TW202405723A - Information processing system, method and program - Google Patents

Information processing system, method and program Download PDF

Info

Publication number
TW202405723A
TW202405723A TW112111662A TW112111662A TW202405723A TW 202405723 A TW202405723 A TW 202405723A TW 112111662 A TW112111662 A TW 112111662A TW 112111662 A TW112111662 A TW 112111662A TW 202405723 A TW202405723 A TW 202405723A
Authority
TW
Taiwan
Prior art keywords
user
attribute data
attribute
data
relationship
Prior art date
Application number
TW112111662A
Other languages
Chinese (zh)
Inventor
山下智彦
町田大樹
吳垠
史普拉塔 高司
河崎麻里子
艾希利 詹
梅田卓志
蛭子琢磨
薩蒂恩 阿布羅爾
Original Assignee
日商樂天集團股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 日商樂天集團股份有限公司 filed Critical 日商樂天集團股份有限公司
Publication of TW202405723A publication Critical patent/TW202405723A/en

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Stored Programmes (AREA)
  • Financial Or Insurance-Related Operations Such As Payment And Settlement (AREA)

Abstract

To realize evaluation such as calculating a user score or to improve evaluation accuracy even when information of a target user is missing or information reliability is low. An information processing system includes: a reference user identification unit 22 that identifies a reference user who is mutually related to a target user; an attribute generation unit 26 that generates corresponding attribute data of the target user based on attribute data of the reference user identified for the target user; an attribute complementing unit 27 that complements a corresponding attribute data group of the target user based on at least a part of the generated corresponding attribute data of the target user; and a user score estimation unit 28 that estimates a user score to be set for the target user based on the complemented corresponding attribute data group of the target user.

Description

資訊處理系統、資訊處理方法及程式產品Information processing systems, information processing methods and program products

本揭露係有關於,用來支援關於使用者的分數之算出等之評價所需之技術。This disclosure relates to technologies required to support evaluation such as calculation of a user's score.

先前,一種判定裝置,係具備:使用者資訊取得部,係取得表示使用者之行動的行動資訊;和信用度判定部,係基於行動資訊,來判定關於將來之使用者對融資的還款能力的信用度,已被提出(參照專利文獻1)。又,隨應於使用者間之親密度而決定使用者分數之顯示可否的系統,已被提出(例如參照專利文獻2)。 [先前技術文獻] [專利文獻] Previously, a judgment device was provided with: a user information acquisition unit that acquires behavior information indicating the user's behavior; and a credit rating determination unit that determines a future user's repayment ability for financing based on the behavior information. Credibility has been proposed (see Patent Document 1). Furthermore, a system has been proposed that determines whether or not user scores can be displayed in accordance with the degree of intimacy between users (for example, refer to Patent Document 2). [Prior technical literature] [Patent Document]

[專利文獻1]日本特開2021-174039號公報 [專利文獻2]日本特開2020-129228號公報 [Patent Document 1] Japanese Patent Application Publication No. 2021-174039 [Patent Document 2] Japanese Patent Application Publication No. 2020-129228

[發明所欲解決之課題][Problem to be solved by the invention]

先前,基於使用者的行動履歷而算出表示使用者之信用度等的使用者分數的技術,已被提出。可是,在對象使用者的資訊所有缺損或資訊的信賴性較低等情況下,係無法算出使用者分數,或所被算出的使用者分數之精度會不足,存在有此等問題。Previously, technology has been proposed that calculates a user score indicating the user's creditworthiness based on the user's action history. However, if all the information about the target user is missing or the reliability of the information is low, there is a problem that the user score cannot be calculated or the accuracy of the calculated user score is insufficient.

本揭露係有鑑於上記問題,其課題在於,即使在對象使用者的資訊所有缺損或資訊的信賴性較低等情況下,仍可實現使用者分數之算出等之評價,或提升評價精度。 [用以解決課題之手段] The present disclosure is based on the above problem, and its subject is to realize evaluation such as calculation of user scores or to improve evaluation accuracy even when all the information about the target user is defective or the reliability of the information is low. [Means used to solve problems]

本揭露之一例,係為一種資訊處理系統,係具備:參考使用者特定手段,係用以特定出與對象使用者彼此存有關係之參考使用者;和屬性生成手段,係用以基於針對前記對象使用者而被特定之前記參考使用者的屬性資料,而生成該對象使用者的對應之屬性資料;和屬性補全手段,係用以基於已被生成之前記對象之使用者的對應之屬性資料之至少一部分,而將前記對象使用者的對應之屬性資料群予以補全;和使用者分數推定手段,係用以基於已被補全之前記對象使用者的對應之前記屬性資料群,來推定要被設定至該對象使用者的使用者分數。An example of this disclosure is an information processing system that includes: a reference user specifying means for specifying a reference user that is related to a target user; and an attribute generation means for specifying a reference user based on the foregoing information. The object user is specified by referring to the attribute data of the user, and the corresponding attribute data of the object user is generated; and the attribute completion means is used based on the corresponding attributes of the user of the object that has been generated. At least part of the data is used to complete the corresponding attribute data group of the previous target user; and the user score estimation means is used to complete the corresponding previous attribute data group of the previous target user based on the completion. The user score that is presumed to be set to this target user.

本揭露係可作為藉由資訊處理裝置、系統、電腦而被執行的方法或令電腦執行的程式,而加以界定。又,本揭露係也可作為將此種程式記錄至電腦或其他裝置、機械等可讀取之記錄媒體,而加以界定。此處,所謂的電腦等可讀取之記錄媒體,係指將資料或程式等之資訊以電性、磁性、光學性、機械性或化學性作用而加以積存,並可從電腦等加以讀取的記錄媒體。 [發明效果] This disclosure may be defined as a method executed by an information processing device, a system, a computer, or a program executed by a computer. In addition, the present disclosure can also be defined as recording such a program on a recording medium that can be read by a computer or other device or machine. Here, the so-called recording medium that can be read by a computer or the like refers to a storage medium in which information such as data or programs is stored electrically, magnetically, optically, mechanically or chemically and can be read from a computer or the like. recording media. [Effects of the invention]

若依據本揭露,則即使在對象使用者的資訊所有缺損或資訊的信賴性較低等情況下,仍可實現使用者分數之算出等之評價,或提升評價精度。According to this disclosure, even if all the information about the target user is defective or the reliability of the information is low, evaluation such as calculation of user scores can still be realized, or evaluation accuracy can be improved.

以下,將本揭露所涉及之資訊處理系統、資訊處理方法及程式產品的實施形態,基於圖式而加以說明。但是,以下所說明的實施形態,係僅為例示實施形態,本揭露所涉及的資訊處理系統、資訊處理方法及程式產品並非限定於以下所說明的具體構成。在實施之際,可因應實施之態樣而適宜採用具體構成,又,可進行各種的改良或變形。Below, the implementation forms of the information processing system, information processing method and program product involved in the present disclosure are explained based on the diagrams. However, the embodiments described below are only exemplary embodiments, and the information processing system, information processing method, and program product involved in the present disclosure are not limited to the specific configurations described below. During implementation, specific configurations may be appropriately adopted depending on the implementation aspect, and various improvements or modifications may be made.

在本實施形態中是說明,將本揭露所涉及之技術,為了將表示使用者所關連之某些尺度(例如信用等)的使用者分數加以管理的使用者分數管理系統而加以實施之情況的實施形態。但是,本揭露所涉及之技術,係可廣泛使用於用來推定使用者分數所需之技術,本揭露的適用對象係不限定於實施形態中所示的例子。This embodiment describes a case where the technology according to the present disclosure is implemented in a user score management system that manages user scores indicating certain criteria (such as credit, etc.) related to the user. Implementation form. However, the technology involved in the present disclosure can be widely used as a technology required for estimating a user's score, and the applicable objects of the present disclosure are not limited to the examples shown in the embodiments.

<系統的構成> 圖1係為本實施形態所述之資訊處理系統的構成的概略圖。本實施形態所述之系統中,資訊處理裝置1、和1或複數個服務提供系統5,係被連接成可相互通訊。使用者,係為藉由服務提供系統5而被提供之服務的利用者,藉由從使用者終端向服務提供系統5進行存取以接受服務之提供。 <System Structure> FIG. 1 is a schematic diagram showing the structure of the information processing system according to this embodiment. In the system described in this embodiment, the information processing device 1 and 1 or a plurality of service providing systems 5 are connected so as to communicate with each other. The user is a user of the service provided by the service providing system 5, and receives the provision of the service by accessing the service providing system 5 from the user terminal.

資訊處理裝置1,係為具備:CPU(Central Processing Unit)11、ROM(Read Only Memory)12、RAM (Random Access Memory)13、EEPROM(Electrically Erasable and Programmable Read Only Memory)或HDD(Hard Disk Drive)等之記憶裝置14、NIC(Network Interface Card)等之通訊單元15等的電腦。但是,關於資訊處理裝置1的具體的硬體構成,係因應實施的態樣而可適宜地省略或置換、追加。又,資訊處理裝置1係不限定於由單一的框體所成的裝置。資訊處理裝置1,係可藉由使用所謂的雲端或分散運算之技術等的複數個裝置來加以實現。The information processing device 1 is equipped with: CPU (Central Processing Unit) 11, ROM (Read Only Memory) 12, RAM (Random Access Memory) 13, EEPROM (Electrically Erasable and Programmable Read Only Memory) or HDD (Hard Disk Drive) A computer such as a memory device 14, a communication unit 15 such as a NIC (Network Interface Card), etc. However, the specific hardware configuration of the information processing device 1 can be appropriately omitted, replaced, or added depending on the implementation aspect. In addition, the information processing device 1 is not limited to a device composed of a single housing. The information processing device 1 can be implemented by a plurality of devices using so-called cloud or distributed computing technology.

資訊處理裝置1,係按照每一使用者而將使用者分數加以管理,並對服務提供系統5提供使用者分數。服務提供系統5,係隨應於從資訊處理裝置1所被提供之使用者分數,而可將針對對象使用者的服務進行客製化。The information processing device 1 manages user scores for each user, and provides the user scores to the service providing system 5 . The service providing system 5 can customize the service for the target user in accordance with the user score provided from the information processing device 1 .

服務提供系統5,為具備CPU、ROM、RAM、記憶裝置、通訊單元、輸入裝置、輸出裝置等(圖示省略)的電腦。又,這些系統及終端,係皆不限定由單一的框體所成的裝置。這些系統及終端,係可藉由使用所謂雲端或分散運算之技術等的複數個裝置來加以實現。The service providing system 5 is a computer equipped with a CPU, ROM, RAM, memory device, communication unit, input device, output device, etc. (illustration omitted). Furthermore, these systems and terminals are not limited to devices composed of a single frame. These systems and terminals can be implemented by multiple devices using so-called cloud or distributed computing technologies.

在本實施形態所述之系統中,作為服務提供系統5,係有:電子商務交易系統40、高爾夫球場預約系統42、旅行預約系統44、及卡片管理系統46,被連接成可相互通訊。但是,藉由服務提供系統5而被提供的服務,係不限定於本實施形態中的例示。藉由服務提供系統5而被提供的服務係可為例如:地圖資訊服務或信用卡/後付結帳服務、電子貨幣結帳服務、線上購物服務、線上預約服務、客服中心服務等。此外,「後付結帳」係不限於被稱為所謂的Buy Now, Pay Later(BNPL)等的服務,可包含任何後付所致之商品/服務之購入。In the system described in this embodiment, the service providing system 5 includes an e-commerce transaction system 40, a golf course reservation system 42, a travel reservation system 44, and a card management system 46, which are connected to communicate with each other. However, the services provided by the service providing system 5 are not limited to the examples in this embodiment. The services provided by the service providing system 5 may be, for example: map information services or credit card/postpaid billing services, electronic money billing services, online shopping services, online reservation services, customer service center services, etc. In addition, "postpaid checkout" is not limited to services called so-called Buy Now, Pay Later (BNPL), etc., and can include the purchase of any postpaid goods/services.

服務提供系統5,係將服務提供之際從使用者所取得的該當使用者的屬性資料群,通知給資訊處理裝置1。又,資訊處理裝置1,係可向服務提供系統5進行存取,針對包含對象使用者的複數個使用者而將系統中所被登錄的使用者屬性資料加以取得並使其被包含在屬性資料群中。此處,使用者的屬性資料中係含有:關於利用系統之使用者的資訊也就是帳戶資料、及該當使用者所致之服務的利用履歷資料。服務的利用履歷資料之內容係會隨著服務之內容而有各式各樣,例如,使用者的位置資訊的履歷資料、信用卡利用額/後付結帳利用額的支付履歷資料、電子貨幣利用履歷資料、交易履歷資料、預約履歷資料、從客服中心對使用者的客服履歷資料、基於位置資訊的履歷資料而被特定的頻繁造訪之滯留場所等。又,帳戶資料中係含有例如:使用者ID、姓名資料、住址資料、年齡資料、性別資料、電話號碼資料、行動電話號碼資料、信用卡號資料、IP位址資料、就學地點資料、工作地點資料等。The service providing system 5 notifies the information processing device 1 of the user's attribute data group obtained from the user when the service is provided. In addition, the information processing device 1 can access the service providing system 5, acquire the user attribute data registered in the system for a plurality of users including the target user, and include it in the attribute data. in the group. Here, the user's attribute data includes information about the user who uses the system, that is, account data, and usage history data of services provided by the user. The content of service usage history data will vary depending on the content of the service, such as history data of the user's location information, payment history data of credit card usage/postpaid bill usage, and electronic money usage. History data, transaction history data, reservation history data, customer service history data for the user from the customer service center, and frequently visited places of stay specified based on history data based on location information, etc. In addition, the account information includes, for example: user ID, name information, address information, age information, gender information, phone number information, mobile phone number information, credit card number information, IP address information, school location information, and work location information. wait.

使用者ID係為例如,於該當電腦系統中的該當使用者之識別資訊。姓名資料係為例如,表示該當使用者之姓名(姓氏及名字)的資料。住址資料係為例如,表示該當使用者之住址的資料。該當電腦系統是電子商務交易系統40的情況下,住址資料亦可為用來表示該當使用者所購入的商品之寄送地之住址。年齡資料係為例如,表示該當使用者之年齡的資料。性別資料係為例如,表示該當使用者之性別的資料。電話號碼資料係為例如,表示該當使用者之電話號碼的資料。行動電話號碼資料係為例如,表示該當使用者之行動電話號碼的資料。信用卡號資料係為例如,表示該當使用者在該當電腦系統中的結帳時所利用的信用卡之卡號的資料。IP位址資料係為例如,表示該當使用者所使用之電腦的IP位址(例如送訊來源之IP位址)的資料。就學地點資料係為例如,在該當使用者是學生的情況下,則是表示該當使用者之就學地點(教育機關名稱或地址等)的資料。工作地點資料係為例如,在該當使用者是社會人士的情況下,則是表示該當使用者之工作地點(企業名稱或地址等)的資料。User ID is, for example, the identification information of the user in the computer system. Name information is, for example, information indicating the name (surname and first name) of the user. Address information is, for example, information indicating the user's residential address. When the computer system is an e-commerce transaction system 40, the address information may also be the address used to indicate the shipping location of the goods purchased by the user. Age information is, for example, information indicating the age of the user. Gender information is, for example, information indicating the gender of the user. Telephone number information is, for example, information indicating the telephone number of the user. Mobile phone number information is, for example, data indicating the mobile phone number of the user. Credit card number information is, for example, information indicating the card number of the credit card used by the user when checking out in the computer system. IP address information is, for example, information indicating the IP address of the computer used by the user (such as the IP address of the source of the message). The school place information is, for example, when the user is a student, it is data indicating the school place (name or address of an educational institution, etc.) of the user. The workplace information is, for example, information indicating the workplace (company name or address, etc.) of the user when the user is a member of the public.

圖2係為本實施形態所述之資訊處理裝置1的機能構成之概略的圖示。資訊處理裝置1,係藉由將記憶裝置14中所被記錄的程式,讀出至RAM13中,並藉由CPU11來加以執行,以控制資訊處理裝置1中所具備的各硬體,藉此而成為具備圖形資料生成部21、參考使用者特定部22、關係性特定部23、關係性強度決定部24、屬性選擇部25、屬性生成部26、屬性補全部27、使用者分數推定部28、及機器學習部29的資訊處理裝置而發揮機能。此外,在本實施形態及後述的其他實施形態中,資訊處理裝置1所具備的各機能,係藉由通用處理器也就是CPU11而被執行,但這些機能的部分或全部係亦可藉由1或複數個專用處理器而被執行。FIG. 2 is a schematic diagram showing the functional configuration of the information processing device 1 according to this embodiment. The information processing device 1 reads the program recorded in the memory device 14 into the RAM 13 and executes it by the CPU 11 to control each hardware included in the information processing device 1, thereby It includes a graphic data generating unit 21, a reference user specifying unit 22, a relationship specifying unit 23, a relationship strength determining unit 24, an attribute selecting unit 25, an attribute generating unit 26, an attribute complementing unit 27, a user score estimating unit 28, and the information processing device of the machine learning unit 29 to function. In addition, in this embodiment and other embodiments described below, each function of the information processing device 1 is executed by the general-purpose processor, that is, the CPU 11, but some or all of these functions may also be executed by the CPU 11. or multiple special-purpose processors.

圖形資料生成部21,係基於複數個使用者之各者的屬性資料群而特定出彼此存有關係之使用者之配對,藉此以生成表示使用者間之關係性的圖形資料(社交圖形網路)。更具體而言,圖形資料生成部21係生成含有例如:分別與包含對象使用者之複數個使用者建立對應的節點資料50、和與彼此存有關係之使用者之配對建立對應的連結資料52的圖形資料(參照圖4、圖6、圖8、及圖9)。此外,圖形生成部21,係藉由進行以明示性連結而被連接之節點(使用者)所構成的使用者間關係圖形的學習(表現學習、關係學習、嵌入學習、知識圖形嵌入),以預測並做成使用者間的暗示性連結。此時,圖形生成部21,係可適宜地基於已知的嵌入模型或其擴充,來進行該當學習。The graphic data generating unit 21 specifies a pair of users who are related to each other based on the attribute data group of each of the plurality of users, thereby generating graphic data (social graph network) indicating the relationship between the users. road). More specifically, the graphic data generating unit 21 generates, for example, node data 50 corresponding to a plurality of users including the target user, and link data 52 corresponding to pairs of users having a relationship with each other. Graphical data (refer to Figure 4, Figure 6, Figure 8, and Figure 9). In addition, the graph generation unit 21 performs learning (expression learning, relationship learning, embedding learning, knowledge graph embedding) of a relationship graph between users composed of nodes (users) connected by explicit connections, so as to Predict and create suggestive connections between users. At this time, the graph generation unit 21 can appropriately perform the necessary learning based on a known embedding model or its extension.

例如,如圖3所示,假設電子商務交易系統40中係被登錄有,使用者A的屬性資料群。又,假設高爾夫球場預約系統42中係被登錄有,使用者B的屬性資料群。又,假設旅行預約系統44中係被登錄有,使用者C的屬性資料。然後,假設電子商務交易系統40中所被登錄的使用者A的IP位址資料之值、高爾夫球場預約系統42中所被登錄的使用者B的IP位址資料之值、及旅行預約系統44中所被登錄的使用者C的IP位址資料之值,係為相同。For example, as shown in FIG. 3 , assume that the attribute data group of user A is registered in the e-commerce transaction system 40 . Furthermore, it is assumed that the attribute data group of user B is registered in the golf course reservation system 42. Furthermore, it is assumed that the attribute data of user C is registered in the travel reservation system 44. Then, assume that the value of the IP address data of user A logged in the e-commerce transaction system 40, the value of the IP address data of user B logged in the golf course reservation system 42, and the travel reservation system 44 The value of the IP address data of user C logged in is the same.

此情況下,圖形資料生成部21係生成,如圖4所示,含有:與使用者A建立對應的節點資料50a、與使用者B建立對應的節點資料50b、與使用者C建立對應的節點資料50c、表示使用者A是與使用者B存有關係的連結資料52a、表示使用者A是與使用者C存有關係的連結資料52b、表示使用者B是與使用者C存有關係的連結資料52c的圖形資料。IP位址為相同的使用者,推測是利用相同電腦的人或是於相同居所或職場中共用全球位址的人。因此,在本實施形態中,如此的使用者係會被相互建立關連。In this case, the graphic data generating unit 21 generates, as shown in FIG. 4 , node data 50a associated with the user A, node data 50b associated with the user B, and node data associated with the user C. Data 50c, link data 52a indicating that user A has a relationship with user B, link data 52b indicating that user A has a relationship with user C, and link data 52b indicating that user B has a relationship with user C. Graphical data linking data 52c. The IP address belongs to the same user, presumably using the same computer or sharing a global address in the same residence or workplace. Therefore, in this embodiment, such users are related to each other.

又,例如,如圖5所示,假設電子商務交易系統40中係被登錄有,使用者D,使用者E、及使用者F的屬性資料群。然後,假設電子商務交易系統40中所被登錄的使用者D的住址資料之值、使用者E的住址資料之值、及使用者F的住址資料之值,係為相同。For example, as shown in FIG. 5 , it is assumed that the attribute data groups of user D, user E, and user F are registered in the e-commerce transaction system 40 . Then, it is assumed that the values of the address data of user D, the address data of user E, and the address data of user F registered in the e-commerce transaction system 40 are the same.

此情況下,圖形資料生成部21係生成,如圖6所示,含有:與使用者D建立對應的節點資料50d、與使用者E建立對應的節點資料50e、與使用者F建立對應的節點資料50f、表示使用者D是與使用者E存有關係的連結資料52d、表示使用者D是與使用者F存有關係的連結資料52e、表示使用者E是與使用者F存有關係的連結資料52f的圖形資料。住址為相同的使用者,係被推測為同居。因此,在本實施形態中,如此的使用者係會被相互建立關連。In this case, the graphic data generating unit 21 generates, as shown in FIG. 6 , node data 50d associated with the user D, node data 50e associated with the user E, and nodes associated with the user F. Data 50f, link data 52d indicating that user D is related to user E, link data 52e indicating that user D is related to user F, and link data 52e indicating that user E is related to user F. Link data 52f graphic data. Users with the same address are presumed to be living together. Therefore, in this embodiment, such users are related to each other.

又,例如,如圖7所示,假設電子商務交易系統40中係被登錄有,使用者G的屬性資料群。又,假設高爾夫球場預約系統42中係被登錄有,使用者H的屬性資料群。又,假設旅行預約系統44中係被登錄有,使用者I的屬性資料群。然後,假設電子商務交易系統40中所被登錄的使用者G的信用卡號資料之值、高爾夫球場預約系統42中所被登錄的使用者H的信用卡號資料之值、及旅行預約系統44中所被登錄的使用者I的信用卡號資料之值,係為相同。For example, as shown in FIG. 7 , it is assumed that the attribute data group of user G is registered in the e-commerce transaction system 40 . Furthermore, it is assumed that the attribute data group of user H is registered in the golf course reservation system 42. Furthermore, it is assumed that the attribute data group of user I is registered in the travel reservation system 44. Then, assume that the value of the credit card number data of user G registered in the e-commerce transaction system 40, the value of the credit card number data of user H registered in the golf course reservation system 42, and the value of the credit card number data of the user H registered in the travel reservation system 44. The values of the credit card number data of the logged-in user I are the same.

此情況下,圖形資料生成部21係生成,如圖8所示,含有:與使用者G建立對應的節點資料50g、與使用者H建立對應的節點資料50h、與使用者I建立對應的節點資料50i、表示使用者G是與使用者H存有關係的連結資料52g、表示使用者G是與使用者I存有關係的連結資料52h、表示使用者H是與使用者I存有關係的連結資料52i的圖形資料。信用卡號為相同的使用者,係被推測為親子等之家人。因此,在本實施形態中,如此的使用者係會被相互建立關連。In this case, the graphic data generation unit 21 generates, as shown in FIG. 8 , node data 50g associated with the user G, node data 50h associated with the user H, and node data associated with the user I. Data 50i, link data 52g indicating that user G is related to user H, link data 52h indicating that user G is related to user I, and link data 52h indicating that user H is related to user I. Graphical data for link data 52i. The credit card numbers belong to the same user, and they are presumed to be family members such as parent and child. Therefore, in this embodiment, such users are related to each other.

此外,是否符合於彼此存有關係之使用者之配對的判斷基準,係不限定於以上所說明者。使用者之配對,係可基於位置資訊的履歷或行動履歷等各式各樣的基準,來做判斷。In addition, the criteria for judging whether the matching of users who are related to each other is met is not limited to the above explanation. User matching can be judged based on various criteria such as location information history or action history.

又,以上所說明的,將已被特定為彼此是存有關係之使用者建立關連的連結資料52所表示的連結,稱作明示性連結。此處假設例如,與第1使用者以明示性連結而被連接的使用者、和與第2使用者以明示性連結而被連接的使用者,是有所定數量以上(例如3人以上)為共通。此情況下,在本實施形態中係例如,圖形資料生成部21係生成,表示該當第1使用者是與該當第2使用者存有關係的連結資料52。如此所被生成的連結資料52所表示的連結,稱作暗示性連結。In addition, as described above, a link represented by the link data 52 that is related to users who are specified to have a relationship with each other is called an explicit link. It is assumed here that, for example, there are more than a certain number of users (for example, 3 or more people) who are explicitly linked to the first user and users who are explicitly linked to the second user. Common. In this case, for example, in the present embodiment, the graphic data generating unit 21 generates link data 52 indicating that the first user is related to the second user. The link represented by the link data 52 generated in this way is called a suggestive link.

例如,如圖9所示,假設藉由表示明示性連結的連結資料52j,與使用者J建立對應的節點資料50j和與使用者K建立對應的節點資料50k,係被連接。又,假設藉由表示明示性連結的連結資料52k,與使用者J建立對應的節點資料50j和與使用者L建立對應的節點資料50l,係被連接。又,假設藉由表示明示性連結的連結資料52l,與使用者J建立對應的節點資料50j和與使用者M建立對應的節點資料50m,係被連接。For example, as shown in FIG. 9 , it is assumed that node data 50j associated with user J and node data 50k associated with user K are connected by link data 52j indicating an explicit link. Furthermore, it is assumed that the node data 50j associated with the user J and the node data 50l associated with the user L are connected by the link data 52k indicating an explicit link. Furthermore, it is assumed that the node data 50j associated with the user J and the node data 50m associated with the user M are connected by the link data 52l indicating an explicit link.

又,假設藉由表示明示性連結的連結資料52m,與使用者K建立對應的節點資料50k和與使用者N建立對應的節點資料50n,係被連接。又,假設藉由表示明示性連結的連結資料52n,與使用者L建立對應的節點資料50l和與使用者N建立對應的節點資料50n,係被連接。又,假設藉由表示明示性連結的連結資料52o,與使用者M建立對應的節點資料50m和與使用者N建立對應的節點資料50n,係被連接。Furthermore, it is assumed that the node data 50k associated with the user K and the node data 50n associated with the user N are connected by the link data 52m indicating an explicit link. Furthermore, it is assumed that the node data 50l associated with the user L and the node data 50n associated with the user N are connected by the link data 52n indicating an explicit link. Furthermore, it is assumed that the node data 50m associated with the user M and the node data 50n associated with the user N are connected by the link data 52o indicating an explicit link.

此情況下,圖形資料生成部21係生成,表示使用者J是與使用者N存有關係的連結資料52p(表示暗示性連結的連結資料52p)。如此一來,使用者N就會被特定成,與使用者J存有關係之使用者。In this case, the graphic data generation unit 21 generates link data 52p indicating that user J is related to user N (link data 52p indicating a suggestive link). In this way, user N will be specified as a user who has a relationship with user J.

又假設例如,與第1使用者以明示性連結或暗示性連結而被連接的使用者、和與第2使用者以明示性連結或暗示性連結而被連接的使用者,是有所定數量以上(例如3人以上)為共通。此情況下,圖形資料生成部21亦可生成,表示該當第1使用者是與該當第2使用者存有關係的連結資料52(表示暗示性連結的連結資料52)。For example, it is also assumed that there are more than a certain number of users who are explicitly linked or implicitly linked to the first user, and users who are explicitly or implicitly linked to the second user. (e.g. 3 or more people) are common. In this case, the graphic data generating unit 21 may generate link data 52 indicating that the first user is related to the second user (link data 52 indicating a suggestive link).

參考使用者特定部22,係參照已被圖形資料生成部21所生成之圖形資料,將該當圖形資料中所含之使用者之中與對象使用者彼此存有關係之其他使用者,特定成為針對該當對象使用者的參考使用者。此處,參考使用者特定部22,係亦可將作為與對象使用者存有關係之使用者而被特定的使用者、及作為存有關係之使用者而被特定的使用者是與所定數量以上之對象使用者呈共通的使用者,特定成為參考使用者。又,參考使用者特定部22,係亦可基於對象使用者的屬性、和複數個使用者的屬性,而從該當複數個使用者之中,特定出參考使用者。The reference user specifying unit 22 refers to the graphic data generated by the graphic data generating unit 21 and specifies other users who are related to the target user among the users included in the graphic data. The reference user of the object user. Here, referring to the user specifying section 22, the user specified as the user having a relationship with the target user, and the user specified as the user having the relationship may be a predetermined number of users. The above target users are common users and specifically become reference users. Furthermore, the reference user specifying unit 22 may specify a reference user from among the plurality of users based on the attributes of the target user and the attributes of the plurality of users.

參考使用者特定部22係亦可將例如,與對象使用者建立對應的節點資料50,和與藉由表示明示性連結或暗示性連結的連結資料52而被連接的節點資料50建立對應的使用者,特定成為針對該當對象使用者的參考使用者。The reference user specifying part 22 may also associate, for example, node data 50 associated with the target user and use associated with node data 50 connected by link data 52 indicating an explicit link or an implicit link. , specifically becomes the reference user for the target user.

關係性特定部23,係將使用者間之關係性,加以特定。這裡所被特定的使用者間之關係性係為例如:(1)居住在同一家戶中的親子關係或夫婦關係、(2)朋友關係、(3)在相同職場中工作之關係等。但是,所被特定的關係性係不限定於本揭露中的例示。在本實施形態中,關係性特定部23,係基於以使用者間之關係所被建立對應之值為基礎的聚類之結果,而將使用者間之關係性,加以特定。此處,可以作為使用者間之關係所被建立對應之值來採用的值之種類係無限定,但可包含有例如:使用者的姓名、IP位址、住址、信用卡號、年齡、性別、就學地點、工作地點及滯留場所之其中至少1者。The relationship specifying part 23 specifies the relationship between users. The relationships between users specified here are, for example: (1) parent-child relationship or husband-wife relationship living in the same household, (2) friend relationship, (3) relationship working in the same workplace, etc. However, the specified relationship is not limited to the examples in this disclosure. In this embodiment, the relationship specifying unit 23 specifies the relationship between users based on the result of clustering based on the values corresponding to the relationship between the users. Here, the types of values that can be used as values to establish relationships between users are not limited, but may include, for example: the user's name, IP address, address, credit card number, age, gender, At least one of the place of study, place of work and place of stay.

關係性特定部23,係將對象使用者與參考使用者之關係性,加以特定。此處,關係性特定部23,係亦可基於對象使用者的屬性資料群、和參考使用者的屬性資料群,而將對象使用者與參考使用者之關係性,加以特定。又,對象使用者的屬性資料群所被登錄的電腦系統與參考使用者的屬性資料群所被登錄的電腦系統,亦可為不同。例如,亦可基於電子商務交易系統40中所被登錄的對象使用者的屬性資料群、和高爾夫球場預約系統42中所被登錄的參考使用者的屬性資料群,而將對象使用者與參考使用者之關係性,加以特定。The relationship specifying unit 23 specifies the relationship between the target user and the reference user. Here, the relationship specifying unit 23 may specify the relationship between the target user and the reference user based on the attribute data group of the target user and the attribute data group of the reference user. In addition, the computer system in which the attribute data group of the target user is registered may be different from the computer system in which the attribute data group of the reference user is registered. For example, the target user and the reference user may also be combined based on the attribute data group of the target user registered in the e-commerce transaction system 40 and the attribute data group of the reference user registered in the golf course reservation system 42. The relationship between the two is specified.

關係性特定部23係例如,將藉由連結資料52而被連接的節點資料50之配對,加以特定。然後,關係性特定部23,係基於與該當配對建立對應的2位使用者的使用者屬性資料群,而生成與該當配對建立對應的配對屬性資料。此處,配對屬性資料中係含有例如:IP共通旗標、住址共通旗標、信用卡號共通旗標、姓氏相同旗標、年齡差資料、配對性別資料、就學地點共通旗標、工作地點共通旗標、滯留場所共通旗標等。The relationship specifying unit 23 specifies pairs of node data 50 connected by the link data 52 , for example. Then, the relationship specifying unit 23 generates pairing attribute data corresponding to the pairing based on the user attribute data groups of the two users associated with the pairing. Here, the matching attribute data includes, for example: a common IP flag, a common address flag, a common credit card number flag, a same last name flag, age difference data, matched gender data, a common school place flag, and a common work place flag. flags, common flags for detention areas, etc.

IP共通旗標係為例如,表示該當配對之中的一方的屬性資料中所含之IP位址資料之值與他方的屬性資料中所含之IP位址資料之值是否為相同的旗標。例如,亦可為,在IP位址資料之值為相同的情況下,則對IP共通旗標之值係設定1,IP位址資料之值為不同的情況下,則對IP共通旗標之值設定0。The IP common flag is, for example, a flag indicating whether the value of the IP address data contained in the attribute data of one party in the pair is the same as the value of the IP address data contained in the attribute data of the other party. For example, when the values of the IP address data are the same, the value of the IP common flag is set to 1. When the values of the IP address data are different, the value of the IP common flag is set. The value is set to 0.

住址共通旗標、就學地點共通旗標、工作地點共通旗標及滯留場所共通旗標係為例如,表示該當配對之中的一方的屬性資料群中所含之住址資料/就學地點資料/工作地點資料/滯留場所資料之值與他方的屬性資料群中所含之住址資料/就學地點資料/工作地點資料/滯留場所資料之值,是否為相同的旗標。例如,住址資料之值為相同的情況下則對住址共通旗標之值設定1,住址資料之值為不同的情況下則對住址共通旗標之值設定0。The common address flag, the common school place flag, the work place common flag and the stay place common flag are, for example, the address data/school place data/work place included in the attribute data group of one of the matched parties. Whether the value of the data/stay place data is the same flag as the value of the address data/school place data/work place data/stay place data included in the attribute data group of the other party. For example, when the values of the address data are the same, the value of the address common flag is set to 1; when the values of the address data are different, the value of the address common flag is set to 0.

信用卡號共通旗標係為例如,表示該當配對之中的一方的屬性資料群中所含之信用卡號資料之值與他方的屬性資料群中所含之信用卡號資料之值是否為相同的旗標。例如,信用卡號資料之值為相同的情況下則對信用卡號共通旗標之值設定1,信用卡號資料之值為不同的情況下則對信用卡號共通旗標之值設定0。The credit card number common flag is, for example, a flag indicating whether the value of the credit card number data contained in the attribute data group of one party in the pair is the same as the value of the credit card number data contained in the other party's attribute data group. . For example, if the values of the credit card number data are the same, set the value of the credit card number common flag to 1; if the values of the credit card number data are different, set the value of the credit card number common flag to 0.

姓氏相同旗標係為例如,表示該當配對之中的一方的屬性資料群中所含之姓名資料所表示之姓氏與他方的屬性資料群中所含之姓名資料所表示之姓氏是否為相同的旗標。例如,姓名資料所表示之姓氏為相同的情況下則對姓氏相同旗標之值設定1,姓名資料所表示之姓氏為不同的情況下則對姓氏相同旗標之值設定0。The same last name flag is, for example, a flag indicating whether the last name represented by the name data contained in the attribute data group of one party in the pair is the same as the last name represented by the name data contained in the attribute data group of the other party. mark. For example, if the last names represented by the name data are the same, the value of the same last name flag is set to 1. If the last names represented by the name data are different, the value of the same last name flag is set to 0.

年齡差資料係為例如,表示該當配對之中的一方的屬性資料群中所含之年齡資料之值與他方的屬性資料群中所含之年齡資料之值的差的資料。The age difference data is, for example, data that represents the difference between the value of the age data included in the attribute data group of one of the paired parties and the value of the age data included in the attribute data group of the other party.

配對性別資料係為例如,表示該當配對之中的一方的屬性資料群中所含之性別資料之值與他方的屬性資料群中所含之性別資料之值之組合的資料。The paired gender data is, for example, data that represents a combination of the value of the gender data included in the attribute data group of one of the paired parties and the value of the gender data included in the attribute data group of the other party.

然後,關係性特定部23,係基於與複數個配對之每一者建立對應的配對屬性資料群之值,執行使用一般聚類手法的聚類,以將該當複數個配對,分類成如圖10所示的複數個群聚54。Then, the relationship specifying unit 23 performs clustering using a general clustering technique based on the value of the pairing attribute data group corresponding to each of the plurality of pairs, so as to classify the plurality of pairs as shown in Figure 10 A plurality of clusters 54 are shown.

圖10係為,複數個配對被分類成5個群聚54(54a、54b、54c、54d、及54e)的樣子之一例的模式性圖示。圖10中所示的叉叉,係與配對建立對應。然後,複數個叉叉之每一者係被配置在,與該當叉叉所對應之配對之配對屬性資料之值建立對應的位置上。圖10的例子中,雖然複數個配對是被分類成5個群聚54,但複數個配對所被分類的群聚54之數量係不限定於5個,例如,複數個配對係可被分類成4個群聚54。FIG. 10 is a schematic diagram showing an example of how a plurality of pairs are classified into five clusters 54 (54a, 54b, 54c, 54d, and 54e). The fork shown in Figure 10 corresponds to the pairing. Then, each of the plurality of crosses is arranged at a position corresponding to the value of the pairing attribute data of the pair corresponding to the cross. In the example of FIG. 10 , although the plurality of pairs are classified into five clusters 54 , the number of clusters 54 into which the plurality of pairs are classified is not limited to five. For example, the plurality of pairs can be classified into 4 clusters 54.

圖11係為,在複數個配對是被分類成4個群聚54的情況下,該當分類的可視化之一例的圖示。如圖11所示,住址為相同、性別為相同、年齡差是大於X歲、姓氏為相同的配對,係亦可被分類成第1群聚。又,住址為相同、性別為相同、年齡差係為X歲以下、姓氏為相同的配對,係亦可被分類成第2群聚。又,住址為相同、性別為不同、年齡差是大於Y歲、姓氏為相同的配對,係亦可被分類成第3群聚。又,住址為相同、性別為不同、年齡差係為Y歲以下、姓氏為相同的配對,係亦可被分類成第4群聚。FIG. 11 is an illustration of an example of visualization that should be performed when a plurality of pairs are classified into four clusters 54 . As shown in Figure 11, pairs with the same address, the same gender, the age difference is greater than X years, and the same surname can also be classified into the first cluster. In addition, pairs with the same address, the same gender, the age difference is less than X years, and the same surname can also be classified into the second cluster. In addition, pairs with the same address, different genders, an age difference greater than Y years, and the same surname can also be classified into the third cluster. In addition, pairs with the same address, different genders, an age difference of Y years or less, and the same surname can also be classified into the fourth cluster.

此情況下,第1群聚係可被推測為,例如與同性之親子建立對應的群聚54。又,第2群聚係可被推測為,與同性之兄弟姊妹建立對應的群聚54。又,第3群聚係可被推測為,與異性之親子建立對應的群聚54。又,第4群聚係可被推測為,與夫婦建立對應的群聚54。In this case, the first cluster system can be inferred to be, for example, cluster 54 associated with parents and children of the same sex. Furthermore, the second cluster system can be speculated to be a cluster 54 corresponding to brothers and sisters of the same sex. In addition, the third cluster system can be speculated to be a cluster 54 corresponding to parents and children of the opposite sex. In addition, the fourth cluster system can be inferred to be cluster 54 corresponding to the couple.

如以上所說明,關係性特定部23,係亦可基於以與使用者間之關係建立對應的值為基礎的聚類之結果,而將對象使用者與參考使用者之關係性,加以特定。關於藉由以就學地點共通旗標、工作地點共通旗標、滯留場所共通旗標為基礎的聚類來作成朋友關係或在相同職場中工作之關係之群聚的情況之具體例,係和上記說明的例子概略相同,因此省略說明。又,關係性特定部23,係亦可基於以姓氏、IP位址、住址、信用卡號、年齡差、性別、就學地點、工作地點及滯留場所之其中至少1者為基礎的聚類之結果,而將對象使用者與參考使用者之關係性,加以特定。As explained above, the relationship specifying unit 23 may specify the relationship between the target user and the reference user based on the result of clustering based on the values corresponding to the relationship between the users. A specific example of a clustering of friendships or relationships working in the same workplace by clustering based on a common flag of school place, a common flag of work place, and a common flag of residence is as described above. The explanatory examples are basically the same, so the explanation is omitted. In addition, the relationship specific part 23 may be based on the result of clustering based on at least one of the last name, IP address, address, credit card number, age difference, gender, school place, work place, and stay place, The relationship between the target user and the reference user is specified.

關係性強度決定部24,係依照對象使用者與參考使用者之關係性所對應之判斷基準,基於表示該當對象使用者與該當參考使用者之關係之強弱的指標,來決定表示該當對象使用者與該當參考使用者之遠近的關係性強度(以下亦稱作「接近度分數」。)。於本實施形態中,關係性強度決定部24,係基於對對象使用者與參考使用者之關係性所對應之已學習之機器學習模型輸入了表示指標的資料之際的輸出,來決定表示對象使用者與參考使用者之遠近的關係性強度(接近度分數)。The relationship strength determination unit 24 determines the target user based on an index indicating the strength of the relationship between the target user and the reference user based on the judgment criteria corresponding to the relationship between the target user and the reference user. The strength of the relationship (hereinafter also referred to as the "proximity score") to the user that should be referenced. In this embodiment, the relationship strength determination unit 24 determines the representation object based on the output when the data representing the index is input to the learned machine learning model corresponding to the relationship between the target user and the reference user. The strength of the relationship between the user and the reference user (proximity score).

此處,關係性強度決定部24係亦可含有,分別與上述之群聚54建立對應的已學習之機器學習模型。例如,複數個配對是被分類成5個群聚54的情況下,則關係性強度決定部24係亦可含有5個機器學習模型。然後,關係性強度決定部24係亦可基於,對於對象使用者與參考使用者之關係性所對應之已學習之機器學習模型,輸入了表示對象使用者與該當參考使用者之關係性之強弱的指標的資料之際的輸出,來決定表示對象使用者與參考使用者之遠近的接近度分數。此情況下,於已學習之機器學習模型中所被實作的輸出入關係,係相當於上述的判斷基準。Here, the relationship strength determining unit 24 may also include learned machine learning models corresponding to the above-mentioned clusters 54 respectively. For example, when a plurality of pairs are classified into five clusters 54, the relationship strength determination unit 24 may also include five machine learning models. Then, the relationship strength determination unit 24 may also input the strength of the relationship between the target user and the reference user based on the learned machine learning model corresponding to the relationship between the target user and the reference user. The output of the indicator data is used to determine the proximity score that represents the distance between the target user and the reference user. In this case, the input-output relationship implemented in the learned machine learning model is equivalent to the above-mentioned judgment standard.

如圖12所示,關係性強度決定部24,亦可對第n個機器學習模型也就是第n機器學習模型,輸入已被分類成與第n機器學習模型建立對應之群聚54的配對所對應之輸入資料。例如,關係性強度決定部24是含有5個機器學習模型的情況下,上述的值n,係為1以上5以下之整數之中的任一者。然後,關係性強度決定部24,亦可將隨應於該當輸入資料之輸入而從第n機器學習模型所被輸出的輸出資料之值,決定成為針對該當配對的接近度分數之值。As shown in FIG. 12 , the relationship strength determination unit 24 may input the pairing information classified into the cluster 54 corresponding to the n-th machine learning model for the n-th machine learning model. Corresponding input data. For example, when the relationship strength determination unit 24 includes five machine learning models, the above-mentioned value n is any one of the integers from 1 to 5. Then, the relationship strength determination unit 24 may determine the value of the output data output from the n-th machine learning model in response to the input of the corresponding input data as the value of the proximity score for the corresponding pairing.

與配對建立對應的輸入資料中係亦可含有例如,與該當配對建立對應的配對屬性資料之部分或全部。又,輸入資料中亦可含有,配對屬性資料中所未含有的資料。例如,輸入資料中亦可含有,表示電子商務交易系統40之利用履歷的資料、或藉由關係性強度決定部24而從SNS等之其他資訊源所取得的資料等。更具體而言,例如,亦可在輸入資料中,含有表示配對間的每單位期間之通話次數或訊息之往來之次數、一方送給他方的贈禮之數量、配對中的共通之好友的數量等的資料。The input data corresponding to the pair may also include, for example, part or all of the pairing attribute data corresponding to the pair. In addition, the input data may also contain data that is not included in the matching attribute data. For example, the input data may include data indicating the usage history of the e-commerce transaction system 40 or data obtained from other information sources such as SNS by the relationship strength determination unit 24. More specifically, for example, the input data may also include information indicating the number of calls or messages exchanged between the pairings per unit period, the number of gifts given by one party to the other party, the number of common friends in the pairing, etc. information.

又,與配對建立對應的輸入資料中所含之資料的種類,係亦可隨著該當配對所屬的群聚54,而為相同或不同。例如,被輸入至第1機器學習模型的輸入資料中所含之資料的種類,與被輸入至第2機器學習模型的輸入資料中所含之資料的種類,亦可為不同。In addition, the type of data contained in the input data corresponding to the matching may also be the same or different depending on the cluster 54 to which the matching belongs. For example, the type of data included in the input data input to the first machine learning model may be different from the type of data included in the input data input to the second machine learning model.

在本實施形態中係例如,早於關係性強度決定部24所致之接近度分數之決定,預先使用與第n機器學習模型建立對應的給定之複數個訓練資料,來執行第n機器學習模型的學習。該訓練資料係為例如,以使得與該當第n機器學習模型建立對應的群聚54中的接近度分數之決定會變成妥當的方式,而被預先準備。此處,被設定至訓練資料的接近度分數,係亦可為基於規則而被設定的(被進行過註解的)接近度分數。又,亦可為,藉由機器學習模型而在過去曾經被輸出後,藉由管理者等而被修正過的接近度分數。In this embodiment, for example, before the determination of the proximity score by the relationship strength determination unit 24, the n-th machine learning model is executed in advance using a given plurality of training data associated with the n-th machine learning model. of learning. The training data is prepared in advance, for example, in such a way that the determination of the proximity scores in the cluster 54 corresponding to the nth machine learning model will be appropriate. Here, the proximity score set to the training data may also be a proximity score set based on rules (annotated). Alternatively, it may be a proximity score that has been corrected by a manager or the like after being outputted in the past using a machine learning model.

此處,亦可對第n機器學習模型,進行弱監督式學習所致之學習。例如,訓練資料亦可包含有:含有與被輸入至第n機器學習模型之輸入資料相同種類之資料的學習輸入資料、和用來與隨應於學習輸入資料之輸入而從第n機器學習模型所被輸出之輸出資料進行比較的訓練資料。Here, learning by weakly supervised learning can also be performed on the nth machine learning model. For example, the training data may also include: learning input data containing the same type of data as the input data input to the n-th machine learning model, and learning input data for learning the model from the n-th machine corresponding to the input of the learning input data. The output data being output is compared to the training data.

此處例如,上述的接近度分數,係取0或1之任一值。例如,配對是處於接近之關係的情況下,則作為該當配對之接近度分數之值是決定為1,除此以外的情況下,則作為該當配對之接近度分數之值是決定為0。此情況下,教師資料係亦可含有,對應之學習輸入資料中的妥當的接近度分數之值、及表示該值為妥當之機率的資料。然後,亦可基於例如,隨應於訓練資料中所含之學習輸入資料之輸入而從第n機器學習模型所被輸出的輸出資料之值、和該當訓練資料中所含之訓練資料之值,來執行將第n機器學習模型之參數之值予以更新的弱監督式學習。Here, for example, the above-mentioned proximity score takes any value of 0 or 1. For example, when a pair is in a close relationship, the value of the proximity score of the corresponding pair is determined to be 1. In other cases, the value of the proximity score of the corresponding pair is determined to be 0. In this case, the teacher information system may also include the appropriate proximity score value in the corresponding learning input data, and data indicating the probability that the value is appropriate. Then, it may also be based on, for example, the value of the output data output from the n-th machine learning model in response to the input of the learning input data contained in the training data, and the value of the training data contained in the training data, To perform weakly supervised learning that updates the parameter values of the nth machine learning model.

此外,上述的接近度分數,係並不必要為只能採取0或1之任一值的二進位資料。例如,上述的接近度分數係亦可為,該當配對越是處於接近之關係就取越大之值的實數值(例如0以上10以下之實數值)、或多階段之整數值(例如1以上10以下之整數值)。In addition, the above-mentioned proximity score does not necessarily need to be binary data that can only take any value of 0 or 1. For example, the above-mentioned proximity score system can also be a real value (for example, a real value between 0 and above 10), or a multi-stage integer value (for example, a real value above 1) that takes on a larger value when the pair is in a closer relationship. an integer value below 10).

又,機器學習模型的學習手法,係不限定於弱監督式學習。作為一具體例,考慮具有兄弟姊妹之關係的配對。此情況下,與該當配對建立對應的輸入資料係被輸入至,兄弟姊妹此一關係所對應的已學習之機器學習模型。然後例如,關於該配對而住址資料之值為相同,該配對之一方送給他方的贈禮之數量為50,該配對的目前為止的通話次數是1200次的情況下,則輸出值為1的輸出資料,如此的學習亦可被執行。又例如,關於該配對而住址資料之值為不同,該配對之一方送給他方的贈禮之數量為2,該配對的目前為止的通話次數是30次的情況下,則輸出值為0的輸出資料,如此的學習亦可被執行。然後,接近度分數所對應之輸出資料之值為1還是0的判斷基準(例如閾值),亦可隨著機器學習模型而不同。In addition, the learning method of the machine learning model is not limited to weakly supervised learning. As a specific example, consider a pair having a brother-sister relationship. In this case, the input data corresponding to the pairing is input to the learned machine learning model corresponding to the relationship between siblings. Then, for example, if the values of the address data for the pair are the same, the number of gifts given by one party to the other party is 50, and the number of calls so far for the pair is 1200, then the output value is 1. information, such learning can also be performed. For another example, if the values of the address data for the pair are different, the number of gifts given by one party to the other party is 2, and the number of calls so far for the pair is 30, the output value is 0. information, such learning can also be performed. Then, the criterion (such as a threshold) for judging whether the value of the output data corresponding to the proximity score is 1 or 0 can also vary with the machine learning model.

屬性選擇部25,係隨應於對象使用者與參考使用者之關係性之種類,來選擇藉由屬性生成部26所被生成的屬性資料之種類(補全對象的屬性資料之種類)。作為使用者間之關係性之種類的具體例、及隨應於關係性之種類而被選擇的屬性資料之種類,係可舉出如以下所例示的關係性及屬性資料。The attribute selection unit 25 selects the type of attribute data generated by the attribute generation unit 26 (the type of attribute data that complements the object) in accordance with the type of relationship between the target user and the reference user. Specific examples of the types of relationships between users and the types of attribute data selected in accordance with the types of relationships include relationships and attribute data exemplified below.

(1)居住在同一家戶中的親子關係或夫婦關係 在使用者間之關係性是居住在同一家戶中的親子關係或夫婦關係的情況下,主要可以假定為,金錢系的變數、表示整個家戶之行動的變數,會是相同。因此,在使用者間被特定出該當關係性的情況下,屬性選擇部25,作為藉由屬性生成部26所被生成的屬性資料之種類,係可選擇例如:家戶收入、家戶年收、居住地、(整個家戶之)保險加入有無、儲蓄金額、金融資產、報紙訂閱有無等。 (1) Parent-child relationship or husband-wife relationship living in the same household When the relationship between users is a parent-child relationship or a husband-wife relationship living in the same household, it can be mainly assumed that the variables related to money and the variables representing the actions of the entire household will be the same. Therefore, when the appropriate relationship between users is specified, the attribute selection unit 25 can select, for example, household income, household annual income, as the type of attribute data generated by the attribute generation unit 26 , place of residence, whether there is insurance (for the entire household), amount of savings, financial assets, whether there is a newspaper subscription, etc.

(2)朋友關係 在使用者間之關係性是朋友關係的情況下,可以假定為,相同性別/年齡/興趣的集團,較容易成為朋友。因此,在使用者間被特定出該當關係性的情況下,屬性選擇部25,作為藉由屬性生成部26所被生成的屬性資料之種類,係可選擇例如:興趣、常去的場所/地區、年齡、性別等。 (2)Friendship When the relationship between users is a friend relationship, it can be assumed that groups of the same gender/age/interest are more likely to become friends. Therefore, when the appropriate relationship between users is specified, the attribute selection unit 25 can select, for example, interests and frequently visited places/regions as the type of attribute data generated by the attribute generation unit 26 , age, gender, etc.

(3)在相同職場中工作之關係 在使用者間之關係性是在相同職場中工作之關係的情況下,可以假定為,相同教育水準、專業領域的集團,在相同職場中工作的情況會較多。因此,在使用者間被特定出該當關係性的情況下,屬性選擇部25,作為藉由屬性生成部26所被生成的屬性資料之種類,係可選擇例如:購入的專業書籍之類別、教育水準等。 (3) Relationship between working in the same workplace When the relationship between users is that they work in the same workplace, it can be assumed that groups with the same educational level and professional fields will tend to work in the same workplace. Therefore, when the appropriate relationship between users is specified, the attribute selection unit 25 can select, for example, the type of purchased professional books, education, etc. as the type of attribute data generated by the attribute generation unit 26. Level etc.

在本實施形態中雖然說明了,屬性選擇部25是以規則基礎來選擇補全對象(生成對象)的屬性資料之種類的方法,但補全對象屬性資料之種類的選擇方法,係不限定於本實施形態中的例示。例如,亦可採用,使用將使用者間之關係性之種類與近似之屬性資料之種類的相關性之有無或相關度進行過學習的機器學習模型,來選擇補全對象屬性資料之種類的方法。In this embodiment, it has been described that the attribute selection unit 25 selects the type of attribute data of the completion target (generation target) based on rules, but the method of selecting the type of attribute data of the completion target is not limited to Example in this embodiment. For example, it is also possible to select a method of completing the type of attribute data of the object using a machine learning model that has learned the presence or absence of correlation or the degree of correlation between the type of relationship between users and the type of approximate attribute data. .

屬性生成部26,係將用來補全對象使用者的屬性資料群之中有所缺損的屬性資料或信賴性低的屬性資料所需的屬性資料,基於針對對象使用者而被特定之至少1個參考使用者所相關之資訊而加以生成。此處,屬性生成部26,作為參考使用者所相關之資訊,係參照參考使用者的屬性資料群之中已被屬性選擇部25所選擇之種類的屬性資料,而將所被參照之屬性資料所對應之對象使用者的屬性資料加以生成。The attribute generation unit 26 generates attribute data necessary to complete missing attribute data or low-reliability attribute data among the attribute data group of the target user, based on at least 1 specific attribute data for the target user. are generated by referring to information related to the user. Here, as information related to the reference user, the attribute generation unit 26 refers to the attribute data of the type selected by the attribute selection unit 25 among the attribute data groups of the reference user, and generates the referenced attribute data. The attribute data of the corresponding object user is generated.

具體而言,在對象使用者與參考使用者之間的關係性是「(1)居住在同一家戶中的親子關係或夫婦關係」的情況下,屬性生成部26,係針對家戶收入、家戶年收、居住地、(整個家戶之)保險加入有無、儲蓄金額、金融資產、報紙訂閱有無等的屬性資料而參照參考使用者的屬性資料,基於其而生成對象使用者的對應之屬性資料。又,在對象使用者與參考使用者之間的關係性是「(2)朋友關係」的情況下,屬性生成部26,係針對興趣、常去的場所/地區、年齡、性別等的屬性資料而參照參考使用者的屬性資料,基於其而生成對象使用者的對應之屬性資料。又,在對象使用者與參考使用者之間的關係性是「(3)在相同職場中工作之關係」的情況下,屬性生成部26,係針對購入的專業書籍之類別、教育水準等的屬性資料而參照參考使用者的屬性資料,基於其而生成對象使用者的對應之屬性資料。Specifically, when the relationship between the target user and the reference user is "(1) Parent-child relationship or husband-wife relationship living in the same household", the attribute generation unit 26 generates a system based on household income, The attribute data of the user is referenced based on the attribute data of the user, such as household annual income, place of residence, insurance coverage (for the entire household), amount of savings, financial assets, newspaper subscriptions, etc., and a corresponding map of the target user is generated based on it. Attribute data. In addition, when the relationship between the target user and the reference user is "(2) friend relationship", the attribute generation unit 26 generates attribute data for interests, frequented places/regions, age, gender, etc. The attribute data of the reference user is referred to, and corresponding attribute data of the target user is generated based on it. In addition, when the relationship between the target user and the reference user is "(3) the relationship of working in the same workplace", the attribute generation unit 26 generates the attribute based on the type of purchased professional books, education level, etc. The attribute data refers to the attribute data of the reference user, and based on it, corresponding attribute data of the target user is generated.

屬性生成部26,係亦可藉由將參考使用者的屬性資料之參數直接複製成對象使用者的對應之屬性資料,以生成對象使用者的屬性資料。但是,屬性生成部26,係亦可藉由對參考使用者的屬性資料之參數施加某些處理,而生成對象使用者的對應之屬性資料。例如,在對象使用者的屬性資料之生成時,屬性生成部26,係亦可參照針對參考使用者而被決定的接近度分數,基於參考使用者的屬性資料之參數與接近度分數,而生成對象使用者的屬性資料。The attribute generation unit 26 may also generate the attribute data of the target user by directly copying the parameters of the reference user's attribute data into the corresponding attribute data of the target user. However, the attribute generation unit 26 may also generate corresponding attribute data of the target user by applying certain processing to the parameters of the reference user's attribute data. For example, when generating the attribute data of the target user, the attribute generation unit 26 may also refer to the proximity score determined for the reference user, and generate based on the parameters and proximity score of the reference user's attribute data. Attribute data of the object user.

例如,屬性生成部26,係亦可藉由對參考使用者的屬性資料之參數,進行基於接近度分數而被決定的加權,以生成對象使用者的屬性資料。此情況下,屬性生成部26係為,對象使用者與參考使用者之間的接近度分數所表示的使用者間之關係性強度越高,則設定越大的加權係數。然後,藉由對參考使用者的屬性資料之參數進行使用到加權係數的處理(例如單純對參數將加權係數做積算等),就可使得針對對象使用者而被補全的屬性資料之參數,會接近於所被參照之參考使用者的屬性資料之參數。For example, the attribute generating unit 26 may generate the attribute data of the target user by weighting the parameters of the reference user's attribute data based on the proximity score. In this case, the attribute generation unit 26 sets a larger weighting coefficient as the strength of the relationship between the users represented by the proximity score between the target user and the reference user is higher. Then, by performing processing using weighting coefficients on the parameters of the reference user's attribute data (for example, simply accumulating the weighting coefficients for the parameters, etc.), the parameters of the attribute data that are completed for the target user can be made. Parameters that will be close to the attribute data of the referenced user.

又,此處,在參考使用者是被特定出複數個的情況下,則亦可基於複數個參考使用者來生成對象使用者的屬性資料。例如,屬性生成部26,係針對複數個參考使用者之每一者而取得接近度分數與補全對象屬性資料之參數,將從各參考使用者所被取得之參數,基於接近度分數而進行加權,將按照每一參考使用者所得到的複數個已加權參數之平均(不限於平均,亦可採用中位數等其他統計量),當作對象使用者的對應之屬性資料之參數。Furthermore, here, when a plurality of reference users are specified, the attribute data of the target user can also be generated based on the plurality of reference users. For example, the attribute generation unit 26 obtains the proximity score and the parameters of the completion target attribute data for each of the plurality of reference users, and performs the operation based on the proximity score based on the parameters obtained from each reference user. Weighting will be based on the average of the plurality of weighted parameters obtained for each reference user (not limited to the average, other statistics such as the median can also be used), as the parameters of the corresponding attribute data of the target user.

又,例如,屬性生成部26係亦可使用:把補全被進行前的對象使用者的屬性資料群之至少一部分之參數、和參考使用者的屬性資料群之至少一部分之參數、和對象使用者及參考使用者間之接近度分數,當作輸入值,把要被補全之對象使用者的屬性資料當作輸出值的屬性生成模型,來生成對象使用者的屬性資料。和採用加權的情況同樣地,在採用屬性生成模型的情況下也是,屬性生成模型係為,以使得對象使用者與參考使用者之間的接近度分數越高,則針對對象使用者而被補全的屬性資料之參數,就會越接近於所被參照之參考使用者的屬性資料之參數的方式,而被生成及/或更新。又,對屬性生成模型輸入關於複數個參考使用者的接近度分數及屬性資料,就會輸出對象使用者的補全對象屬性資料之參數,這件事情也是和上記採用加權的情況相同。Furthermore, for example, the attribute generation unit 26 may use: a parameter that completes at least a part of the attribute data group of the target user before completion, a parameter of at least a part of the attribute data group of the reference user, and the object usage. The proximity score between the user and the reference user is used as the input value, and the attribute data of the target user to be completed is used as the output value of the attribute generation model to generate the attribute data of the target user. As in the case of using weighting, in the case of using an attribute generation model, the attribute generation model is such that the higher the proximity score between the target user and the reference user, the higher the proximity score between the target user and the reference user. The parameters of the complete attribute data will be generated and/or updated in a manner that is closer to the parameters of the referenced user's attribute data. In addition, if the proximity scores and attribute data of multiple reference users are input to the attribute generation model, the parameters of the target user's complementary object attribute data will be output. This is the same as the weighted case mentioned above.

屬性補全部27,係基於已被生成之屬性資料之至少一部分,而將關於使用者的屬性資料群予以補全。使用者所相關之屬性資料群中係包含有,含有從服務提供系統5所被取得之帳戶資料及利用履歷資料的屬性資料,但此時,屬性補全部27,係將已被屬性生成部26所生成之屬性資料之至少一部分,決定作為對象使用者所相關之屬性資料群之至少一部分,而將使用者所相關之屬性資料群予以補全。The attribute completion part 27 is to complete the attribute data group about the user based on at least part of the generated attribute data. The attribute data group related to the user includes attribute data including account information and usage history data obtained from the service providing system 5. However, at this time, the attribute complement part 27 will have been modified by the attribute generation part 26. At least part of the generated attribute data is determined to be at least part of the attribute data group related to the target user, and the attribute data group related to the user is completed.

此處,藉由屬性補全部27而被補全的屬性資料中係可包含有:人口統計屬性、行為屬性、或心理統計屬性。人口統計屬性係為例如:使用者的性別(gender)、家庭組成、年齡等,行為屬性係為例如:電子現金利用有無、固定限額繳款利用有無、所定之帳戶所涉及之入出金履歷、包含賭博或彩券的某些商品所涉及之商務交易履歷(可包含線上市集等中的線上交易履歷)等,心理變數屬性係為例如涉及賭博或彩券之興趣等。但是,可利用之使用者的屬性,係不限定於本實施形態中的例示。例如,來自客服中心服務等之「客服(去電等)所需的時間」、「信用卡利用額/後付結帳利用額」,也可當作屬性資料來使用。Here, the attribute data completed by the attribute completion part 27 may include: demographic attributes, behavioral attributes, or psychographic attributes. Demographic attributes include, for example, the user's gender, family composition, age, etc.; behavioral attributes include, for example: whether electronic cash is used, whether fixed limit payment is used, and the history of deposits and withdrawals involved in the specified account, including Business transaction history related to certain products of gambling or lottery (which may include online transaction history in online markets, etc.), etc. Psychological variable attributes are, for example, interest in gambling or lottery, etc. However, the user attributes that can be used are not limited to the examples in this embodiment. For example, "time required for customer service (outbound calls, etc.)" and "credit card usage/postpaid billing usage" from call center services can also be used as attribute data.

使用者分數推定部28,係基於已被補全的屬性資料群,來推定要被設定至使用者的使用者分數。於本實施形態中,使用者分數推定部28,係藉由將使用者的屬性資料群輸入至使用者分數推定模型,以推定要被設定至該當使用者的使用者分數。此處,使用者分數推定模型的輸出值,係為以0為最小值、以1為最大值而被正規化/規格化的使用者分數。此處,被輸入至使用者分數推定模型的對象使用者的屬性資料群中係包含有,已被屬性生成部26所生成之屬性資料。如上述,已被屬性生成部26所生成之屬性資料中係可包含有例如:家戶收入、家戶年收、居住地、(整個家戶之)保險加入有無、儲蓄金額、金融資產、報紙訂閱有無、興趣、常去的場所/地區、年齡、性別、購入的專業書籍之類別、教育水準等。The user score estimating unit 28 estimates the user score to be set to the user based on the completed attribute data group. In this embodiment, the user score estimation unit 28 inputs the user's attribute data group into the user score estimation model to estimate the user score to be set to the corresponding user. Here, the output value of the user score estimation model is the user score normalized/normalized with 0 as the minimum value and 1 as the maximum value. Here, the attribute data group of the target user input to the user score estimation model includes attribute data generated by the attribute generation unit 26 . As mentioned above, the attribute data generated by the attribute generation unit 26 may include, for example: household income, household annual income, place of residence, insurance coverage (for the entire household), savings amount, financial assets, newspapers Subscription status, interests, places/regions frequently visited, age, gender, types of professional books purchased, education level, etc.

機器學習部29,係將使用者分數推定部28所致之使用者分數推定中所被使用的使用者分數推定模型,予以生成及/或更新。使用者分數推定模型係可為,在被輸入了對象使用者所相關之1或複數個屬性資料(屬性資料群)的情況下,會將表示使用者所關連之某種尺度(例如信用等)的使用者分數予以輸出的機器學習模型,亦可為能夠輸出使用者分數的某些函數或統計模型。The machine learning unit 29 generates and/or updates the user score estimation model used in the user score estimation by the user score estimation unit 28 . The user score estimation model may be such that when one or a plurality of attribute data (attribute data groups) related to the target user is input, it may represent some measure (such as credit, etc.) related to the user. A machine learning model that outputs user scores can also be some function or statistical model that can output user scores.

在使用者分數推定模型的生成及/或更新之際,機器學習部29,係基於從服務提供系統5所取得的資料,而按照每一使用者,作成把含有該當使用者之人口統計屬性的屬性資料群定義成為輸入值並把該當使用者所相關之使用者分數定義成為輸出值的訓練資料。然後,機器學習部29,係基於該當訓練資料,來作成使用者分數推定模型。如上述,被輸入至使用者分數推定模型的屬性資料群中,係含有已被屬性生成部26所生成之屬性資料,其係與對應的使用者之使用者分數進行組合,成為訓練資料而被輸入至機器學習部29。被設定至訓練資料的使用者分數,係亦可為基於規則而被設定的(被進行過註解的)使用者分數。又,亦可為藉由使用者分數推定模型而在過去曾經被輸出之後,藉由管理者等而被修正過的使用者分數。When generating and/or updating the user score estimation model, the machine learning unit 29 creates, for each user, a model containing the demographic attributes of the user based on the data obtained from the service providing system 5. The attribute data group is defined as the input value and the user score related to the user is defined as the training data of the output value. Then, the machine learning unit 29 creates a user score estimation model based on the training data. As described above, the attribute data group input to the user score estimation model includes attribute data generated by the attribute generation unit 26, which is combined with the user score of the corresponding user to become training data. Input to machine learning unit 29. The user scores set to the training data may also be rule-based (annotated) user scores. Alternatively, the user score may be a user score that has been corrected by a manager or the like after being outputted in the past by the user score estimation model.

本揭露所涉及之技術在實作時所能夠採用的機器學習模型生成之框架,作為例子,是基於集成學習演算法。該當框架中係可採用例如:基於梯度提升決策樹(Gradient Boosting Decision Tree:GBDT)的機器學習框架(例如LightGBM)。換言之,該當框架係可採用,在前後的弱學習器(弱分類器)間會將正確答案與預測值之誤差予以繼承的基於此種決策樹模型的機器學習框架。此處所謂的預測值,作為例子,係指使用者分數的預測值。此外,該當框架,係除了LightGBM以外,還可採用XGBoost或CatBoost等之boosting手法。若依據使用決策樹的框架,則相較於使用神經網路的框架,可用較少的參數調整之手續,就能生成具有比較高性能的機器學習模型。但是,本揭露所涉及之技術在實作時所能夠採用的機器學習模型生成之框架,係不限定於本實施形態中的例示。例如,作為學習器亦可取代梯度提升決策樹而改用隨機森林等其他的學習器,亦可採用神經網路等之不被稱為所謂弱學習器的學習器。又,尤其是在採用神經網路等之不被稱為所謂弱學習器的學習器的情況下,則亦可不採用集成學習。The framework for machine learning model generation that can be used in the implementation of the technology involved in this disclosure is, as an example, based on an ensemble learning algorithm. For example, a machine learning framework (such as LightGBM) based on Gradient Boosting Decision Tree (GBDT) can be used in this framework. In other words, this framework can adopt a machine learning framework based on this kind of decision tree model that inherits the error between the correct answer and the predicted value between the preceding and following weak learners (weak classifiers). The predicted value here refers to, for example, the predicted value of the user's score. In addition, in addition to LightGBM, this framework can also use boosting techniques such as XGBoost or CatBoost. If based on a framework using decision trees, compared to a framework using neural networks, a machine learning model with relatively high performance can be generated with less parameter adjustment procedures. However, the framework for generating machine learning models that can be used when implementing the technology involved in the present disclosure is not limited to the examples in this embodiment. For example, as a learner, the gradient boosting decision tree can be replaced by other learners such as random forest, or a learner such as a neural network that is not called a weak learner can be used. Furthermore, especially when a learner such as a neural network, which is not called a so-called weak learner, is used, ensemble learning does not need to be used.

圖13係為本實施形態中所被採用的機器學習模型的決策樹之概念的簡略化之圖示。在採用基於決策樹演算法的梯度提升之機器學習框架的情況下,決策樹的各節點之分歧條件的最佳化會被進行。具體而言,在基於決策樹演算法的梯度提升之機器學習框架中,針對具有從一個母節點所分歧出來的二個子節點之各者所代表之屬性的使用者群,分別算出使用者分數,將母節點的分歧條件進行最佳化,以使得該使用者分數的差分會變大(例如使得差分變成最大,或變成所定之閾值以上),亦即使得二個子節點能夠明確地分歧。例如,作為節點的分歧條件而被表示的屬性係為年齡的情況下,則亦可將被設定成分歧之閾值的年齡予以變更,或者亦可將分歧條件變更成年齡以外的屬性。如此,藉由將決策樹的全節點的分歧條件做遞迴性的最佳化,就可提升基於屬性資料群的使用者分數的推定精度。FIG. 13 is a simplified illustration of the concept of a decision tree of the machine learning model used in this embodiment. In the case of using a machine learning framework based on gradient boosting of the decision tree algorithm, the optimization of the divergence conditions of each node of the decision tree will be performed. Specifically, in the machine learning framework of gradient boosting based on the decision tree algorithm, user scores are calculated for user groups with attributes represented by each of the two child nodes branched from a parent node. The divergence condition of the parent node is optimized so that the difference in user scores becomes larger (for example, the difference becomes the maximum, or exceeds a predetermined threshold), that is, the two child nodes can clearly diverge. For example, if the attribute represented as the divergence condition of a node is age, the age set as the threshold for divergence may be changed, or the divergence condition may be changed to an attribute other than age. In this way, by recursively optimizing the divergence conditions of all nodes of the decision tree, the estimation accuracy of the user score based on the attribute data group can be improved.

又,在屬性生成部26是使用屬性生成模型來生成補全對象的屬性資料的情況下,機器學習部29係還將屬性生成部26所致之對象使用者之補全對象屬性資料之生成時所被使用的屬性生成模型,予以生成及/或更新。屬性生成模型係為,在被輸入了關於1或複數個參考使用者的1或複數個屬性資料及接近度分數的情況下,會將關於對象使用者的補全對象屬性資料予以輸出的機器學習模型。In addition, when the attribute generation unit 26 generates the attribute data of the completion target using the attribute generation model, the machine learning unit 29 also generates the completion target attribute data of the target user caused by the attribute generation unit 26. The attribute generation model used is generated and/or updated. The attribute generation model is machine learning that outputs complete object attribute data about the target user when one or more attribute data and proximity scores about one or more reference users are input. Model.

在屬性生成模型的生成及/或更新時,機器學習部29係作成:把從服務提供系統5所取得的資料之中的1或複數個參考使用者的屬性資料及接近度分數定義成為輸入值並把1個屬性資料(關於對象使用者的補全對象屬性資料)定義成為輸出值的訓練資料。此處,屬性生成模型的生成及/或更新時所被使用的訓練資料中所被設定的輸出值(對象使用者的補全對象屬性資料之參數),係亦可為基於規則(例如上述的加權所致之算出方法)而被設定的(被進行過註解的)輸出值。又,亦可為藉由屬性生成模型而在過去曾經被輸出之後,藉由管理者等而被修正過的輸出值。When generating and/or updating the attribute generation model, the machine learning unit 29 defines the attribute data and proximity scores of one or more reference users among the data obtained from the service providing system 5 as input values. And define one attribute data (complete object attribute data about the object user) as the training data of the output value. Here, the output values set in the training data used when generating and/or updating the attribute generation model (parameters for completing the object attribute data of the target user) may also be based on rules (such as the above The output value is set (annotated) based on the calculation method due to weighting. Alternatively, the output value may be an output value that has been corrected by a manager or the like after being outputted in the past using an attribute generation model.

然後,機器學習部29,係基於該當訓練資料,來生成或更新屬性生成模型。1或複數個屬性資料及接近度分數,係與對應的屬性資料做組合,成為訓練資料而被輸入至機器學習部29。又,在屬性生成模型的生成或更新中也是,所能夠採用的機器學習模型生成之框架係無限定,而可採用基於決策樹演算法的梯度提升之機器學習框架,這點是和上記說明的使用者分數推定模型相同。Then, the machine learning unit 29 generates or updates the attribute generation model based on the training data. One or more attribute data and proximity scores are combined with corresponding attribute data to become training data and are input to the machine learning unit 29 . In addition, in the generation or update of attribute generation models, the framework for generating machine learning models that can be used is not limited, but the machine learning framework of gradient boosting based on the decision tree algorithm can be used. This is explained in the above description. The user score inference model is the same.

<處理的流程> 接著說明,藉由本實施形態所述之資訊處理系統而被執行的處理的流程。此外,以下說明的處理的具體內容及處理順序,係為為了實施本揭露所需之一例。具體的處理內容及處理順序,係可隨著本揭露的實施形態而做適宜選擇。 <Processing flow> Next, the flow of processing executed by the information processing system according to this embodiment will be described. In addition, the specific content and processing sequence of the processing described below are only examples necessary for implementing the present disclosure. The specific processing content and processing sequence can be appropriately selected according to the implementation form of the present disclosure.

圖14係為本實施形態所述之機器學習處理之流程的流程圖。本流程圖中所示的處理,係在藉由管理者而被指定的時序上被執行。FIG. 14 is a flowchart showing the flow of machine learning processing according to this embodiment. The processing shown in this flowchart is executed at the timing specified by the administrator.

於本實施形態中,在機器學習處理中,使用者分數推定模型會被生成及/或更新。機器學習部29係作成訓練資料,其中含有:於服務提供系統5中過去所被累積之每一使用者的屬性資料群、和針對對應之使用者而被預先決定的使用者分數之組合(步驟S101)。然後,機器學習部29,係將已被作成之訓練資料輸入至使用者分數推定模型,將使用者分數推定部28所致之使用者分數推定中所被使用的使用者分數推定模型予以生成或更新(步驟S102)。其後,本流程圖中所示的處理係結束。此外,在屬性生成部26為了屬性補全而使用屬性生成模型的情況下,屬性生成模型的生成及/或更新也是以相同的處理之流程而被進行即可。In this embodiment, during the machine learning process, the user score estimation model is generated and/or updated. The machine learning unit 29 creates training data including a combination of the attribute data group of each user accumulated in the past in the service providing system 5 and the user score predetermined for the corresponding user (step S101). Then, the machine learning unit 29 inputs the created training data into the user score estimation model, and generates the user score estimation model used in the user score estimation by the user score estimation unit 28 or Update (step S102). Thereafter, the processing shown in this flowchart ends. In addition, when the attribute generation unit 26 uses an attribute generation model for attribute completion, the generation and/or update of the attribute generation model may also be performed in the same process flow.

圖15係為本實施形態所述之使用者分數推定處理之流程的流程圖。本流程圖中所示的處理,係在藉由管理者而被指定的時序上,按照對象之每一使用者而被執行。此處,對象使用者,係為屬性資料中有缺損或屬性資料的信賴性較低的使用者。作為信賴性低的屬性資料之例子係可舉出:基於被累積的量不夠充足的履歷資料而被生成的屬性資料、或與其他屬性資料之內容明顯矛盾的屬性資料等。此外,這裡係假設,關於包含對象使用者的複數個使用者的圖形資料是已經被生成,又,各機器學習模型是已經進行過學習。FIG. 15 is a flowchart showing the flow of user score estimation processing according to this embodiment. The processing shown in this flowchart is executed for each user of the object at the timing specified by the administrator. Here, the target user is a user whose attribute data is defective or whose attribute data is less reliable. Examples of attribute data with low reliability include attribute data generated based on an insufficient amount of accumulated historical data, attribute data that clearly contradicts the contents of other attribute data, and the like. In addition, it is assumed here that graphic data on a plurality of users including the target user has been generated, and that each machine learning model has been learned.

在步驟S201及步驟S203中,參考使用者係被特定,對象使用者與參考使用者之間的關係性係被特定。參考使用者特定部22,係參照圖形資料,將與對象使用者所對應之節點資料50以明示性連結或默示性連結而被連接的節點資料50所對應之1或複數個其他使用者,特定成為參考使用者(步驟S201)。然後,關係性特定部23,係針對該當對象使用者與步驟S201中所被特定之1或複數個參考使用者的每一配對,將使用者間之關係性之種類(具體而言係為居住在同一家戶中的親子關係/夫婦關係/朋友關係/在相同職場中工作之關係等),加以特定(步驟S202)。其後,處理係往步驟S203前進。In steps S201 and S203, the reference user is specified, and the relationship between the target user and the reference user is specified. The reference user specifying part 22 refers to one or a plurality of other users corresponding to the node data 50 that are linked explicitly or implicitly to the node data 50 corresponding to the target user with reference to the graph data, The reference user is specified (step S201). Then, for each pairing of the target user and the one or plurality of reference users specified in step S201, the relationship specifying unit 23 determines the type of relationship between the users (specifically, residence). Parent-child relationship/couple relationship/friend relationship/relationship working in the same workplace, etc.) in the same family are specified (step S202). Thereafter, the process proceeds to step S203.

在步驟S203及步驟S204中,身為補全對象的屬性資料之種類係被選擇,使用者間的接近度分數係被決定。屬性選擇部25,係隨應於步驟S202中所被特定的關係性之種類,針對對象使用者來選擇身為補全對象的屬性資料之種類(步驟S203)。又,關係性強度決定部24,係針對該當對象使用者與各參考使用者的每一配對,來決定要與該當配對建立對應的接近度分數之值(S204)。其後,處理係往步驟S205前進。In steps S203 and S204, the type of attribute data that is the completion object is selected, and the proximity score between users is determined. The attribute selection unit 25 selects the type of attribute data to be completed for the target user in accordance with the type of relationship specified in step S202 (step S203). Furthermore, the relationship strength determination unit 24 determines, for each pairing between the target user and each reference user, the value of the proximity score to be associated with the pairing (S204). Thereafter, the process proceeds to step S205.

在步驟S205中,針對對象使用者而被補全的屬性資料係被生成。屬性生成部26,係基於補全對象的屬性資料所對應之參考使用者的屬性資料之參數、與針對該當參考使用者而在步驟S204中所被決定的接近度分數,來生成針對對象使用者而被補全的屬性資料。其後,處理係往步驟S206前進。In step S205, attribute data completed for the target user is generated. The attribute generation unit 26 generates an attribute for the target user based on the parameters of the attribute data of the reference user corresponding to the attribute data of the completion target and the proximity score determined in step S204 for the reference user. The completed attribute data. Thereafter, the process proceeds to step S206.

在步驟S206及步驟S207中,使用者分數係被推定、輸出。屬性補全部27,係對針對對象使用者而從服務提供系統5所被取得等而被預先保持的屬性資料群,追加步驟S205中所被生成之被補全的屬性資料,來作為該當使用者的屬性資料群(步驟S206)。然後,使用者分數推定部28,係將含有步驟S206中針對對象使用者而被補全之屬性資料的屬性資料群,輸入至使用者分數推定模型,將所被輸出的值,當作要被設定至使用者的使用者分數而加以取得(步驟S207)。但是,使用者分數的推定方法,係不限定於本實施形態中的例示。例如,使用者分數係亦可為包含有,將屬性資料群輸入至非機器學習模型的所定之函數而被算出的值。其後,本流程圖中所示的處理係結束。In steps S206 and S207, the user score system is estimated and output. The attribute complementing unit 27 adds the completed attribute data generated in step S205 as the corresponding user to a group of attribute data obtained from the service providing system 5 and held in advance for the target user. attribute data group (step S206). Then, the user score estimation unit 28 inputs the attribute data group including the attribute data completed for the target user in step S206 into the user score estimation model, and treats the output value as the value to be evaluated. The user score set to the user is obtained (step S207). However, the method of estimating the user score is not limited to the example in this embodiment. For example, the user score system may include a value calculated by inputting the attribute data group to a predetermined function of a non-machine learning model. Thereafter, the processing shown in this flowchart ends.

按照每一使用者而被設定的使用者分數,係對服務提供系統5等之其他系統進行提供,而被活用於藉由服務提供系統5等之其他系統而對對象使用者所提供的服務之客製化等。The user score set for each user is provided to other systems such as the service providing system 5 and so on, and is used to provide services to the target users through other systems such as the service providing system 5 etc. Customization, etc.

本實施形態係也可使用於,針對對應之節點資料50並未被包含在圖形節點中的新增之對象使用者的使用者分數之推定。例如,亦可基於新增的對象使用者的使用者屬性資料,而生成該當對象使用者所對應之節點資料50、及與該當節點資料50連接的至少1個連結資料52。然後,藉由連結資料52而與該當對象使用者所對應之節點資料50連接的使用者,亦可被特定成為該當對象使用者的參考使用者。This embodiment can also be used to estimate the user score of a newly added target user whose corresponding node data 50 is not included in the graph node. For example, the node data 50 corresponding to the target user and at least one link data 52 connected to the node data 50 may also be generated based on the user attribute data of the newly added target user. Then, the user connected to the node data 50 corresponding to the target user through the link data 52 can also be specified as a reference user of the target user.

<效果> 若依據本實施形態,則可根據網羅了使用者間之關係的社交圖形網路來補全使用者的缺損屬性,使用已被補全之屬性群來推定/判定使用者分數,藉此,在對象使用者的資訊有所缺損或資訊的信賴性較低等情況下,仍可算出使用者分數,或可提升所被算出的使用者分數之精度。又,藉由使用各式各樣的使用者屬性資料,即使在因為規定或法律等而無法使用某種範圍的(例如信用卡部門的)屬性資料的情況下、或針對對象使用者而只有一部分的屬性資料存在的情況下,仍可算出高精度的使用者分數。 <Effect> According to this embodiment, the user's missing attributes can be completed based on the social graph network that includes the relationship between users, and the user's score can be estimated/determined using the completed attribute group. In cases where the target user's information is missing or the information is less reliable, the user score can still be calculated, or the accuracy of the calculated user score can be improved. In addition, by using various user attribute data, even when a certain range of attribute data (such as that of the credit card department) cannot be used due to regulations or laws, etc., or when only a part of the target user is used. When attribute data exists, high-precision user scores can still be calculated.

<變形例> 在上記說明的實施形態是針對具備圖形資料生成部21、參考使用者特定部22、關係性特定部23、關係性強度決定部24、屬性選擇部25、屬性生成部26、屬性補全部27、使用者分數推定部28、及機器學習部29的資訊處理裝置的例子加以說明,但這些機能部,係在可實施本揭露所涉及之發明的範圍內,亦可省略其一部分。 <Modification> The above-described embodiment includes a graphic data generating unit 21, a reference user specifying unit 22, a relationship specifying unit 23, a relationship strength determining unit 24, an attribute selecting unit 25, an attribute generating unit 26, and an attribute complementing unit 27. An example of the information processing device of the user score estimation unit 28 and the machine learning unit 29 will be described. However, part of these functional units may be omitted within the scope of implementing the invention of the present disclosure.

例如,在上記說明的實施形態中,在補全對象的屬性資料生成時會生成或參照對象使用者與參考使用者之間的關係性強度(接近度分數),但在補全對象的屬性資料生成時,亦可省略接近度分數的生成及參照。此情況下,參照圖2所說明的資訊處理裝置1的各機能部之中,關係性強度決定部24係可被省略。又,屬性生成部26,係屬性資料的生成之際,可不進行參照了接近度分數的加權等,就基於參考使用者的屬性資料來生成對象使用者的補全對象屬性資料。For example, in the embodiment described above, the relationship strength (proximity score) between the target user and the reference user is generated or referred to when the attribute data of the completion object is generated. However, when the attribute data of the completion object is generated, When generating, the generation and reference of the proximity score can also be omitted. In this case, among the functional units of the information processing device 1 described with reference to FIG. 2 , the relationship strength determining unit 24 may be omitted. In addition, when generating attribute data, the attribute generation unit 26 may generate the complementary target attribute data of the target user based on the attribute data of the reference user without performing weighting with reference to the proximity score.

又,例如,屬性生成部26,係亦可使用:把參考使用者的屬性資料群之至少一部分之參數、和對象使用者及參考使用者間之接近度分數當作輸入值,把要被補全的對象使用者的屬性資料當作輸出值的屬性生成模型,來生成對象使用者的屬性資料。此時,屬性生成模型係隨應於輸入值、輸出值之態樣而被適宜地預先進行過學習處理。For example, the attribute generation unit 26 may also use at least a part of the parameters of the attribute data group of the reference user and the proximity score between the target user and the reference user as input values, and use the parameters to be supplemented as input values. The complete attribute data of the object user is used as the attribute generation model of the output value to generate the attribute data of the object user. At this time, the attribute generation model is appropriately learned in advance according to the form of the input value and the output value.

又,例如,屬性生成部26,係亦可使用:把對象使用者的屬性資料群之至少一部分之參數、及/或參考使用者的屬性資料群之至少一部分之參數當作輸入值,把要被補全的對象使用者的屬性資料當作輸出值的屬性生成模型,來生成對象使用者的屬性資料。此時,屬性生成模型係隨應於輸入值、輸出值之態樣而被適宜地預先進行過學習處理。又,此時,屬性生成部26,係在隨著每一對象使用者及參考使用者間之關係性及/或接近度分數而不同的複數個屬性生成模型之中,隨應於身為處理對象之對象使用者與該參考使用者之間的關係性之種類及/或接近度分數而決定出所定之屬性生成模型,來生成要被補全之對象使用者的屬性資料。此處,複數個屬性生成模型之每一者,作為例子,係可基於關係性之種類及/或接近度分數為共通或類似(落在所定之範圍內)的訓練資料而被預先進行過學習處理。Furthermore, for example, the attribute generation unit 26 may also use: parameters of at least a part of the attribute data group of the target user and/or parameters of at least a part of the attribute data group of the reference user as input values, and generate the required parameters. The completed attribute data of the object user is used as the attribute generation model of the output value to generate the attribute data of the object user. At this time, the attribute generation model is appropriately learned in advance according to the form of the input value and the output value. In addition, at this time, the attribute generation unit 26 is based on a plurality of attribute generation models that differ according to the relationship and/or proximity scores between each target user and the reference user, and processes accordingly. The type of relationship and/or the proximity score between the object user of the object and the reference user determines a predetermined attribute generation model to generate attribute data of the object user to be completed. Here, each of the plurality of attribute generation models, for example, may be pre-learned based on training data in which the type of relationship and/or the proximity score are common or similar (falling within a predetermined range). handle.

又,例如,屬性生成部26係亦可使用:作為使用者(對象使用者、參考使用者)的屬性資料群之至少一部分之參數是把圖形資料上的使用者的嵌入表現(向量表現、特徵表現)當作輸入值,把要被補全的對象使用者的屬性資料當作輸出值的屬性生成模型,來生成對象使用者的屬性資料。又,屬性生成模型,係可把圖形資料上的對象使用者及參考使用者的距離或內積等(基於圖形資料的向量空間上之距離或內積等),包含在輸入值中。此時,屬性生成模型係隨應於輸入值、輸出值之態樣而被適宜地預先進行過學習處理。Furthermore, for example, the attribute generation unit 26 may use the parameter that is at least a part of the attribute data group of the user (target user, reference user) to be the embedded representation (vector representation, feature) of the user on the graphic data. Expression) is used as the input value and the attribute data of the object user to be completed is used as the output value of the attribute generation model to generate the attribute data of the object user. In addition, the attribute generation model can include the distance or inner product between the target user and the reference user on the graphic data (the distance or inner product on the vector space based on the graphic data) in the input value. At this time, the attribute generation model is appropriately learned in advance according to the form of the input value and the output value.

又,例如,屬性補全部27,係在藉由屬性生成模型而被輸出的屬性資料,是補全被進行前的對象使用者的屬性資料群中的缺損值(有所缺損的屬性資料)或不當值(信賴性低的屬性資料)的情況下,可將已被輸出的屬性資料,決定成為對象使用者的屬性資料群之一部分。Also, for example, the attribute completion part 27 is when the attribute data output by the attribute generation model is a missing value (missive attribute data) in the attribute data group of the target user before completion is performed, or In the case of inappropriate values (attribute data with low reliability), the attribute data that has been output can be determined to become part of the attribute data group of the target user.

又,例如,屬性選擇部25或屬性補全部27,係可將作為使用者分數推定模型等而被採用的梯度提升決策樹等之集成學習模型中權重較高的屬性資料,視為補全對象的屬性資料。此處,所謂權重較高的屬性資料,作為例子係可為,於使用者分數推定模型中與權重超過所定之權重的樹相對應的屬性資料,亦可為於使用者分數推定模型中與表示上位之(所定之順位以上之)權重的樹相對應的屬性資料。Furthermore, for example, the attribute selecting unit 25 or the attribute complementing unit 27 may regard attribute data with a higher weight in an ensemble learning model such as a gradient boosting decision tree used as a user score estimation model or the like as a complement target. attribute information. Here, the attribute data with a higher weight may be, for example, attribute data corresponding to a tree whose weight exceeds a predetermined weight in the user score estimation model, or may be represented in the user score estimation model. The attribute data corresponding to the tree with the higher weight (above the determined order).

1:資訊處理裝置 5:服務提供系統 11:CPU 12:ROM 13:RAM 14:記憶裝置 15:通訊單元 21:圖形資料生成部 22:參考使用者特定部 23:關係性特定部 24:關係性強度決定部 25:屬性選擇部 26:屬性生成部 27:屬性補全部 28:使用者分數推定部 29:機器學習部 40:電子商務交易系統 42:高爾夫球場預約系統 44:旅行預約系統 46:卡片管理系統 50(50a,50b,50c,50d,50e,50f,50g,50h,50i,50j,50k,50l,50m,50n):節點資料 52(52a,52b,52c,52d,52e,52f,52g,52h,52i,52j,52k,52l,52m,52n,52o,52p):連結資料 54(54a,54b,54c,54d,54e):群聚 1:Information processing device 5:Service provision system 11:CPU 12:ROM 13:RAM 14:Memory device 15: Communication unit 21: Graphic data generation department 22: Refer to user-specific parts 23: Relational specific department 24:Relationship Strength Determining Department 25:Attribute selection department 26:Attribute generation department 27: Complement all attributes 28: User score estimation part 29:Machine Learning Department 40: E-commerce transaction system 42: Golf course reservation system 44:Travel reservation system 46:Card management system 50(50a,50b,50c,50d,50e,50f,50g,50h,50i,50j,50k,50l,50m,50n): Node information 52(52a,52b,52c,52d,52e,52f,52g,52h,52i,52j,52k,52l,52m,52n,52o,52p): Link data 54(54a,54b,54c,54d,54e):Gathering

[圖1]實施形態所述之資訊處理系統之構成的概略圖。 [圖2]實施形態所述之資訊處理裝置之機能構成之概略的圖示。 [圖3]於實施形態中IP位址資料之值為共通之一例的模式性圖示。 [圖4]實施形態所述之圖形資料之一例的圖示。 [圖5]於實施形態中住址資料之值為共通之一例的模式性圖示。 [圖6]實施形態所述之圖形資料之一例的圖示。 [圖7]於實施形態中信用卡號資料之值為共通之一例的模式性圖示。 [圖8]實施形態所述之圖形資料之一例的圖示。 [圖9]實施形態所述之圖形資料之一例的圖示。 [圖10]實施形態所述之群聚之一例的圖示。 [圖11]實施形態所述之分類的可視化之一例的圖示。 [圖12]實施形態所述之使用了機器學習模型的關係性強度(接近度分數)的決定之一例的圖示。 [圖13]實施形態中所被採用的機器學習模型的決策樹之概念的簡略化之圖示。 [圖14]實施形態所述之機器學習處理之流程的流程圖。 [圖15]實施形態所述之使用者分數推定處理之流程的流程圖。 [Fig. 1] A schematic diagram of the structure of the information processing system according to the embodiment. [Fig. 2] A schematic diagram showing the functional configuration of the information processing device according to the embodiment. [Fig. 3] A schematic diagram showing an example of a common value of IP address data in the embodiment. [Fig. 4] An illustration of an example of graphic data according to the embodiment. [Fig. 5] A schematic diagram showing an example of a common value of address data in the embodiment. [Fig. 6] An illustration of an example of graphic data according to the embodiment. [Fig. 7] A schematic diagram showing an example of a common value of credit card number data in the embodiment. [Fig. 8] An illustration of an example of graphic data according to the embodiment. [Fig. 9] An illustration of an example of graphic data according to the embodiment. [Fig. 10] An illustration of an example of clustering according to the embodiment. [Fig. 11] An illustration of an example of visualization of classification according to the embodiment. [Fig. 12] A diagram illustrating an example of determination of relationship strength (proximity score) using a machine learning model according to the embodiment. [Fig. 13] A simplified illustration of the concept of a decision tree of the machine learning model used in the embodiment. [Fig. 14] A flowchart of the flow of machine learning processing described in the embodiment. [Fig. 15] A flowchart illustrating the flow of user score estimation processing according to the embodiment.

1:資訊處理裝置 1:Information processing device

21:圖形資料生成部 21: Graphic data generation department

22:參考使用者特定部 22: Refer to user-specific parts

23:關係性特定部 23: Relational specific department

24:關係性強度決定部 24:Relationship Strength Determining Department

25:屬性選擇部 25:Attribute selection department

26:屬性生成部 26:Attribute generation department

27:屬性補全部 27: Complement all attributes

28:使用者分數推定部 28: User score estimation part

29:機器學習部 29:Machine Learning Department

Claims (15)

一種資訊處理系統,係具備: 參考使用者特定手段,係用以特定出與對象使用者彼此存有關係之參考使用者;和 屬性生成手段,係用以基於針對前記對象使用者而被特定之前記參考使用者的屬性資料,而生成該對象使用者的對應之屬性資料;和 屬性補全手段,係用以基於已被生成之前記對象之使用者的對應之屬性資料之至少一部分,而將前記對象使用者的對應之屬性資料群予以補全;和 使用者分數推定手段,係用以基於已被補全之前記對象使用者的對應之前記屬性資料群,來推定要被設定至該對象使用者的使用者分數。 An information processing system having: Reference user specifying means is used to identify the reference user who is related to the target user; and Attribute generation means is used to generate corresponding attribute data of the object user based on the attribute data of a specific prefix reference user for the user of the object; and Attribute completion means is used to complete the corresponding attribute data group of the user of the preceding object based on at least part of the corresponding attribute data of the user of the preceding object that has been generated; and The user score estimation means is used to estimate the user score to be set to the target user based on the corresponding previous attribute data group that has been completed for the previous target user. 如請求項1所記載之資訊處理系統,其中, 前記參考使用者特定手段,係將前記參考使用者,基於表示使用者間之關係性的圖形資料而加以特定。 An information processing system as described in claim 1, wherein, The means of specifying a reference user in the preamble refers to specifying the reference user in the preamble based on graphic data showing the relationship between users. 如請求項2所記載之資訊處理系統,其中, 還具備:圖形資料生成手段,係用以基於複數個使用者之各者的屬性資料群而特定出彼此存有關係之使用者之配對,以生成前記圖形資料。 An information processing system as described in claim 2, wherein, It also has: graphic data generating means for specifying a pair of users who are related to each other based on the attribute data group of each of the plurality of users to generate the aforementioned graphic data. 如請求項1所記載之資訊處理系統,其中, 還具備:關係性特定手段,係用以特定出使用者間之關係性。 An information processing system as described in claim 1, wherein, It also has: relationship specific means, which is used to specify the relationship between users. 如請求項4所記載之資訊處理系統,其中, 前記關係性特定手段,係基於以使用者間之關係所被建立對應之值為基礎的聚類之結果,而將前記使用者間之關係性,加以特定。 An information processing system as described in request item 4, wherein, The above-mentioned relationship specifying means specifies the relationship between the users mentioned above based on the result of clustering based on the values corresponding to the relationships between the users. 如請求項5所記載之資訊處理系統,其中, 前記關係性特定手段,係基於以前記使用者的姓名、IP位址、住址、信用卡號、年齡、性別、就學地點、工作地點及滯留場所之其中至少1者為基礎的聚類之結果,而將前記使用者間之關係性,加以特定。 An information processing system as described in request item 5, wherein, The aforementioned relationship specific method is the result of clustering based on at least one of the aforementioned user's name, IP address, address, credit card number, age, gender, place of study, place of work, and place of stay, and Specify the relationship between the users mentioned above. 如請求項4所記載之資訊處理系統,其中, 還具備:關係性強度決定手段,係用以依照前記對象使用者與前記參考使用者之關係性所對應之判斷基準,基於表示該對象使用者與該參考使用者之關係之強弱的指標,來決定表示該對象使用者與該參考使用者之遠近的關係性強度; 前記屬性生成手段,係基於針對至少1個前記參考使用者的,關於該參考使用者的資訊、與針對該參考使用者而被決定的前記關係性強度,而生成前記對象使用者的對應之屬性資料。 An information processing system as described in request item 4, wherein, It also includes: a relationship strength determination means that is based on an index indicating the strength of the relationship between the target user and the reference user based on the judgment criteria corresponding to the relationship between the target user mentioned above and the reference user mentioned above. Determine the relationship strength indicating the distance between the object user and the reference user; The means for generating the prescript attribute is to generate attributes corresponding to the prescript target user based on information about at least one prescript reference user and the determined relationship strength of the prescript with respect to the reference user. material. 如請求項7所記載之資訊處理系統,其中, 前記關係性強度決定手段,係基於對前記對象使用者與前記參考使用者之關係性所對應之已學習之機器學習模型輸入了表示前記指標的資料之際的輸出,來決定表示前記對象使用者與前記參考使用者之遠近的前記關係性強度。 An information processing system as described in claim 7, wherein, The means for determining the strength of the aforementioned relationship determines the user who represents the aforementioned object based on the output of a machine learning model that has been learned corresponding to the relationship between the aforementioned user and the aforementioned reference user and inputs data representing the aforementioned index. The strength of the relationship between the user and the reference user. 如請求項1所記載之資訊處理系統,其中, 還具備:屬性選擇手段,係用以隨應於前記對象使用者與前記參考使用者之關係性之種類,來選擇藉由前記屬性生成手段所被生成的前記屬性資料之種類; 前記屬性生成手段,係在前記參考使用者的屬性資料群之中,基於已被前記屬性選擇手段所選擇之種類的屬性資料,而生成該對象使用者的對應之屬性資料。 An information processing system as described in claim 1, wherein, It also has: attribute selection means for selecting the type of prescript attribute data generated by the prescript attribute generating means in accordance with the type of relationship between the prescript object user and the prescript reference user; The attribute generating means mentioned above is to generate corresponding attribute data of the object user based on the attribute data of the type selected by the attribute selecting means mentioned above, among the attribute data group of the reference user mentioned above. 如請求項1所記載之資訊處理系統,其中, 前記使用者分數推定手段,係藉由將前記對象使用者的屬性資料群輸入至機器學習模型,以推定要被設定至該對象使用者的使用者分數。 An information processing system as described in claim 1, wherein, The user score estimation method mentioned above is to estimate the user score to be set to the target user by inputting the attribute data group of the target user mentioned above into the machine learning model. 如請求項10所記載之資訊處理系統,其中, 前記使用者分數推定手段,係使用:使用了基於梯度提升決策樹之機器學習框架而被生成的機器學習模型,來推定前記使用者分數。 An information processing system as described in claim 10, wherein, The aforementioned user score estimation method uses: a machine learning model generated based on the machine learning framework of the gradient boosting decision tree to estimate the aforementioned user score. 如請求項10所記載之資訊處理系統,其中, 前記使用者分數推定手段,係使用:使用了把含有使用者之人口統計屬性的屬性資料群當作輸入值並把該使用者所相關之前記使用者分數當作輸出值的訓練資料而被生成的前記機器學習模型,來推定要被設定至前記對象使用者的使用者分數。 An information processing system as described in claim 10, wherein, The aforementioned user score estimation method is generated by using training data that uses an attribute data group containing the user's demographic attributes as an input value and uses the previous user score related to the user as an output value. The machine learning model of the prefix is used to estimate the user score to be set to the prefix target user. 如請求項1所記載之資訊處理系統,其中, 前記屬性補全手段,係將用來補全前記對象使用者的屬性資料群之中有所缺損的屬性資料或信賴性低的屬性資料所需的屬性資料,基於前記參考使用者的屬性資料而加以生成。 An information processing system as described in claim 1, wherein, The attribute completion method mentioned above is the attribute data required to complete the attribute data that is missing or has low reliability in the attribute data group of the target user mentioned above. It is based on the attribute data of the user mentioned above. be generated. 一種資訊處理方法,係由電腦來執行: 參考使用者特定步驟,係用以特定出與對象使用者彼此存有關係之參考使用者;和 屬性生成步驟,係用以基於針對前記對象使用者而被特定之前記參考使用者的屬性資料,而生成該對象使用者的對應之屬性資料;和 屬性補全步驟,係用以基於已被生成之前記對象之使用者的對應之屬性資料之至少一部分,而將前記對象使用者的對應之屬性資料群予以補全;和 使用者分數推定步驟,係用以基於已被補全之前記對象使用者的對應之前記屬性資料群,來推定要被設定至該對象使用者的使用者分數。 An information processing method performed by a computer: The reference user specific step is used to identify the reference user who has a relationship with the target user; and The attribute generation step is used to generate corresponding attribute data of the object user based on the attribute data of the specified previous reference user for the previous object user; and The attribute completion step is used to complete the corresponding attribute data group of the user of the previous object based on at least part of the corresponding attribute data of the user of the previous object that has been generated; and The user score estimation step is used to estimate the user score to be set to the target user based on the corresponding previous attribute data group that has been completed for the previous target user. 一種程式產品,係使電腦發揮功能而成為: 參考使用者特定手段,係用以特定出與對象使用者彼此存有關係之參考使用者;和 屬性生成手段,係用以基於針對前記對象使用者而被特定之前記參考使用者的屬性資料,而生成該對象使用者的對應之屬性資料;和 屬性補全手段,係用以基於已被生成之前記對象之使用者的對應之屬性資料之至少一部分,而將前記對象使用者的對應之屬性資料群予以補全;和 使用者分數推定手段,係用以基於已被補全之前記對象使用者的對應之前記屬性資料群,來推定要被設定至該對象使用者的使用者分數。 A program product that enables a computer to function as: Reference user specifying means is used to identify the reference user who is related to the target user; and Attribute generation means is used to generate corresponding attribute data of the object user based on the attribute data of a specific prefix reference user for the user of the object; and Attribute completion means is used to complete the corresponding attribute data group of the user of the preceding object based on at least part of the corresponding attribute data of the user of the preceding object that has been generated; and The user score estimation means is used to estimate the user score to be set to the target user based on the corresponding previous attribute data group that has been completed for the previous target user.
TW112111662A 2022-03-30 2023-03-28 Information processing system, method and program TW202405723A (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2022-056450 2022-03-30
JP2022056450A JP2023148437A (en) 2022-03-30 2022-03-30 Information processing system, method and program

Publications (1)

Publication Number Publication Date
TW202405723A true TW202405723A (en) 2024-02-01

Family

ID=88288291

Family Applications (1)

Application Number Title Priority Date Filing Date
TW112111662A TW202405723A (en) 2022-03-30 2023-03-28 Information processing system, method and program

Country Status (2)

Country Link
JP (1) JP2023148437A (en)
TW (1) TW202405723A (en)

Also Published As

Publication number Publication date
JP2023148437A (en) 2023-10-13

Similar Documents

Publication Publication Date Title
CN109977151B (en) Data analysis method and system
Ahn et al. A survey on churn analysis in various business domains
CN110188198A (en) A kind of anti-fraud method and device of knowledge based map
JP4529058B2 (en) Distribution system
US11227217B1 (en) Entity transaction attribute determination method and apparatus
Kültür et al. Hybrid approaches for detecting credit card fraud
Klepac Developing churn models using data mining techniques and social network analysis
CN112330373A (en) User behavior analysis method and device and computer readable storage medium
TW202405723A (en) Information processing system, method and program
TWI837066B (en) Information processing devices, methods and program products
TWI843087B (en) Credit determination system, credit determination method and information storage medium
US20220366421A1 (en) Method and system for assessing the reputation of a merchant
Islam An efficient technique for mining bad credit accounts from both olap and oltp
TWI832588B (en) Information processing systems, information processing methods and program products
JP7459189B2 (en) Closeness score determination system, proximity score determination method and program
JP2024000694A (en) Information processing apparatus, method, and program
US20240095822A1 (en) Information processing apparatus, method, and medium
WO2023119577A1 (en) Information processing system, information processing method, and program
JP7345032B1 (en) Credit screening device, method and program
TWI835439B (en) Information processing systems, information processing methods and program products
TW202401337A (en) Reviewing device, reviewing method, and program product including a first score acquisition portion, a second score acquisition portion, a user section specifying portion, and a reviewing result determination portion
CN117372132B (en) User credit score generation method, device, computer equipment and storage medium
TW202414308A (en) Information processing devices, methods and program products
TW202401310A (en) Information processing device, information processing method and program product characterized in that the information processing device can estimate the effect of an operation even if the data required to evaluate the effect of the operation are insufficient
Van Haver Benchmarking analytical techniques for churn modelling in a B2B context