TW202312061A

TW202312061A - User management devices and methods capable of identifying anonymous users belonging to specific user groups, and storing media storing the methods

Info

Publication number: TW202312061A
Application number: TW110133240A
Authority: TW
Inventors: 陳昶佑
Original assignee: 伊雲谷數位科技股份有限公司
Priority date: 2021-09-07
Filing date: 2021-09-07
Publication date: 2023-03-16

Abstract

The purpose of the present invention is how to use anonymous user data to group anonymous users to know which specific user group the anonymous user belongs to. In this way, the enterprise cannot only know which specific user group the real-name user belongs to, but also know which specific user group the anonymous user belongs to. Thus, marketers can use it to facilitate accurate marketing. Specifically, the present invention is to use real-name user data to find out the characteristics of real-name users that are important and meaningful to the specific user group, and then use the real-name user characteristics to map out the anonymous user characteristics corresponding to the specific user group. In this way, whether an anonymous user belongs to the specific user group can be determined based on the anonymous user data.

Description

User management device, method and storage medium for storing the method capable of judging whether an anonymous user belongs to a specific user group

本發明涉及用戶管理裝置，且特別是一種能夠判別匿名用戶是否屬於特定用戶族群族(例如，但不限定於高價值用戶群、常購用戶群或高消費額用戶群)的用戶管理裝置、方法與儲存該方法的儲存媒介。The present invention relates to a user management device, and in particular to a user management device and method capable of judging whether an anonymous user belongs to a specific user group (such as, but not limited to, a high-value user group, a frequent purchase user group, or a high-spending user group) and a storage medium for storing the method.

針對線上購物服務，在用戶購買的過程中，可能會經過觀看廣告(例如，臉書或谷歌推播的廣告)、網路搜尋、方案比對與決定購買等的流程。在觀看廣告、網路搜尋與方案比對流程中，用戶可能未登入會員，而是使用匿名方式進行網頁瀏覽。另外一方面，目前不少線上購物也允許用戶不註冊與登入會員，便能夠直接進行貨品或服務的購買。因此，在傳統用戶管理系統中，並無法知悉匿名用戶是否屬於特定用戶族群，例如，但不限定於高價值用戶群、常購用戶群或高消費額用戶群。另外，一般企業會利用人口群體統計方法區分客群，而用會員資料預測價值用戶，但兩者往往獨立作業，因此仍無法知道行為特徵中怎樣的匿名用戶會是潛在的高價值用戶群。For online shopping services, during the purchase process, users may go through the process of watching advertisements (for example, advertisements pushed by Facebook or Google), searching the Internet, comparing plans, and deciding to purchase. During the process of viewing advertisements, searching on the Internet and comparing plans, users may not log in as a member, but browse the web in an anonymous manner. On the other hand, many online shopping companies currently allow users to directly purchase goods or services without registering or logging in as a member. Therefore, in a traditional user management system, it is impossible to know whether an anonymous user belongs to a specific user group, such as, but not limited to, a high-value user group, a frequent purchase user group, or a high-spending user group. In addition, general companies will use demographic group statistics methods to distinguish customer groups, and use member data to predict value users, but the two often work independently, so it is still impossible to know which anonymous users in the behavioral characteristics will be potential high-value user groups.

有鑑於先前技術的問題，若能進一步地提前在匿名用戶登入會員時，即能預測匿名用戶屬於哪個特定用戶族群，並給對應的行銷策略(例如，但不限於給予折扣、贈送贈品或給予累積優惠)，則可以有效增加匿名用戶購買商品或訂閱服務的意願，從而增加交易金額。In view of the problems of the previous technology, if the anonymous user can log in to the member in advance, it can be predicted which specific user group the anonymous user belongs to, and the corresponding marketing strategy (such as, but not limited to, giving discounts, giving gifts or giving accumulated discounts), it can effectively increase the willingness of anonymous users to purchase goods or subscribe to services, thereby increasing the transaction amount.

根據本發明的目的，本發明提供一種能夠判別匿名用戶是否屬於特定用戶族群族的方法，係執行於一用戶管理裝置，其包括：針對多個實名用戶的一第一特定用戶族群，獲取對所述第一特定用戶族群來說其重要性大於一門檻值或其重要性為前幾名者的多個實名用戶特徵作為所述第一特定用戶族群之一第一重要實名用戶特徵集合的多個第一實名用戶特徵，其中所述特徵工程是透過所述用戶管理裝置基於一機器學習算法來找出所述第一特定用戶族群之所述第一重要實名用戶特徵集合的多個第一實名用戶特徵；根據所述第一特定用戶族群之所述第一重要實名用戶特徵集合的所述多個第一實名用戶特徵映射出所述第一特定用戶族群之一第一重要匿名用戶特徵集合的多個第一匿名用戶特徵，其中所述多個第一匿名用戶特徵的至少一部分由一匿名用戶的一匿名資料獲取；以及根據由所述匿名用戶的所述匿名資料所獲取的至少一部分所述多個第一匿名用戶特徵判斷所述匿名用戶是否屬於所述第一特定用戶族群。According to the purpose of the present invention, the present invention provides a method capable of judging whether an anonymous user belongs to a specific user group, which is implemented in a user management device, and includes: for a first specific user group of multiple real-name users, obtain the information on all For the first specific user group, a plurality of real-name user features whose importance is greater than a threshold or whose importance is in the top few ranks are used as a plurality of first important real-name user feature sets of the first specific user group First real-name user features, wherein the feature engineering is to find out a plurality of first real-name users of the first important real-name user feature set of the first specific user group based on a machine learning algorithm through the user management device Features; according to the multiple first real-name user features of the first important real-name user feature set of the first specific user group, multiple first important anonymous user feature sets of the first specific user group are mapped A first anonymous user feature, wherein at least a part of the plurality of first anonymous user features is obtained from an anonymous profile of an anonymous user; and according to at least a part of the multiple features obtained from the anonymous profile of the anonymous user A first anonymous user feature determines whether the anonymous user belongs to the first specific user group.

可選地，所述方法更包括：使用所述多個實名用戶的多個實名用戶資料對所述多個實名用戶分群，其中至少一部分所述多個實名用戶屬於所述第一特定用戶族群。Optionally, the method further includes: using multiple real-name user profiles of the multiple real-name users to group the multiple real-name users, wherein at least a part of the multiple real-name users belong to the first specific user group.

可選地，使用一RFM模型(Recency Frequency Monetary Model)根據所述多個實名用戶的所述多個實名用戶資料對所述多個實名用戶分群；或者，根據由所述多個實名用戶的所述多個實名用戶資料獲取的多個靜態特徵與/或多個動態特徵對所述多個實名用戶分群；又或者，使用一K-MEANS算法、一支持向量機算法與/或一機器學習算法根據所述多個實名用戶資料對所述多個實名用戶群分群。Optionally, using an RFM model (Recency Frequency Monetary Model) to group the multiple real-name users according to the multiple real-name user profiles of the multiple real-name users; A plurality of static features and/or a plurality of dynamic features acquired by the plurality of real-name user data are used to group the plurality of real-name users; or, using a K-MEANS algorithm, a support vector machine algorithm and/or a machine learning algorithm The multiple real-name user groups are grouped according to the multiple real-name user profiles.

可選地，所述方法更包括：收集所述多個實名用戶的所述多個實名用戶資料，以及根據所述多個實名用戶資料獲取所述多個實名用戶的所述多個實名用戶特徵。Optionally, the method further includes: collecting the multiple real-name user materials of the multiple real-name users, and acquiring the multiple real-name user characteristics of the multiple real-name users according to the multiple real-name user materials .

可選地，所述方法更包括：根據所述匿名用戶是否屬於所述第一特定用戶族群決定對所述匿名用戶進行行銷的一行銷策略。Optionally, the method further includes: determining a marketing strategy for marketing the anonymous user according to whether the anonymous user belongs to the first specific user group.

可選地，從所述多個第一實名用戶特徵中找出可以透過所述匿名資料而獲取者作為全部的所述多個第一匿名用戶特徵。Optionally, find out the plurality of first anonymous user characteristics that can be acquired through the anonymous data as a whole from the plurality of first real-name user characteristics.

可選地，從所述多個第一實名用戶特徵中找出可以透過所述匿名資料而獲取者作為一部分的所述多個第一匿名用戶特徵，以及將所述多個第一實名用戶特徵中無法透過所述匿名資料獲取者的所述多個第一實名用戶特徵及其數值作為其他所述多個第一匿名用戶特徵及其數值。Optionally, find out the plurality of first anonymous user characteristics that can be acquired through the anonymous data from the plurality of first real-name user characteristics, and combine the plurality of first real-name user characteristics The multiple first real-name user features and their values of the anonymous data acquirer cannot be used as the other multiple first anonymous user features and their values.

可選地，透過所述多個第一匿名用戶特徵訓練所述匿名用戶的一分群模型，且所述分群模型用於根據由所述匿名用戶的所述匿名資料所獲取的至少一部分所述多個第一匿名用戶特徵判斷所述匿名用戶是否屬於所述第一特定用戶族群。Optionally, a grouping model of the anonymous user is trained through the plurality of first anonymous user features, and the grouping model is used to A first anonymous user feature determines whether the anonymous user belongs to the first specific user group.

可選地，對所述匿名用戶的所述多個第一匿名用戶特徵與對應的所述多個第一實名用戶特徵進行一特徵比對，以藉此判斷所述匿名用戶是否屬於所述第一特定用戶族群。Optionally, a feature comparison is performed between the multiple first anonymous user features of the anonymous user and the corresponding multiple first real-name user features, so as to determine whether the anonymous user belongs to the first a specific group of users.

可選地，所述特徵比對為計算一餘弦相似度。Optionally, the feature comparison is to calculate a cosine similarity.

根據本發明的目的，本發明提供一種非揮發性的儲存媒介，儲存有多個程式碼，所述多個程式碼係被一計算機裝置讀取，以執行前述能夠判別匿名用戶是否屬於特定用戶族群族的方法的任一者。According to the purpose of the present invention, the present invention provides a non-volatile storage medium, which stores a plurality of program codes, and the plurality of program codes are read by a computer device to perform the aforementioned determination of whether an anonymous user belongs to a specific user group Any of the methods of the family.

根據本發明的目的，本發明提供一種能夠判別匿名用戶是否屬於特定用戶族群的用戶管理裝置，係以一純硬體電路或一計算機裝置搭配一軟體實現，其係組態成多個模塊，且多個模塊運作以執行前述能夠判別匿名用戶是否屬於特定用戶族群族的方法的任一者。According to the purpose of the present invention, the present invention provides a user management device capable of judging whether an anonymous user belongs to a specific user group, which is realized by a pure hardware circuit or a computer device with a software, which is configured into multiple modules, and A plurality of modules operate to perform any one of the aforementioned methods capable of determining whether an anonymous user belongs to a specific user group.

相較於先前技術無法對匿名用戶進行精準行銷，本發明則可以進一步地對匿名用戶做精準行銷。Compared with the prior art that cannot conduct precise marketing to anonymous users, the present invention can further conduct precise marketing to anonymous users.

下面結合附圖對本發明的較佳實施例進行詳細闡述，以使本發明的優點和特徵能更易於被本領域技術人員理解，藉以對本發明的保護範圍做出更為清楚明確的界定。The preferred embodiments of the present invention will be described in detail below in conjunction with the accompanying drawings, so that the advantages and features of the present invention can be more easily understood by those skilled in the art, so as to define the protection scope of the present invention more clearly.

企業的平台裝置所收集的用戶資料通常會有實名用戶資料(如用戶在企業之購物網站或會員系統所註冊的會員資料)與匿名用戶資料(例如，向第三方購買之被遮罩或已去識別化的資料、用戶未登入為註冊用戶的資料或未在企業之購物網站或會員系統註冊的用戶之資料)。進一步地說，實名用戶資料為具名的用戶資料(例如，會員資料或驗證用戶資料，其可藉由登錄或第三方驗證等方式來確定)，實名用戶資料包括具有唯一的識別、電郵、電話號碼或裝置號碼，而可以追溯到用戶的個人資訊，但實名用戶資料也包括對應於具名的用戶的小型文字檔案、網際網路協定位址或裝置識別、瀏覽行為、裝置資訊、興趣喜好、人口統計資料、消費紀錄、互動紀錄與帳務紀錄。相反地，匿名用戶資料為不具名的用戶資料，僅包括如小型文字檔案、網際網路協定位址或裝置識別、瀏覽行為、裝置資訊、興趣喜好等資料，且匿名用戶資料無法追溯到用戶的個人資訊。The user information collected by the enterprise's platform devices usually includes real-name user information (such as the membership information registered by the user on the company's shopping website or membership system) and anonymous user information (for example, the masked or deleted information purchased from a third party). identifiable information, the information of users who have not logged in as registered users or the information of users who have not registered in the company's shopping website or membership system). Furthermore, real-name user information is named user information (for example, member information or verified user information, which can be determined by logging in or third-party verification, etc.), and real-name user information includes unique identification, email, phone number, etc. or device number, which can be traced to a user's personal information, but authenticated user data also includes cookies, Internet Protocol addresses or device identifiers, browsing behavior, device information, interests, demographics, corresponding to a named user data, consumption records, interaction records and accounting records. In contrast, anonymous user data is anonymous user data, including only data such as cookies, Internet protocol addresses or device identification, browsing behavior, device information, interests, etc., and anonymous user data cannot be traced back to the user's personal information.

一般企業會利用實名用戶資料來對實名用戶進行分群，以判斷實名用戶屬於哪一種特定用戶族群。雖然，從匿名用戶資料的網際網路協定位址可能可以知悉匿名用戶來自於哪一個地方，但仍無法有效地知悉匿名用戶資料屬於哪一種特定用戶族群。例如，匿名用戶資料的網際網路協定位址為海外的網際網路協定位址，但仍無法知悉匿名用戶是否就一定不屬於高價值用戶群、常購用戶群與高消費額用戶群。Generally, enterprises will use real-name user information to group real-name users to determine which specific user group the real-name users belong to. Although it is possible to know where the anonymous user comes from from the IP address of the anonymous user data, it is still impossible to effectively know which specific user group the anonymous user data belongs to. For example, the IP address of the anonymous user data is an overseas IP address, but it is still impossible to know whether the anonymous user must not belong to the high-value user group, frequent purchase user group, and high-spending user group.

本發明的目的在於如何使用匿名用戶資料對匿名用戶進行分群，以知悉匿名用戶屬於特定用戶族群中的哪一群，如此，企業除了可以知悉實名用戶屬於哪一種特定用戶族群外，更可以知道匿名用戶屬於哪一種特定用戶族群，從而讓行銷人員使用，以利於精準行銷。簡單地說，本發明解決了先前技術因無法知悉匿名用戶屬於哪一種特定用戶族群而無法對匿名用戶進行精準行銷的技術問題。The purpose of the present invention is how to use anonymous user data to group anonymous users, so as to know which group in the specific user group the anonymous user belongs to. In this way, the enterprise can not only know which specific user group the real-name user belongs to, but also know the anonymous user. Which specific user group they belong to, so that marketers can use it to facilitate precise marketing. To put it simply, the present invention solves the technical problem that the prior art cannot conduct accurate marketing to anonymous users because it is impossible to know which specific user group the anonymous users belong to.

於本發明中，主要是先收集多個實名用戶的實名用戶資料，然後，根據分群算法(例如，透過機器學習算法，但不以此為限制)或使用特定模型(例如，RFM模型(Recency Frequency Monetary Model，即以近一次消費、消費頻率、消費金額等數值分群的分類模型)對多個實名用戶分群，每一個實名用戶會屬於其中一個特定用戶族群。以高價值用戶群為例，如果想知道匿名用戶是否也屬於高價值用戶群，則可以對高價值用戶群進行特徵工程，以萃取出高價值用戶群所對應的重要實名用戶特徵集合(其包括關聯於高價值用戶群的多個實名用戶特徵)。接著，收集多個匿名用戶的匿名用戶資料，並且獲取可以對應於重要實名用戶特徵集合的多個實名用戶特徵之一部分的多個匿名用戶特徵作為重要匿名用戶特徵集合。然後，根據匿名用戶的多個匿名用戶特徵判斷匿名用戶是否屬於高價值用群。In the present invention, the real-name user data of a plurality of real-name users is collected first, and then, according to a grouping algorithm (for example, through a machine learning algorithm, but not limited thereto) or using a specific model (for example, RFM model (Recency Frequency Monetary Model, that is, a classification model based on numerical grouping such as recent consumption, consumption frequency, and consumption amount) groups multiple real-name users, and each real-name user belongs to one of the specific user groups. Taking high-value user groups as an example, if you want to know Whether anonymous users also belong to the high-value user group, feature engineering can be performed on the high-value user group to extract the important real-name user feature set corresponding to the high-value user group (which includes multiple real-name users associated with the high-value user group feature). Then, collect the anonymous user data of multiple anonymous users, and obtain multiple anonymous user features that can correspond to one part of the multiple real-name user features of the important real-name user feature set as an important anonymous user feature set. Then, according to the anonymous A user's multiple anonymous user characteristics determine whether an anonymous user belongs to a high-value user group.

前述特徵是指具有代表性的資料，可理解為輸入參數或變數，有意義的輸入參數或變數會影響到最終的用戶分群結果(即會影響判別用戶是否屬於特定用戶族群的預測結果)。例如，可以從用戶瀏覽行為擷取特徵，進一步地，用戶進到網頁後常常點擊廣告內容、喜歡觀看影片或停留在網頁的時間不超過 10秒的瀏覽行為，都可能是有意義的特徵。The aforementioned features refer to representative data, which can be understood as input parameters or variables. Significant input parameters or variables will affect the final user grouping results (that is, they will affect the prediction results for judging whether users belong to a specific user group). For example, features can be extracted from the user's browsing behavior. Furthermore, the browsing behavior that the user often clicks on the advertisement content after entering the webpage, likes to watch videos, or stays on the webpage for no more than 10 seconds may all be meaningful features.

另外，萃取出來之常見的特徵，從類型上可分為數值類型、類別類型與時間類型。此三類型的特徵可以彼此轉換，例如，數值類型的特徵可以進行處理，而轉換成時間類型的特徵。數值類型的特徵是利用統計描述方法，將整體資料範圍切分成數分。類別類型的特徵是指屬於何種類別，例如，裝置資訊可能是筆電、手機或平板，而裝置為筆電、手機與平板的哪一者即是類別類型的特徵。另外，數值類型的特徵也能轉換成類別類型的特徵，或者類別類型的特徵也可以依照各類別比重作特徵加權，以轉換成數值類型的特徵。時間類型的特徵可以是將數值特徵依照每週/每月/每季等方法計算頻率、次數或大小等，其計算方法非常多種，且不以本發明所述的作法為限制。簡單地說，三種類型之特徵的任一者都可以經過處理，以生成其中另一者的特徵。In addition, the extracted common features can be divided into numerical type, category type and time type in terms of types. The three types of features can be converted to each other, for example, the features of the numerical type can be processed and converted into the features of the time type. The characteristic of the numerical type is to use the statistical description method to divide the overall data range into fractions. The feature of the category type refers to which category it belongs to. For example, the device information may be a laptop, a mobile phone, or a tablet, and which one of the device is a laptop, a mobile phone, or a tablet is the feature of the category type. In addition, the features of the numerical type can also be converted into the features of the category type, or the features of the category type can also be weighted according to the proportion of each category to convert into the features of the numeric type. The characteristics of the time type can be the frequency, number of times or size of numerical characteristics calculated according to methods such as weekly/monthly/quarterly, etc. There are many calculation methods and are not limited by the methods described in the present invention. Simply put, any of the three types of features can be processed to generate features of the other.

首先，請參照本發明圖1，圖1是本發明實施例的實名用戶特徵與匿名用戶特徵之對應關係的示意圖。於此圖中，匿名用戶資料可能包括裝置資訊301、瀏覽行為302、興趣喜好303、網際網路協定位址或裝置識別304，且透過訓練演算法或分析者的人為規則，可以定義出如何從裝置資訊301、瀏覽行為302、興趣喜好303、網際網路協定位址或裝置識別304獲取多個匿名用戶特徵。另外，實名用戶資料可能包括裝置資訊321、瀏覽行為322、興趣喜好323、網際網路協定位址或裝置識別324、人口統計資料325、消費紀錄326、互動紀錄327與帳戶紀錄328，且透過訓練演算法或分析者的人為規則，可以定義出如何從裝置資訊321、瀏覽行為322、興趣喜好323、網際網路協定位址或裝置識別324、人口統計資料325、消費紀錄326、互動紀錄327與帳戶紀錄328獲取多個實名用戶特徵。First, please refer to FIG. 1 of the present invention. FIG. 1 is a schematic diagram of the corresponding relationship between real-name user features and anonymous user features in an embodiment of the present invention. In this figure, anonymous user data may include device information 301, browsing behavior 302, interests and preferences 303, IP address or device identification 304, and through training algorithms or artificial rules of analysts, it can be defined how to obtain from Device information 301 , browsing behavior 302 , interests 303 , IP address or device identification 304 capture multiple anonymous user characteristics. In addition, real-name user data may include device information 321, browsing behavior 322, interests and preferences 323, IP addresses or device identification 324, demographic data 325, consumption records 326, interaction records 327 and account records 328, and through training Algorithms or artificial rules of analysts can define how to obtain device information 321, browsing behavior 322, interests and preferences 323, IP address or device identification 324, demographic data 325, consumption records 326, interaction records 327 and The account record 328 captures multiple real-name user characteristics.

在特定戶用群族中，多個實名用戶可能會有一些實名用戶特徵對於特定戶用群族來說是重要有意義的，透過特徵工程，這些實名用戶特徵可以被萃取出來作為特定戶用群族的重要實名用戶特徵集合33，其中特定戶用群族的重要實名用戶特徵集合33例如包括N個實名用戶特徵331～33N。一般來說，從匿名用戶資料能獲取的匿名用戶特徵之數量會比從實名用戶資料能獲取的實名用戶特徵之數量來得少，因此，需要從特定戶用群族的重要實名用戶特徵集合33的N個實名用戶特徵331～33N進行挑選，找出對應匿名用戶資料也可以獲取者的特徵，例如，特定戶用群族的重要匿名用戶特徵集合31的M個匿名用戶特徵311～31M是可以從匿名用戶資料獲取者，且可以同樣是特定戶用群族重要有意義的，其中M通常小於等於N，且M、N為正整數。如此，可以根據匿名用戶的M個匿名用戶特徵311～31M判斷匿名客戶是否屬於特定戶用群族，並進一步地制定行銷策略，以促進交易與訂閱的成功率，從而實現對匿名用戶進行精準行銷的目的。In a specific user group, multiple real-name users may have some real-name user features that are important and meaningful for a specific user group. Through feature engineering, these real-name user features can be extracted as a specific user group The important real-name user feature set 33 of a specific user group includes, for example, N real-name user features 331-33N. Generally speaking, the number of anonymous user features that can be obtained from anonymous user data will be less than the number of real-name user features that can be obtained from real-name user data. N real-name user features 331-33N are selected to find out the features of the corresponding anonymous user data that can also be obtained. For example, the M anonymous user features 311-31M of the important anonymous user feature set 31 of a specific user group can be obtained from An anonymous user data acquirer, and may also be important and meaningful to a specific user group, where M is usually less than or equal to N, and M and N are positive integers. In this way, it is possible to judge whether an anonymous customer belongs to a specific user group according to the M anonymous user characteristics 311~31M of the anonymous user, and further formulate marketing strategies to promote the success rate of transactions and subscriptions, so as to realize precise marketing of anonymous users the goal of.

請參照圖2，圖2是使用本發明實施例之用戶管理裝置的線上實名/匿名用戶服務系統的方塊圖。圖2的線上或線下(以下以線上舉例)實名/匿名用戶服務系統4不僅可以提供服務給實名用戶與匿名用戶，更可以透過對匿名用戶是否屬於實名用戶的哪一個特定用戶族群來對匿名用戶進行精準行銷。線上實名/匿名用戶服務系統4包括平台裝置41、匿名用戶資料擷取裝置42、匿名用戶資料庫43、實名用戶資料擷取裝置44與用戶管理裝置45，其中，匿名用戶資料擷取裝置42信號連接平台裝置41、匿名用戶資料庫43與實名用戶資料擷取裝置44，用戶管理裝置45信號連接匿名用戶資料庫43、實名用戶資料擷取裝置44與平台裝置41，以及實名用戶資料擷取裝置44信號連接平台裝置41。另外，本發明中的「信號連接」是指透過軟體或硬體方式的無線或有線連接，使得信號或資訊可以彼此在信號連接的多個元件中互相傳遞。Please refer to FIG. 2. FIG. 2 is a block diagram of an online real-name/anonymous user service system using a user management device according to an embodiment of the present invention. The real-name/anonymous user service system 4 shown in Figure 2 can not only provide services to real-name users and anonymous users, but also can provide services to anonymous users by checking whether the anonymous users belong to which specific user group of real-name users. Users conduct precise marketing. The online real-name/anonymous user service system 4 includes a platform device 41, an anonymous user data retrieval device 42, an anonymous user database 43, a real-name user data retrieval device 44 and a user management device 45, wherein the anonymous user data retrieval device 42 signals Connect platform device 41, anonymous user database 43 and real-name user data capture device 44, user management device 45 signal connection anonymous user database 43, real-name user data capture device 44 and platform device 41, and real-name user data capture device 44 signal connection platform device 41. In addition, "signal connection" in the present invention refers to a wireless or wired connection through software or hardware, so that signals or information can be transmitted to each other among multiple components connected by signals.

匿名用戶透過其用戶終端裝置(例如，但不限定是手機、筆電或平板)信號連接平台裝置41，且平台裝置11可以接收匿名用戶傳送的資料。平台裝置41可以是軟體即服務(Software as a Service，SaaS)伺服器、網頁伺服器或是基於用戶終端裝置安裝之應用程式而提供服務的伺服器。線上實名/匿名用戶服務系統4之平台裝置41能夠提供服務給匿名用戶及實名用戶的服務可以例如是線上購物、線上交易、線上投資、線上訂閱或線上諮詢等，且本發明不以此為限制。用戶透過對用戶終端裝置的操作，可以選擇登入會員後使用提供的服務，或者可以選擇不登入會員，而以匿名方式使用提供的服務。The anonymous user connects to the platform device 41 via a user terminal device (such as, but not limited to, a mobile phone, a laptop or a tablet), and the platform device 11 can receive data sent by the anonymous user. The platform device 41 may be a Software as a Service (SaaS) server, a web server, or a server that provides services based on applications installed on user terminal devices. The platform device 41 of the online real-name/anonymous user service system 4 can provide services to anonymous users and real-name users, such as online shopping, online transactions, online investment, online subscription or online consultation, etc., and the present invention is not limited thereto . Through the operation of the user terminal device, the user can choose to use the provided services after logging in as a member, or can choose not to log in as a member and use the provided services anonymously.

匿名用戶資料擷取裝置42會將匿名用戶傳送的資料進行擷取，以獲取匿名用戶資料，例如，小型文字檔案、網際網路協定位址或裝置識別、瀏覽行為、裝置資訊、興趣喜好等資料，且匿名用戶資料是指無法追溯到用戶個人資訊的資料。匿名用戶資料庫43可以是非結構化資料庫(但本發明不以此為限制，亦可以是結構化資料庫)，且用於儲存匿名用戶資料。The anonymous user data capture device 42 will capture the data sent by the anonymous user to obtain anonymous user data, such as small text files, Internet protocol address or device identification, browsing behavior, device information, interests and preferences, etc. , and anonymous user data refers to data that cannot be traced back to the user's personal information. The anonymous user database 43 can be an unstructured database (but the present invention is not limited thereto, it can also be a structured database), and is used to store anonymous user data.

實名用戶資料擷取裝置44用於會將實名用戶傳送的資料進行擷取，以獲取實名用戶資料，或者，實名用戶資料擷取裝置44在匿名用戶實名登入於平台裝置41時，將先前匿名瀏覽的匿名用戶資料擷取出來做為部分的實名用戶資料。用戶管理裝置45則記錄與分析實名用戶資料，並據此可以依據實名用戶資料對實名用戶進行分群，以判斷實名用戶是屬於哪一種特定用戶族群。The real-name user data retrieval device 44 is used to retrieve the data sent by the real-name user to obtain the real-name user data, or, the real-name user data retrieval device 44 retrieves the previously anonymously browsed Anonymous user data extracted as part of the real-name user data. The user management device 45 records and analyzes the real-name user information, and based on this, the real-name user can be grouped according to the real-name user information to determine which specific user group the real-name user belongs to.

用戶管理裝置45更可以對匿名用戶的匿名用戶資料進行處理，以找出實名用戶之特定用戶族群的多個實名用戶特徵也可以從匿名用戶資料算出者，即找出可以透過匿名用戶資料算出且對應於的實名用戶之特定用戶族群的實名用戶特徵的匿名用戶特徵。接著，用戶管理裝置45根據這些匿名用戶特徵判斷匿名用戶是否屬於實名用戶之特定用戶族群，並對這些匿名用戶進行貼標(即標註匿名用戶屬於特定用戶族群)，以藉此決定對匿名用戶進行精確行銷的行銷策略。The user management device 45 can further process the anonymous user data of the anonymous user to find out the characteristics of multiple real-name users of the specific user group of the real-name user that can also be calculated from the anonymous user data, that is, to find out those who can be calculated through the anonymous user data and Anonymous user features corresponding to real-name user features of a specific user group of real-name users. Next, the user management device 45 judges whether the anonymous user belongs to the specific user group of the real-name user according to the characteristics of these anonymous users, and labels these anonymous users (that is, marks the anonymous user as belonging to the specific user group), so as to decide to carry out the anonymous user. Marketing strategy for precise marketing.

用戶管理裝置45通常是透過軟體配合計算機裝置的方式來實現。例如，計算機裝置包括運算單元、儲存單元與通訊單元，其中運算單元電性連接儲存單元與通訊單元。運算單元會讀取非揮發性儲存媒介所儲存的多個程式碼，多個程式碼被執行後，能夠運行能夠判別匿名用戶是否屬於特定用戶族群的方法。另外，用戶管理裝置45也可以透過硬體電路來實現，例如，透過設計Verilog或VHDL碼對場可程式化陣列(FPGA)晶片燒錄，以實現純硬體電路的用戶管理裝置45，又例如，透過製作特定應用晶片(ASIC)來實現純硬體電路的用戶管理裝置45。總而言之，用戶管理裝置45是使用軟體還是硬體來實現，其皆非用於限制本發明。The user management device 45 is usually implemented through software in cooperation with a computer device. For example, a computer device includes a computing unit, a storage unit and a communication unit, wherein the computing unit is electrically connected to the storage unit and the communication unit. The computing unit reads multiple program codes stored in the non-volatile storage medium. After the multiple program codes are executed, a method capable of judging whether the anonymous user belongs to a specific user group can be executed. In addition, the user management device 45 can also be realized through hardware circuits, for example, by designing Verilog or VHDL codes to burn Field Programmable Array (FPGA) chips to realize the user management device 45 of pure hardware circuits, and for example , realize the user management device 45 of pure hardware circuit by making an application-specific chip (ASIC). In a word, whether the user management device 45 is implemented by software or hardware is not used to limit the present invention.

請接著參照本發明圖3，圖3是本發明實施例之能夠判別匿名用戶是否屬於特定用戶族群的用戶管理裝置的方塊圖。不管用戶管理裝置使用軟體或硬體方式來實現，其可以劃分出多個模組如圖3。於圖3中，用戶管理裝置5包括實名用戶分群模組51、實名用戶特徵工程模組52、匿名用戶特徵獲取模組53、匿名/實名用戶特徵映射模組54與匿名用戶分群模組55，其中實名用戶分群模組51信號連接實名用戶特徵工程模組52，匿名/實名用戶特徵映射模組54則信號連接實名用戶特徵工程模組52、匿名用戶特徵獲取模組53與匿名用戶分群模組55，以及匿名用戶特徵獲取模組53電信號連接匿名用戶分群模組55。當然，圖3的模組劃分僅是實現本發明的其中一種方式，且圖3的做法並非用於限制本發明。Please refer to FIG. 3 of the present invention. FIG. 3 is a block diagram of a user management device capable of judging whether an anonymous user belongs to a specific user group according to an embodiment of the present invention. Regardless of whether the user management device is realized by software or hardware, it can be divided into multiple modules as shown in FIG. 3 . In FIG. 3 , the user management device 5 includes a real-name user grouping module 51, a real-name user feature engineering module 52, an anonymous user feature acquisition module 53, an anonymous/real-name user feature mapping module 54 and an anonymous user grouping module 55, Among them, the real-name user grouping module 51 is signal-connected to the real-name user feature engineering module 52, and the anonymous/real-name user feature mapping module 54 is signal-connected to the real-name user feature engineering module 52, the anonymous user feature acquisition module 53 and the anonymous user grouping module 55, and the anonymous user feature acquisition module 53 is electrically connected to the anonymous user grouping module 55. Of course, the module division in FIG. 3 is only one way to implement the present invention, and the approach in FIG. 3 is not intended to limit the present invention.

請接著同時參照圖3與圖4，圖4是本發明實施例之能夠判別匿名用戶是否屬於特定用戶族群的方法的流程圖，當然，圖4的流程圖也是實現本發明的其中一種方式，且實際上整個方法的流程的執行順序與細節可能會因應實際需求而略有變化，亦即，圖4的做法並非用於限制本發明。Please refer to FIG. 3 and FIG. 4 at the same time. FIG. 4 is a flowchart of a method for determining whether an anonymous user belongs to a specific user group in an embodiment of the present invention. Of course, the flowchart in FIG. 4 is also one of the ways to implement the present invention, and In fact, the execution sequence and details of the entire method flow may be slightly changed according to actual needs, that is, the approach in FIG. 4 is not intended to limit the present invention.

首先，在流程S61中，實名用戶分群模組51進行實名用戶資料收集步驟。具體地，實名用戶分群模組51會收集多個實名用戶的實名用戶資料並根據系統運算結果或人為定義出的規則對實名用戶資料處理，以獲取實名用戶特徵，其中實名用戶資料例如但不限於是前述的各類實名用戶資料，且更可以包括年齡、性別、職業、註冊時間點、RFM參數、會員等級或晉升狀況、帳號餘額與所採用之商品之取貨/交付方式等資訊。實名用戶特徵例如為每週登入頻率、平均停留時間、每週首頁停留次數或每週消費金額，且本發明不以此為限制，在不同情況下，會有不同類型的實名用戶特徵。First, in the process S61, the real-name user grouping module 51 performs a step of collecting real-name user data. Specifically, the real-name user grouping module 51 will collect the real-name user data of multiple real-name users and process the real-name user data according to the system calculation results or artificially defined rules to obtain real-name user characteristics, wherein the real-name user data is for example but not limited to It is the above-mentioned various real-name user information, and can also include information such as age, gender, occupation, registration time, RFM parameters, membership level or promotion status, account balance, and the pick-up/delivery method of the products used. Real-name user characteristics are, for example, weekly login frequency, average stay time, weekly home page stay times, or weekly consumption amount, and the present invention is not limited thereto. In different situations, there will be different types of real-name user characteristics.

接著，在流程S62中，實名用戶分群模組51進行實名用戶分群步驟。具體地，實名用戶分群模組51可以是使用定義好的模型對多個實名用戶進行分群，例如，使用RFM模型分群，可以根據實名用戶資料將實名用戶分類到高價值用戶群、高消費用戶群、常購用戶群、近期高活躍用戶群或其他用戶群。除了可使用RFM模型對實名用戶分群之外，也可以使用其他模型之分群方式，例如以靜態特徵(年齡、性別、職業、註冊時間點)分群，或以動態特徵(RFM參數、會員等級或晉升狀況、帳號餘額、所採用之商品之取貨/交付方式)分群。分群的方式可以是採用K-MEANS算法、支持向量機或機器學習(例如，類神經網路)等演算法直接根據多個實名用戶的實名用戶資料的來進行分群。Next, in the process S62, the real-name user grouping module 51 performs the real-name user grouping step. Specifically, the real-name user grouping module 51 can use a defined model to group a plurality of real-name users, for example, using the RFM model grouping, real-name users can be classified into high-value user groups and high-consumption user groups according to real-name user data. , frequent purchase user group, recent highly active user group or other user groups. In addition to using the RFM model to group real-name users, other model grouping methods can also be used, such as grouping by static features (age, gender, occupation, registration time point), or by dynamic features (RFM parameters, membership levels or promotions) status, account balance, pickup/delivery method of the product used). The way of grouping may be to use algorithms such as K-MEANS algorithm, support vector machine or machine learning (for example, similar neural network) to carry out grouping directly according to the real-name user information of multiple real-name users.

上述分群可以是分出多個特定用戶族群，也可以是僅判斷實名用戶是否屬於某一個特定用戶族群(例如，僅在乎是否為高價值用戶群)。上述分群也可以是針對特定活動或單一性活動分群，例如，分群是判斷實名用戶是否屬於長期參與各種長期促銷活動的特定用戶族群或者是否屬於參加限量之一次性偶像商品加購活動的特定用戶族群。簡單地說，實名用戶的分群方式與分群類型等皆非用於限制本發明。The above-mentioned grouping can be divided into multiple specific user groups, or it can be determined only whether the real-name user belongs to a certain specific user group (for example, only care about whether it is a high-value user group). The above-mentioned grouping can also be grouped for specific activities or single activities. For example, the grouping is to determine whether the real-name user belongs to a specific user group that has participated in various long-term promotional activities for a long time or whether it belongs to a specific user group that participates in a limited amount of one-time idol commodity purchase activities. . In short, neither the grouping method nor the grouping type of the real-name users is used to limit the present invention.

舉例來說，以一般電商常用的會員系統的情況，每位實名用戶在註冊後都會獲得一個用戶帳號，未來用戶在系統中的所有行為活動都會記錄在系統裡，例如用戶何時消費、消費多少錢、消費的時間、地點與品項，消費前點擊了那些頁面，放了多少品項進購物車，實際購買了多少品項等等。取得這些實名用戶資料後，透過分群方法(例如，RFM模型、商品愛好分群或活動參與分群等)，可定義出哪些實名用戶花費最多，哪些實名用戶最常消費，哪些實名用戶最近很活躍，從而可定義出當下對企業來說是具有高貢獻度的特定用戶族群。For example, in the case of the membership system commonly used by e-commerce, each real-name user will get a user account after registration, and all future behaviors and activities of the user in the system will be recorded in the system, such as when the user consumes and how much he consumes Money, consumption time, location and items, which pages were clicked before consumption, how many items were put into the shopping cart, how many items were actually purchased, and so on. After obtaining these real-name user data, through grouping methods (for example, RFM model, commodity hobby grouping or activity participation grouping, etc.), it is possible to define which real-name users spend the most, which real-name users spend most often, and which real-name users are very active recently, so that It is possible to define a specific user group that currently has a high degree of contribution to the enterprise.

之後，在流程63中，實名用戶特徵工程模組52進行實名用戶特徵工程步驟，以藉此針對有興趣的一個特定用戶族群萃取出會關聯於此特定用戶族群的重要實名用戶特徵集合。具體地，特徵工程可以是透過機器學習或統計算法來實現，且本發明不以此為限制，只要能夠將關聯於特定用戶族群的至少一個實名用戶特徵萃取出來即可。Afterwards, in the process 63, the real-name user feature engineering module 52 performs the real-name user feature engineering step, so as to extract an important real-name user feature set associated with a specific user group that is interested in the specific user group. Specifically, feature engineering can be implemented through machine learning or statistical algorithms, and the present invention is not limited thereto, as long as at least one real-name user feature associated with a specific user group can be extracted.

通常來說，不同的實名用戶特徵對實名用戶是否屬於特定用戶族群的重要性不同，因此，這個步驟主要就是挑出對特定用戶族群來說是重要且有意義的實名用戶特徵作為特定用戶族群的重要實名用戶特徵集合。例如，可以是挑選重要性前幾名的實名用戶特徵作為特定用戶族群的重要實名用戶特徵集合中的實名用戶特徵，也可以是挑選重要性超過一定門檻值的實名用戶特徵作為特定用戶族群的重要實名用戶特徵集合中的實名用戶特徵。Generally speaking, different real-name user characteristics have different importance on whether a real-name user belongs to a specific user group. Therefore, this step is mainly to pick out the important and meaningful real-name user characteristics for a specific user group as the important characteristics of a specific user group. A collection of real-name user features. For example, the real-name user features with the top importance can be selected as the real-name user features in the important real-name user feature set of a specific user group, or the real-name user features whose importance exceeds a certain threshold can be selected as the important real-name user features of a specific user group. The real-name user features in the real-name user feature set.

在已經知道了哪些實名用戶是屬於例如高價值用戶群，那些實名用戶不是屬於高價值用戶群後，可以透過萃取出實名用戶的多個特徵來了解哪些特徵是高價值用戶群擁有的實名用戶特徵(也就是挑選出對高價值用戶群來說是重要的實名用戶特徵)。例如，經過特徵工程後，屬於高價值用戶群的實名用戶的特徵為每天瀏覽網頁超過10秒鐘、每週每次進到網頁首頁都停留5秒以上、每週消費金額超過1000元與高價值用戶群的實名用戶往往透過手機應用程式登入而非通過操作電腦於網頁登入等，故可以將這些實名用戶特徵作為高價值用戶群的重要實名用戶特徵集合。After knowing which real-name users belong to, for example, high-value user groups and which real-name users do not belong to high-value user groups, you can extract multiple features of real-name users to understand which features are the characteristics of real-name users owned by high-value user groups (that is, to select the important real-name user characteristics for high-value user groups). For example, after feature engineering, the real-name users belonging to the high-value user group are characterized by browsing the web for more than 10 seconds a day, staying on the home page for more than 5 seconds every week, spending more than 1,000 yuan per week, and high-value users. The real-name users of the user group often log in through mobile applications instead of operating computers and logging in on the web, so these real-name user characteristics can be used as an important collection of real-name user characteristics for high-value user groups.

舉例來說，其中一個實名用戶資料可能為「ID:0001；2020-9-1 10:00:05；登入首頁；device=iphone12」、「ID：0001；2020-9-1 10:00:15；離開首頁；device=iphone12」、「ID：0001；2020-9-5 14:08:30；登入首頁；device=iphone12」、「ID：0001；2020-9-5 14:08:30；消費；消費金額=1,000；device=iphone12」與「ID：0001；2020-9-1 14: 08:50；離開首頁；device=iphone12」。透過流程S62中，實名用戶分群模組51可以取得「ID:0001」的多個實名用戶的特徵為「每週登入頻率=2」、「平均停留時間=15」、「每週首頁停留次數=2」與「每週消費金額=1,000」。For example, one of the real-name user information may be "ID:0001; 2020-9-1 10:00:05; log in to the homepage; device=iphone12", "ID: 0001; 2020-9-1 10:00:15 ;Leave the homepage; device=iphone12", "ID: 0001; 2020-9-5 14:08:30; log in to the homepage; device=iphone12", "ID: 0001; 2020-9-5 14:08:30; consumption ;Consumption amount=1,000; device=iphone12" and "ID: 0001; 2020-9-1 14: 08:50; leave the homepage; device=iphone12". Through the process S62, the real-name user grouping module 51 can obtain the characteristics of multiple real-name users of "ID:0001" as "weekly login frequency = 2", "average stay time = 15", "weekly home page stay times = 2" and "Weekly consumption amount = 1,000".

在每一位實名用戶的實名用戶資料都轉換為特徵，且替每一位實名用戶分群後(流程S62執行後)在流程S63，可使用機器學習算法或統計算法等來進行特徵工程，萃取與確認對特定用戶族群(例如高價值用戶群)來說是重要且有意義的實名用戶特徵有哪些。通常可進行特徵選取，依重要性分數選擇前面數名實名用戶特徵作為特定用戶族群的重要實名用戶特徵集合，且這些特定用戶族群的重要實名用戶特徵集合就是對特定用戶族群來說是重要且有意義的特徵。以上述例子來說，透過特徵工程後，可以發現對高價值用戶群來說重要且有意義的實名用戶特徵為每週登入頻率、平均停留時間與每週消費金額，且對高價值用戶群中的實名用戶來說，每週登入頻率大於2，平均停留時間大於10與每週消費金額大於100。對高價值用戶群來說重要且有意義的實名用戶特徵則可以用於在後續的流程S66使用。After the real-name user information of each real-name user is converted into features, and after each real-name user is grouped (after the execution of process S62), in process S63, machine learning algorithms or statistical algorithms can be used to perform feature engineering, extraction and Confirm what real-name user characteristics are important and meaningful for a specific user group (such as a high-value user group). Usually, feature selection can be carried out, and the first few real-name user features are selected according to the importance score as the important real-name user feature set of a specific user group, and these important real-name user feature sets of a specific user group are important and meaningful for a specific user group Characteristics. Taking the above example as an example, through feature engineering, it can be found that the important and meaningful features of real-name users for high-value user groups are weekly login frequency, average stay time, and weekly consumption amount. For real-name users, the weekly login frequency is greater than 2, the average stay time is greater than 10, and the weekly consumption amount is greater than 100. Important and meaningful features of real-name users for high-value user groups can be used in the subsequent process S66.

在流程S64，匿名用戶特徵獲取模組53執行匿名用戶資料收集步驟，以蒐集每一個匿名用戶的匿名用戶資料。然後，在流程S65，匿名用戶特徵獲取模組53對匿名用戶資料處理，以獲取匿名用戶特徵，其中匿名用戶特徵是根據系統運算結果或人為定義出的規則對匿名用戶資料處理而獲取者。匿名用戶特徵例如為每週登入頻率、平均停留時間或每週首頁停留次數，且本發明不以此為限制，在不同情況下，會有不同類型的匿名用戶特徵。實名用戶特徵中有些是無法透過匿名用戶資料獲取者，例如前述的每週消費金額是一種實名用戶特徵，但匿名用戶特徵卻不包括每週消費金額。In the process S64, the anonymous user feature acquisition module 53 executes the step of collecting anonymous user data to collect the anonymous user data of each anonymous user. Then, in the process S65, the anonymous user feature acquisition module 53 processes the anonymous user data to obtain the anonymous user features, wherein the anonymous user feature is obtained by processing the anonymous user data according to the system operation result or artificially defined rules. Anonymous user characteristics are, for example, weekly login frequency, average stay time, or weekly home page stay times, and the present invention is not limited thereto. In different situations, there will be different types of anonymous user characteristics. Some real-name user characteristics cannot be obtained through anonymous user data. For example, the aforementioned weekly consumption amount is a real-name user characteristic, but anonymous user characteristics do not include weekly consumption amount.

在取得特定用戶族群的重要實名用戶特徵集合以及每一個匿名用戶的匿名用戶特徵後，在流程S66中，匿名/實名用戶特徵映射模組54找出特定用戶族群中之重要實名用戶特徵集合中之實名用戶特徵有哪些是可以透過匿名用戶資料獲取者，並取對應於這些實名用戶特徵的匿名用戶特徵作為特定用戶族群中之重要匿名用戶特徵集合的匿名用戶特徵(例如，透過比對匿名用戶特徵的名稱與實名用戶特徵的名稱)。舉例來說，對高價值用戶群中的實名用戶來說，每週登入頻率大於2，平均停留時間大於10與每週消費金額大於100，但是透過匿名用戶資料僅能獲取每週登入頻率與平均停留時間的匿名用戶特徵，因此，後續可以透過匿名用戶之每週登入頻率與平均停留時間的匿名用戶特徵之數值判斷此匿名用戶是否也屬於高價值用戶群。After obtaining the important real-name user feature set of the specific user group and the anonymous user feature of each anonymous user, in the process S66, the anonymous/real-name user feature mapping module 54 finds out one of the important real-name user feature sets in the specific user group What are the characteristics of real-name users that can be acquired through anonymous user data, and the anonymous user characteristics corresponding to these real-name user characteristics are taken as the anonymous user characteristics of the important anonymous user characteristic set in a specific user group (for example, by comparing anonymous user characteristics name and the name of the real-name user feature). For example, for real-name users in the high-value user group, the weekly login frequency is greater than 2, the average stay time is greater than 10, and the weekly consumption amount is greater than 100, but only the weekly login frequency and average The anonymous user characteristics of the dwell time, therefore, the value of the anonymous user characteristics of the anonymous user's weekly login frequency and average dwell time can be used to determine whether the anonymous user also belongs to the high-value user group.

較佳地，在流程S66中，重要匿名用戶特徵集合的匿名用戶特徵可以拿來做一次模型訓練，以藉此建立用於判斷匿名用戶是否屬於特定用戶族群的分群模型。簡單地說，流程S66主要是要找出對於特定用戶族群來說是重要且有意義並可以透過匿名用戶資料可以取得的匿名用戶特徵之重要匿名用戶特徵集合，並且藉此訓練出用於根據匿名用戶資料得到的匿名用戶特徵判斷匿名用戶是否屬於特定用戶族群的分群模型。Preferably, in the process S66, the anonymous user features of the important anonymous user feature set can be used for model training, so as to establish a grouping model for judging whether the anonymous user belongs to a specific user group. To put it simply, the process S66 is mainly to find out the important anonymous user feature set that is important and meaningful for a specific user group and can be obtained through the anonymous user data, and use it to train a set of anonymous user features based on the anonymous user A grouping model that judges whether an anonymous user belongs to a specific user group based on the anonymous user characteristics obtained from the data.

在此請注意，在流程S66中，也可以不用額外地訓練分群模型，而是直接套用實名用戶的分群模型，此時只要將欠缺之重要實名用戶特徵集合的實名用戶特徵(無法從匿名用戶資料獲取者)做特徵填補即可。例如，將每週消費金額設為100以上，如此，便可以直接套用實名用戶的分群模型對匿名用戶進行分群，以判斷匿名用戶是否屬於高價值用戶群。然而，本發明不以此為限制，另外一種做法是直接將匿名用戶資料與實名用戶資料中近似部分進行特徵比對(例如，計算餘弦相似度)，並根據特徵比對結果，直接將近似部分對應的實名用戶特徵給予匿名用戶作為匿名用戶特徵，以進一步地判斷匿名用戶是否屬於高價值用戶群。以圖1為例，若匿名用戶資料與實名用戶資料的近似部分之特徵比對結果為高度吻合，則可以將實名用戶資料的近似部分，例如對應的實名用戶特徵33N(無法從匿名用戶資料獲取者)，直接設定數值後提供給匿名用戶作為匿名用戶特徵。Please note here that in the process S66, it is not necessary to additionally train the grouping model, but directly apply the grouping model of real-name users. acquirer) to do feature filling. For example, set the weekly consumption amount to more than 100. In this way, the grouping model of real-name users can be directly applied to group anonymous users to determine whether anonymous users belong to the high-value user group. However, the present invention is not limited thereto. Another approach is to directly compare the features of the approximate parts of the anonymous user data with the real-name user data (for example, calculate the cosine similarity), and directly compare the approximate parts according to the feature comparison results. The corresponding real-name user features are given to anonymous users as anonymous user features to further determine whether anonymous users belong to high-value user groups. Taking Figure 1 as an example, if the feature comparison result of the anonymous user profile and the approximate part of the real-name user profile is highly consistent, then the approximate part of the real-name user profile, such as the corresponding real-name user feature 33N (which cannot be obtained from the anonymous user profile or), directly set the value and provide it to the anonymous user as the feature of the anonymous user.

簡單地說，流程S66如何進行匿名/實名用戶特徵映射的方式皆非用於限制本發明。流程S66獲取的特定用戶族群的重要匿名用戶特徵集合可以與特定用戶族群的重要實名用戶特徵集合的維度相同或不同，可以是少於重要實名用戶特徵集合的維度或等於重要實名用戶特徵集合的維度。最後，在流程S67，匿名用戶分群模組55進行匿名用戶分群步驟，其係根據前述匿名用戶的匿名用戶特徵與分群模型判斷匿名用戶是否屬於特定用戶族群。若判斷匿名用戶屬於特定用戶族群，則可以對匿名用戶進行精準的行銷，例如投放對匿名用戶有興趣的廣告、折價卷、優惠或促銷活動等。對匿名用戶分群的作法可以是前面所述使用機器學習算法等建立的分群模型來實現，也可以是透過特徵比對算法，計算匿名用戶特徵與實名用戶特徵的相似度(例如，餘弦相似度)來判斷匿名用戶是否屬於特定用戶族群。In short, the method of performing anonymous/real-name user feature mapping in the process S66 is not intended to limit the present invention. The feature set of important anonymous users of a specific user group obtained in process S66 may have the same or different dimensions as the feature set of important real-name users of a specific user group, and may be less than or equal to the dimension of the feature set of important real-name users . Finally, in the process S67, the anonymous user grouping module 55 performs the anonymous user grouping step, which is to determine whether the anonymous user belongs to a specific user group according to the aforementioned anonymous user characteristics and grouping model of the anonymous user. If it is judged that the anonymous user belongs to a specific user group, precise marketing can be carried out on the anonymous user, such as placing advertisements, discount coupons, discounts or promotional activities that are interested in the anonymous user. The method of grouping anonymous users can be achieved by using the grouping model established by the machine learning algorithm mentioned above, or by using a feature comparison algorithm to calculate the similarity between the characteristics of anonymous users and the characteristics of real-name users (for example, cosine similarity) To determine whether an anonymous user belongs to a specific user group.

舉例來說，在流程S66，高價用戶群的重要實名用戶特徵集合包括每週登入頻率、平均停留時間與每週消費金額，且每週登入頻率大於2、平均停留時間與每週消費金額大於15與每週消費金額大於100的實名用戶屬於高價值用戶群，但因為每週消費金額無法從匿名用戶資料獲得，因此，高價用戶群的重要匿名用戶特徵集合包括每週登入頻率與平均停留時間。For example, in process S66, the important real-name user feature set of the high-priced user group includes weekly login frequency, average stay time and weekly consumption amount, and the weekly login frequency is greater than 2, the average stay time and weekly consumption amount are greater than 15 Real-name users with a weekly consumption amount greater than 100 belong to the high-value user group, but because the weekly consumption amount cannot be obtained from anonymous user data, the important anonymous user feature set of the high-value user group includes weekly login frequency and average stay time.

假設多個匿名用戶的匿名用戶特徵可能分別為「ID:COOK0001；每週登入頻率=2；平均停留時間=15」、「ID:COOK0002；每週登入頻率=0.5；平均停留時間=7」、「ID:COOK0003；每週登入頻率=1；平均停留時間=50」、「ID:COOK0004；每週登入頻率=3；平均停留時間=25」與「ID:COOK0005；每週登入頻率=10；平均停留時間=50」。那麼在流程S67中，可以將匿名用戶特徵的每週登入頻率與平均停留時間拿去與實名用戶特徵的每週登入頻率與平均停留時間計算餘弦相似度，若在此實施例中，餘弦相似度較高者為「ID:COOK0004」與「ID:COOK0005」的匿名用戶，因此可以將「ID:COOK0004」與「ID:COOK0005」的匿名用戶判定屬於高價值用戶群。Assume that the anonymous user characteristics of multiple anonymous users may be "ID: COOK0001; weekly login frequency = 2; average stay time = 15", "ID: COOK0002; weekly login frequency = 0.5; average stay time = 7", "ID: COOK0003; weekly login frequency = 1; average stay time = 50", "ID: COOK0004; weekly login frequency = 3; average stay time = 25" and "ID: COOK0005; weekly login frequency = 10; Average dwell time = 50". Then in process S67, the weekly login frequency and average stay time of anonymous user features can be taken to calculate the cosine similarity with the weekly login frequency and average stay time of real-name user features. In this embodiment, the cosine similarity The higher ones are the anonymous users of "ID:COOK0004" and "ID:COOK0005", so the anonymous users of "ID:COOK0004" and "ID:COOK0005" can be determined to belong to the high-value user group.

另外，在本發明實施例中，每一個特定用戶族群更可以具有兩個分群模型。雖然，分群模型可以每隔一段時間更新，但若不想頻繁更新，則可以設計成其中一個分群模型不考量任何與活動有關的特徵，另一個分群模型則考慮與此次活動舉辦有關的特徵。舉例來說，舉辦動漫祭商品特惠活動，則可能用戶群以學生居多，因此，透過匿名用戶資料中的網際網路位址，可以知道匿名用戶是否來自於學校，如果是，則判別匿名用戶為學生，且職業是學生的匿名用戶特徵對此分群模型來說是重要有意義的。In addition, in the embodiment of the present invention, each specific user group may have two grouping models. Although the grouping model can be updated at regular intervals, if you do not want to update frequently, you can design one of the grouping models without considering any characteristics related to the event, and the other grouping model considers the characteristics related to the event. For example, if an animation festival is held, the majority of the user group may be students. Therefore, through the Internet address in the anonymous user data, it can be known whether the anonymous user is from a school. If so, then the anonymous user is identified as Students, and the anonymous user characteristics whose occupation is a student are important and meaningful for this grouping model.

在某些實施例中，本發明可藉由一電腦軟體程式執行本揭露之模組或方法，其中電腦軟體程式儲存於非揮發性的儲存媒介。非揮發性的儲存媒介儲存用以將匿名用戶資料判斷為屬於實名用戶資料集之群集之電腦軟體程式，且經由電腦載入電腦軟體程式後，執行包含以下之步驟：取得並儲存第一實名用戶資料集，將該第一實名用戶資料集標記為第一用戶群集，該第一實名用戶資料集包含複數個實名用戶原始資料；對該第一實名用戶資料集進行特徵萃取，以自該第一實名用戶資料集產生第一實名用戶特徵集，其中對該第一實名用戶資料集之特徵萃取包含對該第一實名用戶資料集以一機器學習演算法進行特徵重要性分析，以產生該第一實名用戶特徵集；取得並儲存第一匿名用戶資料集，該第一匿名用戶資料集包含複數個匿名用戶原始資料，基於該第一實名用戶特徵集對該第一匿名用戶資料集進行特徵萃取，以產生第一匿名用戶特徵集，使該第一匿名用戶特徵集作為該第一實名用戶特徵集之一子集合；基於該第一匿名用戶特徵集對該第一匿名用戶資料集進行特徵選取，以產生具有該第一匿名用戶特徵集之第二匿名用戶資料集；利用該機器學習演算法對該第二匿名用戶資料集進行擬合，以產生一匿名用戶分群模型；及利用該匿名用戶分群模型對第三匿名用戶資料集進行預測，該第三匿名用戶資料集包含至少一個匿名用戶資料，將該第三匿名用戶資料集中預測為正例之匿名用戶資料標記為屬於該第一用戶群集。In some embodiments, the present invention can implement the modules or methods of the present disclosure through a computer software program, wherein the computer software program is stored in a non-volatile storage medium. The non-volatile storage medium stores the computer software program used to determine the anonymous user data as belonging to the cluster of real-name user data, and after loading the computer software program through the computer, the following steps are performed: obtaining and storing the first real-name user data set, marking the first real-name user data set as the first user cluster, the first real-name user data set contains a plurality of real-name user original data; feature extraction is performed on the first real-name user data set to obtain from the first real-name user data set The real-name user data set generates a first real-name user feature set, wherein the feature extraction of the first real-name user data set includes performing feature importance analysis on the first real-name user data set with a machine learning algorithm to generate the first real-name user data set. Real-name user feature set; obtain and store a first anonymous user data set, the first anonymous user data set includes a plurality of anonymous user original data, perform feature extraction on the first anonymous user data set based on the first real-name user feature set, To generate the first anonymous user feature set, make the first anonymous user feature set as a subset of the first real-name user feature set; perform feature selection on the first anonymous user data set based on the first anonymous user feature set, to generate a second anonymous user data set with the first anonymous user feature set; use the machine learning algorithm to fit the second anonymous user data set to generate an anonymous user grouping model; and use the anonymous user grouping The model predicts the third anonymous user data set, the third anonymous user data set includes at least one anonymous user data, and the anonymous user data predicted as positive cases in the third anonymous user data set are marked as belonging to the first user cluster.

在某些實施例中，本發明可藉由一電腦程式取得並儲存第一實名用戶資料集及第二實名用戶資料集。該第一實名用戶資料集及該第二實名用戶資料集可反映出複數個實名用戶與一軟體服務互動之記錄，該軟體服務可由一伺服器執行。該第一實名用戶資料集及該第二實名用戶資料集可包含分別對應該第一實名用戶資料集及該第二實名用戶資料集之複數個實名用戶的第一週期性資料集及第二週期性資料集。該電腦程式可對該第一實名用戶資料集及該第二實名用戶資料集進行特徵萃取，以得出該第一週期性資料集及該第二週期性資料集。該電腦程式可對該第一實名用戶資料集及該第二實名用戶資料集分別進行特徵工程，以得出分別對應該第一實名用戶資料集及該第二實名用戶資料之第一特徵資料集及第二特徵資料集。於進行該特徵工程之階段中，該電腦程式可分別利用第一機器學習演算法對該第一實名用戶資料集及該第二實名用戶資料進行擬合，以產生該第一特徵資料集及該第二特徵資料集。對於該第一實名用戶資料集及該第二實名用戶資料，該電腦程式可經由計算或找出特徵重要性來判斷特徵資料之重要性而進行特徵選取，以產生出該第一特徵資料集及該第二特徵資料集。該第一特徵資料集可包含第三週期性資料集，該第三週期性資料集可作為該第一週期性資料集之一子集合。該第二特徵資料集可包含第四週期性資料集，該第四週期性資料集可作為該第二週期性資料集之一子集合。該電腦程式可取得並儲存一匿名用戶資料集，該匿名用戶資料集可包含複數個匿名用戶之資料。該匿名用戶資料集可反映出複數個匿名用戶與該軟體服務互動之記錄。該電腦程式可將該匿名用戶資料集與該第一特徵資料集進行特徵比對，以產生第一匿名特徵資料集。該第一匿名特徵資料集可包含複數個匿名特徵，該等匿名特徵可與該第一特徵資料集之部分特徵相同。該電腦程式可利用第二機器學習演算法對該第一匿名特徵資料集進行擬合，以產生第一匿名用戶分群模型。該第二機器學習演算法可相同於該第一機器學習演算法。該電腦程式可將該匿名用戶資料集與該第二特徵資料集進行特徵比對，以產生第二匿名特徵資料集。該第二匿名特徵資料集可包含複數個匿名特徵，該等匿名特徵可與該第二特徵資料集之部分特徵相同。該電腦程式可利用第三機器學習演算法對該第二匿名特徵資料集進行擬合，以產生第二匿名用戶分群模型。該第三機器學習演算法可相同於該第一機器學習演算法及/或該第二機器學習演算法。該電腦程式可分別利用該第一匿名用戶分群模型及該第二匿名用戶分群模型對該匿名用戶資料集進行預測，以取得分別對應該二匿名用戶分群模型之第一預測結果及第二預測結果。該電腦程式可依據該第一預測結果判定該匿名用戶資料集中相似於該第一實名用戶資料集之第一匿名用戶相似資料集，並可依據該第二預測結果判定該匿名用戶資料集中相似於該第二實名用戶資料集之第二匿名用戶相似資料集。在某些例子中，該電腦程式可比較該第一匿名用戶相似資料集之資料數量及該第二匿名用戶相似資料集之資料數量來判斷選取哪一個匿名用戶相似資料集做後續處理或應用。例如，若該第二匿名用戶相似資料集的資料數量較多，該電腦程式可選取該第二匿名用戶相似資料集，若以前述高價值用戶為例，該電腦程式可將該第二匿名用戶相似資料集判斷為包含屬於高價值用戶群之匿名用戶資料。In some embodiments, the present invention can obtain and store the first real-name user data set and the second real-name user data set through a computer program. The first real-name user data set and the second real-name user data set can reflect records of interaction between a plurality of real-name users and a software service, and the software service can be executed by a server. The first real-name user data set and the second real-name user data set may include a first periodic data set and a second periodic data set of a plurality of real-name users respectively corresponding to the first real-name user data set and the second real-name user data set sex dataset. The computer program can perform feature extraction on the first real-name user data set and the second real-name user data set to obtain the first periodic data set and the second periodic data set. The computer program can perform feature engineering on the first real-name user data set and the second real-name user data set respectively, so as to obtain a first feature data set respectively corresponding to the first real-name user data set and the second real-name user data set and the second feature data set. During the feature engineering stage, the computer program can respectively use the first machine learning algorithm to fit the first real-name user data set and the second real-name user data to generate the first feature data set and the A second feature dataset. For the first real-name user data set and the second real-name user data, the computer program can determine the importance of feature data by calculating or finding out the importance of features to perform feature selection to generate the first feature data set and The second characteristic data set. The first feature data set may include a third periodic data set, and the third periodic data set may serve as a subset of the first periodic data set. The second characteristic data set may include a fourth periodic data set, and the fourth periodic data set may serve as a subset of the second periodic data set. The computer program can obtain and store an anonymous user data set, which can include multiple anonymous user data. The anonymous user data set may reflect the records of multiple anonymous users interacting with the software service. The computer program can perform feature comparison between the anonymous user data set and the first feature data set to generate a first anonymous feature data set. The first anonymous feature data set may include a plurality of anonymous features, and these anonymous features may be the same as some features of the first feature data set. The computer program can use a second machine learning algorithm to fit the first anonymous feature data set to generate a first anonymous user grouping model. The second machine learning algorithm can be the same as the first machine learning algorithm. The computer program can perform feature comparison between the anonymous user data set and the second feature data set to generate a second anonymous feature data set. The second anonymous feature data set may include a plurality of anonymous features, and the anonymous features may be the same as some features of the second feature data set. The computer program can use a third machine learning algorithm to fit the second anonymous feature data set to generate a second anonymous user grouping model. The third machine learning algorithm can be the same as the first machine learning algorithm and/or the second machine learning algorithm. The computer program can respectively use the first anonymous user grouping model and the second anonymous user grouping model to predict the anonymous user data set, so as to obtain the first prediction result and the second prediction result respectively corresponding to the two anonymous user grouping models . The computer program can determine that the anonymous user data set is similar to the first anonymous user data set similar to the first real-name user data set according to the first prediction result, and can determine that the anonymous user data set is similar to the first real-name user data set based on the second prediction result. The second anonymous user similar data set to the second real-name user data set. In some examples, the computer program can compare the data quantity of the first anonymous user-similar data set with the data quantity of the second anonymous user-similar data set to determine which anonymous user-similar data set to select for subsequent processing or application. For example, if the amount of data in the second anonymous user similar data set is large, the computer program can select the second anonymous user similar data set. Taking the aforementioned high-value users as an example, the computer program can select the second anonymous user similar data set. Similar datasets are judged to contain anonymous user profiles belonging to high-value user groups.

在某些實施例中，該電腦程式可將實名用戶資料集標記為屬於特定用戶族群。延續前述高價值用戶之例，該電腦程式可將該第一實名用戶資料集標記為高價值用戶族群之資料集，該電腦程式並可選擇性地將該第二實名用戶資料集標記為另一特定用戶族群之資料集，例如RFM分群中之特定族群之資料集。該電腦程式可依據既有之分群結果標記該第一實名用戶資料集及該第二實名用戶資料集，使該第一實名用戶資料集及該第二實名用戶資料集分屬不同之用戶族群。在某些例子中，該電腦程式對新的匿名用戶資料以該第一匿名用戶分群模型及/或該第二匿名用戶分群模型進行分群預測，根據對上述預測結果之判斷，將新的匿名用戶資料標記為對應其相似之實名用戶資料集的特定用戶族群。例如，新的匿名用戶資料可為單筆匿名用戶資料。又如，該電腦程式可將新的匿名用戶資料可即時、近乎即時、批次地進行預測。再如，前述批次預測可針對單一匿名用戶之資料累積至數筆後進行。In some embodiments, the computer program may flag a profile of real-named users as belonging to a particular group of users. Continuing the above example of high-value users, the computer program can mark the first real-name user data set as a high-value user group data set, and the computer program can optionally mark the second real-name user data set as another A data set of a specific user group, such as a data set of a specific group in an RFM group. The computer program can mark the first real-name user data set and the second real-name user data set according to the existing grouping results, so that the first real-name user data set and the second real-name user data set belong to different user groups. In some examples, the computer program predicts the new anonymous user data according to the first anonymous user grouping model and/or the second anonymous user grouping model, and according to the judgment of the above prediction results, the new anonymous user Profiles are tagged to a specific group of users corresponding to their similar set of real-named user profiles. For example, the new anonymous user data can be a single anonymous user data. As another example, the computer program can predict new anonymous user data in real time, near real time and in batches. For another example, the aforementioned batch prediction can be carried out after accumulating several data of a single anonymous user.

在某些實施例中，該第一特徵資料集及該第二特徵資料集可包含複數個不同資料屬性之特徵欄位，各該特徵欄位包含複數個特徵值，各該特徵欄位之複數個特徵值可構成一特徵值範圍。該電腦程式可依據各該特徵欄位之特徵值範圍計算或選取一特徵門檻值。前述特徵資料集之資料可包含特徵門檻值。各該特徵門檻值可為各該特徵欄位之特徵值範圍中之最大值、最小值、平均值、中位數、眾數、百分位數、四分位數、標準差等。依據不同特徵值的屬性，該電腦程式可利用不同的特徵門檻值判斷規則來決定特徵門檻值。例如，該電腦程式可判斷特徵值與該第一機器學習演算法所擬合之目標變數或標籤的關聯性，若判斷該關聯性為正相關，可計算或選取特徵值範圍中之最小值作為特徵門檻值，而若判斷該關聯性為負相關，可計算或選取特徵值範圍中之最大值作為特徵門檻值。舉前述高價值用戶之特徵選取為例，若目標變數為消費金額，特徵為每週登入頻率，特徵值範圍之最小值為2，該電腦程式可判斷消費金額與每週登入頻率具有正相關之關聯性，可選取該最小值作為特徵門檻值。在某些例子中，若目標變數與特徵之間具有弱關聯性，該電腦程式可計算或選取特徵值範圍之算術平均數作為特徵門檻值。在某些例子中，若特徵為計數屬性，該電腦程式可計算或選取特徵值範圍中之眾數作為特徵門檻值。在某些例子中，該電腦程式可設定一門檻值系數h=0, 0.5, 1, 1.8, 2, …並計算特徵值範圍之統計數據與該門檻值系數之乘積作為特徵門檻值；例如，若設定h=0.5，前述每週登入頻率之特徵門檻值便為1。該電腦程式可設定該門檻值系數來調整匿名用戶特徵與實名用戶特徵的相似度，亦即致使調整該匿名用戶資料集與該第一實名用戶資料集及/或該第二實名用戶資料集之相似度，藉此判定出該第一匿名用戶相似資料及/或該第二匿名用戶相似資料。在某些例子中，該電腦程式可接收一特徵門檻值之設定指令，以依據該設定指令設定特徵門檻值。In some embodiments, the first feature data set and the second feature data set may include a plurality of feature fields of different data attributes, each of the feature fields includes a plurality of feature values, and each of the plurality of feature fields eigenvalues can form an eigenvalue range. The computer program can calculate or select a characteristic threshold value according to the characteristic value range of each characteristic field. The data in the aforementioned feature data set may include a feature threshold. Each characteristic threshold value can be the maximum value, minimum value, average value, median, mode, percentile, quartile, standard deviation, etc. in the characteristic value range of each characteristic field. According to the attributes of different feature values, the computer program can use different feature threshold judgment rules to determine the feature threshold. For example, the computer program can determine the correlation between the feature value and the target variable or label fitted by the first machine learning algorithm. If the correlation is judged to be positive correlation, the minimum value in the range of feature values can be calculated or selected as the The characteristic threshold value, and if the correlation is judged to be negative correlation, the maximum value in the characteristic value range can be calculated or selected as the characteristic threshold value. Take the feature selection of the above-mentioned high-value users as an example, if the target variable is the consumption amount, the feature is the weekly login frequency, and the minimum value of the feature value range is 2, the computer program can judge that the consumption amount is positively correlated with the weekly login frequency Correlation, the minimum value can be selected as the feature threshold. In some examples, if there is a weak correlation between the target variable and the feature, the computer program can calculate or select the arithmetic mean of the feature value range as the feature threshold. In some examples, if the feature is a count attribute, the computer program may calculate or select the mode of the feature value range as the feature threshold. In some examples, the computer program can set a threshold coefficient h=0, 0.5, 1, 1.8, 2, ... and calculate the product of the statistical data of the characteristic value range and the threshold coefficient as the characteristic threshold value; for example, If h=0.5 is set, the characteristic threshold value of the aforementioned weekly login frequency is 1. The computer program can set the threshold coefficient to adjust the similarity between the characteristics of anonymous users and real-name users, that is, to adjust the relationship between the anonymous user data set and the first real-name user data set and/or the second real-name user data set. similarity, so as to determine the similar information of the first anonymous user and/or the similar information of the second anonymous user. In some examples, the computer program may receive a setting instruction of a characteristic threshold, so as to set the characteristic threshold according to the setting instruction.

在某些實施例中，該電腦程式依據前述預測結果判定該匿名用戶資料集中相似於前述實名用戶資料集之匿名用戶相似資料集時，可依據如前述之特徵門檻值判斷規則決定特徵資料集中至少一特徵欄位的特徵門檻值，以作為進行匿名用戶資料集與實名用戶資料集之相似度判斷的依據。在某些實施例中，該電腦程式進行特徵選取時，可依據如前述之特徵門檻值判斷規則決定特徵資料集中至少一特徵欄位的特徵門檻值，以作為進行該匿名用戶資料集與該第一特徵資料集之特徵比對之依據。舉前述高價值用戶之例，若每週登入頻率之特徵門檻值為2，在「ID: COOK0001」至「ID:COOK0005」資料的匿名用戶特徵中，該電腦程式可判定出每週登入頻率大於特徵門檻值的資料為「ID:COOK0004」與「ID:COOK0005」。應注意到，在通常情況下，前述匿名特徵資料集會包含複數個特徵，因此匿名特徵資料集可為一匿名特徵聯集，進行前述匿名用戶相似資料集之生成或前述特徵比對時，可將滿足該匿名特徵聯集之所有特徵門檻值之資料作為相似度判斷或特徵比對之條件。In some embodiments, when the computer program determines that the anonymous user data set is similar to the anonymous user similar data set to the real-name user data set based on the aforementioned prediction results, it may determine that the feature data set is at least A characteristic threshold value of a characteristic field, which is used as the basis for judging the similarity between the anonymous user data set and the real-name user data set. In some embodiments, when the computer program performs feature selection, it can determine the feature threshold value of at least one feature field in the feature data set according to the above-mentioned feature threshold value judging rules, so as to perform the anonymous user data set and the first 1. Basis for feature comparison of feature datasets. Taking the aforementioned high-value user as an example, if the feature threshold value of the weekly login frequency is 2, the computer program can determine that the weekly login frequency is greater than The feature threshold data are "ID:COOK0004" and "ID:COOK0005". It should be noted that under normal circumstances, the above-mentioned anonymous feature data set will contain multiple features, so the anonymous feature data set can be an anonymous feature union. When generating the aforementioned anonymous user similar data set or comparing the aforementioned features, you can use The data satisfying the threshold value of all the features of the anonymous feature set is used as the condition for similarity judgment or feature comparison.

在某些實施例中，該電腦程式可利用不純度模型、熵模型、過濾法、包裝法、嵌入法、其他統計方法等計算前述之特徵重要性。例如，該電腦程式可利用scikit-learn機器學習涵式庫之feature_importances_屬性來得出重要特徵。某些實施例中，前述提及之機器學習演算法、機器學習模型、或匿名用戶分群模型係以本領域通常知識者認知之分類方式運用之。例如，機器學習演算法或模型可包含迴歸模型及分類模型。In some embodiments, the computer program can use impurity model, entropy model, filtering method, packing method, embedding method, other statistical methods, etc. to calculate the aforementioned feature importance. For example, the computer program can use the feature_importances_ attribute of the scikit-learn machine learning expression library to derive important features. In some embodiments, the aforementioned machine learning algorithm, machine learning model, or anonymous user grouping model is used in a classification method recognized by those skilled in the art. For example, machine learning algorithms or models may include regression models and classification models.

在某些實施例中，前述週期性資料集可包含由該電腦程式針對不同週期之迭代計算所生成之資料。不同週期可包含以秒、分鐘、小時、天、星期、月、季、年等為單位之週期。不同週期可由時間序列分析如季節性分析等而取得。週期及/或時間序列分析可由第三方系統或服務執行，而該電腦程式可自該第三方系統或服務取得不同週期、週期性資料、或時間序列分析結果。In some embodiments, the aforementioned periodic data set may include data generated by the computer program through iterative calculations for different periods. Different periods may include periods in units of seconds, minutes, hours, days, weeks, months, quarters, years, etc. Different cycles can be obtained by time series analysis such as seasonal analysis. Periodic and/or time-series analysis may be performed by a third-party system or service, and the computer program may obtain different periods, periodic data, or time-series analysis results from the third-party system or service.

據此，本發明主要是使用實名用戶的實名用戶資料來獲取特定用戶族群的重要實名用戶特徵集合，並找出匿名用戶的匿名用戶資料找出是否有對應的匿名用戶特徵來判別匿名用戶是否屬於特定用戶族群，以進一步地達到對匿名用戶也可以做精準行銷的目的。Accordingly, the present invention mainly uses the real-name user data of real-name users to obtain a set of important real-name user features of a specific user group, and finds out the anonymous user data of anonymous users to find out whether there are corresponding anonymous user features to determine whether an anonymous user belongs to Specific user groups, in order to further achieve the purpose of precise marketing for anonymous users.

上述敘述如「本發明…」或「本發明主要…」等字眼、各實施例、變形例中的記載以及圖式中揭露的內容僅為用於說明請求項中記載的發明的一或多個實施例或實施特徵或僅表達本發明之原理或精神，並非限制本發明，亦非作為本發明唯一或必要之實施態樣，且依據本發明精神亦可有包含不同特徵或實施態樣之各種實施組合，因此請求項中記載的發明不受上述「本發明…」記載方式、實施例或圖式中揭露的內容所限定。本申請最初的請求項中的記載僅僅是一個示例，可以根據說明書、圖式等的記載對請求項中的記載進行適宜的變更。The above descriptions such as "the present invention..." or "the present invention mainly...", the descriptions in the embodiments, the modifications, and the contents disclosed in the drawings are only used to illustrate one or more of the inventions described in the claims. The embodiments or implementation features only express the principle or spirit of the present invention, and are not intended to limit the present invention, nor are they intended to be the only or necessary implementations of the present invention, and there may also be various types of implementations that include different features or implementations according to the spirit of the present invention. Combinations are implemented, so the invention described in the claims is not limited by the content disclosed in the above-mentioned "present invention..." description, embodiments, or drawings. The descriptions in the first claims of this application are merely examples, and the descriptions in the claims can be appropriately changed based on the descriptions in the specification, drawings, and the like.

301:裝置資訊 302:瀏覽行為 303:興趣喜好 304:網際網路協定位址或裝置識別 31:特定戶用群族的重要匿名用戶特徵集合 311～31M:匿名用戶特徵 321:裝置資訊 322:瀏覽行為 323:興趣喜好 324:網際網路協定位址或裝置識別 325:人口統計資料 326:消費紀錄 327:互動紀錄 328:帳戶紀錄 33:特定戶用群族的重要實名用戶特徵集合 331～33N:實名用戶特徵 4:線上實名/匿名用戶服務系統 41:平台裝置 42:匿名用戶資料擷取裝置 43:匿名用戶資料庫 44:實名用戶資料擷取裝置 45:用戶管理裝置 5:用戶管理裝置 51:實名用戶分群模組 52:實名用戶特徵工程模組 53:匿名用戶特徵獲取模組 54:匿名/實名用戶特徵映射模組 55:匿名用戶分群模組 S61～S67:流程 301: Device Information 302: Browsing behavior 303: hobbies 304: Internet protocol address or device identification 31: Collection of important anonymous user characteristics of specific user groups 311～31M: Anonymous user features 321: Device information 322: Browsing behavior 323: hobbies 324: Internet protocol address or device identification 325: Demographics 326: Consumption record 327: Interactive record 328: Account records 33: Collection of important real-name user characteristics of a specific user group 331～33N: Features of real-name users 4: Online real-name/anonymous user service system 41: Platform device 42: Anonymous user data acquisition device 43:Anonymous user database 44: real-name user data acquisition device 45: User management device 5: User management device 51:Real-name user grouping module 52:Real-name user feature engineering module 53: Anonymous user feature acquisition module 54: Anonymous/real-name user feature mapping module 55: Anonymous user grouping module S61～S67: Process

圖1是本發明實施例的實名用戶特徵與匿名用戶特徵之對應關係的示意圖。FIG. 1 is a schematic diagram of the corresponding relationship between real-name user features and anonymous user features in an embodiment of the present invention.

圖2是使用本發明實施例之用戶管理裝置的線上實名/匿名用戶服務系統的方塊圖。FIG. 2 is a block diagram of an online real-name/anonymous user service system using a user management device according to an embodiment of the present invention.

圖3是本發明實施例之能夠判別匿名用戶是否屬於特定用戶族群的用戶管理裝置的方塊圖。FIG. 3 is a block diagram of a user management device capable of judging whether an anonymous user belongs to a specific user group according to an embodiment of the present invention.

圖4是本發明實施例之能夠判別匿名用戶是否屬於特定用戶族群的方法的流程圖。FIG. 4 is a flowchart of a method capable of judging whether an anonymous user belongs to a specific user group according to an embodiment of the present invention.

S61~S67:流程 S61~S67: Process

Claims

A method capable of judging whether an anonymous user belongs to a specific user group is implemented in a user management device, comprising: Aiming at a first specific user group of a plurality of real-name users, through a feature engineering, a plurality of real-name user features whose importance is greater than a threshold value or whose importance is in the top few ranks for the first specific user group are obtained A plurality of first real-name user features as a first important set of real-name user features of the first specific user group, wherein the feature engineering is to find out the first real-name user features based on a machine learning algorithm through the user management device A plurality of first real-name user characteristics of the first important real-name user characteristic set of a specific user group; According to the multiple first real-name user features of the first important real-name user feature set of the first specific user group, a plurality of first important anonymous user feature sets of the first specific user group are mapped. an anonymous user profile, wherein at least a portion of said plurality of first anonymous user profiles is obtained from an anonymous profile of an anonymous user; and judging whether the anonymous user belongs to the first specific user group according to at least a part of the characteristics of the plurality of first anonymous users obtained from the anonymous information of the anonymous user.

The method capable of judging whether an anonymous user belongs to a specific user group as described in claim 1 further includes: grouping the plurality of real-name users by using a plurality of real-name user profiles of the plurality of real-name users, wherein at least a part of the plurality of real-name users belong to the first specific user group.

The method for judging whether an anonymous user belongs to a specific user group as described in claim 1, wherein an RFM model (Recency Frequency Monetary Model) is used to analyze the multiple real-name user profiles of the multiple real-name users according to the multiple grouping of real-name users; or, grouping the multiple real-name users according to multiple static features and/or multiple dynamic features obtained from the multiple real-name user profiles of the multiple real-name users; or, using a The K-MEANS algorithm, a support vector machine algorithm and/or a machine learning algorithm group the multiple real-name user groups according to the multiple real-name user profiles.

The method capable of judging whether an anonymous user belongs to a specific user group as described in claim 1 further includes: Collecting multiple real-name user materials of the multiple real-name users, and acquiring the multiple real-name user characteristics of the multiple real-name users according to the multiple real-name user materials.

The method capable of judging whether an anonymous user belongs to a specific user group as described in claim 1 further includes: A marketing strategy for marketing the anonymous user is determined according to whether the anonymous user belongs to the first specific user group.

The method for judging whether an anonymous user belongs to a specific user group as described in claim 1, wherein the multiple first real-name user features that can be acquired through the anonymous data are found as a whole The first anonymous user characteristic.

The method for judging whether an anonymous user belongs to a specific user group as described in claim 1, wherein the multiple first real-name user features that can be obtained through the anonymous data as a part are found out. The characteristics of the first anonymous user, and using the characteristics of the first real-name users and their values that cannot pass through the anonymous data acquirer among the characteristics of the first real-name users as other characteristics of the plurality of first anonymous users and its value.

The method for judging whether an anonymous user belongs to a specific user group as described in claim 1, wherein a grouping model of the anonymous user is trained through the plurality of first anonymous user characteristics, and the grouping model is used according to the Judging whether the anonymous user belongs to the first specific user group based on at least a part of the characteristics of the plurality of first anonymous users obtained from the anonymous information of the anonymous user.

The method capable of judging whether an anonymous user belongs to a specific user group as described in claim 1, wherein the first anonymous user features of the anonymous user are compared with the corresponding first real-name user features feature comparison, so as to determine whether the anonymous user belongs to the first specific user group.

The method for judging whether an anonymous user belongs to a specific user group as described in Claim 9, wherein the feature comparison is to calculate a cosine similarity.

A non-volatile storage medium, storing a plurality of program codes, the plurality of program codes are read by a computer device to execute the method capable of judging whether an anonymous user belongs to a specific Methods for user groups.

A user management device capable of judging whether an anonymous user belongs to a specific user group is realized by a pure hardware circuit or a computer device combined with a software, which is configured into a plurality of modules, and the plurality of modules operate to perform the requested item The method described in one of 1-10 can determine whether an anonymous user belongs to a specific user group.

A computer software program product for judging anonymous user data as belonging to a cluster of real-name user data sets, which includes the following steps after being loaded into a computer: Obtaining and storing a first real-name user data set, marking the first real-name user data set as a first user cluster, the first real-name user data set includes a plurality of real-name user original data; performing feature extraction on the first real-name user data set to generate a first real-name user feature set from the first real-name user data set, wherein the feature extraction on the first real-name user data set includes the first real-name user data set Perform feature importance analysis with a machine learning algorithm to generate the first real-name user feature set; Obtaining and storing a first anonymous user data set, the first anonymous user data set includes a plurality of anonymous user raw data, and performing feature extraction on the first anonymous user data set based on the first real-name user feature set to generate a first anonymous user data set A user feature set, making the first anonymous user feature set a subset of the first real-name user feature set; performing feature selection on the first anonymous user data set based on the first anonymous user feature set to generate a second anonymous user data set having the first anonymous user feature set; fitting the second anonymous user data set using the machine learning algorithm to generate an anonymous user segmentation model; and Use the anonymous user grouping model to predict the third anonymous user data set, the third anonymous user data set contains at least one anonymous user data, and mark the anonymous user data predicted as positive cases in the third anonymous user data set as belonging to the The first user cluster.