TWI807780B - Turnover rate prediction method and electronic apparatus thereof - Google Patents
Turnover rate prediction method and electronic apparatus thereof Download PDFInfo
- Publication number
- TWI807780B TWI807780B TW111114385A TW111114385A TWI807780B TW I807780 B TWI807780 B TW I807780B TW 111114385 A TW111114385 A TW 111114385A TW 111114385 A TW111114385 A TW 111114385A TW I807780 B TWI807780 B TW I807780B
- Authority
- TW
- Taiwan
- Prior art keywords
- survival curves
- survival
- turnover rate
- time point
- data set
- Prior art date
Links
Images
Landscapes
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- Measuring Pulse, Heart Rate, Blood Pressure Or Blood Flow (AREA)
- Measurement Of The Respiration, Hearing Ability, Form, And Blood Characteristics Of Living Organisms (AREA)
Abstract
Description
本揭示是有關於一種統計分析機制,且特別是有關於一種離職率預測方法及其電子裝置。 The disclosure relates to a statistical analysis mechanism, and in particular to a turnover rate prediction method and its electronic device.
一般而言,離職率會隨著年資的不同或其他重要變因的影響而不同。故,倘若直接基於一種變因來判斷離職率並無來獲得客觀的評估結果。例如,住宿員工的離職率比未住宿員工的離職率高出多倍。但實際上,離職率是會隨著在職時間不同而改變。例如,初期無住宿舍比較容易離職,但當在職超過一定時間後,有無宿舍對於是否離職就會較沒影響。據此,關於連續型的變因需將其進行類別化,而如何進行合適的切分則是待考量因素。 In general, turnover rates vary with seniority or other important variables. Therefore, it is not possible to obtain objective evaluation results if the turnover rate is judged directly based on a variable factor. For example, the turnover rate of housing employees is many times higher than that of non-housing employees. But in fact, the turnover rate will change with the length of employment. For example, it is easier to leave a job without a dormitory at the beginning, but after a certain period of time, having a dormitory will have little effect on whether to leave the job. Accordingly, continuous variables need to be categorized, and how to properly divide them is a factor to be considered.
本揭示提供一種離職率預測方法及其電子裝置,可提高分析結果的參考價值。 This disclosure provides a turnover rate prediction method and its electronic device, which can improve the reference value of the analysis results.
本揭示的離職率預測方法,適以藉由處理器來執行,所述 離職率預測方法包括:取得對應於複數個員工的員工資料集;依據員工資料集及影響變因對所述員工進行分群至複數個子集合,並計算這些子集合分別對應的多個存活曲線;依據所述存活曲線的相似度將所述存活曲線進行分群至複數個曲線群;計算這些曲線群分別對應的複數個分群後存活曲線,並依據所述分群後存活曲線的斜率變化取得至少一切分時間點;以及基於所述切分時間點,利用比例風險模型計算所述員工對應於影響變因之離職率預測結果。 The turnover rate prediction method disclosed in this disclosure is suitable for execution by a processor, the said The turnover rate prediction method includes: obtaining employee data sets corresponding to a plurality of employees; grouping the employees into multiple subsets according to the employee data sets and influencing factors, and calculating multiple survival curves respectively corresponding to these subsets; grouping the survival curves into multiple curve groups according to the similarity of the survival curves; calculating the plurality of grouped survival curves corresponding to these curve groups respectively, and obtaining at least a split time point according to the slope change of the survival curve after the grouping; The employee corresponds to the predicted result of the turnover rate of the influencing variable.
本發明的用於存活率分析的電子裝置,包括:儲存器,儲存至少一程式碼指令;處理器,耦接至儲存器以執行至少一程式碼指令來實現所述存活率分析方法離職率預測方法。 The electronic device for survival rate analysis of the present invention includes: a memory storing at least one program code instruction; a processor coupled to the memory to execute at least one program code instruction to implement the survival rate analysis method and turnover rate prediction method.
基於上述,本揭示可將變因進行合理切分再進行評估,據此可提高分析結果的參考價值。 Based on the above, this disclosure can reasonably divide the variables before evaluating them, thereby improving the reference value of the analysis results.
100:電子裝置 100: Electronic device
110:處理器 110: Processor
120:儲存器 120: storage
121:資料庫 121: database
S205~S225:離職率預測方法的步驟 S205~S225: Steps of the turnover rate prediction method
301~306:存活曲線 301~306: Survival curve
401~403:分群後存活曲線 401~403: Survival curve after clustering
T1、T2:切分時間點 T1, T2: Segmentation time points
圖1是依照本揭示一實施例的電子裝置的方塊圖。 FIG. 1 is a block diagram of an electronic device according to an embodiment of the disclosure.
圖2是依照本揭示一實施例的離職率預測方法的流程圖。 FIG. 2 is a flowchart of a method for predicting turnover rate according to an embodiment of the present disclosure.
圖3是依照本揭示一實施例的存活曲線的示意圖。 FIG. 3 is a schematic diagram of survival curves according to an embodiment of the present disclosure.
圖4是依照本揭示一實施例的分群後存活曲線的示意圖。 FIG. 4 is a schematic diagram of survival curves after clustering according to an embodiment of the present disclosure.
圖1是依照本揭示一實施例的電子裝置的方塊圖。請參照圖1,電子裝置100包括處理器110以及儲存器120。處理器110耦接至儲存器120。儲存器120中包括資料庫121及至少一程式碼指令,資料庫121中儲存有對應多個員工的員工資料集。
FIG. 1 is a block diagram of an electronic device according to an embodiment of the disclosure. Please refer to FIG. 1 , the
處理器110例如為中央處理單元(Central Processing Unit,CPU)、物理處理單元(Physics Processing Unit,PPU)、可程式化之微處理器(Microprocessor)、嵌入式控制晶片、數位訊號處理器(Digital Signal Processor,DSP)、特殊應用積體電路(Application Specific Integrated Circuits,ASIC)或其他類似裝置。
The
儲存器120例如是任意型式的固定式或可移動式隨機存取記憶體(Random Access Memory,RAM)、唯讀記憶體(Read-Only Memory,ROM)、快閃記憶體(Flash memory)、硬碟或其他類似裝置或這些裝置的組合。儲存器120用以儲存資料庫121以及一或多個程式碼片段,上述程式碼片段在被安裝後,會由處理器110來執行下述離職率預測方法。
The
圖2是依照本揭示一實施例的離職率預測方法的流程圖。請參照圖2,在步驟S205中,取得對應於複數個員工的員工資料集。接著,在步驟S210中,依據員工資料集及影響變因將所述員工分群至複數個子集合,並計算所述子集合分別對應的複數個存活曲線。於一些實施例中,影響變因是影響離職率的變因,且影響變因是連續型變數,例如為年齡、年資、加班時數或薪資等具有連續性特質的變因。 FIG. 2 is a flowchart of a method for predicting turnover rate according to an embodiment of the present disclosure. Please refer to FIG. 2 , in step S205 , employee data sets corresponding to a plurality of employees are acquired. Next, in step S210, the employees are grouped into a plurality of subsets according to the employee data sets and influencing variables, and a plurality of survival curves corresponding to the subsets are calculated. In some embodiments, the influencing variable is a variable affecting the turnover rate, and the influencing variable is a continuous variable, such as age, seniority, overtime hours or salary, and other variables with continuous characteristics.
具體而言,處理器110基於影響變因對應的數值範圍來劃分多個區間,並將員工資料集劃分對應至所述多個區間的多個子集合。在一實施例中,可利用一個設定值v來劃分出多個區間。所述區間例如為:<N-2×v、N-2×v~N-v、N-v~N、N~N+v、N+v~N+2×v、>N+2×v,其中N為基準值,其為大於2×v的數值。例如,N=35000,v=5000,則劃分出下述區間:<25000、25000~30000、30000~35000、35000~40000、40000~45000、>45000。然,在此僅為舉例說明,並不以此限。
Specifically, the
接著,處理器110將員工資料集劃分為對應至各區間的子集合。舉例來說,假設影響變因設定為“薪資”,單位為“元”,可將薪資劃分出下述6個區間:<25000(對應至子集合A1)、25000~30000(對應至子集合A2)、30000~35000(對應至子集合A3)、35000~40000(對應至子集合A4)、40000~45000(對應至子集合A5)、>45000(對應至子集合A6),且員工資料集包括各員工的在職天數、薪資、年齡、加班時數等資料。基於上述薪資的區間,將員工資料集劃分為6個資料群,即,薪資<25000的資料群、薪資位於25000~30000之間的資料群、薪資位於30000~35000之間的資料群、薪資位於35000~40000之間的資料群、薪資位於40000~45000之間的資料以及薪資大於450000的資料群。而所述6個資料群即分別為子集合A1~A6。
Next, the
在獲得子集合之後,處理器110利用存活分析演算法分別計算這些子集合對應的多個存活曲線。存活分析演算法例如為
Kaplan-Meier法。詳細來說,處理器110利用存活分析演算法分別在子集合A1~A6中計算不同時間點的存活率,以獲得對應的存活曲線。
After obtaining the subsets, the
底下以表1為例來進行說明。在表1中,分別對應至6個區間(<N-2×v、N-2×v~N-v、N-v~N、N~N+v、N+v~N+2×v、>N+2×v)獲得子集合A1~A6,並分別針對各子集合來計算出在時間點t1~tn所對應的存活率。以子集合A1而言,在其對應的區間<N-2×v的資料群中,計算在時間點t1~tn對應的存活率S(A1,t1)~S(A1,tn),其餘亦以此類推,獲得子集合A1~A6各自對應於時間點t1~tn的多個存活率,進而獲得對應的存活曲線。 Table 1 is taken as an example below for illustration. In Table 1, corresponding to six intervals (<N-2×v, N-2×v~N-v, N-v~N, N~N+v, N+v~N+2×v, >N+2×v) to obtain sub-sets A1~A6, and calculate the corresponding survival rate at time point t1-tn for each sub-set. Taking the subset A1 as an example, in the data group corresponding to the interval <N-2×v, calculate the survival rate S(A1,t1)~S(A1,tn) corresponding to the time point t1~tn, and the rest can be deduced by analogy to obtain multiple survival rates corresponding to the time point t1~tn in each of the subsets A1~A6, and then obtain the corresponding survival curve.
底下再舉例來說明如何計算存活率。 Below is an example to illustrate how to calculate the survival rate.
請參照表2,表2例如是以子集合A1及其對應區間<N- 2×v的資料群進行說明。在子集合A1對應的區段中設置多個時間點t1~tn,並逐一針對各時間點而自對應的資料群中進行查詢,而獲得在時間點t-1時的在職人數It-1及在時間點t-1~時間點t內的離職人數dt。 Please refer to Table 2. Table 2 is, for example, described with the data group of subset A1 and its corresponding interval<N−2×v. Set multiple time points t1~tn in the section corresponding to the subset A1, and query each time point from the corresponding data group one by one to obtain the number of employees I t-1 at time point t-1 and the number of people leaving the job d t within time point t-1~time point t .
接著,利用下述公式來計算在時間點t-1~時間點t內的離職率H(t):H(t)=dt/It-1。 Next, use the following formula to calculate the turnover rate H(t) from time point t-1 to time point t: H(t)=d t /I t-1 .
然後,再利用下述公式來計算時間點t對應的存活率S(t):S(t)=S(t-1)×(1-H(t)),其中S(0)=1。 Then, the following formula is used to calculate the survival rate S(t) corresponding to the time point t: S(t)=S(t-1)×(1-H(t)), where S(0)=1.
以時間點t1而言,查詢區間<N-2×v的資料群(子集合A1),而獲得在時間點t0(=0)時的在職人數It0(=300)及在時間點t0~t1內的離職人數dt1(=98)。之後,計算出離職率(t1)=0.33 (98/300)。最後算出存活率S(t1)=1×(1-0.33)=0.673。以此類推來獲得表2中的各數值。 Taking the time point t1 as an example, query the data group (subset A1) in the interval <N-2×v, and obtain the number of employees I t0 (=300) at the time point t0 (=0) and the number of people leaving the job d t1 (=98) within the time point t0~t1. After that, the turnover rate (t1)=0.33 (98/300) is calculated. Finally, the survival rate S(t1)=1×(1-0.33)=0.673 was calculated. The values in Table 2 are obtained by analogy.
在算出各時間點對應的存活率之後,便可以時間作為橫軸,存活率為縱軸來繪示出對應的存活曲線。圖3是依照本揭示一實施例的存活曲線的示意圖。在圖3中,存活曲線301~306分別對應至子集合A1~A6。 After calculating the survival rate corresponding to each time point, the corresponding survival curve can be drawn with time as the horizontal axis and the survival rate as the vertical axis. FIG. 3 is a schematic diagram of survival curves according to an embodiment of the present disclosure. In FIG. 3 , the survival curves 301 - 306 correspond to the subsets A1 - A6 respectively.
在獲得各子集合對應的存活曲線之後,在步驟S215中,依據所述存活曲線的相似度將所述存活曲線分群至複數個曲線群。在此,各曲線群包括至少一個存活曲線。例如,可利用階層式分群法(hierarchical clustering)或K平均(K-means)法等分類演算法來進行分群。 After the survival curves corresponding to each subset are obtained, in step S215, the survival curves are grouped into a plurality of curve groups according to the similarity of the survival curves. Each curve group here includes at least one survival curve. For example, classification algorithms such as hierarchical clustering or K-means method can be used for clustering.
以圖3而言,將存活曲線301~306分別視為是一個矩陣,藉此帶入分類演算法來進行分群。例如,存活曲線301、302(對應至子集合A1、A2)被分群至曲線群G1,存活曲線303、304(對應至子集合A3、A4)被分群至曲線群G2,存活曲線305、306(對應至子集合A5、A6)被分群至曲線群G3。故,基於各子集合對應的區間,可獲得曲線群G1~G3對應的區間分別為:<N-v、N-v~N+v、>N+v,如下述表3所示。
Referring to FIG. 3 , the
之後,在步驟S220中,計算所述曲線群分別對應的複數個分群後存活曲線,並依據所述分群後存活曲線的斜率變化取得至少一切分時間點。 After that, in step S220, calculate the plurality of post-group survival curves respectively corresponding to the curve groups, and obtain at least one split time point according to the slope change of the post-group survival curves.
處理器110將員工資料集劃分為對應至曲線群G1~G3
各自的資料群。之後,利用存活分析演算法在每一個資料群中計算多個時間點所對應的多個存活率,以獲得對應於每一曲線群的分群後存活曲線。在此,存活率的計算可參照上述表2的說明。
The
以曲線群G1而言,在其對應的區間<N-v的資料群中,計算在時間點t1~tn對應的存活率S(G1,t1)~S(G1,tn),其餘亦以此類推,獲得曲線群G1~G3各自對應於時間點t1~tn的多個存活率,進而獲得對應的分群後存活曲線。 Taking the curve group G1 as an example, in the data group corresponding to the interval <N-v, calculate the survival rate S(G1,t1)~S(G1,tn) corresponding to the time point t1~tn, and the rest can be deduced by analogy to obtain multiple survival rates of the curve group G1~G3 corresponding to the time point t1~tn, and then obtain the corresponding survival curve after grouping.
舉例來說,圖4是依照本揭示一實施例的分群後存活曲線的示意圖。在圖4中,分群後存活曲線401~403分別對應至曲線群G1~G3。
For example, FIG. 4 is a schematic diagram of survival curves after grouping according to an embodiment of the present disclosure. In FIG. 4 ,
在獲得分群後存活曲線401~403之後,在分群後存活曲線401~403中找出切分時間點。於一些實施例中,切分時間點的尋找方法可以先利用移動平均(Moving Average)法來平滑化分群後存活曲線401~403,之後,在經平滑化後的各分群後存活曲線中計算多個時間點的斜率,以找出斜率變化最大的時間點來作為切分時間點。例如,在圖4所示的實施例中,在分群後存活曲線
401中找到切分時間點T1,在分群後存活曲線403中找到切分時間點T2。
After the post-group survival curves 401-403 are obtained, the cut-off time points are found in the post-group survival curves 401-403. In some embodiments, the method for finding the segmentation time point may first use the Moving Average (Moving Average) method to smooth the survival curves 401-403 after clustering, and then calculate the slopes of multiple time points in the smoothed survival curves after clustering, so as to find the time point with the largest slope change as the segmentation time point. For example, in the example shown in Figure 4, the survival curve after clustering
In
最後,在步驟S225中,基於所述切分時間點,利用比例風險模型(proportional hazard model)計算所述員工對應於影響變因之離職率預測結果。例如,利用獲得的切分時間點來設定對應的時間範圍。以圖4的切分時間點T1、T2而言,可設定3個時間範圍,即,<T1、T1~T2、>T2。所述比例風險模型例如為Cox比例風險模型。 Finally, in step S225, based on the split time point, a proportional hazard model is used to calculate the predicted result of the turnover rate of the employee corresponding to the influencing variable. For example, the corresponding time range is set by using the obtained segmentation time point. Taking the division time points T1 and T2 in FIG. 4 as an example, three time ranges can be set, namely, <T1, T1~T2, >T2. The proportional hazards model is, for example, a Cox proportional hazards model.
Cox比例風險模型是一種半參數回歸模型,可以用來預測一個或多個不同變因在某一時間對存活率的影響。Cox比例風險模型的公式如下:
其中,β為迴歸係數;;t代表存活的時間點;h(t|x)代表在第t個時間點的情況下給定x的風險;h0(t)表示在第t個時間點時的基礎風險,例如為任意一個基線危險函數(baseline hazard function);x=I(影響變因,年資),在符合條件下x=1,在不符合條件下x=0。舉例來說,假設影響變因為“薪資”,假設條件為年資<30,即,I(薪資,年資<30),則x=1,反之x=0。另,影響變因亦可以為年齡、加班時數等。 Among them, β is the regression coefficient; ;t represents the time point of survival; h(t|x) represents the risk of a given x at the tth time point; h 0 (t) represents the base risk at the tth time point, for example, any baseline hazard function; x=I (influence variable, seniority), x=1 when the conditions are met, and x=0 when the conditions are not met. For example, assuming that the influencing variable is "salary", the assumed condition is seniority<30, that is, I(salary, seniority<30), then x=1, otherwise x=0. In addition, the influencing factors can also be age, overtime hours, etc.
將所述Cox比例風險模型的公式推導展開為如下,其中ε為誤差項:
首先使用員工資料集來估計出β(包括β1~βk)。例如可利用最大概似估計(Maximum Likelihood Estimation,MLE)來估計出β。接著再將β代入公式來算出風險比值HR。β越大代表影響變因xi的離職風險越高。 First use the employee data set to estimate β (including β 1 ~ β k ). For example, β can be estimated by using Maximum Likelihood Estimation (MLE). Then β is substituted into the formula to calculate the hazard ratio HR. The larger β means the higher the turnover risk of variable xi .
另,可以其中一個曲線群作為基準線,將其他曲線群對應的風險比值與基準線對應的風險比值進行比對。例如,可利用下述公式來算出基於基準線的風險程度:HR(x=x1)=exp(β1)。 In addition, one of the curve groups can be used as the baseline, and the risk ratios corresponding to the other curve groups are compared with the risk ratios corresponding to the baseline. For example, the following formula can be used to calculate the risk level based on the baseline: HR(x=x 1 )=exp(β 1 ).
其中β1在前面透過最大概似估計的計算獲得;當HR>1,代表影響變因x1相較基準線離職風險程度高,當HR<1則代表影響變因x1相較基準線離職風險程度低。 Among them, β1 is obtained through the calculation of maximum likelihood estimation; when HR>1, it means that the turnover risk of the influencing variable x1 is higher than the baseline, and when HR<1, it means that the turnover risk of the influencing variable x1 is lower than the baseline.
如表4,其以曲線群G1作為基準線,藉此來獲得其他曲線群G2、G3會比基準線高多少。 As shown in Table 4, the curve group G1 is used as the baseline to obtain how much the other curve groups G2 and G3 are higher than the baseline.
綜上所述,本揭露可將影響變因進行合理切分再進行評 估,據此可提高分析結果的參考價值。具體而言,本揭露將預估的存活曲線進行分群,藉此可降低連續型變因的人為分群的不合理性,並且,利用切分時間點獲得時間區間(年資)的變因,進而可同時針對年資與其他影響變因進行評估,以更有效制定後續留才政策。 To sum up, this disclosure can reasonably divide the influencing variables before evaluating Therefore, the reference value of the analysis results can be improved. Specifically, this disclosure grouped the estimated survival curves, thereby reducing the irrationality of the artificial grouping of continuous variables, and using the segmentation time points to obtain the variables of the time interval (seniority), and then simultaneously evaluating seniority and other influencing variables to formulate follow-up talent retention policies more effectively.
S205~S225:離職率預測方法的步驟 S205~S225: Steps of the turnover rate prediction method
Claims (10)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
TW111114385A TWI807780B (en) | 2022-04-15 | 2022-04-15 | Turnover rate prediction method and electronic apparatus thereof |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
TW111114385A TWI807780B (en) | 2022-04-15 | 2022-04-15 | Turnover rate prediction method and electronic apparatus thereof |
Publications (2)
Publication Number | Publication Date |
---|---|
TWI807780B true TWI807780B (en) | 2023-07-01 |
TW202343317A TW202343317A (en) | 2023-11-01 |
Family
ID=88149241
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
TW111114385A TWI807780B (en) | 2022-04-15 | 2022-04-15 | Turnover rate prediction method and electronic apparatus thereof |
Country Status (1)
Country | Link |
---|---|
TW (1) | TWI807780B (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160171398A1 (en) * | 2000-10-17 | 2016-06-16 | Asset Reliance, Inc. | Predictive Model Development System Applied To Enterprise Risk Management |
CN110163418A (en) * | 2019-04-26 | 2019-08-23 | 重庆大学 | A kind of labor turnover behavior prediction method based on survival analysis |
TW202141368A (en) * | 2020-04-23 | 2021-11-01 | 和碩聯合科技股份有限公司 | Electronic device and method for turnover rate prediction |
CN113723689A (en) * | 2021-09-01 | 2021-11-30 | 畅捷通信息技术股份有限公司 | Method, system, terminal and medium for constructing enterprise employee leave prediction model |
-
2022
- 2022-04-15 TW TW111114385A patent/TWI807780B/en active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160171398A1 (en) * | 2000-10-17 | 2016-06-16 | Asset Reliance, Inc. | Predictive Model Development System Applied To Enterprise Risk Management |
CN110163418A (en) * | 2019-04-26 | 2019-08-23 | 重庆大学 | A kind of labor turnover behavior prediction method based on survival analysis |
TW202141368A (en) * | 2020-04-23 | 2021-11-01 | 和碩聯合科技股份有限公司 | Electronic device and method for turnover rate prediction |
CN113723689A (en) * | 2021-09-01 | 2021-11-30 | 畅捷通信息技术股份有限公司 | Method, system, terminal and medium for constructing enterprise employee leave prediction model |
Also Published As
Publication number | Publication date |
---|---|
TW202343317A (en) | 2023-11-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP7241862B2 (en) | Rejecting Biased Data Using Machine Learning Models | |
Chen et al. | Activehne: Active heterogeneous network embedding | |
Hopfensitz et al. | Multiscale binarization of gene expression data for reconstructing Boolean networks | |
AU2017251771B2 (en) | Statistical self learning archival system | |
CN113366473A (en) | Method and system for automatic selection of models for time series prediction of data streams | |
US11531831B2 (en) | Managing machine learning features | |
US10445341B2 (en) | Methods and systems for analyzing datasets | |
US20180114123A1 (en) | Rule generation method and apparatus using deep learning | |
Mall et al. | Representative subsets for big data learning using k-NN graphs | |
WO2002057958A1 (en) | Method and apparatus for data clustering | |
TWI807780B (en) | Turnover rate prediction method and electronic apparatus thereof | |
Susan et al. | Dynamic growth of hidden-layer neurons using the non-extensive entropy | |
CN109255368B (en) | Method, device, electronic equipment and storage medium for randomly selecting characteristics | |
US20160283862A1 (en) | Multi-distance similarity analysis with tri-point arbitration | |
Kaneko et al. | Enhancing the lasso approach for developing a survival prediction model based on gene expression data | |
CN116453209A (en) | Model training method, behavior classification method, electronic device, and storage medium | |
Gauraha | Stability feature selection using cluster representative lasso | |
JP7302229B2 (en) | Data management system, data management method, and data management program | |
US20110161259A1 (en) | System and method for simplification of a matrix based boosting algorithm | |
CN109949070B (en) | User viscosity evaluation method, device, computer equipment and storage medium | |
CN112540973A (en) | Network visualization method based on association rule | |
CN112329715A (en) | Face recognition method, device, equipment and storage medium | |
CN111488903A (en) | Decision tree feature selection method based on feature weight | |
US20240086408A1 (en) | Data query apparatus, method, and storage medium | |
Chen | Clustering parallel data streams |