WO2021054905A1 - A machine learning based prediction system and method - Google Patents

A machine learning based prediction system and method Download PDF

Info

Publication number
WO2021054905A1
WO2021054905A1 PCT/TR2019/050775 TR2019050775W WO2021054905A1 WO 2021054905 A1 WO2021054905 A1 WO 2021054905A1 TR 2019050775 W TR2019050775 W TR 2019050775W WO 2021054905 A1 WO2021054905 A1 WO 2021054905A1
Authority
WO
WIPO (PCT)
Prior art keywords
machine learning
cases
sub
algorithm
realization
Prior art date
Application number
PCT/TR2019/050775
Other languages
French (fr)
Inventor
Şadi Evren ŞEKER
Original Assignee
Bi̇lkav Eği̇ti̇m Danişmanlik Anoni̇m Şi̇rketi̇
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Bi̇lkav Eği̇ti̇m Danişmanlik Anoni̇m Şi̇rketi̇ filed Critical Bi̇lkav Eği̇ti̇m Danişmanlik Anoni̇m Şi̇rketi̇
Priority to PCT/TR2019/050775 priority Critical patent/WO2021054905A1/en
Publication of WO2021054905A1 publication Critical patent/WO2021054905A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/10Office automation; Time management
    • G06Q10/105Human resources
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/02Banking, e.g. interest calculation or account maintenance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/08Insurance

Definitions

  • the present disclosure relates to a prediction system and method based on feature extraction and machine learning that can be used in all sectors.
  • the present disclosure specifically relates to a system and a method similar to funnel modeling that allows the user to predict both the realization and non realization possibilities of cases and sub-cases that can be defined by the user in different numbers and depending on different circumstances.
  • FIG. 1 shows a schematic view of the marketing funnel traditionally used.
  • scoring is referred to as a field of application of the concepts of regression or prediction.
  • the reputation of a digital asset can be considered as the credit score of an individual or organization, or a risk score in the insurance field, or the scoring of the probability of an employee leaving.
  • scoring allows ranking because all of the values in an index are scored. For example, scoring and ranking of all institutions are enough to find the most reputable institution.
  • a manual formula or rule-based system works. For example, when calculating a person's digital reputation, values from different social media platforms (facebook, twitter, Google trends, etc.) are multiplied by different weights to calculate the total score.
  • each rule base is scanned for different events (for example, transaction amounts, such as when the number of purchases on the same day and from same institution exceeds a certain value), and then if these events occur -which is a rule- risk score is calculated.
  • a red, an orange or a yellow alarm is issued and depending on the alarm level, the customer is called, texted or no action is taken.
  • scoring is done by running a separate algorithm at each stage.
  • the marketing funnel in the previous stage scores the conditions that customers provide in certain rules (for example, checking events/rules such as when a visitor to the site purchases something or re-purchases made by someone who has previous purchases) and historical data is analyzed.
  • rules for example, checking events/rules such as when a visitor to the site purchases something or re-purchases made by someone who has previous purchases
  • a single algorithm checks the level of the customer on the funnel.
  • the current technique does not involve a method of scoring for cases that can be defined depending on the flexible number and flexible conditions, similar to funnel structuring, and for sub-cases that are linked to the said cases.
  • Complex event processing is used to analyze events from a single center and control the relationship between events to facilitate the management of multiple different events. For example, in a marketing campaign, gifting a movie ticket for people flying with a specific airline company and taking coffee from a specific coffee shop requires the verification and management of multiple complex events from one center.
  • Complex event processing systems with a visual interface include interfaces customized to a specific area. None of these interface solutions offer scoring and prediction algorithms.
  • the general purpose of complex event processing software (and algorithms) is to control events from a single center and most do not offer prediction with artificial intelligence. The solutions offered are generally for management purposes.
  • the invention proposes a system that allows customers to be predicted to end their purchase of services/products from the company by analyzing the data about a company's customers with machine learning methods, and also to prevent the abandonment of customers who are likely to leave with personalized offers using machine learning and data analysis methods, and to increase their loyalty.
  • the said invention comprises a data collection unit, data warehouse, analysis unit, proposal unit, reporting unit, management, and observation unit.
  • a new and original scoring algorithm for each case and associated sub-cases in said system.
  • the present disclosure relates to a machine learning based prediction system and method that meets the above-mentioned requirements, eliminates all disadvantages and brings some additional advantages.
  • the main purpose of the invention is to develop a machine learning-based prediction system and method that predicts the possibility of the realization and non-realization of user-prepared conditions in a flexible manner and predicts the possibility of the realization and non-realization of later sub-cases depending on each other among these predicted realizations.
  • Another purpose of the invention is to develop a machine learning-based prediction system and method in which results can be visualization for users.
  • Another purpose of the invention is to develop a machine learning-based prediction system and method in which the machine learning algorithm to be used can be selected either automatically or by the user.
  • AutoML automated machine learning
  • Another purpose of the invention is to develop a machine learning-based prediction system and method in which transformations are automatically provided by extracting the features to be used by the pre-processing modules.
  • Another purpose of the invention is to develop a machine learning-based prediction system and method in which new algorithms can be selected by processing new parameters in the prediction of scores of sub-cases linked to the defined case.
  • the invention is a machine learning-based prediction system which can be used in different sectors to predict the probability of events occurring or not, characterized by comprising; • at least one interface that allows the flexible organization of the possible cases and associated sub-cases of business events and visual presentation of prediction results to the user,
  • At least one pre-processing module that determines the features to be used in machine learning by processing data within the said data source and the machine learning method that is appropriate to the said features
  • the invention is a machine learning-based prediction method that can be used in different sectors, allowing to predict the probability of occurring for each defined event over the fiction created on an interface by the user, characterized by comprising; a) directing/routing the user through an interface to the data loading screen for loading the relevant data to a data source, b) following the completion of the data loading, directing the user to the identification screen to identify the cases and the sub cases of the said cases, which may have multiple layers, c) determining the features to be used in machine learning by the pre-processing modules after the user has defined said cases and sub-cases over the data loaded into the said data source, d) selecting the most appropriate algorithm for machine learning through the feature selection, e) optimizing the selected algorithm parameters, f) prediction of the possibilities of realization and/or non realization of the defined cases with the realization of machine learning by means of the selected algorithm, selected features, and previous transitions, g) visual representation of the defined cases and predicted possibilities to the user on the interface, h) if at least one sub-case
  • Figure - 1 A schematic view of the marketing funnel.
  • Figure - 2 A schematic flow diagram of the machine learning approach.
  • Figure - 3 A schematic view of the elements of the machine learning based prediction system of the invention.
  • Figure - 4 A representation of the interface of a machine learning based prediction system of the invention for an exemplary human resources fiction.
  • Figure - 5 A schematic representation of the working interface of a machine learning based prediction system of the invention for an exemplary human resources fiction.
  • Figure - 6 A schematic view of the funnel structure of the machine learning based prediction system of the invention.
  • Figure - 7 A representative view of the use of the interface used to visualize the results of the machine learning-based prediction system of the invention in automated machine learning.
  • the drawings do not necessarily have to be scaled, and the details that are not necessary to understand the invention may be neglected. Other than that, elements that are substantially identical, or at least have substantially identical functions, are denoted by the same number. Reference Numbers
  • a machine learning-based prediction system which can be used in different sectors to predict the probability of events occurring or not, wherein; comprising, at least one interface (3) that allows the flexible organization of the possible cases and associated sub-cases of business events and visual presentation of prediction results to the user, at least one data source (1 ) that enables data to be loaded into the system for the realization of machine learning, at least one pre-processing module (2) that determines the features to be used in machine learning by processing data within the said data source (1 ) and the machine learning method that is appropriate to the said features and at least one algorithm (4) that enables the probability of events being defined to occur by means of the said features.
  • the user In order for the system to work, the user first loads the data to the system from the data loading screen. Following this stage, here comes the part in which the user defines the cases, the associated sub-cases and the connection between them in a flexible manner, similar to the funnel structure. Furthermore, during the creation of the organizational layers, the automatic machine learning (AutoML) process on the interface (4) visualizes the process. The data is processed quickly via AutoML and predictions are made.
  • AutoML automatic machine learning
  • a machine learning-based prediction method which can be used in different sectors, allowing to predict the probability of occurring or not for each defined event through fiction the created on an interface (3) by user, the user is first directed through an interface (4) to the data loading screen for machine learning for loading the relevant data to a data source (1 ).
  • events can be added to the fiction by connecting to a data source (1 ) and creating a connection layer.
  • the user is directed to the identification screen to identify the cases and the sub-cases of the said cases, which may have multiple layers.
  • the features to be used in machine learning are determined by the pre-processing modules (2) over the data loaded into said data source (1 ), and the most appropriate algorithm (3) for machine learning is selected depending on the feature selection .
  • the pre-processing modules (2) both extract critical information for the selection of the algorithm (3) and automatically perform the transformations which will increase the success of the algorithm (4).
  • the machine learning algorithm to be used (4) can also be selected by the user through the interface (3).
  • the section up to this point may be called preprocessing.
  • the selected algorithm (3) parameters are optimized by examining the available data from the algorithm library available. Hyperparameter optimization is used to optimize parameters of algorithms in a preferred embodiment of the invention.
  • the realization and/or non-realization possibilities of the defined cases is predicted, the identified cases and the predicted possibilities are presented to the user visually on the interface (4).
  • the algorithm best appropriate to the sub-case for machine learning is selected again (3), the realization and/or non-realization probability of the sub-cases in the respective layer are predicted, and the visualization of the sub-cases and their predicted possibilities are presented to the user on the interface (4).
  • At least one sub-case is defined for the said case which can be nested in multiple layers
  • the algorithm best appropriate to the sub-case for machine learning is re-selected (3)
  • the realization and/or non-realization probability of the sub-cases in the respective layer are predicted
  • the visualization of the sub-cases and their predicted possibilities are presented to the user on the interface (4).
  • PCA-Principal Component Analysis For the determination of features at the pre-processing level, PCA-Principal Component Analysis, LDA-Linear Discriminant Analysis, feature elimination (Backward and Forward) — where the p-value is used and the significance level is checked — , Correlation Matrix — where the rho value is checked — , missing value — which we use mean or mode (but not used if lightGBM is available in later steps), date conversions (automatic recognition of date field and days of the week, month, day), and segmentation/clustering algorithms.
  • the selected algorithm (4) is performed to the machine learning stage through previous transitions between the events.
  • optimization algorithms are used such as KNN k-nearest neighbor, Linear Regression, XGBoost, LightGBM, random forest, support vector regression, decision tree regression, and hypermeter.
  • KNN k-nearest neighbor Linear Regression
  • XGBoost Linear Regression
  • LightGBM LightGBM
  • random forest support vector regression
  • decision tree regression decision tree regression
  • hypermeter hypermeter
  • both previous events and predicted future transitions are presented visually via the interface (3).
  • the probability (and/or non-probability) of transition between these events for the future is calculated and visualized by means of the interface (3).
  • the outputs to be obtained from here can be automatically used for different purposes (For example, campaigning or texting to customers who are not considered to be going to the next stage).
  • Each stage produced between all events is predicted with a special scoring algorithm and the algorithm produces different outputs at each stage.
  • the feature extraction, algorithm selection, machine learning, scoring and visualization steps described above are performed repeatedly for the scoring process when each funnel layer is opened and can be run consecutively.
  • the probability of a condition which is prepared in a completely flexible way is predicted for the future, and it is predicted that among these predicted to be realized, who will realize the next possibility. That is, each step of the funnel returns scored results that feed the next step, whereby a new and original scoring algorithm works for the next step in each step.
  • the machine learning based prediction method of the invention can be explained with an exemplary scenario of human resources.
  • a user defines the human resources process with some complex events. Let these events be given as follows:
  • a sample customer scoring in the telecommunications field can be as follows:
  • the setups mentioned may vary entirely according to the user's description of the cases and its connected, multiple sub-cases, and each time a new algorithm (3) is selected for the said cases and their connected sub-cases, the probability of both realization and non-realization of the cases and sub-cases is predicted.
  • machine learning-based prediction system which can be used in different sectors to predict the probability of events occurring or not and a machine learning-based prediction method that can be used in different sectors, allowing to predict the probability of occurring for each defined event over the fiction created on an interface (3) by the user, which use funnel approach, scoring and complex event processing approaches.
  • Mentioned system and method visualise past and future events with using machine learning/artificial intelligence and interpretability.
  • mentioned system and method can control data flow from multiple events from a single center (this is descriptive analytic) when they can calculate and visualize the probability of transition between these events (or the probability of not passing).
  • Even the outputs can be used for different purposes automatically (eg campaigning or sending messages to customers who are not expected to go to the next step).
  • Each step produced between all events is estimated with a special scoring algorithm and the algorithm can produce different outputs at each stage.

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Strategic Management (AREA)
  • Accounting & Taxation (AREA)
  • Economics (AREA)
  • Finance (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Development Economics (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Game Theory and Decision Science (AREA)
  • Technology Law (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • Data Mining & Analysis (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The present disclosure is a machine learning-based prediction system which can be used in different sectors to predict the probability of events occurring or not, wherein; comprising, at least one interface (3) that allows the flexible organization of the possible cases and associated sub-cases of business events and visual presentation of prediction results to the user, at least one data source (1) that enables data to be loaded into the system for the realization of machine learning, at least one pre-processing module (2) that determines the features to be used in machine learning by processing data within the said data source (1) and the machine learning method that is appropriate to the said features, and at least one algorithm (4) that enables the probability of events being defined to occur by means of the said features.

Description

A Machine Learning Based Prediction System and Method
Technical Field
The present disclosure relates to a prediction system and method based on feature extraction and machine learning that can be used in all sectors.
The present disclosure specifically relates to a system and a method similar to funnel modeling that allows the user to predict both the realization and non realization possibilities of cases and sub-cases that can be defined by the user in different numbers and depending on different circumstances.
Prior Art
The marketing funnel, scoring methods, and complex event processing methods used in the prior art are described below.
Today, the marketing funnel used to visualize customers at different levels can consist of a different number of layers depending on the situation. The said marketing funnel is one of the most preferred representations for products in the literature and in the marketing industry. Figure 1 shows a schematic view of the marketing funnel traditionally used. However, there are no scoring and/or prediction algorithms in the traditional funnel representation used to visualize customers. A schematic representation of the funnel structure used in a machine learning prediction system and method of the invention is given in Figure 6. In the current technique, scoring is referred to as a field of application of the concepts of regression or prediction. For example, the reputation of a digital asset can be considered as the credit score of an individual or organization, or a risk score in the insurance field, or the scoring of the probability of an employee leaving. The advantage of scoring is that it allows ranking because all of the values in an index are scored. For example, scoring and ranking of all institutions are enough to find the most reputable institution. In the scoring field, in most cases, a manual formula or rule-based system works. For example, when calculating a person's digital reputation, values from different social media platforms (facebook, twitter, Google trends, etc.) are multiplied by different weights to calculate the total score. Similarly, to determine whether a person has committed credit card fraud or has not, each rule base is scanned for different events (for example, transaction amounts, such as when the number of purchases on the same day and from same institution exceeds a certain value), and then if these events occur -which is a rule- risk score is calculated. Accordingly, a red, an orange or a yellow alarm is issued and depending on the alarm level, the customer is called, texted or no action is taken. Nowadays, scoring is done by running a separate algorithm at each stage. For example, the marketing funnel in the previous stage scores the conditions that customers provide in certain rules (for example, checking events/rules such as when a visitor to the site purchases something or re-purchases made by someone who has previous purchases) and historical data is analyzed. For solutions to find out how valuable a customer is, a single algorithm checks the level of the customer on the funnel. Flowever, the current technique does not involve a method of scoring for cases that can be defined depending on the flexible number and flexible conditions, similar to funnel structuring, and for sub-cases that are linked to the said cases.
Complex event processing (CEP) is used to analyze events from a single center and control the relationship between events to facilitate the management of multiple different events. For example, in a marketing campaign, gifting a movie ticket for people flying with a specific airline company and taking coffee from a specific coffee shop requires the verification and management of multiple complex events from one center. Complex event processing systems with a visual interface include interfaces customized to a specific area. None of these interface solutions offer scoring and prediction algorithms. The general purpose of complex event processing software (and algorithms) is to control events from a single center and most do not offer prediction with artificial intelligence. The solutions offered are generally for management purposes.
In the application TR 2016/20097 of prior art documents, a system for predicting and preventing customer abandonment is mentioned. The invention proposes a system that allows customers to be predicted to end their purchase of services/products from the company by analyzing the data about a company's customers with machine learning methods, and also to prevent the abandonment of customers who are likely to leave with personalized offers using machine learning and data analysis methods, and to increase their loyalty. The said invention comprises a data collection unit, data warehouse, analysis unit, proposal unit, reporting unit, management, and observation unit. However, there is no mention of the operation of a new and original scoring algorithm for each case and associated sub-cases in said system.
As a result, the existence of the inadequacy of existing solutions necessitated an improvement in the relevant technical field.
The Purpose of Invention
The present disclosure relates to a machine learning based prediction system and method that meets the above-mentioned requirements, eliminates all disadvantages and brings some additional advantages.
The main purpose of the invention is to develop a machine learning-based prediction system and method that predicts the possibility of the realization and non-realization of user-prepared conditions in a flexible manner and predicts the possibility of the realization and non-realization of later sub-cases depending on each other among these predicted realizations.
Another purpose of the invention is to develop a machine learning-based prediction system and method in which results can be visualization for users.
Another purpose of the invention is to develop a machine learning-based prediction system and method in which the machine learning algorithm to be used can be selected either automatically or by the user.
Another purpose of the invention is to develop a machine learning-based prediction system and method that can perform high accuracy prediction via feedback from parameters used with hyperparameter optimization. Another purpose of the invention is to develop a machine learning-based prediction system and method that can quickly process data through the use of the automated machine learning (AutoML) method.
Another purpose of the invention is to develop a machine learning-based prediction system and method in which transformations are automatically provided by extracting the features to be used by the pre-processing modules.
Another purpose of the invention is to develop a machine learning-based prediction system and method in which new algorithms can be selected by processing new parameters in the prediction of scores of sub-cases linked to the defined case.
In order to achieve the objectives mentioned above, the invention is a machine learning-based prediction system which can be used in different sectors to predict the probability of events occurring or not, characterized by comprising; • at least one interface that allows the flexible organization of the possible cases and associated sub-cases of business events and visual presentation of prediction results to the user,
• at least one data source that enables data to be loaded into the system for the realization of machine learning,
• at least one pre-processing module that determines the features to be used in machine learning by processing data within the said data source and the machine learning method that is appropriate to the said features,
• at least one algorithm that enables the probability of events being defined to occur by means of the said features.
In addition, the invention is a machine learning-based prediction method that can be used in different sectors, allowing to predict the probability of occurring for each defined event over the fiction created on an interface by the user, characterized by comprising; a) directing/routing the user through an interface to the data loading screen for loading the relevant data to a data source, b) following the completion of the data loading, directing the user to the identification screen to identify the cases and the sub cases of the said cases, which may have multiple layers, c) determining the features to be used in machine learning by the pre-processing modules after the user has defined said cases and sub-cases over the data loaded into the said data source, d) selecting the most appropriate algorithm for machine learning through the feature selection, e) optimizing the selected algorithm parameters, f) prediction of the possibilities of realization and/or non realization of the defined cases with the realization of machine learning by means of the selected algorithm, selected features, and previous transitions, g) visual representation of the defined cases and predicted possibilities to the user on the interface, h) if at least one sub-case is defined for the said cases, which may have multiple layers, when the sub-case is opened through the interface, it comprises,
• selecting the most appropriate algorithm for machine learning through the feature selection,
• prediction of the realization and/or non-realization of the sub-cases in the related layer, and visual representation of the sub-cases and predicted possibilities to the user on the interface,
• if at least one more sub-case is defined for the said sub-case, which may have multiple layers, when the sub-case is opened through the interface, it comprises, o selecting the most appropriate algorithm for machine learning through the feature selection, o prediction of the realization and/or non-realization of the sub-cases in the related layer, and visual representation of the sub-cases and predicted possibilities to the user on the interface, o rechecking if at least one sub-case is defined for at least one said sub case, which may have multiple layers.
The structural and characteristic features and all advantages of the invention outlined in the drawings below and in the detailed description made by referring these figures will be understood clearly, therefore the evaluation should be made by taking these figures and detailed explanation into consideration.
Brief Description of the Figures
In order to be able to understand the advantages of the present invention together with the additional elements, it is necessary to evaluate it with the figures explained below.
Figure - 1 : A schematic view of the marketing funnel.
Figure - 2: A schematic flow diagram of the machine learning approach.
Figure - 3: A schematic view of the elements of the machine learning based prediction system of the invention.
Figure - 4: A representation of the interface of a machine learning based prediction system of the invention for an exemplary human resources fiction.
Figure - 5: A schematic representation of the working interface of a machine learning based prediction system of the invention for an exemplary human resources fiction.
Figure - 6: A schematic view of the funnel structure of the machine learning based prediction system of the invention. Figure - 7: A representative view of the use of the interface used to visualize the results of the machine learning-based prediction system of the invention in automated machine learning. The drawings do not necessarily have to be scaled, and the details that are not necessary to understand the invention may be neglected. Other than that, elements that are substantially identical, or at least have substantially identical functions, are denoted by the same number. Reference Numbers
1 . Data Source
2. Pre-Processing Module
3. Interface
4. Algorithm
Detailed Description of the Invention
In this detailed description, the machine learning based prediction system and method of the invention will be explained through examples only for a better understanding of the subject matter and without any restrictive effect.
A machine learning-based prediction system which can be used in different sectors to predict the probability of events occurring or not, wherein; comprising, at least one interface (3) that allows the flexible organization of the possible cases and associated sub-cases of business events and visual presentation of prediction results to the user, at least one data source (1 ) that enables data to be loaded into the system for the realization of machine learning, at least one pre-processing module (2) that determines the features to be used in machine learning by processing data within the said data source (1 ) and the machine learning method that is appropriate to the said features and at least one algorithm (4) that enables the probability of events being defined to occur by means of the said features.
In order for the system to work, the user first loads the data to the system from the data loading screen. Following this stage, here comes the part in which the user defines the cases, the associated sub-cases and the connection between them in a flexible manner, similar to the funnel structure. Furthermore, during the creation of the organizational layers, the automatic machine learning (AutoML) process on the interface (4) visualizes the process. The data is processed quickly via AutoML and predictions are made.
In a machine learning-based prediction method, which can be used in different sectors, allowing to predict the probability of occurring or not for each defined event through fiction the created on an interface (3) by user, the user is first directed through an interface (4) to the data loading screen for machine learning for loading the relevant data to a data source (1 ). In a preferred embodiment of the invention, events can be added to the fiction by connecting to a data source (1 ) and creating a connection layer.
Following the completion of the data loading, the user is directed to the identification screen to identify the cases and the sub-cases of the said cases, which may have multiple layers. After defining said cases and sub-cases, the features to be used in machine learning are determined by the pre-processing modules (2) over the data loaded into said data source (1 ), and the most appropriate algorithm (3) for machine learning is selected depending on the feature selection . In a preferred embodiment of the invention, during the selection of the features, the pre-processing modules (2) both extract critical information for the selection of the algorithm (3) and automatically perform the transformations which will increase the success of the algorithm (4). For example, it automatically detects that a column has date-type data, and then automatically extracts information from those dates, such as the day of the week, the day of the month, the day of the year, the month, and the year. In addition, in a preferred embodiment of the invention, the machine learning algorithm to be used (4) can also be selected by the user through the interface (3). The section up to this point may be called preprocessing.
Then, the selected algorithm (3) parameters are optimized by examining the available data from the algorithm library available. Hyperparameter optimization is used to optimize parameters of algorithms in a preferred embodiment of the invention.
Following the optimization process, by performing the machine learning through the selected algorithm (3), selected features and previous transitions, the realization and/or non-realization possibilities of the defined cases is predicted, the identified cases and the predicted possibilities are presented to the user visually on the interface (4).
If at least one sub-case is defined for the said cases -which can be nested in multiple layers- when the said sub-case is opened through the interface (4), the algorithm best appropriate to the sub-case for machine learning is selected again (3), the realization and/or non-realization probability of the sub-cases in the respective layer are predicted, and the visualization of the sub-cases and their predicted possibilities are presented to the user on the interface (4).
If at least one sub-case is defined for the said case which can be nested in multiple layers, when the said sub-case is opened through the interface (4), the algorithm best appropriate to the sub-case for machine learning is re-selected (3), the realization and/or non-realization probability of the sub-cases in the respective layer are predicted, and the visualization of the sub-cases and their predicted possibilities are presented to the user on the interface (4). Re-checking whether at least one further sub-case of the at least one sub-case is defined or not, which can have more than one layer, and if so, selecting the most appropriate algorithm (3), visualizing the sub-case by calculating the possibility of the realization of the sub-case continues until the inner sub-case if the user chooses to open it.
For the determination of features at the pre-processing level, PCA-Principal Component Analysis, LDA-Linear Discriminant Analysis, feature elimination (Backward and Forward) — where the p-value is used and the significance level is checked — , Correlation Matrix — where the rho value is checked — , missing value — which we use mean or mode (but not used if lightGBM is available in later steps), date conversions (automatic recognition of date field and days of the week, month, day), and segmentation/clustering algorithms. After the feature selection process is completed, the selected algorithm (4) is performed to the machine learning stage through previous transitions between the events.
For the machine learning operation, optimization algorithms are used such as KNN k-nearest neighbor, Linear Regression, XGBoost, LightGBM, random forest, support vector regression, decision tree regression, and hypermeter. However, the present disclosure can not be limited with above mentioned algorithms. After machine learning has performed, by analyzing the available data, the prediction of who/which probability will proceed to the later event and the probability of realizations are predicted by means of the algorithm (4), so that scoring is performed.
As the final stage after the scoring, both previous events and predicted future transitions are presented visually via the interface (3). In addition to descriptive- analytic control of the data flow resulting from more than one event, the probability (and/or non-probability) of transition between these events for the future is calculated and visualized by means of the interface (3). In fact, the outputs to be obtained from here can be automatically used for different purposes (For example, campaigning or texting to customers who are not considered to be going to the next stage). Each stage produced between all events is predicted with a special scoring algorithm and the algorithm produces different outputs at each stage. The feature extraction, algorithm selection, machine learning, scoring and visualization steps described above are performed repeatedly for the scoring process when each funnel layer is opened and can be run consecutively.
In the machine learning based prediction method of the present invention, the probability of a condition which is prepared in a completely flexible way is predicted for the future, and it is predicted that among these predicted to be realized, who will realize the next possibility. That is, each step of the funnel returns scored results that feed the next step, whereby a new and original scoring algorithm works for the next step in each step.
The machine learning based prediction method of the invention can be explained with an exemplary scenario of human resources. A user defines the human resources process with some complex events. Let these events be given as follows:
• Publication of the job advertisement
• Evaluation of job applications / interview
• Recruitment
• Learning and development processes of employees
• Performance reviews
Retirement
These steps above can be visualized using the funnel approach (another event/sequence assembly can be made entirely for Human Resources, and the funnel approach can work here as well.) In Figure 4, it is also established that there will be people who continue to the next step in each step or who abandon the mainstream by not doing so. It is possible to score these abandonments. In addition, the method has the ability to issue risk management or team leader-specific scores based on scoring results. In Figure 5, in addition to these additional scorings, step-by-step operation of the whole system is visualized. For example, after the job announcement, a separate scoring algorithm works to predict who will come to the interview. A new scoring solution that predicts who will be recruited within the cluster derived from this prediction is also made by machine learning. It remakes what the performance of those in work will be like, and depending on that performance, who will be fired, who will retire or who will resign, completely independent of previous scoring models. In this way, scoring algorithms based on re-prediction are produced at each step and the results are interpreted specifically for that step.
For the machine learning-based prediction method of the invention, a different example may be given for customer scoring. For example, most sales channels have now become subscription-based. For example, when Microsoft used to bundle and sell office products, it is now available through membership through Office 360, or bank credit cards require annual renewal, or insurance policies (automobile insurance, mandatory earthquake insurance, etc.) require renewal. Making these renewals is critical for companies because gaining new customers is about three times more costly than maintaining an old customer. The machine learning based prediction method of the invention provides both visualization, scoring and complex event processing analysis by solving the connections between the customer's actions. A sample customer scoring in the telecommunications field can be as follows:
• All Customers
• Those who renewed their subscriptions for the first time
• Those who renewed their subscriptions for 3 or more times
In our approach to the 3-step funnel above, those who are likely to go to the next step in each step are scored and presented visually.
The following events can also be added to this funnel example; • Pre-paid costumers
• Subscriptions
• Those who renewed their subscriptions for the first time
• Those who renewed their subscriptions for 3 or more times
Or completely another setup can be as follows:
• All Customers
• Customers receiving additional packages
• Customers upgrading packages
The setups mentioned may vary entirely according to the user's description of the cases and its connected, multiple sub-cases, and each time a new algorithm (3) is selected for the said cases and their connected sub-cases, the probability of both realization and non-realization of the cases and sub-cases is predicted.
In the invention, three major solutions which can be used in every sector and which can solve big problems will be solved on a single structure at the same time.
As a result present disclosure is machine learning-based prediction system which can be used in different sectors to predict the probability of events occurring or not and a machine learning-based prediction method that can be used in different sectors, allowing to predict the probability of occurring for each defined event over the fiction created on an interface (3) by the user, which use funnel approach, scoring and complex event processing approaches. Mentioned system and method visualise past and future events with using machine learning/artificial intelligence and interpretability. Also, mentioned system and method can control data flow from multiple events from a single center (this is descriptive analytic) when they can calculate and visualize the probability of transition between these events (or the probability of not passing). Even the outputs can be used for different purposes automatically (eg campaigning or sending messages to customers who are not expected to go to the next step). Each step produced between all events is estimated with a special scoring algorithm and the algorithm can produce different outputs at each stage.

Claims

1 . A machine learning-based prediction system which can be used in different sectors to predict the probability of events occurring or not, characterized by comprising;
• at least one interface (3) that allows the flexible organization of the possible cases and associated sub-cases of business events and visual presentation of prediction results to the user,
• at least one data source (1 ) that enables data to be loaded into the system for the realization of machine learning,
• at least one pre-processing module (2) that determines the features to be used in machine learning by processing data within the said data source (1 ) and the machine learning method that is appropriate to the said features,
• at least one algorithm (4) that enables the probability of events being defined to occur by means of the said features.
2. A machine learning-based prediction method that can be used in different sectors, allowing to predict the probability of occurring for each defined event over the fiction created on an interface (3) by the user , characterized by comprising; a) directing/routing the user through an interface (4) to the data loading screen for loading the relevant data to a data source (1 ), b) following the completion of the data loading, directing the user to the identification screen to identify the cases and the sub cases of the said cases, which may have multiple layers, c) determining the features to be used in machine learning by the pre-processing modules (2) after the user has defined said cases and sub-cases over the data loaded into the said data source (1 ), d) selecting the most appropriate algorithm (3) for machine learning through the feature selection , e) optimizing the selected algorithm (3) parameters, f) prediction of the possibilities of realization and/or non realization of the defined cases with the realization of machine learning by means of the selected algorithm (3), selected features, and previous transitions, g) visual representation of the defined cases and predicted possibilities to the user on the interface (4), h) if at least one sub-case is defined for the said case, which may have multiple layers, when the sub-case is opened through the interface (4), it comprises,
• selecting the most appropriate algorithm (3) for machine learning through the feature selection ,
• prediction of the realization and/or non-realization of the sub-cases in the related layer, and visual representation of the sub-cases and predicted possibilities to the user on the interface (4),
• if at least one more sub-case is defined for the said sub-case, which may have multiple layers, when the sub-case is opened through the interface (4), it comprises, o selecting the most appropriate algorithm (3) for machine learning through the feature selection, o prediction of the realization and/or non-realization of the sub-cases in the related layer, and visual representation of the sub-cases and predicted possibilities to the user on the interface (4), o rechecking if at least one sub-case is defined for at least one said sub case, which may have multiple layers.
3. A machine learning based prediction algorithm according to Claim 2, characterized by comprising; during the extraction of the features, the pre-processing modules (2) both extract critical information for the selection of the algorithm (3) and automatically perform the transformations which will increase the success of the algorithm (4).
4. A machine learning based prediction algorithm according to Claim 2, characterized by comprising; hyperparameter optimization is used to optimize parameters of algorithms (3).
PCT/TR2019/050775 2019-09-19 2019-09-19 A machine learning based prediction system and method WO2021054905A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/TR2019/050775 WO2021054905A1 (en) 2019-09-19 2019-09-19 A machine learning based prediction system and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/TR2019/050775 WO2021054905A1 (en) 2019-09-19 2019-09-19 A machine learning based prediction system and method

Publications (1)

Publication Number Publication Date
WO2021054905A1 true WO2021054905A1 (en) 2021-03-25

Family

ID=74883860

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/TR2019/050775 WO2021054905A1 (en) 2019-09-19 2019-09-19 A machine learning based prediction system and method

Country Status (1)

Country Link
WO (1) WO2021054905A1 (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140195466A1 (en) * 2013-01-08 2014-07-10 Purepredictive, Inc. Integrated machine learning for a data management product
US20160048766A1 (en) * 2014-08-13 2016-02-18 Vitae Analytics, Inc. Method and system for generating and aggregating models based on disparate data from insurance, financial services, and public industries
WO2018075995A1 (en) * 2016-10-21 2018-04-26 DataRobot, Inc. Systems for predictive data analytics, and related methods and apparatus

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140195466A1 (en) * 2013-01-08 2014-07-10 Purepredictive, Inc. Integrated machine learning for a data management product
US20160048766A1 (en) * 2014-08-13 2016-02-18 Vitae Analytics, Inc. Method and system for generating and aggregating models based on disparate data from insurance, financial services, and public industries
WO2018075995A1 (en) * 2016-10-21 2018-04-26 DataRobot, Inc. Systems for predictive data analytics, and related methods and apparatus

Similar Documents

Publication Publication Date Title
US9075848B2 (en) Methods, systems, and computer program products for generating data quality indicators for relationships in a database
Tsiptsis et al. Data mining techniques in CRM: inside customer segmentation
Prasad et al. Prediction of churn behavior of bank customers using data mining tools
US11403712B2 (en) Methods and systems for injury segment determination
EP1089222A1 (en) Method and apparatus for providing explanations of automated decisions applied to user data
Laurent et al. Intelligent automation entering the business world
Amuda et al. Customers churn prediction in financial institution using artificial neural network
Apsilyam et al. THE APPLICATION OF ARTIFICIAL INTELLIGENCE IN THE ECONOMIC SECTOR
Seymen et al. Customer churn prediction using deep learning
Hosseini et al. Identifying multi-channel value co-creator groups in the banking industry
Brahma et al. Automated mortgage origination delay detection from textual conversations
CN115545886A (en) Overdue risk identification method, overdue risk identification device, overdue risk identification equipment and storage medium
Jain et al. Machine Learning for Risk Analysis
CN112581271A (en) Merchant transaction risk monitoring method, device, equipment and storage medium
Thiprungsri Cluster analysis for anomaly detection in accounting
WO2021054905A1 (en) A machine learning based prediction system and method
Dalbah et al. An interactive dashboard for predicting bank customer attrition
CN114612239A (en) Stock public opinion monitoring and wind control system based on algorithm, big data and artificial intelligence
Srivastava et al. Hyperautomation in transforming underwriting operation in the life insurance industry
Shankararaman et al. A Framework for embedding analytics in a business process
Tanikella Credit Card Approval Verification Model
CN113094595A (en) Object recognition method, device, computer system and readable storage medium
KR20220022167A (en) Apparatus and metho for recommendation of financial products based on aritificail intelligence using unstrucred data
Lee et al. Application of machine learning in credit risk scorecard
Tewari Artificial Intelligence in Finance and Industry: Opportunities and Challenges

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19946208

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 09/06/2022)

122 Ep: pct application non-entry in european phase

Ref document number: 19946208

Country of ref document: EP

Kind code of ref document: A1