CN117235524A - Learning training platform of automatic valuation model - Google Patents

Learning training platform of automatic valuation model Download PDF

Info

Publication number
CN117235524A
CN117235524A CN202311212514.9A CN202311212514A CN117235524A CN 117235524 A CN117235524 A CN 117235524A CN 202311212514 A CN202311212514 A CN 202311212514A CN 117235524 A CN117235524 A CN 117235524A
Authority
CN
China
Prior art keywords
model
data
platform
module
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311212514.9A
Other languages
Chinese (zh)
Inventor
赵先明
林昀
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Hongshan Information Technology Research Institute Co Ltd
Original Assignee
Beijing Hongshan Information Technology Research Institute Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Hongshan Information Technology Research Institute Co Ltd filed Critical Beijing Hongshan Information Technology Research Institute Co Ltd
Priority to CN202311212514.9A priority Critical patent/CN117235524A/en
Publication of CN117235524A publication Critical patent/CN117235524A/en
Pending legal-status Critical Current

Links

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a learning training platform of an automatic estimation model in the technical field of training platforms, which comprises a main control center, a data collection and processing module, an intelligent feature extraction module, a model training and optimization module, a visualization and analysis module and an automatic deployment module, wherein the data collection and processing module can automatically collect and integrate a large amount of economic, financial, industrial and market data from various data sources. The beneficial effects of the invention are as follows: and (3) automation: the platform realizes the automatic flow of the whole estimation model, does not need manual intervention or professional knowledge, and has accuracy: through deep learning and big data analysis, the platform can provide more accurate and reliable estimated results, efficiency: the intelligent feature extraction and model optimization algorithm of the platform greatly improves the training efficiency of the estimated model, saves time and resources, and is expandable: the platform has good expandability, and new data sources, features and models can be added according to requirements.

Description

Learning training platform of automatic valuation model
Technical Field
The invention relates to the technical field of training platforms, in particular to a learning training platform of an automatic estimation model.
Background
Currently, valuation models play a vital role in the fields of finance, real estate, enterprise assessment, and the like. However, the conventional estimation model creation and training process is often cumbersome and time consuming, requiring expertise and a large amount of data. Therefore, an automated method is urgently needed to simplify and accelerate the learning training process of the estimation model.
Disclosure of Invention
The invention aims to provide a learning and training platform of an automatic estimation model, which realizes the aim of quickly and accurately creating and training the estimation model by combining artificial intelligence and big data technology so as to solve the problems in the background technology.
In order to achieve the above purpose, the present invention provides the following technical solutions: the learning training platform of the automatic valuation model comprises a main control center, a data collection and processing module, an intelligent feature extraction module, a model training and optimizing module, a visualization and analysis module and an automatic deployment module, wherein:
the data collection and processing module is capable of automatically collecting and integrating a large amount of economic, financial, industrial and marketing data from various data sources;
the intelligent feature extraction module can intelligently extract features related to the estimated value, and automatically mine potential influencing factors according to historical data and domain knowledge;
The model training and optimizing module automatically builds and trains an estimated model by using a deep learning technology and iterates and optimizes according to new market conditions;
the visualization and analysis module is used for providing a user interface for visualizing the result and analysis report of the estimation model and providing information such as prediction result, sensitivity analysis, uncertainty estimation and the like;
the automatic deployment module supports the deployment of the trained estimation model to different application scenes and provides real-time estimation service;
the data collection and processing module further comprises a data source interface and a data cleaning and standardization function, wherein the data source interface is used for being connected with various data sources and obtaining data, the data cleaning and standardization function is used for processing and cleaning the data obtained from the data sources to enable the data to be consistent and accurate, the intelligent feature extraction module further comprises an automatic feature selection and combination algorithm, a domain knowledge base and auxiliary feature extraction, the automatic feature selection and combination algorithm is used for selecting and combining the most informative features, the domain knowledge base is used for storing experience knowledge and rules, the feature conversion and encoding function is used for converting original data into feature representations which can be used for model training, and the deep learning model construction and optimization module further comprises: the system comprises a deep neural network structure design function, a model training algorithm and a model optimization algorithm, wherein the deep neural network structure design function is used for constructing a neural network structure adapting to an estimation task, the model training algorithm is used for training parameters of an estimation model based on historical data, the model optimization algorithm is used for carrying out model iteration and optimization according to new market conditions, the automatic deployment module further comprises a model deployment interface and a real-time estimation service function, the model deployment interface is used for deploying the trained estimation model into different application scenes, the real-time estimation service function is used for providing the real-time estimation service based on the deployment model, the visualization and analysis interface further comprises a data visualization function, an analysis report generation function and a decision assistance function, the data visualization function is used for visualizing and displaying input data and output results of the estimation model, the analysis report generation function is used for generating detailed estimation analysis reports, the decision assistance function is used for providing information such as model prediction results, sensitivity analysis and uncertainty estimation and the like, and the user is assisted in making decisions.
As a further scheme of the invention: the data collection and processing module further comprises the following sub-modules:
data source interface: the interface can be connected with various data sources and can acquire data, including financial data providers, economic research institutions and industry data platforms, and the platform can acquire wide data types by interfacing with a plurality of data sources;
data cleaning and normalization functions: in order to ensure the accuracy and consistency of the data, the platform provides data cleaning and standardization functions;
data quality assessment: the module can automatically detect potential quality problems in the data, and if the potential quality problems are found, the platform can repair or complement the data according to a preset rule or algorithm;
data integration and format conversion: the platform can integrate data from different data sources and perform format conversion to adapt to the requirements of an estimation model;
data storage and management: the platform has the functions of data storage and management, and can effectively manage a large amount of historical data and real-time data;
data update and synchronization: in order to ensure timeliness and accuracy of the estimation model, the platform can automatically update and synchronize data;
Custom dataset integration: the platform also supports the integration and use of user-defined datasets.
As still further aspects of the invention: the intelligent feature extraction module further comprises the following sub-modules:
deep learning feature extractor: the platform passes through a built-in deep learning feature extractor which can automatically learn useful feature representations from the raw data;
guiding an experience knowledge base: in order to further improve the accuracy and effectiveness of feature extraction, the platform combines the experience knowledge of field experts to construct an experience knowledge base;
intelligent selection and combination algorithm: the feature extraction module is also provided with an intelligent selection and combination algorithm, and the algorithm can automatically select the features with the most information quantity and combine the features into a feature representation with more expressive force;
dynamic feature update: the feature extraction module of the platform also has the function of dynamic feature update;
multidimensional feature combination: the feature extraction module can combine multidimensional features from different data sources so as to obtain more comprehensive feature representation;
pretreatment and noise treatment: the feature extraction module also includes data preprocessing and noise processing functions.
As still further aspects of the invention: the model training and optimizing module further comprises the following sub-modules:
and (3) selecting various models: the platform provides a plurality of different types of estimation model selections, including a linear regression model, a decision tree model, a support vector machine model and a neural network model, and a user can select a proper model for training and optimizing according to task requirements and data characteristics.
Model initialization and super-parameter tuning: the model training and optimizing module comprises a model initializing function and a super-parameter tuning function;
training data partitioning and cross-validation: in order to evaluate the performance and robustness of the model, the platform divides the training data into a training set, a validation set and a test set;
loss function definition and optimization algorithm: the platform allows a user to define a custom loss function according to the requirements of specific tasks;
model iteration and automatic tuning: the model training and optimizing module supports iteration and automatic optimization of the model;
ensemble learning and model fusion: the platform provides functions of integrated learning and model fusion, and the accuracy and stability of the estimated model are improved by comprehensively considering the opinions of a plurality of models by combining the prediction results of a plurality of different models;
Model interpretation and evaluation: to improve the interpretability and reliability of the model, the platform provides the functionality for model interpretation and evaluation.
As still further aspects of the invention: the visualization and analysis module further comprises the following sub-modules:
data visualization tool: the platform provides visual data visualization tools, and can display original data, characteristic data and model results in a graphical form;
visualization of model results: the platform can display the result of the estimation model in a visual mode so as to help a user understand the prediction and estimation result of the model;
feature importance visualization: the visualization and analysis module can display the importance of the features in an intuitive way;
model interpretation tool: in order to improve the interpretability of the model, the platform provides a model interpretation tool capable of resolving the internal logic and rules of the valuation model;
data exploration and interaction analysis: the platform provides the functions of data exploration and interaction analysis, so that a user can freely explore data and conduct deep analysis;
and (3) geographic information analysis: if the data relates to geographic location information, the platform may provide geographic information analysis functionality;
real-time monitoring and early warning: the visualization and analysis module also has the functions of real-time monitoring and early warning.
As still further aspects of the invention: the automated deployment module further comprises the following sub-modules:
algorithm packing and model derivation: the platform provides algorithm packing and model export functions, and can export the trained estimated model in a standard format;
environment configuration and dependency management: the automatic deployment module is responsible for environment configuration and dependency management, and ensures that software packages, libraries and dependency items in the deployment environment are consistent with the training environment;
system integration and API support: the platform supports system integration and API interfaces, so that the estimation model can be seamlessly integrated into other application programs or systems;
automated deployment tool and pipeline: the automatic deployment module provides an automatic deployment tool and a pipeline, and realizes the automation and repeatability of the deployment process;
deployment monitoring and troubleshooting: the automatic deployment module also comprises deployment monitoring and fault checking functions;
model update and rollback: the automatic deployment module supports updating and rollback of the model;
security and rights control: the automated deployment module attaches importance to security and rights control.
As still further aspects of the invention: .
Compared with the prior art, the invention has the beneficial effects that:
And (3) automation: the platform realizes the automatic flow of the whole estimation model without manual intervention or expert knowledge.
Accuracy: through deep learning and big data analysis, the platform can provide more accurate and reliable estimation results.
Efficiency is as follows: the intelligent feature extraction and model optimization algorithm of the platform greatly improves the training efficiency of the estimated model and saves time and resources.
Scalability: the platform has good expandability, and new data sources, features and models can be added according to requirements.
Drawings
Fig. 1 is a schematic diagram of a system structure according to the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
In the description of the present invention, it should be understood that the terms "upper", "lower", "front", "rear", "left", "right", "top", "bottom", "inner", "outer", "one end", "one side", etc. indicate orientations or positional relationships based on the orientations or positional relationships shown in the drawings, are merely for convenience in describing the present invention and simplifying the description, and do not indicate or imply that the devices or elements referred to must have a specific orientation, be configured and operated in a specific orientation, and thus should not be construed as limiting the present invention.
Referring to FIG. 1, an embodiment of the present invention includes a data collection and processing module for automatically collecting economic, financial, industrial and market data from a plurality of data sources, and for processing and normalizing the data;
the intelligent feature extraction module is used for automatically extracting features related to the estimated value and mining potential influence factors according to historical data and domain knowledge;
the deep learning model construction and optimization module is used for automatically constructing and training an estimated model and carrying out iteration and optimization according to new market conditions;
the visualization and analysis interface is used for visualizing the result and report of the estimation model and providing information such as a prediction result, sensitivity analysis, uncertainty estimation and the like;
and the automatic deployment module is used for deploying the trained estimation model into different application scenes and providing real-time estimation service.
Preferably, the data collection and processing module further comprises:
a) Data source interface: the interface is capable of interfacing with a variety of data sources and data acquisition, including financial data providers, economic research institutions, industry data platforms, and the like. By interfacing with multiple data sources, the platform may obtain a wide range of data types, such as financial reports, market indexes, macro economic indicators, industry reports, and the like.
b) Data cleaning and normalization functions: to ensure accuracy and consistency of data, the platform provides data cleansing and normalization functions. The function can automatically detect and process missing values, abnormal values and repeated values in the data, and ensure the integrity and reliability of the data. Meanwhile, the platform can also perform standardization processing on the data so as to enable the data to meet the input requirements of the model, such as data normalization, standardization and the like.
c) Data quality assessment: data quality is critical to the accuracy of the estimation model. Thus, the platform also includes data quality assessment functionality in the data collection and processing module. This function automatically detects potential quality problems in the data, such as data integrity, consistency, and accuracy. If the data quality problem is found, the platform can repair or complement the data according to a preset rule or algorithm.
d) Data integration and format conversion: the platform can integrate data from different data sources and perform format conversion to adapt to the requirements of the estimation model. Through data integration and format conversion, the platform can unify the data of a plurality of data sources into a consistent format and data structure, thereby facilitating subsequent feature extraction and model training.
e) Data storage and management: the platform has the functions of data storage and management, and can effectively manage a large amount of historical data and real-time data. The platform can use high-efficiency database technology to realize rapid storage, indexing and retrieval of data. Meanwhile, the platform also provides data backup and security control measures to protect the integrity and confidentiality of data.
f) Data update and synchronization: in order to ensure timeliness and accuracy of the estimation model, the platform can automatically update and synchronize data. It can periodically acquire the latest data from the data source and synchronize it with the existing historical data. By means of data updating and synchronization, the platform can keep real-time performance of the model and adapt to market changes and arrival of new data.
g) Custom dataset integration: the platform also supports the integration and use of user-defined datasets. The user may import his own data set into the platform, integrate with the data sources within the platform, and use it with the functionality provided by the platform to meet specific valuation requirements.
Through the rich data collection and processing module, the learning training platform of the automatic estimation model can efficiently and accurately collect, process and manage a large amount of diversified data, and provides strong support for subsequent feature extraction and model training.
Wherein the intelligent feature extraction module further comprises:
a) Deep learning feature extractor: the platform incorporates a powerful deep learning feature extractor that automatically learns useful feature representations from raw data. Through a multi-layer neural network structure, the feature extractor can extract abstract and high-order features in the data through nonlinear transformation layer by layer, so that the intrinsic mode and the correlation of the data are better captured.
b) Guiding an experience knowledge base: in order to further improve the accuracy and effectiveness of feature extraction, the platform combines the experience knowledge of field experts to construct an experience knowledge base. The knowledge base contains industry rules, expertise, priori information and the like, and can guide key feature selection and combination in the feature extraction process. By incorporating domain knowledge, the feature extraction module may more accurately mine features that are relevant to the valuation.
c) Intelligent selection and combination algorithm: the feature extraction module is also equipped with intelligent selection and combination algorithms that automatically select the most informative features and combine them into a more expressive feature representation. The algorithm can determine which features have significant influence on the estimation model by evaluating indexes such as relevance, importance, uniqueness and the like of the features, so that redundant features are effectively reduced, and the accuracy and the interpretability of the estimation model are improved.
d) Dynamic feature update: the feature extraction module of the platform also has the function of dynamic feature update. Over time and the arrival of new data, the module can automatically update the parameters and structure of the feature extractor to capture dynamic changes in the market and data. Thus, the estimation model can adapt to new market conditions in time and provide accurate prediction and estimation results.
e) Multidimensional feature combination: the feature extraction module is capable of combining multi-dimensional features from different data sources to obtain a more comprehensive representation of the features. The method can fuse the characteristics of multiple dimensions such as macro economic indexes, industry data, enterprise financial indexes and the like, and further improves the understanding capability of the estimation model on the complex market environment.
f) Pretreatment and noise treatment: the feature extraction module also includes data preprocessing and noise processing functions. The method can perform preprocessing operations such as normalization, standardization, smoothing and the like on the original data so as to eliminate noise and abnormal values in the data and reduce interference to a feature extraction process.
Through the rich intelligent feature extraction module, the learning training platform of the automatic estimation model can automatically extract valuable high-abstract feature representations from the original data. These features can better reflect the inherent regularity and relevance of the data, providing a more accurate and reliable basis for subsequent model training and prediction.
Wherein the model training and optimization module further comprises:
a) And (3) selecting various models: the platform provides a variety of different types of estimation model selections, including linear regression models, decision tree models, support vector machine models, neural network models, and the like. The user can select a proper model for training and optimizing according to task demands and data characteristics.
b) Model initialization and super-parameter tuning: the model training and optimizing module comprises model initialization and super-parameter tuning functions. The platform can initialize parameters of the estimated model according to the initially set parameter configuration, and prepare for subsequent training. Meanwhile, the platform also provides an automatic super-parameter tuning algorithm, and can automatically search for proper super-parameter combinations so as to improve the performance and generalization capability of the model.
c) Training data partitioning and cross-validation: to evaluate the performance and robustness of the model, the platform divides the training data into a training set, a validation set, and a test set. The training set is used for parameter training of the model, the verification set is used for super-parameter selection and model tuning, and the test set is used for evaluating generalization capability of the model. In addition, the platform also provides a model evaluation method based on cross validation so as to evaluate the performance of the model more accurately.
d) Loss function definition and optimization algorithm: the platform allows the user to define custom loss functions according to the needs of a particular task. The loss function is used to measure the error between the model predicted value and the actual value and is used as an objective function of the optimization algorithm. The platform supports a variety of optimization algorithms, such as gradient descent, stochastic gradient descent, adam optimizers, etc., to minimize the loss function and optimize the model parameters.
e) Model iteration and automatic tuning: the model training and optimizing module supports iterative and automatic tuning of the model. By continuously iterating training and optimizing the model, the platform can effectively improve the performance and accuracy of the estimated model. In addition, the platform can automatically adjust model parameters and structures according to new market conditions so as to adapt to dynamic changes of the market.
f) Ensemble learning and model fusion: the platform provides functions of integrated learning and model fusion, and the accuracy and stability of the estimated model are improved by comprehensively considering the opinions of a plurality of models by combining the prediction results of a plurality of different models. Common ensemble learning methods include voting, stacking, weighted averaging, and the like.
g) Model interpretation and evaluation: to improve the interpretability and reliability of the model, the platform provides the functionality for model interpretation and evaluation. By analyzing the weight, feature importance, gradient, etc. of the model, the platform can interpret the decision process and key factors of the model. Meanwhile, the platform also provides various evaluation indexes such as Root Mean Square Error (RMSE), mean Absolute Error (MAE), decision coefficient (R2) and the like to evaluate the prediction capability and performance of the model.
Through the rich model training and optimizing module, the learning training platform of the automatic estimation model can perform flexible, efficient and automatic model training and optimizing. These functions can improve the performance and generalization ability of the estimation model, ensuring that the model can be accurately predicted.
Wherein the visualization and analysis module further comprises:
a) Data visualization tool: the platform provides visual data visualization tools, and can display the original data, the characteristic data and the model result in a graphical form. The user may select the appropriate chart type, e.g., line graph, bar graph, scatter graph, etc., according to his own needs to better understand and analyze the characteristics and trends of the data.
b) Visualization of model results: the platform can visually present the results of the valuation model to assist the user in understanding the model's predicted and valuation results. The platform supports comparing the predicted value with the actual value, draws graphs such as a predicted curve, error distribution and the like, and helps a user evaluate the accuracy and reliability of the model.
c) Feature importance visualization: the visualization and analysis module may display the importance of the features in an intuitive manner. By plotting feature importance charts or thermodynamic diagrams, a user can understand how much each feature contributes to the estimation model, thereby better understanding the decision making process and key factors of the model.
d) Model interpretation tool: to improve the interpretability of the model, the platform provides a model interpretation tool that can parse the internal logic and rules of the valuation model. Through the interpretation tool, the user can understand how the model makes decisions, which features have an important impact on the model's predicted outcome. This helps the user to understand the credibility of the model and the reasoning process more deeply.
e) Data exploration and interaction analysis: the platform provides the functions of data exploration and interaction analysis, so that a user can freely explore data and conduct deep analysis. Through the visualization and interaction interfaces, a user can perform screening, sorting, filtering and other operations on the data, and quickly find out trends, rules and anomalies in the data, so that analysis results with more insight are provided.
f) And (3) geographic information analysis: if the data relates to geographic location information, the platform may provide geographic information analysis functionality. The user can display the data in the form of a map, and the estimated conditions or trends of different areas are displayed so as to help the user to know the geographic distribution and the spatial correlation.
g) Real-time monitoring and early warning: the visualization and analysis module also has the functions of real-time monitoring and early warning. The platform can update data in real time and display the data in a visual mode, and meanwhile, early warning conditions and threshold values are set, when the early warning conditions are met, the system can automatically give an alarm, and users are helped to grasp market dynamics and risk conditions in time.
Through the rich visualization and analysis modules, the learning training platform of the automatic estimation model can provide visual, interpretable and flexible data analysis and visualization tools. These tools enable users to understand data, model predictions, and feature importance in depth, making more accurate decisions and judgments.
Wherein the automated deployment module further comprises:
a) Algorithm packing and model derivation: the platform provides algorithm packing and model derivation functions, and the trained valuation model can be derived in standard format, such as PMML (Predictive Model Markup Language) or ONNX (Open Neural Network Exchange), to ensure portability and cross-platform deployment capability of the model.
b) Environment configuration and dependency management: the automated deployment module is responsible for environment configuration and dependency management, ensuring that software packages, libraries and dependent items in the deployment environment are consistent with the training environment. Thus, compatibility problems caused by environmental differences can be avoided, and normal operation of the model in different deployment environments is ensured.
c) System integration and API support: the platform supports system integration and API interfaces so that the valuation model can be seamlessly integrated into other applications or systems. By defining and exposing the API interface, a user can conveniently call the valuation model to realize automated valuation calculation and prediction.
d) Automated deployment tool and pipeline: the automatic deployment module provides an automatic deployment tool and a pipeline, and realizes the automation and repeatability of the deployment process. Through the script, the configuration file or the visual interface, a user can set deployment parameters, configuration file paths and the like, so that a deployment flow is simplified, and deployment efficiency is improved.
e) Deployment monitoring and troubleshooting: the automated deployment module also includes deployment monitoring and troubleshooting functions. The platform can monitor the state and performance index of deployment, and timely discover and solve potential problems. Meanwhile, for errors or anomalies in deployment, the platform can provide a fault checking tool to help users to quickly locate and solve the problems.
f) Model update and rollback: the automated deployment module supports updating and rollback of the model. Over time and with the advent of new data, users can automatically deploy updated models into a production environment. If the new model is found to perform poorly or to introduce unforeseen problems, the user can quickly roll back to the previous stable version.
g) Security and rights control: the automated deployment module attaches importance to security and rights control. The platform can support a user identity authentication and authorization mechanism and ensure the security of the model and data. Only authorized users can access and deploy the model, thereby preventing unauthorized access and potential risk of data leakage.
Through the rich automatic deployment modules, the learning training platform of the automatic estimation model can realize rapid, reliable and extensible model deployment. The functions can reduce the complexity and workload of deployment and improve the real-time performance and usability of the model, so that better use experience and effect are provided for users.
The data collection and processing module of the present invention is intended to collect, clean and prepare data for training an estimation model. The following are specific examples of the module:
data source identification and access: the module first identifies and accesses various data sources, such as financial market data, economic indicators, corporate financial data, and the like. These data sources may be either public data sources or proprietary data sources provided by data suppliers that cooperate with the company.
Data capture and update: the module collects data from various data sources by a crawling technique and updates the existing data periodically. The data crawling process may use web crawler technology to extract data from the data sources through an API interface or other suitable means. The update process may be automatically scheduled according to the update frequency of the data source to maintain timeliness of the data.
Data cleaning and pretreatment: the collected original data often has the problems of missing values, abnormal values, noise and the like. The module cleans and preprocesses the data, including filling in missing values, processing outliers, removing noise, smoothing the data, etc. Meanwhile, preprocessing operations such as feature selection, dimension reduction, standardization and the like can be performed on the data, so that subsequent model training and analysis are facilitated.
Characteristic engineering: the module performs feature engineering to convert raw data into more valuable features. The feature engineering comprises the steps of feature extraction, feature conversion, feature construction and the like. The original data can be subjected to feature extraction according to the requirements of domain knowledge and data analysis, and features related to the estimated target are extracted. The original data can be converted and combined by the methods of feature conversion, feature construction and the like, and new features are generated to capture more information.
Data storage and management: the module stores the processed data in a database or data warehouse for subsequent model training and analysis. The data store may employ suitable techniques such as relational databases, noSQL databases, or distributed file systems. Meanwhile, a data index and a data tag can be established, so that data can be conveniently and rapidly searched and managed.
Data quality monitoring and reporting: the module is responsible for monitoring and reporting data quality. By setting the data quality index and the threshold value, the quality in the aspects of accuracy, integrity, consistency and the like of the data is monitored in real time. If the data quality problem is found, the system can automatically alarm or intervene manually to repair and process the data.
Data privacy and security protection: during data collection and processing, the module focuses on data privacy and security protection. Appropriate measures such as encryption, authentication, rights control, etc. are taken to protect the confidentiality and integrity of the data. Meanwhile, the validity and compliance of data collection and processing are ensured by following the related privacy laws and data protection policies.
Through the embodiment, the data collection and processing module realizes access and grabbing of various data sources, cleaning and preprocessing of data, characteristic engineering and ensuring of quality and safety of the data. Such a data processing flow provides a high quality data basis for subsequent model training and analysis.
The intelligent feature extraction module in the invention automatically extracts effective features related to the estimated target from the original data by utilizing machine learning and intelligent algorithm. The following are specific examples of the module:
Automatic feature engineering: feature engineering is a cumbersome and critical task, and the module uses automation technology to perform feature engineering. The method can automatically perform operations such as feature selection, feature transformation, feature construction and the like according to the type of data and the distribution of features. For example, normalization, logarithmic transformation, etc. may be automatically performed for continuous data; for classified data, one-hot encoding, feature hashing, or the like may be automatically performed. Through automatic feature engineering, features with more prediction capability can be generated, and model performance is improved.
Nonlinear feature extraction: the module uses a nonlinear feature extraction algorithm to increase the expressive power of the model by finding nonlinear relationships in the data. For example, the original features may be polynomial converted using a polynomial feature extraction algorithm, introduced Gao Jiexiang and cross terms, capturing more complex data relationships. The kernel-based feature mapping method may also be used to map data to a high-dimensional feature space such that the otherwise linear inseparable problem becomes linear inseparable.
Feature extraction based on deep learning: deep learning has a strong capability in feature extraction, and the module can automatically learn abstract representations of data and advanced features using a deep learning model. For example, the image data may be feature extracted using a Convolutional Neural Network (CNN), and the sequence data may be feature extracted using a Recurrent Neural Network (RNN). By using the deep learning model, features with more discriminant can be extracted from the original data, and the accuracy and generalization capability of the model are improved.
Feature importance assessment: the module may evaluate the importance of each feature in the valuation model to help the user understand the relationship between the feature and the valuation objective. It may rank the features using tree-based algorithms (such as random forests or gradient-lifted trees), or calculate the relative weights of the features using neural network-based methods. Through feature importance assessment, a user can know which features contribute significantly to the predictive performance of the valuation model, thereby better understanding the data and model.
Incremental feature learning: the module supports incremental feature learning, enabling feature extraction models to be updated dynamically as new data arrives. The method can use an online learning algorithm or an incremental learning method to extract incremental characteristics of new data and update model parameters so as to adapt to the change of data distribution and the addition of new samples.
The model training and optimizing module in the invention aims to train an estimated model and optimize the model by using data subjected to characteristic engineering and preparation. The following are specific examples of the module:
model selection: according to the characteristics and requirements of the estimation task, the module realizes a plurality of model selection algorithms. By comparing performance and effect indexes of different models, a model suitable for an evaluation task is automatically selected. For example, performance of different models such as linear regression, decision trees, support vector machines, neural networks, etc., may be compared and the model with the best performance may be selected as the final estimation model.
Model training: the module performs model training on the prepared data using the selected estimation model algorithm. During the training process, the model learns patterns and relations in the data according to the input characteristics and the estimated targets. The training process can use optimization algorithms such as batch gradient descent, random gradient descent and the like, and the parameters of the optimization model are iterated to enable the model to gradually approach the optimal solution. The training process may perform cross-validation and model evaluation to ensure generalization ability and accuracy of the model.
Super parameter tuning: the module realizes an automatic super-parameter tuning method and is used for finding the optimal model super-parameter combination. Super parameters refer to parameters that need to be manually specified during the training process, such as learning rate, regularization coefficient, tree depth, etc. By using cross-validation, grid search, bayesian optimization and other techniques, the system can automatically try different super-parameter combinations, evaluate the performance of the model on a validation set, and select the optimal super-parameter combination, thereby improving the performance and generalization capability of the model.
Model integration: the module supports a model integration technology, and integrates a plurality of estimation models obtained through training. The prediction results of different models are combined through voting, weighted average, stacking and other methods, so that a more stable and accurate estimated result is obtained. The model integration can improve the robustness of the model, reduce the over fitting and under fitting of the model and improve the overall performance of the estimated model.
Model verification and evaluation: after the training process is completed, the module verifies and evaluates the estimation model obtained through training. And calculating evaluation indexes such as accuracy, precision, recall rate, F1 score and the like of the model by using data of the verification set or the test set, and evaluating the performance of the model. The method can be used for performing cross validation, AUC-ROC curve analysis, confusion matrix and other methods, and comprehensively evaluating and validating the model.
Model persistence and deployment: the trained estimation model needs to be stored in a lasting mode and deployed into an actual environment for use. The module realizes the storage and loading functions of the model, stores the model obtained by training in a disk or cloud storage, and creates an API interface or releases the API interface as a service so that other application programs or systems can call and use the model for valuation calculation.
The visualization and analysis module in the invention visualizes and analyzes the estimated value result and the model performance through a graphical interface, a data processing technology and a data analysis method. The following are specific examples of the module:
data visualization: the module can visually display the original data and the processed data in an intuitive mode. The distribution, trend, and relationship of the data may be presented using common chart types such as line graphs, bar graphs, scatter graphs, and the like. For example, for stock valuations, a line graph may be used to show the historical trend of stock prices; for property estimation, a map may be used to label the property location and present its estimation results. The data visualization may help the user better understand the data and discover rules and trends in the data.
Visualization of results: the module can visually display the estimated result to intuitively present the estimated result and the predicted value. By displaying the estimation results by using charts, images, maps and the like, the user can intuitively understand the distribution, differences and changes of the estimation results. For example, for stock valuations, the valuated price may be compared to the actual price and presented in a bar graph or line graph; for property estimation, the estimation results of different property locations can be marked on a map. Through the result visualization, the user can intuitively know the influence factors and trends of the estimation result.
Model analysis: the module can analyze and interpret the estimation model to help the user understand the performance and working principles of the model. By using methods such as feature importance analysis, error analysis, etc., accuracy and stability of the model can be evaluated, and limitations of the model and room for improvement can be found. For example, the importance of individual features can be demonstrated by bar graphs or thermodynamic diagrams, evaluating their extent of influence on the model; the performance of the model in different data intervals can be analyzed by drawing a prediction error distribution diagram or a residual diagram. Model analysis may help users make assessment and optimization decisions for the model.
User interaction: the module provides the user with the interactive capability of the system so that the user can perform customized data analysis and query. Through the user interface and the operation control, the user can customize the query conditions, select specific data views and analyze indexes. For example, the user may select different time ranges, specific subsets of data, indicators of interest, etc., and observe the corresponding analysis results. The user interaction capability may customize the data analysis process, providing a personalized data visualization and analysis experience.
Data export and report generation: the module may support exporting the analysis results as a visual chart or report document so that the user may save, share, and further analyze. The user may choose to export the chart into a common image format (e.g., PNG, JPEG) or to generate the report into a PDF, excel, etc. format. The derived charts and reports may be used as references to the decision making process, as well as for sharing and discussion with others.
The automatic deployment module aims to realize quick deployment and application integration of the valuation model, and is convenient to apply the valuation model to an actual service scene. The following are rich embodiments of the module:
Environmental configuration and dependency management: the module can detect and configure the environment and dependencies required for the model to run. For example, corresponding software packages, libraries, and tools are automatically installed and configured according to the requirements of the model, ensuring that the model is able to function properly in the deployment environment.
And (3) automation of deployment flow: the module can simplify the steps and operation of model deployment through an automatic deployment flow. For example, by providing a visual interface or command line interface, the system may guide the user through the process of uploading, installing, and configuring the model, reducing the complexity of manual operations and the risk of errors.
Elastic deployment and expansion: the module can support the flexible deployment and expansion of the model to cope with the business requirements of different scales and loads. By automatically monitoring the system load and performance index, the system can automatically expand the capacity and resources of the model, and ensure that the estimation service meeting the performance requirement can be provided under the high-load environment.
Configuration management and version control: the module can provide management and version control functions of model configuration, and is convenient for tracking and managing different configurations and updates of the model. By recording and comparing changes in model configuration, the system can trace back and recover previous configurations, reducing problems and risks due to configuration errors.
Monitoring and alarming: the module can monitor the deployment model and detect the real-time state, and provide alarming and notification functions. By monitoring the performance index, abnormal behavior and error log of the model, the system can discover and respond to problems in time, and the influence and loss on the service are reduced.
Cross-platform and integration: the module may support deployment and integration of valuation models in different platforms and systems. Whether deployed to a cloud platform, a local server, or a mobile device, the system may provide a corresponding deployment solution that ensures that the model can run and integrate on different platforms.
Automatic update and rollback: the module can realize automatic updating and rollback functions of the model so as to rapidly cope with business changes and model improvement. By automatically detecting new model versions and implementing updates, the system can timely apply new model improvements and optimizations. Meanwhile, the system can also provide a rollback mechanism, so that the system can quickly recover to the previous version when a problem occurs in the updating process, and the stability of the service is ensured.
Through the embodiment, the automatic deployment module can provide an efficient, reliable and easy-to-use deployment solution, and realizes quick application and integration of the estimation model.
The foregoing is only a preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art, who is within the scope of the present invention, should make equivalent substitutions or modifications according to the technical scheme of the present invention and the inventive concept thereof, and should be covered by the scope of the present invention.

Claims (6)

1. The learning training platform of the automatic valuation model is characterized in that: the system comprises a main control center, a data collection and processing module, an intelligent feature extraction module, a model training and optimizing module, a visualization and analysis module and an automatic deployment module, wherein:
the data collection and processing module is capable of automatically collecting and integrating a large amount of economic, financial, industrial and marketing data from various data sources;
the intelligent feature extraction module can intelligently extract features related to the estimated value, and automatically mine potential influencing factors according to historical data and domain knowledge;
the model training and optimizing module automatically builds and trains an estimated model by using a deep learning technology and iterates and optimizes according to new market conditions;
the visualization and analysis module is used for providing a user interface for visualizing the result and analysis report of the estimation model and providing information such as prediction result, sensitivity analysis, uncertainty estimation and the like;
The automatic deployment module supports the deployment of the trained estimation model to different application scenes and provides real-time estimation service;
the data collection and processing module further comprises a data source interface and a data cleaning and standardization function, wherein the data source interface is used for being connected with various data sources and obtaining data, the data cleaning and standardization function is used for processing and cleaning the data obtained from the data sources to enable the data to be consistent and accurate, the intelligent feature extraction module further comprises an automatic feature selection and combination algorithm, a domain knowledge base and auxiliary feature extraction, the automatic feature selection and combination algorithm is used for selecting and combining the most informative features, the domain knowledge base is used for storing experience knowledge and rules, the feature conversion and encoding function is used for converting original data into feature representations which can be used for model training, and the deep learning model construction and optimization module further comprises: the system comprises a deep neural network structure design function, a model training algorithm and a model optimization algorithm, wherein the deep neural network structure design function is used for constructing a neural network structure adapting to an estimation task, the model training algorithm is used for training parameters of an estimation model based on historical data, the model optimization algorithm is used for carrying out model iteration and optimization according to new market conditions, the automatic deployment module further comprises a model deployment interface and a real-time estimation service function, the model deployment interface is used for deploying the trained estimation model into different application scenes, the real-time estimation service function is used for providing the real-time estimation service based on the deployment model, the visualization and analysis interface further comprises a data visualization function, an analysis report generation function and a decision assistance function, the data visualization function is used for visualizing and displaying input data and output results of the estimation model, the analysis report generation function is used for generating detailed estimation analysis reports, the decision assistance function is used for providing information such as model prediction results, sensitivity analysis and uncertainty estimation and the like, and the user is assisted in making decisions.
2. The learning training platform of an automatic valuation model of claim 1, wherein: the data collection and processing module further comprises the following sub-modules:
data source interface: the interface can be connected with various data sources and can acquire data, including financial data providers, economic research institutions and industry data platforms, and the platform can acquire wide data types by interfacing with a plurality of data sources;
data cleaning and normalization functions: in order to ensure the accuracy and consistency of the data, the platform provides data cleaning and standardization functions;
data quality assessment: the module can automatically detect potential quality problems in the data, and if the potential quality problems are found, the platform can repair or complement the data according to a preset rule or algorithm;
data integration and format conversion: the platform can integrate data from different data sources and perform format conversion to adapt to the requirements of an estimation model;
data storage and management: the platform has the functions of data storage and management, and can effectively manage a large amount of historical data and real-time data;
data update and synchronization: in order to ensure timeliness and accuracy of the estimation model, the platform can automatically update and synchronize data;
Custom dataset integration: the platform also supports the integration and use of user-defined datasets.
3. The learning training platform of an automatic valuation model of claim 1, wherein: the intelligent feature extraction module further comprises the following sub-modules:
deep learning feature extractor: the platform passes through a built-in deep learning feature extractor which can automatically learn useful feature representations from the raw data;
guiding an experience knowledge base: in order to further improve the accuracy and effectiveness of feature extraction, the platform combines the experience knowledge of field experts to construct an experience knowledge base;
intelligent selection and combination algorithm: the feature extraction module is also provided with an intelligent selection and combination algorithm, and the algorithm can automatically select the features with the most information quantity and combine the features into a feature representation with more expressive force;
dynamic feature update: the feature extraction module of the platform also has the function of dynamic feature update;
multidimensional feature combination: the feature extraction module can combine multidimensional features from different data sources so as to obtain more comprehensive feature representation;
pretreatment and noise treatment: the feature extraction module also includes data preprocessing and noise processing functions.
4. The learning training platform of an automatic valuation model of claim 1, wherein: the model training and optimizing module further comprises the following sub-modules:
and (3) selecting various models: the platform provides a plurality of different types of estimation model selections, including a linear regression model, a decision tree model, a support vector machine model and a neural network model, and a user can select a proper model for training and optimizing according to task requirements and data characteristics.
Model initialization and super-parameter tuning: the model training and optimizing module comprises a model initializing function and a super-parameter tuning function;
training data partitioning and cross-validation: in order to evaluate the performance and robustness of the model, the platform divides the training data into a training set, a validation set and a test set;
loss function definition and optimization algorithm: the platform allows a user to define a custom loss function according to the requirements of specific tasks;
model iteration and automatic tuning: the model training and optimizing module supports iteration and automatic optimization of the model;
ensemble learning and model fusion: the platform provides functions of integrated learning and model fusion, and the accuracy and stability of the estimated model are improved by comprehensively considering the opinions of a plurality of models by combining the prediction results of a plurality of different models;
Model interpretation and evaluation: to improve the interpretability and reliability of the model, the platform provides the functionality for model interpretation and evaluation.
5. The learning training platform of an automatic valuation model of claim 1, wherein: the visualization and analysis module further comprises the following sub-modules:
data visualization tool: the platform provides visual data visualization tools, and can display original data, characteristic data and model results in a graphical form;
visualization of model results: the platform can display the result of the estimation model in a visual mode so as to help a user understand the prediction and estimation result of the model;
feature importance visualization: the visualization and analysis module can display the importance of the features in an intuitive way;
model interpretation tool: in order to improve the interpretability of the model, the platform provides a model interpretation tool capable of resolving the internal logic and rules of the valuation model;
data exploration and interaction analysis: the platform provides the functions of data exploration and interaction analysis, so that a user can freely explore data and conduct deep analysis;
and (3) geographic information analysis: if the data relates to geographic location information, the platform may provide geographic information analysis functionality;
Real-time monitoring and early warning: the visualization and analysis module also has the functions of real-time monitoring and early warning.
6. The learning training platform of an automatic valuation model of claim 1, wherein: the automated deployment module further comprises the following sub-modules:
algorithm packing and model derivation: the platform provides algorithm packing and model export functions, and can export the trained estimated model in a standard format;
environment configuration and dependency management: the automatic deployment module is responsible for environment configuration and dependency management, and ensures that software packages, libraries and dependency items in the deployment environment are consistent with the training environment;
system integration and API support: the platform supports system integration and API interfaces, so that the estimation model can be seamlessly integrated into other application programs or systems;
automated deployment tool and pipeline: the automatic deployment module provides an automatic deployment tool and a pipeline, and realizes the automation and repeatability of the deployment process;
deployment monitoring and troubleshooting: the automatic deployment module also comprises deployment monitoring and fault checking functions;
model update and rollback: the automatic deployment module supports updating and rollback of the model;
security and rights control: the automated deployment module attaches importance to security and rights control.
CN202311212514.9A 2023-09-20 2023-09-20 Learning training platform of automatic valuation model Pending CN117235524A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311212514.9A CN117235524A (en) 2023-09-20 2023-09-20 Learning training platform of automatic valuation model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311212514.9A CN117235524A (en) 2023-09-20 2023-09-20 Learning training platform of automatic valuation model

Publications (1)

Publication Number Publication Date
CN117235524A true CN117235524A (en) 2023-12-15

Family

ID=89085649

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311212514.9A Pending CN117235524A (en) 2023-09-20 2023-09-20 Learning training platform of automatic valuation model

Country Status (1)

Country Link
CN (1) CN117235524A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117474125A (en) * 2023-12-21 2024-01-30 环球数科集团有限公司 Automatic training machine learning model system
CN117608865A (en) * 2024-01-23 2024-02-27 江西科技学院 Mathematical model service method and system of take-away meal delivery platform based on cloud computing

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117474125A (en) * 2023-12-21 2024-01-30 环球数科集团有限公司 Automatic training machine learning model system
CN117474125B (en) * 2023-12-21 2024-03-01 环球数科集团有限公司 Automatic training machine learning model system
CN117608865A (en) * 2024-01-23 2024-02-27 江西科技学院 Mathematical model service method and system of take-away meal delivery platform based on cloud computing
CN117608865B (en) * 2024-01-23 2024-04-05 江西科技学院 Mathematical model service method and system of take-away meal delivery platform based on cloud computing

Similar Documents

Publication Publication Date Title
Wang et al. Industrial big data analytics: challenges, methodologies, and applications
Huang et al. Big-data-driven safety decision-making: a conceptual framework and its influencing factors
CN110796470B (en) Data analysis system for market subject supervision and service
US20020138492A1 (en) Data mining application with improved data mining algorithm selection
CN117235524A (en) Learning training platform of automatic valuation model
CN115578015A (en) Sewage treatment overall process supervision method and system based on Internet of things and storage medium
CN108985467A (en) Secondary device lean management-control method based on artificial intelligence
CN111680153A (en) Big data authentication method and system based on knowledge graph
Davila Delgado et al. Big data analytics system for costing power transmission projects
Luo et al. Big data analytics–enabled cyber-physical system: model and applications
CN116703303A (en) Warehouse visual supervision system and method based on multi-layer perceptron and RBF
CN117194919A (en) Production data analysis system
Aghimien et al. A review of the application of data mining for sustainable construction in Nigeria
Zhengxin et al. Mlops spanning whole machine learning life cycle: A survey
Márquez Digital maintenance manage ́ ment
Rajbahadur et al. Pitfalls analyzer: quality control for model-driven data science pipelines
Ali et al. Distributed data mining systems: techniques, approaches and algorithms
Moleda et al. Big data in power generation
Roy et al. Mining big data in manufacturing: requirement analysis, tools and techniques
US20210124751A1 (en) Prescriptive Recommendation System and Method for Enhanced Speed and Efficiency in Rule Discovery from Data in Process Monitoring
Ramanujan et al. VESPER: Visual exploration of similarity and performance metrics for computer-aided design repositories
AU2020201689A1 (en) Cognitive forecasting
Fan et al. Building intelligent applications for construction equipment management
CN117076454B (en) Engineering quality acceptance form data structured storage method and system
Li et al. Analytic model and assessment framework for data quality evaluation in state grid

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination