CN117764726A - Real estate financial risk prevention and control method and system based on big data and artificial intelligence - Google Patents

Real estate financial risk prevention and control method and system based on big data and artificial intelligence Download PDF

Info

Publication number
CN117764726A
CN117764726A CN202410196772.0A CN202410196772A CN117764726A CN 117764726 A CN117764726 A CN 117764726A CN 202410196772 A CN202410196772 A CN 202410196772A CN 117764726 A CN117764726 A CN 117764726A
Authority
CN
China
Prior art keywords
risk
data
real estate
value
artificial intelligence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202410196772.0A
Other languages
Chinese (zh)
Inventor
冯永玉
薛秀荣
史辉
高洁
王萌
王燕
高瑜
王人杰
赵军
常胜男
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong Institute Of Land And Spatial Data And Remote Sensing Technology Shandong Sea Area Dynamic Monitoring And Monitoring Center
Original Assignee
Shandong Institute Of Land And Spatial Data And Remote Sensing Technology Shandong Sea Area Dynamic Monitoring And Monitoring Center
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong Institute Of Land And Spatial Data And Remote Sensing Technology Shandong Sea Area Dynamic Monitoring And Monitoring Center filed Critical Shandong Institute Of Land And Spatial Data And Remote Sensing Technology Shandong Sea Area Dynamic Monitoring And Monitoring Center
Priority to CN202410196772.0A priority Critical patent/CN117764726A/en
Publication of CN117764726A publication Critical patent/CN117764726A/en
Pending legal-status Critical Current

Links

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention relates to a real estate financial risk prevention and control method and system based on big data and artificial intelligence. The method comprises the following steps: receiving and storing input real estate registration information, and extracting features of structured data and unstructured data; performing cluster analysis on the extracted characteristic data through preset cluster number parameters to obtain a cluster result; based on the extracted feature data, the clustering result and the historical data set, constructing a risk prediction model by using a neural network architecture, and obtaining a trained risk prediction model by cross-verifying optimization model parameters; the historical data and the real-time data are synthesized, the real-property financial risk is evaluated by using the trained risk prediction model, and the risk grade is calculated according to the obtained risk evaluation result; and when the risk level exceeds a set threshold, a risk early warning is sent out. The method and the system realize comprehensive monitoring of the real estate market, and timely predict and identify the potential risk so as to provide a more accurate and effective risk prevention and control strategy.

Description

Real estate financial risk prevention and control method and system based on big data and artificial intelligence
Technical Field
the invention relates to the technical field of computers, in particular to a real estate financial risk prevention and control method and system based on big data and artificial intelligence.
Background
In the current economic environment, the real estate market has become an important field for attracting various investors, and not only is the asset allocation of individuals and enterprises related, but also the stability of the whole financial system is directly affected. As real estate transaction activities increase, the financial products and services involved also become more diverse and complex. Accordingly, real estate financial risk management is a focus of attention for market participants and regulatory authorities.
traditional real estate financial risk management relies primarily on statistical analysis of historical data and empirical judgment by professionals. However, this approach presents challenges in real world applications due to several limitations:
1. complexity: the real estate market is affected by a number of factors, including economic cycles, policy regulations, market supply and demand, interest rate changes, etc., which interact to increase the complexity of risk management.
2. the data size is large: with the development of information technology, real estate-related data has been explosively increased, including transaction records, market quotations, credit data, etc., and it has been difficult for conventional analysis tools to cope with such huge data volumes.
3. Dynamic change: the real estate market is a highly dynamic market that requires real-time or near real-time data monitoring and analysis to capture the latest trends in the market.
4. difficulty in prediction: due to the nonlinear characteristics of the real estate market and the limitations of the predictive model, the traditional method has difficulty in accurately predicting market trends and risk events.
In view of the above problems, it is needed to provide a real estate financial risk prevention and control method and system based on big data and artificial intelligence, which aims to efficiently process and analyze the big data by utilizing a data analysis and artificial intelligence algorithm, and realize real-time monitoring, dynamic prediction and intelligent early warning so as to improve the accuracy and efficiency of risk management.
Disclosure of Invention
In view of the above, the invention aims to provide a real estate financial risk prevention and control method and system based on big data and artificial intelligence, which realize comprehensive monitoring of a real estate market through a deep learning model and big data analysis technology and timely predict and identify potential risks, thereby providing a more accurate and effective risk prevention and control strategy.
in order to achieve the above purpose, the present invention provides the following technical solutions:
in a first aspect, the invention provides a real estate financial risk prevention and control method based on big data and artificial intelligence, comprising the following steps:
receiving and storing input real estate registration information, and extracting features of structured data and unstructured data;
k-means cluster analysis is carried out on the extracted characteristic data through preset cluster number parameters, so that a cluster result is obtained;
Based on the extracted feature data, the clustering result and the historical data set, constructing a risk prediction model by using a neural network architecture, and obtaining a trained risk prediction model by cross-verifying optimization model parameters;
The historical data and the real-time data are synthesized, the real-property financial risk is evaluated by using the trained risk prediction model, and the risk grade is calculated according to the obtained risk evaluation result;
And when the risk level exceeds a set threshold, a risk early warning is sent out.
As a further scheme of the invention, the real estate financial risk prevention and control method based on big data and artificial intelligence further comprises the following steps: real estate market trends are predicted using long and short term memory network (LSTM) pairs to process time series data.
As a further aspect of the present invention, receiving and storing input real estate registration information, performing feature extraction on structured data, including the steps of:
Screening the numerical structural features related to risk assessment from the real estate registration information, and grouping the numerical structural features of the real estate according to the risk grade;
checking the average value among different risk level groups by using a single-factor ANOVA for the numerical structural features of each group, performing single-factor variance analysis and calculating the F value and the P value of the average value among the groups;
a significance threshold is set and structured features with P values less than the set significance threshold level are selected as feature extraction.
As a further aspect of the present invention, calculating the F value and the P value of the inter-group average value includes:
Grouping the numerical structural features in the real estate registration information according to risk grades, and calculating the average value and the total average value of each group; wherein the ANOVA calculation includes an inter-group variance and an intra-group variance;
the calculation formula of the inter-group variance (SSB, sum of Squares Between groups) is:
Wherein k represents the number of groups, niis the number of samples of the i-th group,Is the sample mean of group i,/>Is the total average of all samples;
the calculation formula of the intra-group variance (SSW, sum of Squares Within groups) is:
where n represents the number of rows, m represents the number of columns,representing the j-th observation in the i-th group in matrix X.
when calculating the variance ratio (F value), the inter-group average square (MSB, mean Square Between groups) is calculated:
Wherein k represents the number of groups;
Average square within group (MSW, mean Square Within groups) was calculated:
Where N is the total number of all samples.
From the inter-group average squared MSB and the intra-group average squared MSW, the variance ratio F statistic is calculated:
And when the P value of the average value among the groups is calculated, F statistics and a corresponding degree of freedom are used for checking an F distribution table or statistical software is used for obtaining the P value, wherein the degree of freedom of a molecule is k-1, and the degree of freedom of a denominator is N-k.
As a further aspect of the present invention, receiving and storing input real estate registration information, performing feature extraction on unstructured data, comprising the steps of:
Reading input real estate registration information, dividing text content of the real estate registration information into words, and cleaning and normalizing;
Counting word frequency and inverse document frequency of words in text content, and calculating TF-IDF value of each document in the document set according to the word frequency and the inverse document frequency;
and constructing a feature matrix by using the TF-IDF value of each document in the document set, wherein each column represents a word, each row represents a document, and the value in the matrix is the TF-IDF value of the word of the corresponding document.
As a further scheme of the invention, K-means cluster analysis is carried out on the extracted characteristic data through preset cluster number parameters to obtain a cluster result, and the method comprises the following steps:
1) Selecting the number of clusters K:
selecting the number K of clusters to be partitioned according to a preset cluster number parameter;
2) Initializing a centroid:
Randomly selecting K pieces of characteristic data as initial centroids, calculating the distance between each piece of characteristic data and each initial centroid, and distributing each piece of characteristic data to the category represented by the nearest initial centroid;
3) Updating the centroid:
Re-calculating the centers of all the characteristic data distributed to each cluster to obtain a new centroid; repeating the steps until the preset iteration times are reached, and finally obtaining the distribution result of the centroid position and the characteristic data as a clustering result.
as a further aspect of the present invention, the real estate financial risk is evaluated using a trained risk prediction model, comprising the steps of:
Combining the received real-time data set and the historical data set as a feature set, and synchronizing according to the time stamp;
and (3) carrying out data standardization pretreatment on the feature set, inputting the feature set into a risk prediction model for scoring, and recording the risk score output by the model to obtain a risk assessment result.
As a further aspect of the present invention, calculating a risk level includes the steps of:
setting a threshold value for classifying the risk level, and classifying the risk level into a plurality of levels;
calculating the quantile of the risk score output by the risk prediction model;
Mapping the calculated quantiles to the risk level to obtain the determined risk level.
In a second aspect, the invention also provides a real estate financial risk prevention and control system based on big data and artificial intelligence, which comprises the following components:
And a data acquisition module: for receiving and storing input real estate registration information;
And the feature extraction module is used for: the method comprises the steps of extracting features of structured data and unstructured data in real estate registration information;
And a cluster analysis module: the method comprises the steps of performing K-means cluster analysis on extracted feature data through preset cluster number parameters to obtain a cluster result;
Model construction module: the method comprises the steps of constructing a risk prediction model by using a neural network architecture based on extracted feature data, clustering results and a historical data set, and obtaining a trained risk prediction model by cross-verifying optimized model parameters;
risk assessment module: the real-time risk assessment method comprises the steps of integrating historical data and real-time data, assessing real-time financial risk by using a trained risk prediction model, and calculating a risk level according to a risk assessment result;
and the early warning module is used for: and the risk level monitoring module is used for monitoring the risk level, and sending out risk early warning when the risk level exceeds a set threshold value.
compared with the prior art, the real estate financial risk prevention and control method and system based on big data and artificial intelligence have the following beneficial effects:
1. The accuracy of risk prediction is improved: the invention utilizes big data analysis to process and analyze a large amount of historical and real-time data, thereby providing more accurate risk assessment, and utilizing the risk prediction model can identify real estate financial risks in the data.
2. Dynamic risk monitoring: by continuously monitoring market variation and real-time data, the risk score can be updated immediately, market variation can be reflected rapidly, the risk level can be adjusted timely, and timeliness and accuracy of risk assessment are ensured.
3. And the decision making efficiency is improved: the automatic risk assessment flow reduces the time and errors of manual operation and improves the efficiency of the decision making process. The explicit risk level helps the decision maker make decisions on loan approval, asset management, and risk control quickly.
4. The artificial deviation is reduced: based on objective data analysis and model prediction, the influence of human factors in risk assessment is reduced, and the objectivity of assessment is improved; the system can be adjusted according to different market environments and data sources, so that the wide adaptability and expandability of the system are ensured; multiple data formats and sources are supported for easy integration with existing financial information systems.
5. Optimizing resource configuration: capital and resources can be more optimally allocated by accurate risk classification, facilitating risk-controlled adjustment of real estate.
These and other aspects of the invention will be more readily apparent from the following description of the embodiments. It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention as claimed.
Drawings
in order to more clearly illustrate the embodiments of the present invention or the technical solutions in the related art, a brief description will be given below of the drawings required for use in the description of the exemplary embodiments or the related art, which serve to provide a further understanding of the present invention and form a part of the specification, together with the embodiments of the present invention to explain the present invention, and not to limit the present invention. In the drawings:
fig. 1 is a flowchart of a real estate financial risk prevention and control method based on big data and artificial intelligence according to an embodiment of the present invention.
Fig. 2 is a flowchart of feature extraction of structured data in a real estate financial risk prevention and control method based on big data and artificial intelligence according to an embodiment of the present invention.
fig. 3 is a flowchart of feature extraction of unstructured data in a real estate financial risk prevention and control method based on big data and artificial intelligence according to an embodiment of the present invention.
Fig. 4 is a flowchart of performing cluster analysis on extracted feature data in a real estate financial risk prevention and control method based on big data and artificial intelligence according to an embodiment of the present invention.
Fig. 5 is a flowchart of calculating risk level in a real estate financial risk prevention and control method based on big data and artificial intelligence according to an embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
In some of the flows described in the specification and claims of the present invention and in the foregoing figures, a plurality of operations occurring in a particular order are included, but it should be understood that the operations may be performed out of order or performed in parallel, with the order of operations such as 101, 102, etc., being merely used to distinguish between the various operations, the order of the operations themselves not representing any order of execution. In addition, the flows may include more or fewer operations, and the operations may be performed sequentially or in parallel. It should be noted that, the descriptions of "first" and "second" herein are used to distinguish different messages, devices, modules, etc., and do not represent a sequence, and are not limited to the "first" and "second" being different types.
Technical solutions in exemplary embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in exemplary embodiments of the present invention, and it is apparent that the described exemplary embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
In view of the fact that traditional real estate financial risk management mainly relies on statistical analysis of historical data and experience judgment of professionals, the invention provides a real estate financial risk prevention and control method and system based on big data and artificial intelligence, and the data analysis and the artificial intelligence algorithm are utilized to conduct efficient processing and analysis on large-scale data, so that real-time monitoring, dynamic prediction and intelligent early warning are achieved, and accuracy and efficiency of risk management are improved. Solves the problems of increased risk management complexity, large data volume, dynamic market change and difficult prediction of multi-factor interaction existing in the traditional real estate financial risk management
the technical scheme of the invention is further described below with reference to specific embodiments:
referring to fig. 1, the real estate financial risk prevention and control method based on big data and artificial intelligence provided by the embodiment of the invention comprises the following steps:
Step 1), receiving and storing input real estate registration information, and extracting features of structured data and unstructured data;
Step 2), performing K-means cluster analysis on the extracted characteristic data through preset cluster number parameters to obtain a cluster result;
Step 3), constructing a risk prediction model by using a neural network architecture based on the extracted characteristic data, the clustering result and the historical data set, and obtaining a trained risk prediction model by cross-verifying optimized model parameters;
Step 4), integrating historical data and real-time data, evaluating real estate financial risks by using a trained risk prediction model, and calculating risk grades according to the obtained risk evaluation results;
And 5) when the risk level exceeds a set threshold, sending out risk early warning.
In this embodiment, the receiving of the real estate registration information is input through a designed user interface, allowing the user or the organization to input the level information of the real estate, including but not limited to the structured data such as the position, the area, the use property, the transaction price, the ownership proof, and the unstructured data such as the ownership history, the construction developer information, and the like of the real estate; the entered real estate registration information is stored in a database. Wherein, during feature extraction, for structured data (such as form data), key fields are extracted as features, such as extracting price change trend from transaction price, etc.; for unstructured data (e.g., text descriptions, pictures, etc.), natural language processing and image recognition are used to extract key information, such as developer reputation level. The K-means cluster analysis analyzes the cluster results to identify different risk groups, such as high risk and low risk real estate projects.
For example, if a financial institution or user wants to assess the risk of his real estate portfolio. The institution or user uses the above method and inputs detailed information of real estate in step 1). Real estate is clustered into three categories, residential, commercial, and industrial by K-means clustering. Next, using the deep learning model, the institution can predict the risk level of each type of real property under future market changes. When market data shows that the commercial real estate risk of a certain area is increased, the system gives early warning timely, and a financial institution can take measures, such as reducing the investment in the area or improving the risk preparation.
The real estate financial risk prevention and control method based on big data and artificial intelligence realizes the whole process from data acquisition and processing to risk assessment and early warning. The method utilizes advanced data analysis technology and artificial intelligence model to improve the accuracy and efficiency of risk assessment. In addition, the risk early warning is also automated by setting the threshold value, so that the financial institution is helped to timely cope with the potential risk, and the assets and the investment of the financial institution are protected. In this way, financial institutions are able to obtain a scientific, systematic and automated risk management tool to help them remain competitive in a changing market.
In this embodiment, referring to fig. 2, input real estate registration information is received and stored, and feature extraction is performed on the structured data, including the following steps:
Step S101, screening numerical structural features related to risk assessment from real estate registration information, and grouping the numerical structural features of the real estate according to risk grades;
step S102, checking the average value among different risk level groups by using a single-factor ANOVA for the numerical structural features of each group, performing single-factor variance analysis and calculating the F value and the P value of the average value among the groups;
step S103, setting a significance threshold, and selecting structural features with P values smaller than the significance threshold level as feature extraction.
In the above steps, when the feature extraction is performed on the structured data, feature screening and grouping are performed first, and during feature screening, numerical structured features related to risk assessment, such as price, area, age, geographic location index (such as longitude and latitude), historical transaction data, and the like, are screened out from a large amount of real estate registration information. When the characteristics are grouped, the characteristic data are divided into a plurality of groups according to the risk level of the real property, and each group represents one risk level. For example, real estate may be divided into three groups of low risk, medium risk, and high risk.
One-way analysis of variance (ANOVA) was then performed, where the numerical structured features within each risk class group were subjected to one-way ANOVA for testing whether there was a significant difference in the mean of two or more samples. Wherein, the ANOVA test outputs an F value (variance ratio) and a P value (significance probability value), and a high F value indicates a large difference between groups, and the P value is used for determining whether the difference is statistically significant.
Finally, the saliency feature selection is carried out, the saliency threshold value is set to be a saliency level (such as 0.05), the saliency threshold value is used for screening features, and when the features are selected, the features with the P value smaller than the set threshold value are selected, so that the saliency threshold value has a statistically significant effect on distinguishing risk grades.
illustratively, assume a real estate group with three risk levels: low, medium, high. Each group includes the following numerical structured features: trade price, year of construction, rate of lease return.
Feature screening and grouping: the features are extracted from the database and real-estate cases are divided into low, medium and high risk groups according to the existing risk assessment.
In one-way analysis of variance:
ANOVA was performed on the trade price, which showed an F value of 15.3 and a p value of 0.0001.
ANOVA was performed on year of construction and showed an F value of 3.5 and a p value of 0.04.
ANOVA was performed on lease yield, which showed an F value of 0.8 and a P value of 0.45.
In the case of the significance signature selection, the significance threshold was set to 0.05. Since the P value of the transaction price and the building year is less than 0.05, the feature of risk assessment is selected; and the P value of the lease rate is greater than 0.05, and thus is not a risk assessment feature.
In embodiments of the present invention, one-way analysis of variance (ANOVA) is used to compare the sample means of multiple independent (or different) groups to determine if the means of all groups are equal. When the P value is less than a predetermined significance level (typically 0.05), it is indicated that at least one group has a significantly different mean from the other groups, indicating that the feature has a distinguishing component between different risk levels.
Therefore, in the real estate financial risk prevention and control method based on big data and AI, by means of feature extraction and analysis of the structured data, the statistical correlation of which features have obvious with the risk level of the real estate can be distinguished. The use of one-way analysis of variance (ANOVA) can help determine if there are significant feature differences between different risk levels and select features that contribute to risk prediction, and can create a more accurate risk assessment model to better manage and control real estate financial risk.
In the above embodiment, in the ANOVA test, the F value and the P value of the group average value are calculated, the numerical structural features in the real estate registration information are first grouped by risk level, and the average value and the total average value of each group are calculated. Wherein the ANOVA test, the calculation includes an inter-group variance and an intra-group variance.
the calculation formula of the inter-group variance (SSB, sum of Squares Between groups) is as follows:
Wherein k represents the number of groups, niis the number of samples of the i-th group,Is the sample mean of group i,/>is the total average of all samples.
the calculation formula of the intra-group variance (SSW, sum of Squares Within groups) is:
where n represents the number of rows, m represents the number of columns,representing the j-th observation in the i-th group in matrix X.
When calculating the variance ratio (F value), the inter-group average square (MSB, mean Square Between groups) and the intra-group average square (MSW, mean Square Within groups) are calculated from the inter-group variance and the intra-group variance.
The calculation formula of the average square MSB among groups is as follows:
Where k represents the number of groups.
The calculation formula of the average square MSW in the group is as follows:
Where N is the total number of all samples.
Further, a variance ratio F statistic is calculated from the inter-group average squared MSB and the intra-group average squared MSW:
And when the P value of the average value among the groups is calculated, F statistics and a corresponding degree of freedom are used for checking an F distribution table or statistical software is used for obtaining the P value, wherein the degree of freedom of a molecule is k-1, and the degree of freedom of a denominator is N-k.
In an embodiment of the present invention, referring to fig. 3, the method for extracting features of unstructured data by receiving and storing input real estate registration information includes the following steps:
step S201, reading input real estate registration information, dividing text content of the real estate registration information into words, and cleaning and normalizing;
Step S202, counting word frequency and inverse document frequency of words in text content, and calculating TF-IDF value of each document in the document set according to the word frequency and the inverse document frequency;
Step S203, constructing a feature matrix by using TF-IDF values of each document in the document set, wherein each column represents a word, each row represents a document, and the values in the matrix are TF-IDF values of words of the corresponding document.
In the above step, word frequency (TF) is used to count the number of occurrences of each word in the document, inverse Document Frequency (IDF) is used to calculate the distribution density of words in all documents, and the less frequent words have higher IDF values. During TF-IDF calculation, the word frequency (TF) and the Inverse Document Frequency (IDF) are multiplied to obtain the importance score of a word for a document, namely: TF-IDF value. When the feature matrix is constructed, the rows represent documents, the columns represent words, each element is a corresponding TF-IDF value, and the constructed feature matrix can be used for training a machine learning model.
By way of example, assume that the following text summaries of two pieces of real estate registration information are provided:
Document 1 "new apartment is located in the city center, with two bedrooms. The price is reasonable. "
Document 2, "old apartment in city center, low price, and beautiful environment. "
Then, in text preprocessing, the word segmentation processing result is: the "new", "apartment", "located" in the "city center", "two", "bedroom", "price", "reasonable" ], the "city center", "old", "apartment", "price", "cheap", "environment", "grace" ]; after the stop words are removed and the morphological reduction is carried out, the result is as follows: "New", "apartment", "City center", "two", "bedroom", "price", "reasonable" ], [ "City center", "old", "apartment", "price", "cheap", "environment", "grace" ].
Then, TF-IDF is calculated, since the term "apartment" appears in both documents, but its IDF value is relatively low because of the small total number of documents. While "new" and "old" each appear in one document, the IDF value is higher. Then, a feature matrix is constructed, and assuming that a 5x2 matrix (5 words, 2 documents) is constructed, TF-IDF values of five words, "new", "old", "city center", "price" and "environment" are included.
Therefore, in the risk assessment or information extraction process of the real estate registration information, unstructured text data is converted into structured numerical characteristics, and the importance of words in the document is assessed by using TF-IDF, so that a feature matrix capable of representing the original text content can be constructed. The feature matrix may further be used in various machine learning applications to aid in the decision making process. Such an approach is particularly important, for example, in credit risk assessment, automatic classification of real estate types, or prediction of market trends.
in the embodiment of the present invention, referring to fig. 4, the K-means cluster analysis is performed on the extracted feature data through a preset cluster number parameter to obtain a cluster result, which includes the following steps:
1) Selecting the number of clusters K:
step S301, selecting the number K of clusters to be divided according to preset cluster number parameters;
2) Initializing a centroid:
step S302, randomly selecting K pieces of characteristic data as initial centroids, calculating the distance between each piece of characteristic data and each initial centroid, and distributing each piece of characteristic data to the category represented by the nearest initial centroid;
3) Updating the centroid:
Step S303, recalculating centers of all the characteristic data distributed to each cluster to obtain a new centroid; repeating the steps until the preset iteration times are reached, and finally obtaining the distribution result of the centroid position and the characteristic data as a clustering result.
In this embodiment, the characteristic data of the real property registration information is analyzed by using a K-means clustering algorithm to classify the data, and n observations can be divided into K sets (K < =n) so that each observation belongs to a set represented by a mean value nearest thereto (i.e., centroid).
For example, the following feature data sets are assumed, representing prices and areas of different real properties, and the preset cluster number parameter K is 2, where the feature data are: [ (200, 30), (150, 35), (300, 70), (250, 60) ]. Then the number of clusters selected k=2, when initializing the centroid, assuming the initial centroid is [ (200, 30), (250, 60) ], the distances of other data points to the two centroids are calculated and the categories are assigned to obtain two clusters. When updating the centroid, after the first iteration, a new centroid is obtained [ (175, 32.5), (275, 65) ], and the steps of assigning and updating the centroid are repeated until the centroid is no longer changed. By K-means clustering to minimize the variance between each point and its centroid, potential groupings of data can be found by continually iteratively updating the centroid and reassigning data points to the nearest centroid. In real property market analysis, real properties with similar characteristics are identified through K-means clustering, so that the market is segmented, for example, a residential market is subdivided according to the characteristics of the size, price, position and the like of the real property, and the method has practical application value in the aspects of market analysis, price estimation, investment decision and the like.
In this embodiment, the real estate financial risk is evaluated using the trained risk prediction model, including the steps of:
combining the received real-time data set and the historical data set as a feature set, and synchronizing according to the time stamp; and (3) carrying out data standardization pretreatment on the feature set, inputting the feature set into a risk prediction model for scoring, and recording the risk score output by the model to obtain a risk assessment result.
as shown in fig. 5, the risk level is calculated, which includes the following steps:
step S401, setting a threshold value for classifying the risk level, and classifying the risk level into a plurality of levels.
In this step, when the risk level is classified into levels, a desired number of risk levels, for example, 5 (very low, medium, high, very high) are determined.
and step S402, calculating the quantile of the risk score output by the risk prediction model.
In this step, the risk prediction model outputs a risk score according to the historical data and the real-time data, the risk score guarantees the probability of breach, and the quantiles, such as 20%,40%,60% and 80% quantiles, are calculated according to the risk score.
Step S403, mapping the calculated quantiles to the risk levels to obtain the determined risk levels. For example:
1. very low risk: the probability of violation is <20% quantile value.
2. Low risk: the probability of breach is between 20% and 40% quantiles.
3. Risk of (1): the probability of breach is between 40% and 60% quantiles.
4. high risk: the probability of breach is between 60% and 80% quantile.
5. Very high risk: the probability of violation is >80% quantile.
finally, the risk scores and grades are displayed to a decision maker in the form of reports, and loan conditions, interest rates or buying and selling strategies of the decision-making real estate are adjusted according to the risk grades.
It should be understood that although described in a certain order, the steps are not necessarily performed sequentially in the order described. The steps are not strictly limited to the order of execution unless explicitly recited herein, and the steps may be executed in other orders. Moreover, some steps of the present embodiment may include a plurality of steps or stages, which are not necessarily performed at the same time, but may be performed at different times, and the order of the steps or stages is not necessarily sequential, but may be performed alternately or alternately with at least a part of the steps or stages in other steps or other steps.
in this embodiment, the present invention further provides a real estate financial risk prevention and control system based on big data and artificial intelligence, including the following components:
And a data acquisition module: for receiving and storing input real estate registration information;
And the feature extraction module is used for: the method comprises the steps of extracting features of structured data and unstructured data in real estate registration information;
And a cluster analysis module: the method comprises the steps of performing K-means cluster analysis on extracted feature data through preset cluster number parameters to obtain a cluster result;
Model construction module: the method comprises the steps of constructing a risk prediction model by using a neural network architecture based on extracted feature data, clustering results and a historical data set, and obtaining a trained risk prediction model by cross-verifying optimized model parameters;
risk assessment module: the real-time risk assessment method comprises the steps of integrating historical data and real-time data, assessing real-time financial risk by using a trained risk prediction model, and calculating a risk level according to a risk assessment result;
and the early warning module is used for: and the risk level monitoring module is used for monitoring the risk level, and sending out risk early warning when the risk level exceeds a set threshold value.
in this embodiment, the real estate financial risk prevention and control system based on big data and artificial intelligence adopts the steps of the real estate financial risk prevention and control method based on big data and artificial intelligence as described above when executing, so the operation process of the real estate financial risk prevention and control system based on big data and artificial intelligence will not be described in detail in this embodiment.
The real estate financial risk prevention and control system based on big data and artificial intelligence is based on the modules, and improves the risk management capability of the real estate market through accurate data classification and pattern recognition. The system can collect real estate related data including price, area, geographic location, historical transaction records and the like from a plurality of channels and platforms through an efficient data acquisition module. The collected data is preprocessed, including cleaning, normalization, and feature extraction, to adapt to subsequent cluster analysis. The characteristic data are classified by using a K-means algorithm, the data can be divided into a plurality of categories by setting a clustering number parameter K, each category represents a real estate set with similar attributes, and a clustering result is used for analyzing the integral structure of the real estate market and identifying high-value or high-risk areas. In combination with historical data and market dynamics, the system can predict future price change trends and evaluate potential financial risks. Decision support is provided to assist financial institutions, investors and policy makers in making corresponding risk prevention and control measures and policies.
The real estate financial risk prevention and control system based on big data and artificial intelligence has the characteristics of high efficiency, accuracy, flexibility and practicability, and the automatic data processing and analysis flow greatly improves the efficiency of processing large-scale data sets. The K-means clustering can accurately classify the real estate data, and helps users to quickly identify market patterns and risk points. The flexible setting of the number of clusters K enables the system to adapt to market analysis requirements of different scales and characteristics. The cluster analysis result output by the system can be directly applied to the practice of real estate financial risk prevention and control, and has obvious market application value.
The real estate financial risk prevention and control system combines big data analysis and artificial intelligence algorithm, and provides a scientific, efficient and intelligent risk management solution for the real estate market. By deeply mining the potential information of the data, the system is helpful for realizing the stable development of the real estate market and the financial security.
In an embodiment, there is further provided in an embodiment of the present invention a computer device including at least one processor, and a memory communicatively connected to the at least one processor, where the memory stores instructions executable by the at least one processor, to cause the at least one processor to perform the real estate financial risk prevention and control method based on big data and artificial intelligence, where the processor executes the instructions to implement the steps in the real estate financial risk prevention and control method embodiment based on big data and artificial intelligence.
in an embodiment, a computer readable storage medium is provided, the computer readable storage medium storing computer instructions for causing the computer to perform the steps of the real estate financial risk prevention and control method based on big data and artificial intelligence.
Those skilled in the art will appreciate that implementing all or part of the above described methods in accordance with the embodiments may be accomplished by instructing the associated hardware by a computer program characterized by computer instructions, which may be stored on a non-transitory computer readable storage medium, which when executed may comprise the steps of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in embodiments provided herein may include at least one of non-volatile and volatile memory.
the non-volatile memory may include read-only memory, magnetic tape, floppy disk, flash memory, optical memory, etc. Volatile memory can include random access memory or external cache memory. By way of illustration, and not limitation, RAM can take many forms, such as static random access memory or dynamic random access memory.
The foregoing description of the preferred embodiments of the invention is not intended to be limiting, but rather is intended to cover all modifications, equivalents, and alternatives falling within the spirit and principles of the invention.

Claims (10)

1. The real estate financial risk prevention and control method based on big data and artificial intelligence is characterized by comprising the following steps:
receiving and storing input real estate registration information, and extracting features of structured data and unstructured data;
k-means cluster analysis is carried out on the extracted characteristic data through preset cluster number parameters, so that a cluster result is obtained;
Based on the extracted feature data, the clustering result and the historical data set, constructing a risk prediction model by using a neural network architecture, and obtaining a trained risk prediction model by cross-verifying optimization model parameters;
The historical data and the real-time data are synthesized, the real-property financial risk is evaluated by using the trained risk prediction model, and the risk grade is calculated according to the obtained risk evaluation result;
And when the risk level exceeds a set threshold, a risk early warning is sent out.
2. The real estate financial risk prevention and control method based on big data and artificial intelligence of claim 1 further comprising: and (5) processing the time sequence data by using a long-period memory network to predict the market trend of the real estate.
3. the real estate financial risk prevention and control method based on big data and artificial intelligence according to claim 1, characterized in that receiving and storing input real estate registration information, performing feature extraction on structured data, comprising the following steps:
Screening the numerical structural features related to risk assessment from the real estate registration information, and grouping the numerical structural features of the real estate according to the risk grade;
checking the average value among different risk level groups by using a single-factor ANOVA for the numerical structural features of each group, performing single-factor variance analysis and calculating the F value and the P value of the average value among the groups;
a significance threshold is set and structured features with P values less than the set significance threshold level are selected as feature extraction.
4. The real estate financial risk prevention and control method based on big data and artificial intelligence as claimed in claim 3, characterized in that when calculating the F value and P value of the group average value, it includes:
Grouping the numerical structural features in the real estate registration information according to risk grades, and calculating the average value and the total average value of each group; wherein the ANOVA calculation includes an inter-group variance and an intra-group variance;
The calculation formula of the inter-group variance SSB is:
Wherein k represents the number of groups, niis the number of samples of the i-th group,Is the sample mean of group i,/>Is the total average of all samples;
The calculation formula of the intra-group variance SSW is:
where n represents the number of rows, m represents the number of columns,Representing the j-th observation value in the i-th group in the matrix X;
When calculating the variance ratio, the inter-group average squared MSB is calculated:
Wherein k represents the number of groups;
Average squares MSW within group were calculated:
From the inter-group average square MSB and the intra-group average square MSW, a statistic of the variance ratio F is calculated:
5. The method for preventing and controlling financial risk of real estate based on big data and artificial intelligence according to claim 4, wherein when calculating the P value of the average value between groups, the F value is obtained by checking the F distribution table with F statistics and corresponding degrees of freedom or by using statistical software, wherein the molecular degree of freedom is k-1, and the denominator degree of freedom is N-k.
6. The real estate financial risk prevention and control method based on big data and artificial intelligence according to claim 1, characterized in that receiving and storing input real estate registration information, performing feature extraction on unstructured data, comprising the following steps:
Reading input real estate registration information, dividing text content of the real estate registration information into words, and cleaning and normalizing;
Counting word frequency and inverse document frequency of words in text content, and calculating TF-IDF value of each document in the document set according to the word frequency and the inverse document frequency;
and constructing a feature matrix by using the TF-IDF value of each document in the document set, wherein each column represents a word, each row represents a document, and the value in the matrix is the TF-IDF value of the word of the corresponding document.
7. The real estate financial risk prevention and control method based on big data and artificial intelligence according to claim 6, characterized in that the K-means cluster analysis is performed on the extracted feature data through the preset cluster number parameter to obtain a cluster result, comprising the following steps:
1) Selecting the number of clusters K:
selecting the number K of clusters to be partitioned according to a preset cluster number parameter;
2) Initializing a centroid:
Randomly selecting K pieces of characteristic data as initial centroids, calculating the distance between each piece of characteristic data and each initial centroid, and distributing each piece of characteristic data to the category represented by the nearest initial centroid;
3) Updating the centroid:
Re-calculating the centers of all the characteristic data distributed to each cluster to obtain a new centroid; repeating the steps until the preset iteration times are reached, and finally obtaining the distribution result of the centroid position and the characteristic data as a clustering result.
8. The real estate financial risk prevention and control method based on big data and artificial intelligence of claim 7 wherein real estate financial risk is assessed using a trained risk prediction model, comprising the steps of:
Combining the received real-time data set and the historical data set as a feature set, and synchronizing according to the time stamp;
and (3) carrying out data standardization pretreatment on the feature set, inputting the feature set into a risk prediction model for scoring, and recording the risk score output by the model to obtain a risk assessment result.
9. The real estate financial risk prevention and control method based on big data and artificial intelligence of claim 8 wherein calculating risk level comprises the following steps:
setting a threshold value for classifying the risk level, and classifying the risk level into a plurality of levels;
calculating the quantile of the risk score output by the risk prediction model;
Mapping the calculated quantiles to the risk level to obtain the determined risk level.
10. A real estate financial risk prevention and control system based on big data and artificial intelligence, characterized by being used for executing the real estate financial risk prevention and control method based on big data and artificial intelligence as claimed in any of claims 1-9, the system comprising:
And a data acquisition module: for receiving and storing input real estate registration information;
And the feature extraction module is used for: the method comprises the steps of extracting features of structured data and unstructured data in real estate registration information;
And a cluster analysis module: the method comprises the steps of performing K-means cluster analysis on extracted feature data through preset cluster number parameters to obtain a cluster result;
Model construction module: the method comprises the steps of constructing a risk prediction model by using a neural network architecture based on extracted feature data, clustering results and a historical data set, and obtaining a trained risk prediction model by cross-verifying optimized model parameters;
risk assessment module: the real-time risk assessment method comprises the steps of integrating historical data and real-time data, assessing real-time financial risk by using a trained risk prediction model, and calculating a risk level according to a risk assessment result;
and the early warning module is used for: and the risk level monitoring module is used for monitoring the risk level, and sending out risk early warning when the risk level exceeds a set threshold value.
CN202410196772.0A 2024-02-22 2024-02-22 Real estate financial risk prevention and control method and system based on big data and artificial intelligence Pending CN117764726A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410196772.0A CN117764726A (en) 2024-02-22 2024-02-22 Real estate financial risk prevention and control method and system based on big data and artificial intelligence

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410196772.0A CN117764726A (en) 2024-02-22 2024-02-22 Real estate financial risk prevention and control method and system based on big data and artificial intelligence

Publications (1)

Publication Number Publication Date
CN117764726A true CN117764726A (en) 2024-03-26

Family

ID=90318669

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410196772.0A Pending CN117764726A (en) 2024-02-22 2024-02-22 Real estate financial risk prevention and control method and system based on big data and artificial intelligence

Country Status (1)

Country Link
CN (1) CN117764726A (en)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070269804A1 (en) * 2004-06-19 2007-11-22 Chondrogene, Inc. Computer system and methods for constructing biological classifiers and uses thereof
CN106897918A (en) * 2017-02-24 2017-06-27 上海易贷网金融信息服务有限公司 A kind of hybrid machine learning credit scoring model construction method
CN109523516A (en) * 2018-10-19 2019-03-26 中国科学院遥感与数字地球研究所 A kind of object level land cover pattern change detecting method based on double constraints condition
CN110069550A (en) * 2019-04-23 2019-07-30 深圳市承儒科技有限公司 One kind is based on the associated statistical analysis technique of education and cloud platform system
CN111307643A (en) * 2019-04-04 2020-06-19 西北大学 Soil moisture prediction method based on machine learning algorithm
CN113609109A (en) * 2021-06-08 2021-11-05 中国电力科学研究院有限公司 Automatic scene information generation method based on data twinning
CN116206764A (en) * 2023-03-24 2023-06-02 深圳大学 Risk classification method, apparatus, electronic device and storage medium
CN116235052A (en) * 2019-03-13 2023-06-06 美国实验室控股公司 Methods for treating rheumatoid arthritis cardiovascular disease
CN117390499A (en) * 2023-12-12 2024-01-12 天津市佳和食品科技有限公司 Be applied to multiple sample detecting system that food pesticide remained and detected

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070269804A1 (en) * 2004-06-19 2007-11-22 Chondrogene, Inc. Computer system and methods for constructing biological classifiers and uses thereof
CN106897918A (en) * 2017-02-24 2017-06-27 上海易贷网金融信息服务有限公司 A kind of hybrid machine learning credit scoring model construction method
CN109523516A (en) * 2018-10-19 2019-03-26 中国科学院遥感与数字地球研究所 A kind of object level land cover pattern change detecting method based on double constraints condition
CN116235052A (en) * 2019-03-13 2023-06-06 美国实验室控股公司 Methods for treating rheumatoid arthritis cardiovascular disease
CN111307643A (en) * 2019-04-04 2020-06-19 西北大学 Soil moisture prediction method based on machine learning algorithm
CN110069550A (en) * 2019-04-23 2019-07-30 深圳市承儒科技有限公司 One kind is based on the associated statistical analysis technique of education and cloud platform system
CN113609109A (en) * 2021-06-08 2021-11-05 中国电力科学研究院有限公司 Automatic scene information generation method based on data twinning
CN116206764A (en) * 2023-03-24 2023-06-02 深圳大学 Risk classification method, apparatus, electronic device and storage medium
CN117390499A (en) * 2023-12-12 2024-01-12 天津市佳和食品科技有限公司 Be applied to multiple sample detecting system that food pesticide remained and detected

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
何亮亮;: "企业潜在信用风险的预警模型", 大连海事大学学报(社会科学版), no. 03, 15 June 2007 (2007-06-15) *
李婕;向菲;: "冠心病风险因素识别及其预测模型构建", 中华医学图书情报杂志, no. 06, 15 June 2020 (2020-06-15) *
王德鲁;郑建萍;马刚;: "基于多分类器融合的上市公司产品伤害事件风险预警系统", 情报杂志, no. 09, 18 September 2015 (2015-09-18) *
赵亦军;谢赤;王彭;: "基于CFaR的企业信用风险评估模型构建及其实证研究", 系统工程, no. 07, 28 July 2013 (2013-07-28) *

Similar Documents

Publication Publication Date Title
Acosta-González et al. Forecasting financial failure of firms via genetic algorithms
CN111324642A (en) Model algorithm type selection and evaluation method for power grid big data analysis
US10614073B2 (en) System and method for using data incident based modeling and prediction
Tong et al. Developing econometrics
CN104636449A (en) Distributed type big data system risk recognition method based on LSA-GCC
CN113538154A (en) Risk object identification method and device, storage medium and electronic equipment
Granadillo et al. Methodology with multivariate calculation to define and evaluate financial productivity profiles of the chemical sector in Colombia
Calabrese Optimal cut-off for rare events and unbalanced misclassification costs
CN105447117A (en) User clustering method and apparatus
CN112836750A (en) System resource allocation method, device and equipment
CN117764726A (en) Real estate financial risk prevention and control method and system based on big data and artificial intelligence
Yip Business failure prediction: a case-based reasoning approach
Kadam et al. Data mining in finance
Yang A Study on the Impact of Corporate Financial Accounting Management System on Corporate Innovation Under Sustainable Development Strategy
CN114219630A (en) Service risk prediction method, device, equipment and medium
Liu et al. RETRACTED ARTICLE: Company financial path analysis using fuzzy c-means and its application in financial failure prediction
CN113010754A (en) Target behavior recognition system
CN112506930A (en) Data insight platform based on machine learning technology
Borst Discovering and applying location influence patterns in the mass valuation of domestic real property
CN117787569B (en) Intelligent auxiliary bid evaluation method and system
Papadopoulos Machine learning applications to advance the field of urban energy dynamics
Chen et al. Construction of Bank Credit White List Access System Based on Grey Clustering Algorithm
Anuar et al. Reverse Migration Factor in Machine Learning Models
Jarri Urban Energy Modelling: An Integration of Data-Driven, Machine Learning and Deep Learning Techniques
Oskin A Prototype of a Machine Learning Workflow to Classify Land Use From Housing Market Dynamics. Part of a Longitudinal Analysis of Housing Sales in the Greater Toronto-Hamilton Area

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination