CN110310012B - Data analysis method, device, equipment and computer readable storage medium - Google Patents

Data analysis method, device, equipment and computer readable storage medium Download PDF

Info

Publication number
CN110310012B
CN110310012B CN201910479378.7A CN201910479378A CN110310012B CN 110310012 B CN110310012 B CN 110310012B CN 201910479378 A CN201910479378 A CN 201910479378A CN 110310012 B CN110310012 B CN 110310012B
Authority
CN
China
Prior art keywords
data
financing
model
prediction
enterprise
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910479378.7A
Other languages
Chinese (zh)
Other versions
CN110310012A (en
Inventor
陈娴娴
阮晓雯
徐亮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN201910479378.7A priority Critical patent/CN110310012B/en
Publication of CN110310012A publication Critical patent/CN110310012A/en
Application granted granted Critical
Publication of CN110310012B publication Critical patent/CN110310012B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0637Strategic management or analysis, e.g. setting a goal or target of an organisation; Planning actions based on goals; Analysis or evaluation of effectiveness of goals
    • G06Q10/06375Prediction of business process outcome or impact based on a proposed change
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/90Financial instruments for climate change mitigation, e.g. environmental taxes, subsidies or financing

Abstract

The invention relates to the technical field of artificial intelligence, and discloses a data analysis method, which comprises the following steps: and obtaining the internal portrait data and the external portrait data of the object according to the data analysis request to carry out grading, calculating the maximum financing range and the asset bearing capacity grade of the object, selecting a corresponding model training algorithm according to the maximum financing range and the asset bearing capacity grade to train the financing prediction model, and outputting the financing prediction result based on the enterprise data and the model of the object to be predicted. The invention also discloses a data analysis device, equipment and a computer readable storage medium, which are used for fund planning based on the fund-raising mechanism, so that the situation that the fund and the income are not matched due to blind fund-raising planning of an enterprise is avoided, meanwhile, the fund-raising planning is formed based on the data inside and outside the enterprise, the accuracy of a system in planning and analysis is greatly improved, the maximum benefit of the fund of the enterprise is ensured, and the enthusiasm of the enterprise for the banked fund is also improved.

Description

Data analysis method, device, equipment and computer readable storage medium
Technical Field
The present invention relates to the field of artificial intelligence, and in particular, to a data analysis method, apparatus, device, and computer readable storage medium.
Background
With the continuous development of artificial intelligence at present, especially in the data statistics and business planning of enterprises, the artificial intelligence can save a lot of human resources for the enterprises, but in the prior art, as for the system of planning analysis of the enterprises, the system is arranged inside the enterprises and requires confidentiality of data, and external network connection cannot be usually carried out, so when the system carries out analysis of the data, the system usually uses the analyzed data to plan the enterprise by historical data inside the current year and is also revenue data, because external information needs to be continuously imported from the external network, thus the update of the data is not timely, and the analysis difference and inaccuracy are caused; meanwhile, the system does not carry out excessive large-scale analysis of assets and maximum capacity during analysis, so that the self-planning of an enterprise can easily go to two polarizations, or oversaturation can greatly influence the survival of the enterprise, and the oversaturation can limit the development of the enterprise.
Especially when enterprises need to carry out financing, if data updating is not timely, the analysis of the system is incomplete, and the analysis is easy to occur because of the limitation of self-development and capital of the enterprises, and the analysis exceeds the bearing capacity of the enterprises, so that the planning is inaccurate. Therefore, at present, a system and a method for high-level financing system analysis are not formed, so that inaccuracy and efficiency of data analysis are low, financing cannot reasonably meet feedback mechanisms corresponding to financing of enterprises of different scales, poor operation of the enterprises is caused, larger financing risks are brought to the enterprises, and possibility of enterprise financing and income matching is reduced.
Disclosure of Invention
The invention mainly aims to provide a data analysis method, a device, equipment and a computer readable storage medium, which aim to solve the technical problem that the system is inaccurate in enterprise financing planning analysis due to the fact that existing data are not updated timely.
In order to achieve the above object, the present invention provides a data analysis method comprising:
receiving a data analysis request sent by a terminal, and acquiring a corresponding object data set based on an object to be analyzed in the data analysis request, wherein the object data set at least comprises object internal portrait data and external portrait data;
Grading the internal portrait data and the external portrait data of the objects according to a preset financing grade to obtain at least one data subset, wherein the data subset corresponds to the objects to be analyzed one by one;
calculating the maximum financing range corresponding to the object to be analyzed and the maximum bearing capacity grade of the asset according to the data subset;
selecting a corresponding model training algorithm according to the maximum financing range and the maximum bearing capacity level of the assets thereof;
and training the data subset according to the model training algorithm to obtain a financing prediction model, carrying out ecological balance prediction on the object to be predicted based on the financing prediction model, and outputting a financing prediction result.
Optionally, after the step of obtaining the corresponding object data set based on the object to be analyzed in the data analysis request, the method further includes:
acquiring a data format of a data set used for training the financing prediction model, wherein the data format comprises a tag column, a sequencing order of the tag column and a storage position of data;
according to the data format, adjusting tag columns in the internal portrait data and the external portrait data according to the sorting order, and detecting whether missing or redundant tag columns exist in the tag columns;
If the external image data and the internal image data have missing tag columns, adding the missing tag columns at corresponding positions in the internal image data and the external image data, and filling blank data to form a standardized object data set;
if redundant tag columns exist in the external portrait data and the internal portrait data, deleting or shielding the redundant tag columns and corresponding data in the internal portrait data and the external portrait data from a dataset to be invalid so as to form a standardized object dataset.
Optionally, after the grading the object internal portrait data and external portrait data according to a preset financing grade, at least one data subset is obtained, the method further comprises:
scoring the at least one data subset through a weight ratio coefficient in a preset scoring model to obtain a scoring result;
and sorting the data subsets according to the scoring result from large to small, and selecting N data subsets with the top scoring as the effective data sets trained by the financing prediction model, wherein N is more than or equal to 1.
Optionally, after the grading the object internal portrait data and external portrait data according to a preset financing grade, at least one data subset is obtained, the method further comprises:
Performing feature analysis on the data subset to obtain the data features of the same data in the data subset;
and carrying out feature derivatization on the data features to obtain data similar to the data in the data subset, wherein the feature derivatization is used for further subdividing the data features or expanding similar features.
Optionally, the training of the financing prediction for the subset of data according to the model training algorithm obtains a financing prediction model, and outputting a financing prediction result based on the ecological balance prediction for the object to be treated by the financing prediction model includes:
when training is performed by adopting a training algorithm of a Light GBM model, matching a Light GBM model training framework corresponding to the grade of the data subset according to the grade division result of the data subset, and inputting the data subset into the model framework for training to obtain the financing prediction model, wherein the financing prediction model is as follows:
wherein Obj is the output result of the financing prediction model, n>1,y i The tag value of the tag column normalized for the image data,is y i Estimated value f of (f) t (x t ) =f (x) is an approximate calculation function of the feature value in the image data, f t Is a characteristic value x t An approximate target value i is the number of terms of the tag sequence, x t X is y i The corresponding characteristic value, t is the number of terms of the characteristic value, and constant, L and omega represent constant terms;
acquiring internal image data and external image data of an object to be predicted, inputting the internal image data and the external image data into the financing prediction model, and outputting a financing grade corresponding to the object to be predicted;
and according to the financing grade, inquiring the corresponding financing evaluation report from a corresponding system table of the preset financing grade and the financing evaluation report.
Optionally, after the step of querying the corresponding financing evaluation report from the corresponding system table of the preset financing grade and the financing evaluation report according to the financing grade, the method further comprises:
calculating the minimum value of the financing prediction model, judging the feasibility of the generated financing report based on the minimum value, wherein the formula for calculating the minimum value is as follows:
wherein R is 1 (j,s)={x|x (j) ≤s},R 2 (j,s)={x|x (j) The number of the dimension sections in the image data is respectively greater than s, i is the number of the items of the tag column, j is the dividing feature of the image data, s is the dividing point, and c i Is y and y i Corresponding absolute values.
The data acquisition module is used for receiving a data analysis request sent by the terminal, and acquiring a corresponding object data set based on an object to be analyzed in the data analysis request, wherein the object data set at least comprises object internal portrait data and external portrait data;
The data grading module is used for grading the internal portrait data and the external portrait data of the object according to a preset financing grade to obtain at least one data subset, wherein the data subset corresponds to the object to be analyzed one by one;
the calculating module is used for calculating the maximum financing range corresponding to the object to be analyzed and the maximum bearing capacity grade of the asset according to the data subset;
the prediction module is used for selecting a corresponding model training algorithm according to the maximum financing range and the maximum bearing capacity level of the assets; and training the financing prediction for the at least one data subset according to the model training algorithm to obtain a financing prediction model, carrying out ecological balance prediction on the object to be predicted based on the financing prediction model, and outputting a financing prediction result.
Optionally, the data analysis device further comprises a format conversion module, and the data format of the data set used for training the financing prediction model is obtained, wherein the data format comprises a tag column, a sorting order of the tag column and a storage position of data; according to the data format, adjusting tag columns in the internal portrait data and the external portrait data according to the sorting order, and detecting whether missing or redundant tag columns exist in the tag columns; if the external image data and the internal image data have missing tag columns, adding the missing tag columns at corresponding positions in the internal image data and the external image data, and filling blank data to form a standardized object data set; if redundant tag columns exist in the external portrait data and the internal portrait data, deleting or shielding the redundant tag columns and corresponding data in the internal portrait data and the external portrait data from a dataset to be invalid so as to form a standardized object dataset.
Optionally, the data analysis device further includes a scoring module, configured to score the data subset through a weight ratio coefficient in a preset scoring model, so as to obtain a scoring result; and sorting the data subsets according to the scoring result from large to small, and selecting N data subsets with the top scoring as the effective data sets trained by the financing prediction model, wherein N is more than or equal to 1.
Optionally, the data analysis device further includes a derivatization module, configured to perform feature analysis on the data subset, so as to obtain data features with the same data in each data subset; and carrying out feature derivatization on the data features to obtain data similar to the data in the data subset, wherein the feature derivatization is used for further subdividing the data features or expanding similar features.
Optionally, the prediction module comprises a model training unit and a report generating unit;
the model training unit is configured to, when training is performed by using a training algorithm of a Light GBM model, match a training framework of the Light GBM model corresponding to the level of the data subset according to the level division result of the data subset, and input the data subset into the model framework for training, so as to obtain the financing prediction model, where the financing prediction model is:
Wherein Obj is the output result of the financing prediction model, n>1, yi is the label value of the label column after normalization processing of the portrait data,is y i Estimated value f of (f) t (x t ) =f (x) is an approximate calculation function of the feature value in the image data, f t Is a characteristic value x t An approximate target value i is the number of terms of the tag sequence, x t X is y i The corresponding characteristic value, t is the number of terms of the characteristic value, and constant, L and omega represent constant terms;
the report generating unit is used for acquiring the internal portrait data and the external portrait data of the object to be predicted, inputting the internal portrait data and the external portrait data into the financing prediction model and outputting the financing grade corresponding to the object to be predicted; and according to the financing grade, inquiring the corresponding financing evaluation report from a corresponding system table of the preset financing grade and the financing evaluation report.
Optionally, the data analysis device further includes a judging module, configured to calculate a minimum value of the financing prediction model, and judge the feasibility of the generated financing report based on the minimum value, where a formula for calculating the minimum value is:
wherein R is 1 (j,s)={x|x (j) ≤s},R 2 (j,s)={x|x (j) The number of the dimension sections in the image data is respectively greater than s, i is the number of the items of the tag column, j is the dividing feature of the image data, s is the dividing point, and c i Is y and y i Corresponding absolute values.
Further, to achieve the above object, the present invention provides a data analysis apparatus comprising: a memory, a processor, and a data analysis program stored on the memory and executable on the processor, which when executed by the processor performs the steps of the data analysis method as claimed in any one of the preceding claims.
In addition, in order to achieve the above object, the present invention provides a computer-readable storage medium having stored thereon a data analysis program which, when executed by a processor, implements the steps of the data analysis payment method according to any one of the above.
According to the invention, planning analysis of the financing data is carried out by acquiring the internal portrait data and the external portrait data of the enterprise according to the data analysis request, a preliminary financing analysis result is formed, then a financing plan corresponding to the enterprise is generated according to deduction of ecological balance of the enterprise of the analysis result polarity, so that a financing mechanism of financing and income balance of the enterprise in the financing process is formed, and fund is planned based on the financing mechanism, thereby avoiding the condition of unmatched financing and income caused by blind financing planning of the enterprise, forming a financing plan based on the data of the interior and the exterior of the enterprise, greatly improving the accuracy of a system in planning analysis, ensuring the maximum benefit of financing of the enterprise, and improving the stability of financing of the enterprise.
Drawings
FIG. 1 is a schematic flow chart of a first embodiment of a data analysis method according to the present invention;
FIG. 2 is a flow chart of a second embodiment of the data analysis method according to the present invention;
FIG. 3 is a schematic diagram of a functional module of an embodiment of a data analysis device according to the present invention;
fig. 4 is a schematic structural diagram of a server operating environment according to an embodiment of the present invention.
The achievement of the objects, functional features and advantages of the present invention will be further described with reference to the accompanying drawings, in conjunction with the embodiments.
Detailed Description
It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
The data analysis method provided by the invention mainly refers to a prediction method of a financing plan for realizing financing data planning of balance of financing and income of an enterprise, and can be used for planning and analyzing other businesses, the method can be realized through a current financing system, and preferably, the method can be realized by adding software code data for realizing the method in the current financing system, and the physical realization of the system can be a Personal Computer (PC), a server, a smart phone and the like. Based on such hardware results, various embodiments of the data analysis method of the present invention are presented.
Referring to fig. 1, fig. 1 is a flowchart of a data analysis method according to an embodiment of the present invention. In this embodiment, the data analysis method specifically includes the following steps:
step S110, a data analysis request sent by a terminal is received, and a corresponding object data set is obtained based on an object to be analyzed in the data analysis request;
in this step, the object data set includes at least object internal representation data and external representation data; the object data set can be obtained from the existing enterprise credit system and can also be obtained from a comment website of the Internet, and is mainly used for judging resources and bearing capacity of enterprises, wherein the portrait data comprises enterprise ranking, enterprise influence indexes, enterprise scale, enterprise annual income, enterprise types, enterprise categories, enterprise historical annual financing dynamics, financing categories, enterprise required propaganda dynamics, propaganda markets, some marketing or exhibition holding places and the like.
The enterprise ranking includes an operation status ranking, a tax ranking, a total asset ranking, a credibility ranking, a profit ranking and the like aiming at the enterprise location, and the enterprise ranking can be obtained nationwide according to the actual application.
Step S120, grading the internal portrait data and the external portrait data of the object according to a preset financing grade to obtain at least one data subset, wherein the data subset corresponds to the object to be analyzed one by one;
step S130, calculating the maximum financing range corresponding to the object to be analyzed and the maximum bearing capacity level of the assets according to the data subset;
in this embodiment, the object refers to an enterprise, that is, the execution of steps S120 and S130 is to perform planning analysis of financing data on an enterprise data set, so as to obtain a preliminary financing analysis result of the enterprise, where the preliminary financing analysis result includes a maximum financing range of the enterprise and a maximum bearing capacity level of an asset of the enterprise; in the planning analysis of the financing of the step, the asset bearing capacity of the enterprise is mainly analyzed, the asset bearing capacity is relatively compared with data which can reflect the development condition of the enterprise, the future development planning of the enterprise is also convenient to prepare, the financing is a mode of the enterprise development, the development of the enterprise can be realized, and the supporting assistance to the outside can also be realized.
The calculation of the bearing capacity of the asset needs to be calculated by combining the tangible asset and the intangible asset of the enterprise, and the property given by the intangible asset from the outside due to proper management can be said to be the credibility of the enterprise, which is a trusted resource for ensuring the actual financing of the enterprise.
For the maximum financing scope and the level of affordability to be also in connection with the area to which the enterprise itself relates, such as the direction in which the enterprise is primarily developing or the products produced, etc., calculations are required to be made according to the type of enterprise and the industry it serves, not any enterprise may be free to financing in any one industry or area.
For example, the maximum financing capability of the enterprise can be judged according to the current pure income and liability conditions of the enterprise, the financing amount is determined based on the capability level, and the bearing capability level of the enterprise is determined based on the determined financing amount, and can be comprehensively considered and calculated by combining the current business trend, actual expense, development state of the enterprise and other factors of the enterprise.
Step S140, selecting a corresponding model training algorithm according to the maximum financing range and the maximum bearing capacity level of the assets thereof;
in practical application, the selection of the model training algorithm can be specifically performed by selecting a corresponding relation table, that is, a user estimates the financing range of a company in advance according to the actual financing condition and factors such as assets and income of the company, classifies the financing range into levels including low, medium and high levels, then selects the corresponding model training algorithm, finally creates a corresponding relation table, and selects the corresponding model training algorithm from the corresponding relation table for use by taking the maximum financing range and the maximum bearing capacity level of the assets as retrieval conditions during actual use.
Of course, in this step, instead of selecting by means of correspondence, it may be determined according to historical financing records of the company, for example, searching for the financing historical records inside the company according to the maximum financing range and the maximum bearing capacity level of the asset, selecting records of a level similar to that, and extracting the model training algorithm in the records, thereby realizing the selection of the model training algorithm.
And step S150, carrying out financing prediction training on the at least one data subset according to the model training algorithm to obtain a financing prediction model, carrying out ecological balance prediction on an object to be predicted based on the financing prediction model, and outputting a financing prediction result.
In this embodiment, the object to be predicted refers to an enterprise name that a user needs to predict a financing plan; the object to be analyzed refers to the name of the enterprise used for model training, and can be a plurality of objects, and the object to be analyzed is mainly used for acquiring data of a training model; the prediction of the financing is realized by a model, namely the process of carrying out ecological deduction on the whole balance of the enterprise, the deduction of the ecological balance of the enterprise refers to the deduction of the balance degree between the financing and the income of the enterprise, namely the deduction of the enterprise is simulated according to the primary financing analysis result to realize the financing based on the current analysis result, whether the maximum financing saturation under the premise of ensuring the survival of the enterprise can be met or not, the specific realization mode can be to calculate the balance grade of the financing and the income through deduction according to the maximum financing range and the maximum bearing capacity grade of the asset, determine the corresponding financing amount according to the balance grade, even plan the prep of the financing is directly carried out according to the financing scale or the starting of the financing.
In this embodiment, for step S120, the enterprise may be initially rated according to the enterprise ranking, the enterprise influence index, the enterprise scale, the annual income of the enterprise, the enterprise type, the enterprise category, the historical annual financing effort of the enterprise, the financing category, the required propaganda effort of the enterprise, the publicity market and other data in the image data of the enterprise, for example, the operation condition of the enterprise may be initially analyzed according to the enterprise influence index, the annual income of the enterprise, the historical annual financing effort of the enterprise and the publicity condition of the enterprise, if the operation condition is good, the deeper calculation analysis is performed, that is, the machine damage analysis is performed by combining the image data of more enterprises, so as to obtain the final financing effort and the maximum bearing effort level of the enterprise asset on the premise of the financing effort.
By evaluating an enterprise in the mode, the corresponding partial financing plan is determined, so that the enterprise can better carry out financing operation, and the enthusiasm of the enterprise for the banked financing is improved; and also ensures the full implementation and use of the banked financing.
In this embodiment, in step S110, after the analyzing the object to be analyzed in the Xi request based on the data, obtaining the corresponding object data set, the method further includes:
Preprocessing image data in the object data set, wherein the preprocessing is to perform format conversion on the image data according to a data format required by a data analysis system, so as to obtain a normalized object data set.
In practical application, for the purpose of converting the image data of enterprises in the data sets into the data with fixed format by normalizing the data format, the purpose is to facilitate subsequent calculation, and by simplifying the data in this way, the influence of the data scrambling on the calculation result can be avoided, thereby improving the calculation reference degree and the final prediction of the financing scale of the enterprise data.
In this embodiment, the preprocessing the image data in the object data set includes:
acquiring a data format of a data set used for training the financing prediction model, wherein the data format comprises a tag column, a sequencing order of the tag column and a storage position of data;
according to the data format, adjusting tag columns in the internal portrait data and the external portrait data according to the sorting order, and detecting whether missing or redundant tag columns exist in the tag columns;
If the external image data and the internal image data have missing tag columns, adding the missing tag columns at corresponding positions in the internal image data and the external image data, and filling blank data to form a standardized object data set;
if redundant tag columns exist in the external portrait data and the internal portrait data, deleting or shielding the redundant tag columns and corresponding data in the internal portrait data and the external portrait data from a dataset to be invalid so as to form a standardized object dataset.
Namely, the original data is normalized by masking the label columns, and performing a missing value filling method of tree model prediction on the missing columns, a box diagram anomaly test, and the like. Instead of adding columns, the original data is subjected to data cleaning by means of some data cleaning methods. The label column is masked because it is a tag column, and we use label column as carefully as possible except for model training and verification, because this column data is very important. Therefore, when filling the missing values with the tree model, we prefer to remove this column, reducing the impact of this column, and only consider the other image data. Therefore, rather than using label columns, in many cases data manipulation and feature engineering will mask label columns.
In practical applications, the general format of the acquired image dataset is data stored by a data table, and when the data table is output by an enterprise or some statistical companies, various label column labels are set, and the label columns are not needed in the prediction of the financing plan in the application.
Such as: if the checked data is missing, the blank column is added to the image data, so that the image data meets the preset format requirement, and the method of filling missing values by tree model prediction on the missing column, the method of checking the abnormal box diagram and the like can be preferably selected to normalize the image data; if the data is detected to be redundant, redundant information is deleted by deleting the data in a label column shielding mode.
The original data is processed in the same format in the mode, so that the standardization of the data is realized, and deviation of subsequent grade evaluation caused by format diversification of the data can be avoided.
In this embodiment, the performing planning analysis of the financing data according to the obtained object data set, and obtaining the preliminary financing analysis result of the enterprise includes:
carrying out data grading processing on the preprocessed portrait data according to a preset grade to obtain a plurality of data subsets, wherein the data subsets are in one-to-one correspondence with the enterprises;
and calculating the maximum financing range corresponding to the enterprise and the maximum bearing capacity level of the asset in the portrait data according to the data subset.
The image data is a data set which comprises external image data and internal image data, and the image data set can be subjected to data classification through a characteristic classifying barrel.
In practical application, the data features corresponding to each level are corresponding to different financing levels, for example, the financing levels of an enterprise are divided into 10 levels, during the grading processing, keywords in the image data converted into a specified format are respectively compared with the data features of the 10 levels, then the number of levels is determined according to the comparison result, the image data is assumed to be uniform type enterprise data of the enterprise which comprises a period of time, during the grading processing, firstly the image data is segmented according to small time intervals to obtain a plurality of small sets, then each small set is respectively compared with the data features corresponding to the 10 levels, and when the data features are determined to reach the values corresponding to the levels, the small sets are divided into the levels until the comparison is completed, and the small sets form at least one data subset.
Next, the obtained image data set is data such as continuous annual income of the enterprise, for example, the continuous annual income is divided according to years to obtain a plurality of small sets, each grade corresponds to an average income amount of annual income, average value in the small sets is compared with the amount corresponding to the grade, and factor variable corresponding to the comparison result of more than 1000W is 10 (namely, grade 10), more than 500W is 8, and the like.
Of course, the above is merely an example to facilitate understanding of the recited amounts, some of which are not necessarily amounts, and some of which are text type data that also require grading or discretizing after text matching in the manner described above, but only a comparison of the possible specific key lines of text. Of course, the above-mentioned grading is performed according to actual situation of the enterprises involved in the financing evaluation, for example, when the overall strength of the enterprises involved in the financing evaluation is high, the grading can be performed at a few points, and the requirement of each grade is relatively high; when the overall strength of enterprises participating in financing evaluation is low, the grading is multi-point, and the low-grade requirement is relatively low, so that the method can be mainly provided for small enterprises and is beneficial to the comprehension of the poverty.
Of course, some grading models are graded to realize grading treatment, and training of analysis models is performed by training normalized data, that is, training can be performed by directly using the data labels during model training, or can be performed according to data characteristics in grading.
In this embodiment, after the grading the object internal portrait data and external portrait data according to a preset financing grade, at least one data subset is obtained, the method further includes:
performing feature analysis based on the data subset, and extracting data features with the same data in the data subset;
and carrying out derivatization on the data characteristics according to the data characteristics so as to expand data similar to the data in the data subset, wherein the derivatization refers to further subdividing the data characteristics or expanding similar characteristics, so that the division of the portrait data set is more accurate.
In practical applications, it is not necessary to derive all data features, but only a part of them may be derived, and the derivation may specifically be performed according to the determination of the actual classification of the enterprise data, preferably, the derivation is performed on the preset hierarchical data features, and it is assumed that the derivation differentiation is performed on a certain column of features, for example, the feature is changed into a one-hot type or the like. Not every data is done, but rather is screened in order according to the nature of the data itself, which is the course of the experiment. The method can adopt a plurality of different feature processing methods of the image data to carry out experiments, and finally, the optimal feature engineering algorithm is selected based on the results.
Further, in the step of performing feature derivatization, the specific implementation process is as follows: firstly, judging whether each data feature can be correspondingly derivatized or not through hierarchical comparison processing, if yes, verifying according to the type and scene of the data feature in the hierarchy, and if yes, carrying out derivatization in combination with reasonability of the image data per se, wherein the actual development of the image data of an enterprise cannot be exceeded, and if excessive derivatization can cause deviation of a subsequent training model, so that financing evaluation of the enterprise is inaccurate. For example, the current type of the data feature is A, the type similar to the type A is B, whether the data feature can be derived to the similar type B is judged by combining the development condition of the portrait data, if so, the data feature is derived, and other features under the type B are obtained as features for feature comparison at the time, so that the number of the features is expanded, meanwhile, the deduction requirement of an enterprise is met, and the evaluation accuracy is greatly improved.
In this embodiment, the feature differentiation includes feature differentiation and feature search, where for feature differentiation, a specific implementation may be to select a corresponding differentiation method according to a type of a data feature or a class of a subset of data, and split each data feature based on the differentiation method, where each split small feature is a phrase that belongs to the same type or has the same meaning as the original data feature.
For feature search addition, the concrete implementation may be to group words according to the semantics of the data features in the data subset, thereby obtaining more similar features, and then select features belonging to the unified class of data in the data subset from the similar features.
In this embodiment, in the step of generating the thigh recommended financing plan, the financing plan may be generated by performing model training according to the acquired object data set, specifically, by training to obtain a deduced financing model, and by inputting corresponding enterprise data based on the model, the corresponding financing plan may be output.
In practical application, model training is performed according to the image data subjected to data grading processing so as to obtain a financing prediction model.
In this case, the model is trained, specifically, the obtained known enterprise portrait data is labeled by the enterprise itself or an expert, so that a corresponding data table is output. For example, for an enterprise A, the portrait dimension is from f1 to fn, and the label is 1, and the mapping of enterprise financing, promotion dynamics and promotion market exists; similarly, B, f1-fn, label is 2, and so on, we can collect a part of the data set of known labels, we build the model to train the model, we get the learning rule from the label data set, such as the simplest linear model, y=a1f1+a2f2+a3f3 …, we get the values of a1, a2, a3 by training the label data, where f (n) is the image data, and a (n) is the coefficient of the model function.
In the scheme, besides training deduction by adopting the linear model, the LightGBM model is preferably selected for training, and the gradient is adopted for segmenting and selecting the characteristic data in the training of the model, so that the calculation of the data quantity can be reduced, the efficiency of training the model is greatly improved, and the training of the model is specifically as follows.
Input: training data, iteration step number d, sampling rate a of large gradient data, sampling rate b of small gradient data, loss function and types of a plurality of learners (generally decision tree);
and (3) outputting: a trained strong learner;
(1) Sorting the sample points in descending order according to their absolute values;
(2) Selecting a sample of 100% of the previous samples from the ordered results to generate a subset of large-gradient sample points;
(3) Randomly selecting b (1-a) 100% sample points from the rest sample set (1-a) 100% samples to generate a set of small gradient sample points;
(4) Combining the large gradient sample and the sampled small gradient sample;
(5) Multiplying the small gradient sample by a weight coefficient (1-a)/b;
(6) Using the sampled samples, learn a new weak learner;
(7) Steps (1) - (6) are repeated until a prescribed number of iterations or convergence is reached.
The algorithm can greatly reduce the model learning rate while losing the accuracy of the learner without changing the data distribution.
In this embodiment, training of the financing prediction for the at least one data subset according to the model training algorithm, to obtain a financing prediction model, and predicting ecological balance of the object to be treated based on the financing prediction model, and outputting a financing prediction result includes:
when training is performed by adopting a training algorithm of a Light GBM model, matching a Light GBM model training framework corresponding to the grade of the data subset according to the grade division result of the data subset, and inputting the data subset into the model framework for training to obtain the financing prediction model, wherein the financing prediction model is as follows:
wherein Obj is the output result of the financing prediction model, n>1,y i The tag value of the tag column normalized for the image data,is y i Estimated value f of (f) t (x t ) =f (x) is an approximate calculation function of the feature value in the image data, f t Is a characteristic value x t The approximate target value of the time, i is the number of terms of the tag column, xt=x is y i The corresponding characteristic value, t is the number of terms of the characteristic value, and constant, L and omega represent constant terms;
acquiring internal image data and external image data of an object to be predicted, inputting the internal image data and the external image data into the financing prediction model, and outputting a financing grade corresponding to the object to be predicted;
and according to the financing grade, inquiring the corresponding financing evaluation report from a corresponding system table of the preset financing grade and the financing evaluation report.
After the corresponding financing evaluation report is queried from a corresponding system table of a preset financing grade and a financing evaluation report according to the financing grade, the method further comprises the following steps:
calculating the minimum value of the financing prediction model, judging the feasibility of the generated financing report based on the minimum value, wherein the formula for calculating the minimum value is as follows:
wherein R is 1 (j,s)={x|x (j) ≤s},R 2 (j,s)={x|x (j) The number of the dimension sections in the image data is respectively greater than s, i is the number of the items of the tag column, j is the dividing feature of the image data, s is the dividing point, and c i Is y and y i Corresponding absolute values.
Specifically, the implementation process for training the obtained object data set by using the Light GBM model to obtain the financing prediction model is specifically as follows:
let n instances X1, …, xn { X1 …, xn } X1, …, xn of the object dataset have a feature dimension s. And when the gradients are overlapped, the negative gradient direction of the loss function of the model data variable is expressed as g1, … and gn, the decision tree divides the data to each node through an optimal dividing point (maximum information gain point), and then the divided data is trained in a preset model framework of the LightGBM model, so that a final financing planning prediction model is obtained, and the function formula of the model is as follows:
Wherein y is i A tag value of a tag column normalized for the image data; f (f) t (X t ) A function is calculated for approximating the characteristic value in the acquired image data; x is X t Is y i Corresponding characteristic values; j is the segmentation feature of the image data; s is a division point; constant represents a constant term.
Further, the taylor expansion calculation is performed based on the financing plan prediction model so as to obtain an approximate target value of the enterprise financing plan, and the calculation formula is as follows:
then, the minimum value of the function of the model is obtained, and the feasibility of the financing plan of the enterprise is judged based on the minimum value, and the formula for obtaining the minimum value is as follows:
wherein R is 1 (j,s)={x|x (j) ≤s},R 2 (j,s)={x|x (j) The value of the dimension interval in the image data is respectively more than s; c i Is y and y i Corresponding absolute values.
The LightGBM adopts a leaf-wise growth strategy, finds one leaf with the maximum splitting gain (generally the maximum data amount) from all the current leaves at a time, splits the leaf, and loops the steps; but a deeper decision tree will grow, resulting in an overfitting (thus the LightGBM adds a maximum depth limit above leaf-wise, preventing overfitting while ensuring high efficiency). The LightGBM optimizes support for category features, can directly input category features, and does not require additional 0/1 expansion. And the decision rule of the category characteristic is added on the decision tree algorithm. The task of histogram merging is distributed to different machines by using a scatter reduction (reducer) in data parallelism, so that communication and calculation are reduced, and half of the traffic is further reduced by using histogram difference. Voting-based data parallelism (parallelvoing) further optimizes the communication cost in the data parallelism, making the communication cost a constant level.
In summary, the LightGBM model has good robustness, can well prevent overfitting, and accelerates optimization in performance, so that the operation speed is faster and the memory consumption is lower, which is also an important reason for selecting the LightGBM model.
And generating a corresponding financing scale according to the maximum financing range and the maximum bearing capacity level of the assets by combining with the financing plan prediction model.
In this embodiment, the model for the financing plan prediction may be obtained by training a large amount of financing plan data in advance, and of course, the model may also be established by a demand theory and then be obtained by training continuously in actual prediction application, thereby improving the accuracy of the model.
Further, after the grading the object internal portrait data and external portrait data according to a preset financing grade, at least one data subset is obtained, the method further comprises:
scoring the at least one subset of data by weight ratio coefficients in a scoring model;
and selecting a data subset with higher scoring from the at least one data subset according to the scoring result as an effective data set trained by the financing prediction model.
In practical application, when a prediction model is trained, weighting ratio is introduced to score and screen data, and of course, the weighting ratio can be set in the step of preprocessing the data, so that the screening of the data can be realized in advance, and the screening can be realized by combining the scoring model, and the screening can be realized based on specific indexes such as enterprise financing, propaganda strength and propaganda market, and the like, wherein the enterprise financing and propaganda strength are considered to be the most important scoring indexes, and the weights are assumed to be 0.3, so that the two indexes reach weight 0.6, and the balance indexes are evenly distributed with weight accumulation sum of 0.4. Secondly, we normalize based on the specific values of the indexes and then perform weighted calculation. Finally, each enterprise can quantitatively obtain one specific label value. The comb of the label column is finished (namely, the preprocessing of the data is finished).
In this embodiment, in addition to the above model mode to implement the prediction of the financing scale, some analysis mechanisms may be simply used to simply classify, which is not applicable when the data size is large, and the logic stringency has a certain vulnerability. However, if there are restrictions on resources and time, it is also a simple scheme to extract part of the image data for simple classification, and to use the image data with higher correlation as the basis for division by performing some correlation tests on the tag data.
As shown in fig. 2, a flowchart of a data analysis method for training and financing prediction based on a LightGBM model according to an embodiment of the present invention is specifically implemented, and the method specifically includes the following steps:
step S210, obtaining the portrait data of the enterprise to be subjected to financing planning prediction from a data system and a website related to the enterprise through communication connection of the Internet;
in this step, the obtained portrait data is specifically enterprise ranking, enterprise impact index, enterprise scale, enterprise annual income, enterprise type, enterprise category, enterprise historical annual financing strength, financing category, enterprise required propaganda strength, propaganda market, holding place, and the like.
Step S220, carrying out data format and feature derivatization processing on the portrait data;
in this step, firstly, the acquired data is subjected to tabular processing, that is, head-up labels are generated in tables according to different data and classified into the data tables for storage, but the acquired data is not enabled by all data information, and some unnecessary or redundant information may exist.
Furthermore, feature data in various image data are barreled, such as continuous data of annual income of enterprises and the like are counted in a grading manner, and specific numerical values are corresponding to the feature data. And (3) carrying out the feature derivation of the same ratio and the ring ratio on the pure incomes, liabilities and the like of enterprises.
In the step, in the data table which is not provided with the type tag and can be arranged in a label column structure mode, enterprise images such as different enterprise financing, propaganda force, propaganda market, holding places and the like are specifically combed into the quantized label through a scoring model.
In this embodiment, the classification of the features may also be performed according to the nature of the data itself. In the case of continuous data, such as business annual income, data can be classified, for example, a factor variable of more than 1000W is more than 10,500W and 8, and so on. Some text-type data is also ranked or discretized after text matching.
Step S230, training of a financing prediction model is carried out on the processed portrait data based on the Light GBM model.
In practical application, the image data is firstly split to obtain data subsets by splitting, splitting points of each subset are determined, training is performed by using a Light GBM model based on the splitting points in combination with the subsets, for example, a splitting strategy of the Light GBM is used, and red, yellow, green and blue color sets are taken as an example for illustration, that is, all possible strategies of dividing four types of samples corresponding to red, yellow, blue and green into two types are as follows: red Huang Yilei, blue-green. Then there are strategies to fully mine the information contained in the dimension feature to find the optimal segmentation strategy. But the time complexity of finding the optimal segmentation strategy is great. There is an effective solution for regression trees. To find the optimal partitioning requires approximately. The basic idea is to reorder the categories according to the relevance of the training goals. More specifically, the histograms (of the category characteristics) are reordered according to the accumulated value (), and then the best division point is found in the ordered histograms. Inputting the data set into a training framework of the Light GBM model based on the segmentation point to obtain the following model formula:
Based on the financing planning prediction model, taylor expansion calculation is carried out so as to obtain an approximate target value of the enterprise financing plan, and the calculation formula is as follows:
then, the minimum value of the function of the model is obtained, and the feasibility of the financing plan of the enterprise is judged based on the minimum value, and the formula for obtaining the minimum value is as follows:
wherein R is 1 (j,s)={x|x (j) ≤s},R 2 (j,s)={x|x (j) And the s is the value of the dimension interval in the image data.
S240, current data of the enterprise are obtained and output to a financing prediction model, and a predicted financing planning result is output.
Furthermore, when the training model for processing the division points is performed, the method further comprises the step of introducing weight ratios to score and screen the divided data subsets, and of course, the weight ratios can also be set in the step of preprocessing the data, so that the data can be screened in advance, and the method can be specifically realized by combining the scoring model, and can be based on specific indexes such as known enterprise financing, propaganda strength and propaganda market, wherein the enterprise financing and propaganda strength are considered to be the most important scoring indexes, the weights are assumed to be 0.3, and therefore, the two indexes reach weight 0.6, the other indexes are evenly distributed with weights, and the sum of the weights is 0.4. Secondly, we normalize based on the specific values of the indexes and then perform weighted calculation. Finally, each enterprise can quantitatively obtain one specific label value. After the comb of the label column is finished, training of the model is carried out, and therefore accuracy of a prediction result of the model is further improved.
In order to solve the above-mentioned problems, the present invention further provides a data analysis device, where the data analysis device may be used to implement the data analysis method provided by the embodiment of the present invention, and the physical implementation of the data analysis device exists in a server, where a specific hardware implementation of the server is shown in fig. 1.
Referring to fig. 3, the mobile device includes: a processor 301, such as a CPU, a communication bus 302, a user interface 303, a network interface 304, a memory 305. Wherein the communication bus 302 is used to enable connected communication between these components. The user interface 303 may comprise a Display screen (Display), an input unit such as a Keyboard (Keyboard), and the network interface 304 may optionally comprise a standard wired interface, a wireless interface (e.g., WI-FI interface). The memory 305 may be a high-speed RAM memory or a stable memory (non-volatile memory), such as a disk memory. The memory 305 may alternatively be a storage device separate from the aforementioned processor 301.
Those skilled in the art will appreciate that the hardware configuration of the apparatus shown in fig. 3 does not constitute a limitation on the data analysis device, and may include more or fewer components than shown, or may combine certain components, or a different arrangement of components.
As shown in fig. 3, an operating system, a network communication module, a user interface module, and a data-based analysis program may be included in the memory 305, which is a computer-readable storage medium. The operating system is a program for managing and analyzing the device and the software resource, and the data analysis program and other software and/or program are run.
In the hardware architecture of the server shown in fig. 3, the network interface 104 is mainly used for accessing the network; the user interface 103 is primarily used to communicate with the external internet or with a server providing enterprise data, to retrieve various credit and asset information for the enterprise, and the processor 301 may be used to invoke a data analysis program stored in the memory 305 and to perform the operations of the various embodiments of the data analysis method described below.
In this embodiment of the present invention, the implementation of fig. 3 may also be a mobile terminal capable of touch operation, such as a mobile phone, where the processor of the mobile terminal performs deduction prediction on the financing plan of the enterprise by reading the program code capable of implementing the data analysis method stored in the buffer or the storage unit.
In order to solve the above-mentioned problems, an embodiment of the present invention further provides a data analysis device, and referring to fig. 4, fig. 4 is a schematic diagram of functional modules of the data analysis device according to the embodiment of the present invention. In this embodiment, the apparatus includes:
The data acquisition module 41 is configured to receive a data analysis request sent by a terminal, and acquire a corresponding object data set based on an object to be analyzed in the data analysis request, where the object data set includes at least object internal portrait data and external portrait data;
the data grading module 42 is used for grading the internal portrait data and the external portrait data of the object according to a preset financing grade to obtain at least one data subset, and the data subsets are in one-to-one correspondence with the object to be analyzed;
a calculating module 43, configured to calculate, according to the data subset, a maximum financing range corresponding to the object to be analyzed and a maximum bearing capacity level of an asset thereof;
a prediction module 44, configured to select a corresponding model training algorithm according to the maximum financing range and the maximum bearing capacity level of the asset; and training the data subset according to the model training algorithm to obtain a financing prediction model, predicting ecological balance of the object to be predicted based on the financing prediction model, and outputting a financing prediction result.
In this embodiment, the data analysis device further includes a format conversion module, configured to perform preprocessing on the image data in the object data set, where the preprocessing is to perform format conversion on the image data according to a data format required in the data analysis system, so as to obtain a normalized object data set.
In this embodiment, the apparatus further includes a determining module configured to calculate a minimum value of the financing prediction model, and determine a feasibility of the generated recommended financing plan based on the minimum value.
The content is described based on the same embodiment as the data analysis method in the embodiment of the present invention, so the content of the embodiment of the data analysis device in this embodiment is not described in detail.
According to the method, the device and the system, planning analysis of the financing data is carried out by acquiring the internal portrait data and the external portrait data of the enterprise according to the data analysis request, a preliminary financing analysis result is formed, then a financing plan corresponding to the enterprise is generated according to deduction of ecological balance of the enterprise of the analysis result polarity, a financing mechanism for financing and profit balance of the enterprise in the financing process is formed, and fund is planned based on the financing mechanism, so that the situation that financing and income are not matched due to blind financing planning of the enterprise is avoided, and meanwhile, the financing planning is formed based on the data of the interior and the exterior of the enterprise, the accuracy of a system in planning analysis is greatly improved, the maximum benefit of financing of the enterprise is guaranteed, and the financing of the enterprise is also improved.
The invention also provides a computer readable storage medium.
In this embodiment, the computer readable storage medium stores a data analysis program, and the code scanning payment program of the H5 web page implements the steps of the data analysis method according to any one of the foregoing embodiments when executed by the processor. The method implemented when the data analysis program is executed by the processor may refer to various embodiments of the data analysis method of the present invention, and thus will not be described in detail.
From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (e.g. ROM/RAM), comprising instructions for causing a terminal (which may be a mobile phone, a computer, a server or a network device, etc.) to perform the method according to the embodiments of the present invention.
While the embodiments of the present invention have been described above with reference to the drawings, the present invention is not limited to the above-described embodiments, which are merely illustrative and not restrictive, and many modifications may be made thereto by those of ordinary skill in the art without departing from the spirit of the present invention and the scope of the appended claims, which are to be accorded the full scope of the present invention as defined by the following description and drawings, or by any equivalent structures or equivalent flow changes, or by direct or indirect application to other relevant technical fields.

Claims (8)

1. A data analysis method, the data analysis method comprising:
receiving a data analysis request sent by a terminal, and acquiring a corresponding object data set based on an object to be analyzed in the data analysis request, wherein the object data set at least comprises object internal portrait data and external portrait data;
grading the internal portrait data and the external portrait data of the object according to a preset financing grade to obtain at least one data subset, wherein the data subset corresponds to the object to be analyzed one by one;
calculating the maximum financing range corresponding to the object to be analyzed and the maximum bearing capacity grade of the asset according to the data subset;
Selecting a corresponding model training algorithm according to the maximum financing range and the maximum bearing capacity level of the assets thereof;
training the data subset for financing prediction according to the model training algorithm to obtain a financing prediction model, carrying out ecological balance prediction on an object to be predicted based on the financing prediction model, and outputting a financing prediction result;
the training of the financing prediction is carried out on the data subset according to the model training algorithm to obtain a financing prediction model, ecological balance prediction is carried out on the object to be predicted based on the financing prediction model, and the outputting of the financing prediction result comprises:
when training is performed by adopting a training algorithm of a Light GBM model, matching a Light GBM model training framework corresponding to the grade of the data subset according to the grade division result of the data subset, and inputting the data subset into the Light GBM model training framework for training to obtain the financing prediction model, wherein the financing prediction model is as follows:
wherein Obj is the output result of the financing prediction model, n>1,y i The tag value of the tag column normalized for the image data,is y i Estimated value f of (f) t (x t ) =f (x) is an approximate calculation function of the feature value in the image data, f t Is a characteristic value x t An approximate target value i is the number of terms of the tag sequence, x t= x is y i The corresponding characteristic value, t is the number of terms of the characteristic value, and constant, L and omega represent constant terms;
acquiring internal image data and external image data of an object to be predicted, inputting the internal image data and the external image data into the financing prediction model, and outputting a financing grade corresponding to the object to be predicted;
and according to the financing grade, inquiring the corresponding financing evaluation report from a corresponding system table of the preset financing grade and the financing evaluation report.
2. The data analysis method according to claim 1, further comprising, after the step of acquiring the corresponding object data set based on the object to be analyzed in the data analysis request:
acquiring a data format of a data set used for training the financing prediction model, wherein the data format comprises a tag column, a sequencing order of the tag column and a storage position of data;
according to the data format, adjusting tag columns in the internal portrait data and the external portrait data according to the sorting order, and detecting whether missing or redundant tag columns exist in the tag columns;
If the external image data and the internal image data have missing tag columns, adding the missing tag columns at corresponding positions in the internal image data and the external image data, and filling blank data to form a standardized object data set;
if redundant tag columns exist in the external portrait data and the internal portrait data, deleting or shielding the redundant tag columns and corresponding data in the internal portrait data and the external portrait data from a dataset to be invalid so as to form a standardized object dataset.
3. The data analysis method of claim 2, further comprising, after said ranking the subject internal representation data and external representation data according to a predetermined financing level to obtain at least one subset of data:
scoring the data subset through a weight ratio coefficient in a preset scoring model to obtain a scoring result;
and sorting the data subsets according to the scoring result from large to small, and selecting N data subsets with the top scoring as the effective data sets trained by the financing prediction model, wherein N is more than or equal to 1.
4. A data analysis method according to any one of claims 1 to 3, further comprising, after said ranking the subject internal representation data and external representation data according to a predetermined financing ranking to obtain at least one data subset:
performing feature analysis on the data subset to obtain the data features of the same data in the data subset;
and carrying out feature derivatization on the data features to obtain data similar to the data in the data subset, wherein the feature derivatization is used for further subdividing the data features or expanding similar features.
5. The data analysis method according to claim 4, further comprising, after the querying of the corresponding financing evaluation report from a preset correspondence table of financing levels and financing evaluation reports according to the financing levels:
calculating the minimum value of the financing prediction model, judging the feasibility of the generated financing evaluation report based on the minimum value, wherein the formula for finding the minimum value is as follows:
wherein, the liquid crystal display device comprises a liquid crystal display device,respectively taking values of dimension intervals in the image data, wherein i is the number of items of the tag column, j is the segmentation feature of the image data, s is a segmentation point, and c i Is->Corresponding absolute values.
6. A data analysis device, characterized in that the data analysis device comprises:
the data acquisition module is used for receiving a data analysis request sent by the terminal, and acquiring a corresponding object data set based on an object to be analyzed in the data analysis request, wherein the object data set at least comprises object internal portrait data and external portrait data;
the data grading module is used for grading the internal portrait data and the external portrait data of the object according to a preset financing grade to obtain at least one data subset, wherein the data subset corresponds to the object to be analyzed one by one;
the calculating module is used for calculating the maximum financing range corresponding to the object to be analyzed and the maximum bearing capacity grade of the asset according to the data subset;
the prediction module is used for selecting a corresponding model training algorithm according to the maximum financing range and the maximum bearing capacity level of the assets; training the data subset for financing prediction according to the model training algorithm to obtain a financing prediction model, carrying out ecological balance prediction on an object to be predicted based on the financing prediction model, and outputting a financing prediction result;
The prediction module comprises a model training unit and a report generating unit;
the model training unit is configured to, when training is performed by using a training algorithm of a Light GBM model, match a Light GBM model training framework corresponding to a level of the data subset according to a level division result of the data subset, and input the data subset into the Light GBM model training framework for training, so as to obtain the financing prediction model, where the financing prediction model is:
wherein Obj is the output result of the financing prediction model, n>1, yi is the label value of the label column after normalization processing of the portrait data,is y i Estimated value f of (f) t (x t ) =f (x) is an approximate calculation function of the feature value in the image data, f t Is a characteristic value x t An approximate target value i is the number of terms of the tag sequence, x t X is y i The corresponding characteristic value, t is the number of terms of the characteristic value, and constant, L and omega represent constant terms;
the report generating unit is used for acquiring the internal portrait data and the external portrait data of the object to be predicted, inputting the internal portrait data and the external portrait data into the financing prediction model and outputting the financing grade corresponding to the object to be predicted; and according to the financing grade, inquiring the corresponding financing evaluation report from a corresponding system table of the preset financing grade and the financing evaluation report.
7. A data analysis device, characterized in that the data analysis device comprises: memory, a processor and a data analysis program stored on the memory and executable on the processor, which when executed by the processor, performs the steps of the data analysis method according to any one of claims 1-5.
8. A computer-readable storage medium, characterized in that the computer-readable storage medium has stored thereon a data analysis program which, when executed by a processor, implements the steps of the data analysis method according to any of claims 1-5.
CN201910479378.7A 2019-06-04 2019-06-04 Data analysis method, device, equipment and computer readable storage medium Active CN110310012B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910479378.7A CN110310012B (en) 2019-06-04 2019-06-04 Data analysis method, device, equipment and computer readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910479378.7A CN110310012B (en) 2019-06-04 2019-06-04 Data analysis method, device, equipment and computer readable storage medium

Publications (2)

Publication Number Publication Date
CN110310012A CN110310012A (en) 2019-10-08
CN110310012B true CN110310012B (en) 2023-07-28

Family

ID=68075289

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910479378.7A Active CN110310012B (en) 2019-06-04 2019-06-04 Data analysis method, device, equipment and computer readable storage medium

Country Status (1)

Country Link
CN (1) CN110310012B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110956303A (en) * 2019-10-12 2020-04-03 未鲲(上海)科技服务有限公司 Information prediction method, device, terminal and readable storage medium
CN112598341B (en) * 2021-03-08 2021-08-27 工福(北京)科技发展有限公司 Data processing system and method for idle article poverty alleviation platform

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101149821A (en) * 2007-05-15 2008-03-26 佟辛 Complete information based dynamically interactive type enterprise finance model construction and operation method
CN105719069A (en) * 2016-01-15 2016-06-29 中国南方电网有限责任公司电网技术研究中心 Method and system for measuring enterprise cash flows
KR101750825B1 (en) * 2016-04-01 2017-06-26 주식회사 조이펀 A crowd funding platform system for securing investors by continuously providing objective investing information and the funding method by using the same
CN107767259A (en) * 2017-09-30 2018-03-06 平安科技(深圳)有限公司 Loan risk control method, electronic installation and readable storage medium storing program for executing

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140136294A1 (en) * 2012-11-13 2014-05-15 Creat Llc Comprehensive quantitative and qualitative model for a real estate development project

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101149821A (en) * 2007-05-15 2008-03-26 佟辛 Complete information based dynamically interactive type enterprise finance model construction and operation method
CN105719069A (en) * 2016-01-15 2016-06-29 中国南方电网有限责任公司电网技术研究中心 Method and system for measuring enterprise cash flows
KR101750825B1 (en) * 2016-04-01 2017-06-26 주식회사 조이펀 A crowd funding platform system for securing investors by continuously providing objective investing information and the funding method by using the same
CN107767259A (en) * 2017-09-30 2018-03-06 平安科技(深圳)有限公司 Loan risk control method, electronic installation and readable storage medium storing program for executing

Also Published As

Publication number Publication date
CN110310012A (en) 2019-10-08

Similar Documents

Publication Publication Date Title
CN106447285B (en) Recruitment information matching method based on multi-dimensional domain key knowledge
EP3186754B1 (en) Customizable machine learning models
US10657498B2 (en) Automated resume screening
Tsui et al. Knowledge-based extraction of intellectual capital-related information from unstructured data
CN111105209B (en) Job resume matching method and device suitable for person post matching recommendation system
JP2021504789A (en) ESG-based corporate evaluation execution device and its operation method
CN105893609A (en) Mobile APP recommendation method based on weighted mixing
CN111125343A (en) Text analysis method and device suitable for human-sentry matching recommendation system
CN111078835A (en) Resume evaluation method and device, computer equipment and storage medium
CN112836509A (en) Expert system knowledge base construction method and system
US11620453B2 (en) System and method for artificial intelligence driven document analysis, including searching, indexing, comparing or associating datasets based on learned representations
CN112052396A (en) Course matching method, system, computer equipment and storage medium
CN110929119A (en) Data annotation method, device, equipment and computer storage medium
CN110310012B (en) Data analysis method, device, equipment and computer readable storage medium
CN115099310A (en) Method and device for training model and classifying enterprises
Rasiman et al. How effective is automated trace link recovery in model-driven development?
CN111930944B (en) File label classification method and device
CN115660695A (en) Customer service personnel label portrait construction method and device, electronic equipment and storage medium
CN115292167A (en) Life cycle prediction model construction method, device, equipment and readable storage medium
CN112506930B (en) Data insight system based on machine learning technology
CN113704599A (en) Marketing conversion user prediction method and device and computer equipment
CN113505117A (en) Data quality evaluation method, device, equipment and medium based on data indexes
CN112632275A (en) Crowd clustering data processing method, device and equipment based on personal text information
Jia Exploratory research on the practice of college English classroom teaching based on internet and artificial intelligence
CN114492308B (en) Industry information indexing method and system combining knowledge discovery and text mining

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant