CN111523085A - Stock tendency analysis method based on autocorrelation linear neighbor analysis - Google Patents
Stock tendency analysis method based on autocorrelation linear neighbor analysis Download PDFInfo
- Publication number
- CN111523085A CN111523085A CN202010277361.6A CN202010277361A CN111523085A CN 111523085 A CN111523085 A CN 111523085A CN 202010277361 A CN202010277361 A CN 202010277361A CN 111523085 A CN111523085 A CN 111523085A
- Authority
- CN
- China
- Prior art keywords
- stock
- factors
- data
- analysis
- model
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/18—Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/951—Indexing; Web crawling techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q40/00—Finance; Insurance; Tax strategies; Processing of corporate or income taxes
- G06Q40/04—Trading; Exchange, e.g. stocks, commodities, derivatives or currency exchange
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q40/00—Finance; Insurance; Tax strategies; Processing of corporate or income taxes
- G06Q40/06—Asset management; Financial planning or analysis
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Business, Economics & Management (AREA)
- Theoretical Computer Science (AREA)
- Accounting & Taxation (AREA)
- Finance (AREA)
- Data Mining & Analysis (AREA)
- Development Economics (AREA)
- Databases & Information Systems (AREA)
- General Business, Economics & Management (AREA)
- Computational Mathematics (AREA)
- Technology Law (AREA)
- Strategic Management (AREA)
- Marketing (AREA)
- General Engineering & Computer Science (AREA)
- Economics (AREA)
- Mathematical Physics (AREA)
- Pure & Applied Mathematics (AREA)
- Operations Research (AREA)
- Mathematical Optimization (AREA)
- Mathematical Analysis (AREA)
- Entrepreneurship & Innovation (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Probability & Statistics with Applications (AREA)
- Human Resources & Organizations (AREA)
- Algebra (AREA)
- Game Theory and Decision Science (AREA)
- Software Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- Financial Or Insurance-Related Operations Such As Payment And Settlement (AREA)
Abstract
The invention discloses a stock trend analysis method based on autocorrelation linear neighbor analysis, which comprises the following steps: step 1, analyzing influence factors; step 2, acquiring data of stocks and companies; step 3, preprocessing stock data; step 4, establishing a model based on autocorrelation linear nearest neighbor analysis; step 5, model testing and conclusion analysis; and 6, evaluating the model and generating an analysis report. According to the method, a model based on autocorrelation linear nearest neighbor analysis is established through a literature research method, a quantitative analysis method, a qualitative analysis method and a core multiple linear regression method, and curve fitting accuracy is achieved under the influence factors of a plurality of stock price trends.
Description
Technical Field
The invention relates to the field of machine learning and data mining, in particular to a stock trend analysis method based on autocorrelation linear neighbor analysis.
Background
The stock market in China is formed in the nineties of the last century and has been developed for more than twenty years, and the stock market in China has already formed a certain scale but is still in the early development stage. The high-income and high-risk characteristic of the stock is particularly prominent in China because the stock market and the macroscopic economy in China have weaker correlation and larger fluctuation compared with European and American markets. And market participants are not mature enough, the sheep flock effect is obvious, and people are often influenced by most people to follow the thought or behavior of the masses.
Therefore, quantitative analysis of stock financing is undoubtedly a good choice. The invention utilizes the data mining algorithm to carry out practical application in the stocks and provides decision support for the stocks. Provides scientific algorithm and method for investors in decision making and can be applied to practice. And too many factors influencing the trend of the stock price are too many to fit the trend of the stock price very accurately, so some economic losses are caused inevitably.
Disclosure of Invention
The invention aims to solve the defects of the prior art and provide a stock trend analysis method based on autocorrelation linear neighbor analysis.
In order to realize the purpose of the invention, the invention adopts the following technical scheme: a stock tendency analysis method based on autocorrelation linear neighbor analysis comprises the following steps: step 1, analyzing influence factors, wherein the influence factors are as follows: determining factors affecting the trend of the stock price, wherein the factors comprise company factors and market factors;
step 2, acquiring data of stocks and companies, specifically comprising the following steps: step (2-1), crawling all stock codes of the deep certificate through a port; step (2-2), extracting all enterprise stock data 2010 to 2020, wherein the stock data comprises daily stock transaction data of each enterprise; step (2-3), selecting related data from the existing data to determine which data are related to the data analysis, and setting the data as data 1; step (2-4), presetting 21 potential factors, which are specifically as follows: opening price x1Maximum valence x2Total business cost x3Total profit x4Amount of finished transaction x5Forehead x for rising and falling6Total of the liabilities x7X, the tax due8Diluting each yield x9Company type x10Surplus accumulated fund x11Volume of business x12Unallocated profit x13X for paying staff14And the falling and rising amplitude x15Basic earnings per share x16Business profit x17Business total income x18Minimum price x19Yesterday's collection x20And closing price x21;
Step 3, preprocessing stock data, specifically comprising the following steps: deleting redundant fields of the client code and the agency mechanism number; step (3-2), setting a threshold value alpha as a standard value of the redundancy removing field; step (3-3), calculating the number n of each value in each field by using value _ counts, and deleting the redundant field if n is more than or equal to 80% of row; step (3-4), deleting null values and null values which meet the conditions in the same way; step (3-5), utilizing a Lagrange interpolation method to fill up the rest abnormal values, and obtaining data2 after the abnormal values are processed; step (3-6), extracting more main characteristics, and performing PCA dimension reduction to obtain data 3; step (3-7), setting a random seed, and randomly extracting 80% of data3 as training data train and 20% of data3 as test data test by utilizing train _ test _ split; step (3-8), train enters the model training;
step 4, establishing a model based on autocorrelation linear neighbor analysis, which specifically comprises the following steps: step (4-1), selecting a form of a relation between the variable and the model;
step (4-2) determining a general form of the regression equation:
Y=C+αx1+βx2+γx3+x4+η
wherein, Y is the income of stock, x1 is the growth rate of stock growth value, x2 is the volume of bargain (capital construction investment total), x3 is the price index of stock commodity, x4 is the volume of bargain, η is the random variable, and α, β, γ are the β coefficients of stock;
step (4-3), calling LinearRegulation and KNeighborsregressor functions for multiple times to calculate a beta coefficient;
screening out the maximum and minimum four variables to establish a linear optimization equation set to solve the maximum value and the optimal solution of the yield;
step (4-5), performing sequence autocorrelation test on the model weight parameters;
and (4-6) checking to pass, wherein the model conforms to normal distribution, and a final regression equation is obtained:
Y=-27.76x1-2.15x2-1.94x3-1.47x4-0.31x5-0.23x6-0.14x7-0.02x8-0.02x9+0.01x10+0.02x11+0.10x12+0.09x13+0.11x14+0.15x15+0.73x16+1.13x17+2.12x18+8.97x19+17.17x20+17.22x21
step (4-7), drawing a stock price trend graph;
and 6, evaluating the model and generating an analysis report.
Preferably, in step 1, the factors further include economic factors, political factors and industrial factors, the company factors include the operation condition of the company, financial reports and personnel change of important positions, and the market factors include market demands and the evaluation and influence of the company on the society.
Preferably, in step 5, the model test is specifically as follows: inputting the data in the test set for testing, wherein the test result passes through model test and conforms to normal distribution; the conclusion analysis is as follows: and determining the factors with high correlation degree according to the weights corresponding to the factors with different variables in the regression equation.
Compared with the prior art, the stock trend analysis method based on the autocorrelation linear neighbor analysis, which adopts the technical scheme, has the following beneficial effects: by adopting the stock tendency analysis method based on the autocorrelation linear neighbor analysis, a model based on the autocorrelation linear neighbor analysis is established, and the final test result passes through model inspection, so that the curve fitting accuracy is achieved under the influence factors of a plurality of stock price tendencies.
Drawings
FIG. 1 is a schematic flow chart of an embodiment of a stock trend analysis method based on autocorrelation linear nearest neighbor analysis according to the present invention;
FIG. 2 is a schematic diagram illustrating a flow of model detection in the stock trend analysis method based on autocorrelation linear nearest neighbor analysis according to this embodiment;
fig. 3 is a diagram illustrating a curve fitted to the stock trend in this embodiment.
Detailed Description
The invention is further described below with reference to the accompanying drawings.
Fig. 1 is a schematic flow chart of a stock trend analysis method based on autocorrelation linear neighbor analysis, which includes the following steps: step 1, analyzing influence factors, wherein the influence factors are as follows: determining factors influencing the stock price trend, wherein the influencing factors comprise company factors, market factors, economic factors, political factors and industrial factors, the company factors comprise the operating condition of the company, financial reports and personnel change of important positions, and the market factors comprise market demands and the evaluation and influence of the company on the society;
step 2, acquiring data of stocks and companies, specifically comprising the following steps: step (2-1), crawling all stock codes of the deep certificate through a port; step (2-2), extracting all enterprise stock data 2010 to 2020, wherein the stock data comprises daily stock transaction data of each enterprise; step (2-3), selecting related data from the existing data to determine which data are related to the data analysis, and setting the data as data 1; step (2-4), presetting 21 potential factors, which are specifically as follows: opening price x1Maximum valence x2Total business cost x3Total profit x4Amount of finished transaction x5Forehead x for rising and falling6Total of the liabilities x7X, the tax due8Diluting each yield x9Company type x10Surplus accumulated fund x11Volume of business x12Unallocated profit x13X for paying staff14And the falling and rising amplitude x15Basic earnings per share x16Business profit x17Business total income x18Minimum price x19Yesterday's collection x20And closing price x21;
Step 3, preprocessing stock data, specifically comprising the following steps: deleting redundant fields of the client code and the agency mechanism number; step (3-2), setting a threshold value alpha as a standard value of the redundancy removing field; step (3-3), calculating the number n of each value in each field by using value _ counts, and deleting the redundant field if n is more than or equal to 80% of row; step (3-4), deleting null values and null values which meet the conditions in the same way; step (3-5), utilizing a Lagrange interpolation method to fill up the rest abnormal values, and obtaining data2 after the abnormal values are processed; step (3-6), extracting more main characteristics, and performing PCA dimension reduction to obtain data 3; step (3-7), setting a random seed, and randomly extracting 80% of data3 as training data train and 20% of data3 as test data test by utilizing train _ test _ split; step (3-8), train enters the model training;
step 4, establishing a model based on autocorrelation linear neighbor analysis, which specifically comprises the following steps: step (4-1), selecting a form of a relation between the variable and the model;
step (4-2) determining a general form of the regression equation:
Y=C+αx1+βx2+γx3+x4+η
wherein, Y is the income of stock, x1 is the growth rate of stock growth value, x2 is the volume of bargain (capital construction investment total), x3 is the price index of stock commodity, x4 is the volume of bargain, η is the random variable, and α, β, γ are the β coefficients of stock;
step (4-3), calling LinearRegulation and KNeighborsregressor functions for multiple times to calculate a beta coefficient;
screening out the maximum and minimum four variables to establish a linear optimization equation set to solve the maximum value and the optimal solution of the yield;
step (4-5), performing sequence autocorrelation test on the model weight parameters;
and (4-6) checking to pass, wherein the model conforms to normal distribution, and a final regression equation is obtained:
Y=-27.76x1-2.15x2-1.94x3-1.47x4-0.31x5-0.23x6-0.14x7-0.02x8-0.02x9+0.01x10+0.02x11+0.10x12+0.09x13+0.11x14+0.15x15+0.73x16+1.13x17+2.12x18+8.97x19+17.17x20+17.22x21
and (4-7) drawing a stock price trend graph.
step 6, model evaluation and analysis report generation, so far, the stock trend analysis method based on the autocorrelation linear neighbor analysis is completely explained, because the invention relates to a plurality of variables, table 1 explains the variables in each step, and as shown in table 1, the variable explanation table is as follows:
TABLE 1
The foregoing is a preferred embodiment of the present invention, and it will be apparent to those skilled in the art that various changes and modifications may be made without departing from the spirit of the invention, and these should be considered to be within the scope of the invention.
Claims (3)
1. A stock trend analysis method based on autocorrelation linear neighbor analysis is characterized in that: the method comprises the following steps: step 1, analyzing influence factors, wherein the influence factors are as follows: determining factors affecting the trend of the stock price, wherein the factors comprise company factors and market factors;
step 2, acquiring data of stocks and companies, specifically comprising the following steps: step (2-1), crawling all stock codes of the deep certificate through a port; step (2-2) extracting all enterprise stock data 2010 to 2020, wherein the stock transaction data of each enterprise every day is included(ii) a Step (2-3), selecting related data from the existing data to determine which data are related to the data analysis, and setting the data as data 1; step (2-4), presetting 21 potential factors, which are specifically as follows: opening price x1Maximum valence x2Total business cost x3Total profit x4Amount of finished transaction x5Forehead x for rising and falling6Total of the liabilities x7X, the tax due8Diluting each yield x9Company type x10Surplus accumulated fund x11Volume of business x12Unallocated profit x13X for paying staff14And the falling and rising amplitude x15Basic earnings per share x16Business profit x17Business total income x18Minimum price x19Yesterday's collection x20And closing price x21;
Step 3, preprocessing stock data, specifically comprising the following steps: deleting redundant fields of the client code and the agency mechanism number; step (3-2), setting a threshold value alpha as a standard value of the redundancy removing field; step (3-3), calculating the number n of each value in each field by using value _ counts, and deleting the redundant field if n is more than or equal to 80% of row; step (3-4), deleting null values and null values which meet the conditions in the same way; step (3-5), utilizing a Lagrange interpolation method to fill up the rest abnormal values, and obtaining data2 after the abnormal values are processed; step (3-6), extracting more main characteristics, and performing PCA dimension reduction to obtain data 3; step (3-7), setting a random seed, and randomly extracting 80% of data3 as training data train and 20% of data3 as test data test by utilizing train _ test _ split; step (3-8), train enters the model training;
step 4, establishing a model based on autocorrelation linear neighbor analysis, which specifically comprises the following steps: step (4-1), selecting a form of a relation between the variable and the model;
step (4-2) determining a general form of the regression equation:
Y=C+αx1+βx2+γx3+x4+η
wherein, Y is the income of stock, x1 is the growth rate of stock growth value, x2 is the volume of bargain (capital construction investment total), x3 is the price index of stock commodity, x4 is the volume of bargain, η is the random variable, and α, β, γ are the β coefficients of stock;
step (4-3), calling LinearRegulation and KNeighborsregressor functions for multiple times to calculate a beta coefficient;
screening out the maximum and minimum four variables to establish a linear optimization equation set to solve the maximum value and the optimal solution of the yield;
step (4-5), performing sequence autocorrelation test on the model weight parameters;
and (4-6) checking to pass, wherein the model conforms to normal distribution, and a final regression equation is obtained:
Y=-27.76x1-2.15x2-1.94x3-1.47x4-0.31x5-0.23x6-0.14x7-0.02x8-0.02x9+0.01x10+0.02x11+0.10x12+0.09x13+0.11x14+0.15x15+0.73x16+1.13x17+2.12x18+8.97x19+17.17x20+17.22x21
step (4-7), drawing a stock price trend graph;
step 5, model testing and conclusion analysis;
and 6, evaluating the model and generating an analysis report.
2. The method of claim 1, wherein the stock trend analysis method based on the autocorrelation linear nearest neighbor analysis comprises: in step 1, the factors further include economic factors, political factors, and industrial factors, the company factors include the operating conditions of the company, financial statements, and staff changes of important positions, and the market factors include market demands and the evaluation and influence of the company on the society.
3. The method of claim 1, wherein the stock trend analysis method based on the autocorrelation linear nearest neighbor analysis comprises: in step 5, the model test is specifically as follows: inputting the data in the test set for testing, wherein the test result passes through model test and conforms to normal distribution; the conclusion analysis is as follows: and determining the factors with high correlation degree according to the weights corresponding to the factors with different variables in the regression equation.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010277361.6A CN111523085A (en) | 2020-04-10 | 2020-04-10 | Stock tendency analysis method based on autocorrelation linear neighbor analysis |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010277361.6A CN111523085A (en) | 2020-04-10 | 2020-04-10 | Stock tendency analysis method based on autocorrelation linear neighbor analysis |
Publications (1)
Publication Number | Publication Date |
---|---|
CN111523085A true CN111523085A (en) | 2020-08-11 |
Family
ID=71902041
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010277361.6A Withdrawn CN111523085A (en) | 2020-04-10 | 2020-04-10 | Stock tendency analysis method based on autocorrelation linear neighbor analysis |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111523085A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112380489A (en) * | 2020-11-03 | 2021-02-19 | 武汉光庭信息技术股份有限公司 | Data processing time calculation method, data processing platform evaluation method and system |
CN112927081A (en) * | 2021-03-16 | 2021-06-08 | 北京同邦卓益科技有限公司 | Data processing method, device, system and storage medium |
-
2020
- 2020-04-10 CN CN202010277361.6A patent/CN111523085A/en not_active Withdrawn
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112380489A (en) * | 2020-11-03 | 2021-02-19 | 武汉光庭信息技术股份有限公司 | Data processing time calculation method, data processing platform evaluation method and system |
CN112380489B (en) * | 2020-11-03 | 2024-04-16 | 武汉光庭信息技术股份有限公司 | Data processing time calculation method, data processing platform evaluation method and system |
CN112927081A (en) * | 2021-03-16 | 2021-06-08 | 北京同邦卓益科技有限公司 | Data processing method, device, system and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN114048436A (en) | Construction method and construction device for forecasting enterprise financial data model | |
CN111523085A (en) | Stock tendency analysis method based on autocorrelation linear neighbor analysis | |
Desmon et al. | Factors Affecting Investment in the Provinces of Sumatra Island | |
CN106228399A (en) | A kind of stock trader's customer risk preference categories method based on big data | |
CN113240527A (en) | Bond market default risk early warning method based on interpretable machine learning | |
CN111523086A (en) | Room price trend analysis method based on logarithmic linear regression and random forest | |
CN115205011A (en) | Bank user portrait model generation method based on LSF-FC algorithm | |
CN117541057A (en) | Enterprise operation early warning monitoring method and system based on data analysis | |
Gao et al. | Data analytics and audit quality | |
CN116468536A (en) | Automatic risk control rule generation method | |
CN112037006A (en) | Credit risk identification method and device for small and micro enterprises | |
Akyol et al. | How do experienced analysts improve price efficiency? | |
Cunningham | Extracting a better signal from uncertain data | |
Crespi et al. | The productivity of science | |
CN116883070A (en) | Bank generation payroll customer loss early warning method | |
CN115907533A (en) | Method and system for evaluating continuous operation capability of individual industrial and commercial customers | |
KR102499182B1 (en) | Loan regular auditing system using artificia intellicence | |
CN116227958A (en) | Method and system for dynamically and quantitatively evaluating offset fund manager based on holding bin and net value | |
Herwartz et al. | Do rising top incomes spur economic growth? Evidence from OECD countries based on a novel identification strategy | |
Lebedchenko | The impact of interregional economic differentiation on the economic security of the regions of Ukraine | |
Friedman et al. | Technological Investment and Accounting: A Demand-Side Perspective on Accounting Enrollment Declines | |
Yang et al. | An empirical analysis on distribution patterns of software maintenance effort | |
Sari et al. | The Effect of Working Capital Turnover on Profitability (Empirical Study of Textile and Garment Companies Listed on The Indonesia Stock Exchange for The 2014-2018 Period | |
Damiani et al. | When robots do (not) enhance job quality: the role of innovation regimes | |
Wang et al. | Nonlinearity in the cross-section of stock returns: Evidence from China |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WW01 | Invention patent application withdrawn after publication |
Application publication date: 20200811 |
|
WW01 | Invention patent application withdrawn after publication |