CN111523085A - Stock tendency analysis method based on autocorrelation linear neighbor analysis - Google Patents

Stock tendency analysis method based on autocorrelation linear neighbor analysis Download PDF

Info

Publication number
CN111523085A
CN111523085A CN202010277361.6A CN202010277361A CN111523085A CN 111523085 A CN111523085 A CN 111523085A CN 202010277361 A CN202010277361 A CN 202010277361A CN 111523085 A CN111523085 A CN 111523085A
Authority
CN
China
Prior art keywords
stock
factors
data
analysis
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202010277361.6A
Other languages
Chinese (zh)
Inventor
石建
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nantong University
Original Assignee
Nantong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nantong University filed Critical Nantong University
Priority to CN202010277361.6A priority Critical patent/CN111523085A/en
Publication of CN111523085A publication Critical patent/CN111523085A/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/18Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/04Trading; Exchange, e.g. stocks, commodities, derivatives or currency exchange
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/06Asset management; Financial planning or analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • Data Mining & Analysis (AREA)
  • Development Economics (AREA)
  • Databases & Information Systems (AREA)
  • General Business, Economics & Management (AREA)
  • Computational Mathematics (AREA)
  • Technology Law (AREA)
  • Strategic Management (AREA)
  • Marketing (AREA)
  • General Engineering & Computer Science (AREA)
  • Economics (AREA)
  • Mathematical Physics (AREA)
  • Pure & Applied Mathematics (AREA)
  • Operations Research (AREA)
  • Mathematical Optimization (AREA)
  • Mathematical Analysis (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Probability & Statistics with Applications (AREA)
  • Human Resources & Organizations (AREA)
  • Algebra (AREA)
  • Game Theory and Decision Science (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Financial Or Insurance-Related Operations Such As Payment And Settlement (AREA)

Abstract

The invention discloses a stock trend analysis method based on autocorrelation linear neighbor analysis, which comprises the following steps: step 1, analyzing influence factors; step 2, acquiring data of stocks and companies; step 3, preprocessing stock data; step 4, establishing a model based on autocorrelation linear nearest neighbor analysis; step 5, model testing and conclusion analysis; and 6, evaluating the model and generating an analysis report. According to the method, a model based on autocorrelation linear nearest neighbor analysis is established through a literature research method, a quantitative analysis method, a qualitative analysis method and a core multiple linear regression method, and curve fitting accuracy is achieved under the influence factors of a plurality of stock price trends.

Description

Stock tendency analysis method based on autocorrelation linear neighbor analysis
Technical Field
The invention relates to the field of machine learning and data mining, in particular to a stock trend analysis method based on autocorrelation linear neighbor analysis.
Background
The stock market in China is formed in the nineties of the last century and has been developed for more than twenty years, and the stock market in China has already formed a certain scale but is still in the early development stage. The high-income and high-risk characteristic of the stock is particularly prominent in China because the stock market and the macroscopic economy in China have weaker correlation and larger fluctuation compared with European and American markets. And market participants are not mature enough, the sheep flock effect is obvious, and people are often influenced by most people to follow the thought or behavior of the masses.
Therefore, quantitative analysis of stock financing is undoubtedly a good choice. The invention utilizes the data mining algorithm to carry out practical application in the stocks and provides decision support for the stocks. Provides scientific algorithm and method for investors in decision making and can be applied to practice. And too many factors influencing the trend of the stock price are too many to fit the trend of the stock price very accurately, so some economic losses are caused inevitably.
Disclosure of Invention
The invention aims to solve the defects of the prior art and provide a stock trend analysis method based on autocorrelation linear neighbor analysis.
In order to realize the purpose of the invention, the invention adopts the following technical scheme: a stock tendency analysis method based on autocorrelation linear neighbor analysis comprises the following steps: step 1, analyzing influence factors, wherein the influence factors are as follows: determining factors affecting the trend of the stock price, wherein the factors comprise company factors and market factors;
step 2, acquiring data of stocks and companies, specifically comprising the following steps: step (2-1), crawling all stock codes of the deep certificate through a port; step (2-2), extracting all enterprise stock data 2010 to 2020, wherein the stock data comprises daily stock transaction data of each enterprise; step (2-3), selecting related data from the existing data to determine which data are related to the data analysis, and setting the data as data 1; step (2-4), presetting 21 potential factors, which are specifically as follows: opening price x1Maximum valence x2Total business cost x3Total profit x4Amount of finished transaction x5Forehead x for rising and falling6Total of the liabilities x7X, the tax due8Diluting each yield x9Company type x10Surplus accumulated fund x11Volume of business x12Unallocated profit x13X for paying staff14And the falling and rising amplitude x15Basic earnings per share x16Business profit x17Business total income x18Minimum price x19Yesterday's collection x20And closing price x21
Step 3, preprocessing stock data, specifically comprising the following steps: deleting redundant fields of the client code and the agency mechanism number; step (3-2), setting a threshold value alpha as a standard value of the redundancy removing field; step (3-3), calculating the number n of each value in each field by using value _ counts, and deleting the redundant field if n is more than or equal to 80% of row; step (3-4), deleting null values and null values which meet the conditions in the same way; step (3-5), utilizing a Lagrange interpolation method to fill up the rest abnormal values, and obtaining data2 after the abnormal values are processed; step (3-6), extracting more main characteristics, and performing PCA dimension reduction to obtain data 3; step (3-7), setting a random seed, and randomly extracting 80% of data3 as training data train and 20% of data3 as test data test by utilizing train _ test _ split; step (3-8), train enters the model training;
step 4, establishing a model based on autocorrelation linear neighbor analysis, which specifically comprises the following steps: step (4-1), selecting a form of a relation between the variable and the model;
step (4-2) determining a general form of the regression equation:
Y=C+αx1+βx2+γx3+x4+η
wherein, Y is the income of stock, x1 is the growth rate of stock growth value, x2 is the volume of bargain (capital construction investment total), x3 is the price index of stock commodity, x4 is the volume of bargain, η is the random variable, and α, β, γ are the β coefficients of stock;
step (4-3), calling LinearRegulation and KNeighborsregressor functions for multiple times to calculate a beta coefficient;
screening out the maximum and minimum four variables to establish a linear optimization equation set to solve the maximum value and the optimal solution of the yield;
step (4-5), performing sequence autocorrelation test on the model weight parameters;
and (4-6) checking to pass, wherein the model conforms to normal distribution, and a final regression equation is obtained:
Y=-27.76x1-2.15x2-1.94x3-1.47x4-0.31x5-0.23x6-0.14x7-0.02x8-0.02x9+0.01x10+0.02x11+0.10x12+0.09x13+0.11x14+0.15x15+0.73x16+1.13x17+2.12x18+8.97x19+17.17x20+17.22x21
step (4-7), drawing a stock price trend graph;
step 5, model testing and conclusion analysis;
and 6, evaluating the model and generating an analysis report.
Preferably, in step 1, the factors further include economic factors, political factors and industrial factors, the company factors include the operation condition of the company, financial reports and personnel change of important positions, and the market factors include market demands and the evaluation and influence of the company on the society.
Preferably, in step 5, the model test is specifically as follows: inputting the data in the test set for testing, wherein the test result passes through model test and conforms to normal distribution; the conclusion analysis is as follows: and determining the factors with high correlation degree according to the weights corresponding to the factors with different variables in the regression equation.
Compared with the prior art, the stock trend analysis method based on the autocorrelation linear neighbor analysis, which adopts the technical scheme, has the following beneficial effects: by adopting the stock tendency analysis method based on the autocorrelation linear neighbor analysis, a model based on the autocorrelation linear neighbor analysis is established, and the final test result passes through model inspection, so that the curve fitting accuracy is achieved under the influence factors of a plurality of stock price tendencies.
Drawings
FIG. 1 is a schematic flow chart of an embodiment of a stock trend analysis method based on autocorrelation linear nearest neighbor analysis according to the present invention;
FIG. 2 is a schematic diagram illustrating a flow of model detection in the stock trend analysis method based on autocorrelation linear nearest neighbor analysis according to this embodiment;
fig. 3 is a diagram illustrating a curve fitted to the stock trend in this embodiment.
Detailed Description
The invention is further described below with reference to the accompanying drawings.
Fig. 1 is a schematic flow chart of a stock trend analysis method based on autocorrelation linear neighbor analysis, which includes the following steps: step 1, analyzing influence factors, wherein the influence factors are as follows: determining factors influencing the stock price trend, wherein the influencing factors comprise company factors, market factors, economic factors, political factors and industrial factors, the company factors comprise the operating condition of the company, financial reports and personnel change of important positions, and the market factors comprise market demands and the evaluation and influence of the company on the society;
step 2, acquiring data of stocks and companies, specifically comprising the following steps: step (2-1), crawling all stock codes of the deep certificate through a port; step (2-2), extracting all enterprise stock data 2010 to 2020, wherein the stock data comprises daily stock transaction data of each enterprise; step (2-3), selecting related data from the existing data to determine which data are related to the data analysis, and setting the data as data 1; step (2-4), presetting 21 potential factors, which are specifically as follows: opening price x1Maximum valence x2Total business cost x3Total profit x4Amount of finished transaction x5Forehead x for rising and falling6Total of the liabilities x7X, the tax due8Diluting each yield x9Company type x10Surplus accumulated fund x11Volume of business x12Unallocated profit x13X for paying staff14And the falling and rising amplitude x15Basic earnings per share x16Business profit x17Business total income x18Minimum price x19Yesterday's collection x20And closing price x21
Step 3, preprocessing stock data, specifically comprising the following steps: deleting redundant fields of the client code and the agency mechanism number; step (3-2), setting a threshold value alpha as a standard value of the redundancy removing field; step (3-3), calculating the number n of each value in each field by using value _ counts, and deleting the redundant field if n is more than or equal to 80% of row; step (3-4), deleting null values and null values which meet the conditions in the same way; step (3-5), utilizing a Lagrange interpolation method to fill up the rest abnormal values, and obtaining data2 after the abnormal values are processed; step (3-6), extracting more main characteristics, and performing PCA dimension reduction to obtain data 3; step (3-7), setting a random seed, and randomly extracting 80% of data3 as training data train and 20% of data3 as test data test by utilizing train _ test _ split; step (3-8), train enters the model training;
step 4, establishing a model based on autocorrelation linear neighbor analysis, which specifically comprises the following steps: step (4-1), selecting a form of a relation between the variable and the model;
step (4-2) determining a general form of the regression equation:
Y=C+αx1+βx2+γx3+x4+η
wherein, Y is the income of stock, x1 is the growth rate of stock growth value, x2 is the volume of bargain (capital construction investment total), x3 is the price index of stock commodity, x4 is the volume of bargain, η is the random variable, and α, β, γ are the β coefficients of stock;
step (4-3), calling LinearRegulation and KNeighborsregressor functions for multiple times to calculate a beta coefficient;
screening out the maximum and minimum four variables to establish a linear optimization equation set to solve the maximum value and the optimal solution of the yield;
step (4-5), performing sequence autocorrelation test on the model weight parameters;
and (4-6) checking to pass, wherein the model conforms to normal distribution, and a final regression equation is obtained:
Y=-27.76x1-2.15x2-1.94x3-1.47x4-0.31x5-0.23x6-0.14x7-0.02x8-0.02x9+0.01x10+0.02x11+0.10x12+0.09x13+0.11x14+0.15x15+0.73x16+1.13x17+2.12x18+8.97x19+17.17x20+17.22x21
and (4-7) drawing a stock price trend graph.
Step 5, model testing and conclusion analysis, as shown in fig. 3, a schematic diagram of a curve fitted to stock trends, wherein the model testing is as follows: inputting the data in the test set for testing, wherein the test result passes through model test and conforms to normal distribution; conclusion analysis is specifically as follows: determining factors with high relevancy according to weights corresponding to factors with different variables in a regression equation, and analyzing to obtain a conclusion that the stock price of an enterprise is influenced highly by opening price, maximum price, total business cost, total profit, business profit, total business income, minimum price, yesterday income and closing price;
step 6, model evaluation and analysis report generation, so far, the stock trend analysis method based on the autocorrelation linear neighbor analysis is completely explained, because the invention relates to a plurality of variables, table 1 explains the variables in each step, and as shown in table 1, the variable explanation table is as follows:
Figure BDA0002445289360000061
TABLE 1
The foregoing is a preferred embodiment of the present invention, and it will be apparent to those skilled in the art that various changes and modifications may be made without departing from the spirit of the invention, and these should be considered to be within the scope of the invention.

Claims (3)

1. A stock trend analysis method based on autocorrelation linear neighbor analysis is characterized in that: the method comprises the following steps: step 1, analyzing influence factors, wherein the influence factors are as follows: determining factors affecting the trend of the stock price, wherein the factors comprise company factors and market factors;
step 2, acquiring data of stocks and companies, specifically comprising the following steps: step (2-1), crawling all stock codes of the deep certificate through a port; step (2-2) extracting all enterprise stock data 2010 to 2020, wherein the stock transaction data of each enterprise every day is included(ii) a Step (2-3), selecting related data from the existing data to determine which data are related to the data analysis, and setting the data as data 1; step (2-4), presetting 21 potential factors, which are specifically as follows: opening price x1Maximum valence x2Total business cost x3Total profit x4Amount of finished transaction x5Forehead x for rising and falling6Total of the liabilities x7X, the tax due8Diluting each yield x9Company type x10Surplus accumulated fund x11Volume of business x12Unallocated profit x13X for paying staff14And the falling and rising amplitude x15Basic earnings per share x16Business profit x17Business total income x18Minimum price x19Yesterday's collection x20And closing price x21
Step 3, preprocessing stock data, specifically comprising the following steps: deleting redundant fields of the client code and the agency mechanism number; step (3-2), setting a threshold value alpha as a standard value of the redundancy removing field; step (3-3), calculating the number n of each value in each field by using value _ counts, and deleting the redundant field if n is more than or equal to 80% of row; step (3-4), deleting null values and null values which meet the conditions in the same way; step (3-5), utilizing a Lagrange interpolation method to fill up the rest abnormal values, and obtaining data2 after the abnormal values are processed; step (3-6), extracting more main characteristics, and performing PCA dimension reduction to obtain data 3; step (3-7), setting a random seed, and randomly extracting 80% of data3 as training data train and 20% of data3 as test data test by utilizing train _ test _ split; step (3-8), train enters the model training;
step 4, establishing a model based on autocorrelation linear neighbor analysis, which specifically comprises the following steps: step (4-1), selecting a form of a relation between the variable and the model;
step (4-2) determining a general form of the regression equation:
Y=C+αx1+βx2+γx3+x4+η
wherein, Y is the income of stock, x1 is the growth rate of stock growth value, x2 is the volume of bargain (capital construction investment total), x3 is the price index of stock commodity, x4 is the volume of bargain, η is the random variable, and α, β, γ are the β coefficients of stock;
step (4-3), calling LinearRegulation and KNeighborsregressor functions for multiple times to calculate a beta coefficient;
screening out the maximum and minimum four variables to establish a linear optimization equation set to solve the maximum value and the optimal solution of the yield;
step (4-5), performing sequence autocorrelation test on the model weight parameters;
and (4-6) checking to pass, wherein the model conforms to normal distribution, and a final regression equation is obtained:
Y=-27.76x1-2.15x2-1.94x3-1.47x4-0.31x5-0.23x6-0.14x7-0.02x8-0.02x9+0.01x10+0.02x11+0.10x12+0.09x13+0.11x14+0.15x15+0.73x16+1.13x17+2.12x18+8.97x19+17.17x20+17.22x21
step (4-7), drawing a stock price trend graph;
step 5, model testing and conclusion analysis;
and 6, evaluating the model and generating an analysis report.
2. The method of claim 1, wherein the stock trend analysis method based on the autocorrelation linear nearest neighbor analysis comprises: in step 1, the factors further include economic factors, political factors, and industrial factors, the company factors include the operating conditions of the company, financial statements, and staff changes of important positions, and the market factors include market demands and the evaluation and influence of the company on the society.
3. The method of claim 1, wherein the stock trend analysis method based on the autocorrelation linear nearest neighbor analysis comprises: in step 5, the model test is specifically as follows: inputting the data in the test set for testing, wherein the test result passes through model test and conforms to normal distribution; the conclusion analysis is as follows: and determining the factors with high correlation degree according to the weights corresponding to the factors with different variables in the regression equation.
CN202010277361.6A 2020-04-10 2020-04-10 Stock tendency analysis method based on autocorrelation linear neighbor analysis Withdrawn CN111523085A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010277361.6A CN111523085A (en) 2020-04-10 2020-04-10 Stock tendency analysis method based on autocorrelation linear neighbor analysis

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010277361.6A CN111523085A (en) 2020-04-10 2020-04-10 Stock tendency analysis method based on autocorrelation linear neighbor analysis

Publications (1)

Publication Number Publication Date
CN111523085A true CN111523085A (en) 2020-08-11

Family

ID=71902041

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010277361.6A Withdrawn CN111523085A (en) 2020-04-10 2020-04-10 Stock tendency analysis method based on autocorrelation linear neighbor analysis

Country Status (1)

Country Link
CN (1) CN111523085A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112380489A (en) * 2020-11-03 2021-02-19 武汉光庭信息技术股份有限公司 Data processing time calculation method, data processing platform evaluation method and system
CN112927081A (en) * 2021-03-16 2021-06-08 北京同邦卓益科技有限公司 Data processing method, device, system and storage medium

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112380489A (en) * 2020-11-03 2021-02-19 武汉光庭信息技术股份有限公司 Data processing time calculation method, data processing platform evaluation method and system
CN112380489B (en) * 2020-11-03 2024-04-16 武汉光庭信息技术股份有限公司 Data processing time calculation method, data processing platform evaluation method and system
CN112927081A (en) * 2021-03-16 2021-06-08 北京同邦卓益科技有限公司 Data processing method, device, system and storage medium

Similar Documents

Publication Publication Date Title
CN114048436A (en) Construction method and construction device for forecasting enterprise financial data model
CN111523085A (en) Stock tendency analysis method based on autocorrelation linear neighbor analysis
Desmon et al. Factors Affecting Investment in the Provinces of Sumatra Island
CN106228399A (en) A kind of stock trader's customer risk preference categories method based on big data
CN113240527A (en) Bond market default risk early warning method based on interpretable machine learning
CN111523086A (en) Room price trend analysis method based on logarithmic linear regression and random forest
CN115205011A (en) Bank user portrait model generation method based on LSF-FC algorithm
CN117541057A (en) Enterprise operation early warning monitoring method and system based on data analysis
Gao et al. Data analytics and audit quality
CN116468536A (en) Automatic risk control rule generation method
CN112037006A (en) Credit risk identification method and device for small and micro enterprises
Akyol et al. How do experienced analysts improve price efficiency?
Cunningham Extracting a better signal from uncertain data
Crespi et al. The productivity of science
CN116883070A (en) Bank generation payroll customer loss early warning method
CN115907533A (en) Method and system for evaluating continuous operation capability of individual industrial and commercial customers
KR102499182B1 (en) Loan regular auditing system using artificia intellicence
CN116227958A (en) Method and system for dynamically and quantitatively evaluating offset fund manager based on holding bin and net value
Herwartz et al. Do rising top incomes spur economic growth? Evidence from OECD countries based on a novel identification strategy
Lebedchenko The impact of interregional economic differentiation on the economic security of the regions of Ukraine
Friedman et al. Technological Investment and Accounting: A Demand-Side Perspective on Accounting Enrollment Declines
Yang et al. An empirical analysis on distribution patterns of software maintenance effort
Sari et al. The Effect of Working Capital Turnover on Profitability (Empirical Study of Textile and Garment Companies Listed on The Indonesia Stock Exchange for The 2014-2018 Period
Damiani et al. When robots do (not) enhance job quality: the role of innovation regimes
Wang et al. Nonlinearity in the cross-section of stock returns: Evidence from China

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication

Application publication date: 20200811

WW01 Invention patent application withdrawn after publication