CN110889024A - Method and device for calculating information-related stock - Google Patents

Method and device for calculating information-related stock Download PDF

Info

Publication number
CN110889024A
CN110889024A CN201911024061.0A CN201911024061A CN110889024A CN 110889024 A CN110889024 A CN 110889024A CN 201911024061 A CN201911024061 A CN 201911024061A CN 110889024 A CN110889024 A CN 110889024A
Authority
CN
China
Prior art keywords
information
piece
stock
keyword
stocks
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911024061.0A
Other languages
Chinese (zh)
Inventor
吴祥
樊国鹏
朱留锋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan Lighthouse Light Technology Co Ltd
Original Assignee
Wuhan Lighthouse Light Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan Lighthouse Light Technology Co Ltd filed Critical Wuhan Lighthouse Light Technology Co Ltd
Priority to CN201911024061.0A priority Critical patent/CN110889024A/en
Publication of CN110889024A publication Critical patent/CN110889024A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/04Trading; Exchange, e.g. stocks, commodities, derivatives or currency exchange

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Business, Economics & Management (AREA)
  • Databases & Information Systems (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Finance (AREA)
  • Accounting & Taxation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Development Economics (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • Technology Law (AREA)
  • General Business, Economics & Management (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention belongs to the technical field of information processing, and particularly relates to a method and a device for calculating information-related stocks, wherein the method comprises the following steps: acquiring a plurality of pieces of information according to search conditions input by a user, and calculating to obtain one or more keywords of each piece of information; for each piece of information, performing model training on each keyword of the piece of information, and calculating to obtain a keyword vector corresponding to each keyword in the piece of information; for each piece of information, calculating the association degree between the keywords of the piece of information and the stock words of the market based on the keyword vector, and further determining one or more stocks related to the piece of information. The invention can acquire mass information according to the needs of users, calculates the keywords of each piece of information, and quickly analyzes and acquires the stocks related to the information by calculating the similarity between the keywords and each stock word in the market and presents the stocks to the users, thereby assisting the users in carrying out stock investment transactions and providing effective reference for the investment of the users.

Description

Method and device for calculating information-related stock
[ technical field ] A method for producing a semiconductor device
The invention relates to the technical field of information processing, in particular to a method and a device for calculating information-related stocks.
[ background of the invention ]
With the rapid development of the internet and domestic securities markets, internet information is updated more frequently, very much information is often exploded and sent out in a short time on the market, and too much information makes investors overwhelmed, and how to accurately and rapidly acquire associated stock information in a massive information set in the background and show the associated stock information to users in time becomes very difficult.
At present, the method of manually configuring information is usually adopted, and in the face of massive real-time information, a plurality of stocks related to the information are manually selected and displayed to users through a large amount of manpower, so as to assist the users in carrying out stock investment transactions. However, since the information is updated frequently, the manual configuration cost is high, the subjectivity and the accuracy are low, and the process of analyzing the information by using only manpower is relatively slow, which is not enough to obtain a good effect in a short time, and cannot meet the requirement of timely touching the associated stocks to the speed of the user, so that effective reference cannot be timely provided for the investment of the user.
In view of the above, it is an urgent problem in the art to overcome the above-mentioned drawbacks of the prior art.
[ summary of the invention ]
The technical problems to be solved by the invention are as follows:
at present, internet information is updated frequently, and the acquisition of information-related stocks through only adopting human analysis leads to higher configuration cost, strong subjectivity, low accuracy and slow configuration process, so that the information-related stocks cannot be screened out in time and presented to users, and effective reference cannot be provided for user investment.
The invention achieves the above purpose by the following technical scheme:
in a first aspect, the present invention provides a method for calculating information-related stocks, comprising:
acquiring a plurality of pieces of information according to search conditions input by a user, and calculating to obtain one or more keywords of each piece of information;
for each piece of information, performing model training on each keyword of the piece of information, and calculating to obtain a keyword vector corresponding to each keyword in the piece of information;
for each piece of information, calculating the association degree between the keywords of the piece of information and the stock words of the market based on the keyword vector, and further determining one or more stocks related to the piece of information.
Preferably, after determining one or more stocks related to the piece of information, the method further comprises:
displaying the one or more stocks to a user according to a preset sorting mode; the preset sequence is the sequence of the relevance of each stock word, the sequence of the release time of corresponding information or the sequence of the number of corresponding keywords.
Preferably, the obtaining of multiple pieces of information according to the search condition input by the user and the obtaining of one or more keywords of each piece of information by calculation specifically include:
acquiring a plurality of related information according to a retrieval condition input by a user; wherein, the search condition comprises one or more items of information title, information content and information keyword;
performing data cleaning on each information to remove useless tags and dirty data in the information;
and performing word segmentation operation on the text in each piece of information, removing invalid words, selecting one or more keywords as the keywords of the piece of information from the rest words, and calculating the weight of each keyword.
Preferably, for each piece of information, one or more keywords selected from the remaining vocabularies as the keywords of the piece of information are specifically:
for each remaining vocabulary, comparing the probability of the vocabulary appearing in the full news information with the probability of the vocabulary appearing in the news information; if the probability of the vocabulary appearing in the news information is higher than that appearing in the full information, the vocabulary is used as a keyword of the information.
Preferably, the null words include one or more of stop words, dirty words and noise words.
Preferably, when performing keyword training, the cbow model is selected for training, and the final obtained keyword vector is a 1 × 256-dimensional vector.
Preferably, for each piece of information, the association degree between the keyword of the piece of information and each stock word in the market is calculated based on the keyword vector, so as to determine one or more stocks related to the piece of information, specifically:
traversing and calculating the distance of the cosine of the included angle between the keyword vector of the information and each stock word vector in the market, and multiplying the distance by the corresponding weight to obtain the association degree of the keyword of the information and the corresponding stock word;
after each calculation, comparing the obtained association degree with a preset threshold value, and if the association degree is higher than the preset threshold value, storing the corresponding stock words and the association degree value in a stock dictionary;
and after the traversal calculation is completed, determining one or more stocks related to the information based on one or more stock words stored in the stock dictionary.
Preferably, after determining one or more stocks related to the information based on one or more stock words stored in the stock dictionary, the method further comprises:
sorting the stock words in the stock dictionary according to the sequence of the corresponding relevance values from big to small, and forming n stock tickets with the top rank into an ordered relevant stock array; wherein n is more than or equal to 1;
and displaying the associated stock array to the user according to a sorting mode so that the user can conveniently perform stock investment based on information association.
Preferably, the plurality of information is obtained from one or more information platforms, and after the obtaining of the plurality of information, the method further comprises:
respectively analyzing the properties of the plurality of pieces of information, screening out one or more pieces of predictive information, and recording an information platform and a predictive result corresponding to each piece of predictive information;
obtaining a correlation result corresponding to each piece of predictive information through big data crawler analysis and/or access to a national reference information platform;
matching the estimation result of each piece of estimation information with the corresponding correlation result to obtain the accuracy of each piece of estimation information so as to obtain the reliability of the corresponding information platform;
then, when the stock related to the information is calculated next time, the acquiring of the plurality of pieces of information specifically includes: the information platforms with reliability higher than the preset reference value are obtained.
In a second aspect, the present invention provides an apparatus for calculating information-related stocks, comprising at least one processor and a memory, the at least one processor and the memory being connected via a data bus, the memory storing instructions executable by the at least one processor, the instructions being configured to perform the method for calculating information-related stocks according to the first aspect after being executed by the processor.
The invention has the beneficial effects that:
the method for calculating the information-associated stocks can acquire mass information according to the requirements of users, calculates the keywords of each piece of information, calculates the similarity between the keywords of each piece of information and each stock word in the market, quickly analyzes and obtains the stocks associated with the information and presents the stocks to the users, can help the users to quickly locate the stocks corresponding to the current information, assists the users in carrying out stock investment transactions, increases the opportunity of selecting stocks for the users, enhances the investment capacity of investors, and provides effective reference for the investment of the users.
[ description of the drawings ]
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required to be used in the embodiments of the present invention will be briefly described below. It is obvious that the drawings described below are only some embodiments of the invention, and that for a person skilled in the art, other drawings can be derived from them without inventive effort.
FIG. 1 is a flowchart of a method for calculating information-related stocks according to an embodiment of the present invention;
FIG. 2 is a flowchart of a method for calculating keywords in information according to an embodiment of the present invention;
FIG. 3 is a flowchart of a method for determining stocks related to information based on similarity according to an embodiment of the present invention;
FIG. 4 is a diagram of a page display of information and related stocks provided by an embodiment of the present invention;
FIG. 5 is a diagram of another page display of information and related stocks provided by an embodiment of the present invention;
fig. 6 is a display diagram of a page of news information according to an embodiment of the present invention;
FIG. 7 is a flowchart of a method for performing reliability evaluation on an information platform according to an embodiment of the present invention;
FIG. 8 is a diagram of a system for computing information-related stocks, according to an embodiment of the present invention;
FIG. 9 is a structural diagram of an algorithm processing module according to an embodiment of the present invention;
fig. 10 is a block diagram of an apparatus for calculating information-related stocks according to an embodiment of the present invention.
[ detailed description ] embodiments
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
In the embodiments of the present invention, the symbol "/" indicates the meaning of having both functions, and the symbol "a and/or B" indicates that the combination between the preceding and following objects connected by the symbol includes three cases of "a", "B", "a and B".
The intelligent terminal of the embodiments of the present invention may exist in various forms, including but not limited to:
(1) a mobile communication device: such devices are characterized by mobile communications capabilities and are primarily targeted at providing voice, data communications. Such terminals include smart phones (e.g., iphones), multimedia phones, functional phones, and low-end phones, among others.
(2) The ultra-mobile personal computer equipment belongs to the category of personal computers, has calculation and processing functions and generally has the characteristic of mobile internet access. Such terminals include: PDA, MID, and UMPC devices, etc., such as ipads.
(3) A portable entertainment device: such devices can display and play video content, and generally also have mobile internet access features. This type of device comprises: video players, handheld game consoles, and intelligent toys and portable car navigation devices.
(4) A server: the device for providing the computing service comprises a processor, a hard disk, a memory, a system bus and the like, and the server is similar to a general computer architecture, but has higher requirements on processing capacity, stability, reliability, safety, expandability, manageability and the like because of the need of providing high-reliability service.
In addition, the technical features involved in the embodiments of the present invention described below may be combined with each other as long as they do not conflict with each other. The invention will be described in detail below with reference to the figures and examples.
Example 1:
the embodiment of the invention provides a method for calculating information associated stocks, which can quickly find the associated stocks in the information and increase the opportunity of selecting the stocks for users. As shown in fig. 1, the method provided by the embodiment of the present invention mainly includes the following steps:
step 10, obtaining a plurality of information according to the search condition input by the user, and obtaining one or more keywords of each information through calculation.
When a user acquires the information-related stocks by using the system of the invention, the user usually inputs information according to the information concerned by the user, and the information is used as a retrieval condition for acquiring the information-related stock labels; the search condition may be one or more of an information title, an information content and an information keyword. According to the retrieval conditions input by the user, the system can automatically acquire the whole amount of individual share information news information from the information platform, and after all the acquired information is gathered, the keywords of each piece of information news and the weight of each keyword (namely the importance degree of the keywords in the piece of information news) can be obtained through calculation; each news item may have one or more keywords to convey the main information of the news item. For each piece of information news, a series of steps of data cleaning, word segmentation, word frequency tf (namely the probability of the vocabulary appearing in the information) and inverse text frequency index idf (namely the probability of the vocabulary appearing in the whole information) calculation, comparison and the like are carried out on the information to obtain corresponding keywords; the specific method will be described later, and will not be described herein.
And step 20, performing model training on each keyword of the information to calculate a keyword vector corresponding to each keyword in the information.
When the model of the keyword is trained, specifically, word2vec, Glove and other tools can be adopted, a cbow model is selected for training, and finally, the obtained keyword vector is a 1 × 256-dimensional vector. For example, in a specific embodiment, the training window size of the cbow model may be set to 8, using Hierarchical software max, the sampling threshold is set to 1e-4, the learning rate is set to 0.025, the lowest frequency is set to 5, the thread is set to 20, and a 1 × 256-dimensional vector of the keyword is obtained.
And step 30, for each piece of information, calculating the association degree between the keywords of the piece of information and each stock word in the market based on the keyword vector, and further determining one or more stocks related to the piece of information.
In the embodiment of the invention, the similarity between each stock word and the information keyword in the whole market is calculated in a traversal mode by a cosine similarity method, and then the correlation is obtained by multiplying the weight of the corresponding keyword, and the higher the value of the correlation is, the higher the correlation between the keyword and the corresponding stock word is proved to be. Therefore, one or more stocks related to the information can be determined based on the calculation result of the correlation, and the specific process will be described later and will not be described herein.
Generally, after step 30, i.e., after determining one or more stocks related to the piece of information, the method further comprises:
and displaying the one or more stocks to the user according to a preset sequencing mode so that the user can obtain the stock information related to the information in time and perform the stock investment based on the information correlation. The preset ordering may be an association ordering of each stock word, an issuing time ordering of corresponding information, or a quantity ordering of corresponding keywords, etc.; of course, the corresponding sorting selection mode can be directly provided on the display interface, the user can select the desired sorting method on the display interface according to the requirement, and the system orderly displays the one or more stocks to the user according to the sorting mode selected by the user, so that the use requirement of the user can be met to the maximum extent.
According to the method provided by the embodiment of the invention, massive information can be obtained according to the requirements of the user, the keyword of each piece of information is obtained through calculation, and then the similarity between the keyword of each piece of information and each stock word in the market is calculated, so that stocks related to the information are obtained through rapid analysis and presented to the user, the user can be helped to rapidly locate stocks corresponding to the current information, the user is helped to conduct stock investment transaction, the opportunity of selecting stocks is increased for the user, and effective reference is provided for the user investment.
Referring further to fig. 2, the step 10 of obtaining a plurality of pieces of information according to the search condition input by the user and obtaining one or more keywords of each piece of information by calculation specifically includes the following steps:
step 101, acquiring a plurality of related information according to a retrieval condition input by a user; wherein, the search condition comprises one or more items of information title, information content and information keyword.
Therefore, after the user inputs the search condition according to the information concerned by the user, the system can automatically acquire the whole amount of historical individual stock news information from the information platform according to the search condition and collect all the acquired information. Generally, the more recent information reflects the more new information, the greater the reference value, so the system can directly obtain the information in the latest preset time period (for example, the latest month), and thus, not only the total amount of information to be processed is reduced, and the processing pressure is reduced, but also the subsequent calculation of the associated stock is not greatly affected. In addition, since the news information acquisition mainly acquires related stock information to provide references for investment, the information platform is usually some specific platform mainly used for providing information on finance, such as finance and policy.
Step 102, performing data cleaning on each information to remove useless tags and dirty data in the information.
The system acquires and summarizes the information from the specific information platform, and then performs a round of data cleaning on each information to remove useless labels and dirty data. The useless tag is page format information of some html in the acquired information news, such as < head > </head > < p > </p > and the like, and the page format information is irrelevant to the text of the information, namely the useless tag needs to be removed; the dirty data refers to some interfering words in the news text, such as the stop words "of", "ground", and some sensitive dirty words.
Step 103, performing word segmentation on the text in each piece of information, removing the invalid words, selecting one or more keywords as the keywords of the piece of information from the remaining words, and calculating the weight of each keyword.
For each piece of information, firstly, performing word segmentation operation on a text in the information to obtain a plurality of words corresponding to the information, then removing invalid words such as stop words, dirty words and noise words from the words, and selecting one or more words from the rest words as key words of the information news; the keywords are mainly financial vocabularies or vocabularies related to characters, organizations and the like. The process of selecting the keywords specifically comprises the following steps: for each remaining vocabulary, respectively calculating the probability of the vocabulary appearing in the news information and the probability of the vocabulary appearing in the full information, and comparing the two calculated probability values; if the probability of the vocabulary appearing in the news information is higher than that appearing in the full information, the vocabulary is used as a key word of the information; according to the method, a plurality of vocabularies can be selected from the rest vocabularies as the key words of the information news.
With continued reference to fig. 3, for each piece of information, the method calculates a degree of association between the keyword of the piece of information and stock words in the market based on the keyword vector, and further determines one or more stocks related to the piece of information, step 30, which further includes the following steps:
step 301, traversing and calculating the distance of the cosine of the included angle between the keyword vector of the information and each stock word vector in the market, and multiplying the distance by the corresponding weight to obtain the association degree of the keyword of the information and the corresponding stock word.
As can be seen from the foregoing, in the embodiment of the present invention, the similarity is calculated by a cosine similarity method, and the similarity between two vectors is calculated by the cosine similarity method, the keyword vector corresponding to the keyword is already calculated in the foregoing step 20, and the stock word vector corresponding to each stock word in the market can also be calculated by the same method, which is not described herein again. Therefore, for each keyword of each piece of information, the keyword vector can be marked in a high-dimensional space, and then the distance of the cosine of the included angle between the keyword vector and each stock word vector in the market is calculated in a traversing manner to obtain the similarity between the keyword of the piece of information and each stock word in the market; wherein the cosine similarity between two vectors ranges from 0 to 1, and the closer to 1 indicates the higher the similarity between two vectors. And multiplying the similarity value by the weight corresponding to the keyword to obtain the association degree between the keyword and each stock word of the information.
And step 302, after each calculation, comparing the obtained association degree with a preset threshold value, and if the association degree is higher than the preset threshold value, storing the corresponding stock words and the association degree value in a stock dictionary.
Obviously, after each calculation, the higher the obtained association degree value is, the higher the association degree between the keyword of the information and the corresponding stock word is. A preset threshold value can be set according to a large number of artificial experience samples, if the obtained relevance value is higher than the preset threshold value, the corresponding stock words and the information can be considered to have higher relevance, and therefore the corresponding stock words and the relevance value can be stored in a stock dictionary.
Step 303, after the traversal calculation is completed, one or more stocks related to the information are determined based on one or more stock words stored in the stock dictionary.
After the traversal calculation is completed, one or more stock words may be already stored in the stock dictionary, and the stock corresponding to each stock word can be regarded as a stock associated with the information, so that one or more stocks related to the information are determined, and the stocks are displayed to the user.
Furthermore, after one or more stocks related to the information are determined, the stock words in the stock dictionary can be sorted according to the sequence of the corresponding relevance values from large to small, and n top-ranked stock tickets are combined into an ordered relevant stock array (n is larger than or equal to 1); wherein, the related stock array corresponds to n stock tickets which are most related to the information, and the stock labels of the information are logically represented. After the information related stock array is obtained, the related stock array can be displayed to the user according to a sorting mode, so that the user can conveniently perform stock investment based on information association.
As shown in fig. 4-6, after the user inputs the search condition in the intelligent terminal, the related news and the related stocks can be displayed on the display interface of the intelligent terminal. According to the display result, the user can quickly acquire the associated stocks in the information, and effective reference is provided for the investment of the user.
In the embodiment shown in fig. 4, the user may wish to obtain stocks related to the information of cattle market in china, so the input search information may be "cattle market", "stock of cattle", etc., and the system automatically obtains a plurality of related information according to the search condition, and the part on the page in fig. 4 shows the news information related to one of the cattle market. After the calculation by the method provided by the invention, four related stocks of the information can be obtained, namely Merrill medical treatment, Merrill, Huazhi wine and Convergence technologies displayed at the bottom of the page, and are displayed to the user at the intelligent terminal. Wherein, in this embodiment, news information and related stocks may be displayed on the same page.
In the embodiment corresponding to fig. 5 and fig. 6, the user may wish to obtain the stock related to the pork price information, so the inputted search information may be "pork", "pig", "pork price", etc., the system automatically obtains related information according to the search condition, and the related information news is shown on the upper part of the page in fig. 5. After calculation by the method provided by the invention, two related stocks of the information can be obtained, namely the herdship stock and the wen's stock shown at the bottom of the page in fig. 5. When a certain news item of the multiple news items is clicked, the specific information item is also obtained and displayed on the interface, as shown in fig. 6, which shows news information related to the price of one of the pork items.
Further, as can be seen from the foregoing, in step 10, the pieces of information are obtained from one or more information platforms, and the information platforms generally refer to some specific platforms mainly used for providing information news in financial aspects. Among a plurality of information news acquired from the information platforms, some information news with estimation properties inevitably exist, and whether the estimation results in the part of information news are accurate or not cannot be known, and the credibility of the corresponding information platform cannot be known. If the reliability of a certain information platform is poor and the accuracy of the provided information news is also poor, certain errors may be brought to the related stock calculation according to the information news, and the accuracy of the result is affected.
To solve the above problem, after acquiring a plurality of pieces of information in step 10, referring to fig. 7, the method may further include:
step 401, performing property analysis on the plurality of pieces of information respectively, screening out one or more pieces of predictive information, and recording an information platform and a predictive result corresponding to each piece of predictive information.
The method can determine whether each news item belongs to predictive information by detecting whether each news item has predictive vocabulary. Wherein the predictive words include, but are not limited to, "possible", "predicted", "estimated", "about", etc., if such words occur, the news information can be judged to belong to the predictive information.
Step 402, obtaining a correlation result corresponding to each piece of predictive information through big data crawler analysis and/or access to a national reference information platform.
After the estimation result corresponding to each piece of estimation information is recorded, large data crawler analysis can be periodically performed at a later stage (for example, every day and every week) based on each network data, and/or data access can be performed on a national standard information platform, so that a correlation result corresponding to the content of each piece of estimation information can be obtained. The national standard information platform is an authoritative information platform with a certain authority and high reliability (usually up to 100%), such as a people network and a Chinese securities network. The correlation results obtained by big data analysis or an authoritative information platform can be generally considered as high in accuracy, and therefore can be used as standard results.
And 403, matching the estimation result of each piece of estimation information with the corresponding correlation result to obtain the accuracy of each piece of estimation information, and further obtain the reliability of the corresponding information platform.
The similarity between the estimated result and the corresponding standard result can be obtained by matching the estimated result with the corresponding standard result, and the higher the similarity is, the higher the accuracy of the piece of estimated information is, and the higher the credibility of the corresponding information platform is; on the contrary, the lower the similarity is, the lower the accuracy of the predictive information is, and the lower the reliability of the corresponding information platform is. The system can screen out the information platform with the reliability higher than the preset reference value according to the self requirement, and the reliability of the information platform is higher, so that the accuracy of the correspondingly provided information news is higher, and the information platform can be continuously used as a data source subsequently; and the rest information platforms with the credibility lower than the preset reference value can not be used as data sources subsequently due to the low accuracy of the information news provided by the information platforms.
Therefore, when the information-related stock is calculated next time, the acquiring of the plurality of pieces of information in step 10 specifically includes: the information platforms with reliability higher than the preset reference value are obtained. Therefore, the accuracy of the final correlation result can be further ensured by ensuring the accuracy of the information news at the data source, and the user experience is improved.
When the information-related stock is presented to the user, information sources (namely information platforms) corresponding to the information, credibility of the information sources, information news quantity, information release time and the like can also be presented to the user together, so that the user can obtain more comprehensive and detailed information, and the user experience is better.
Example 2:
on the basis of the above embodiment 1, an embodiment of the present invention provides a system for calculating information-related stocks, as shown in fig. 8, the system mainly includes:
the user input module is used for inputting search conditions such as information titles, information contents and information keywords and the like in the system by a user;
the algorithm processing module is used for acquiring a plurality of information according to the user retrieval conditions, and obtaining one or more stocks related to the information after calculation and analysis;
and the user output module is used for displaying one or more stocks related to the information to the user according to the preset sequence so that the user can obtain the stock information related to the information in time and invest.
With further reference to fig. 9, the algorithm processing module may specifically include:
the information acquisition module is used for acquiring a plurality of pieces of relevant information from one or more information platforms according to the retrieval conditions input by the user and summarizing the information.
The keyword calculation module is used for obtaining one or more keywords of each piece of information through calculation, and specifically comprises the following steps: cleaning data of each information, and removing useless labels and dirty data; and performing word segmentation operation on the text in each piece of information, removing invalid words, selecting one or more keywords as the keywords of the piece of information from the rest words, and calculating the weight of each keyword.
And the keyword vector calculation module is used for performing model training on one or more keywords corresponding to each piece of information news and calculating to obtain a keyword vector corresponding to each keyword in the piece of information. Specifically, word2vec, Glove and other tools can be adopted, a cbow model is selected for training, and finally the obtained keyword vector is a 1 x 256-dimensional vector.
And the relevancy calculation module is used for traversing the relevancy between the keywords of the calculated information and each stock word in the market by a cosine similarity method, storing the stock words with the relevancy higher than a preset threshold value and the corresponding relevancy in a stock dictionary and determining one or more stocks related to the information.
And the stock ordering module is used for ordering the stock words in the stock dictionary from large to small according to the corresponding relevance value, forming n stock tickets with the top rank into an ordered associated stock array (n is more than or equal to 1), corresponding to the n stock tickets most relevant to the information, and logically representing stock labels of the information so as to display the associated stock array to the user according to an ordering mode.
Example 3:
on the basis of the method for calculating information-related stocks provided in embodiment 1, the present invention further provides a device for calculating information-related stocks, which can be used to implement the method, as shown in fig. 10, which is a schematic structural diagram of the device according to an embodiment of the present invention. The apparatus for calculating information-related stocks of the present embodiment includes one or more processors 21 and a memory 22. In fig. 10, one processor 21 is taken as an example.
The processor 21 and the memory 22 may be connected by a bus or other means, and fig. 10 illustrates the connection by a bus as an example.
The memory 22, which is a non-volatile computer-readable storage medium for a method of calculating information-related stocks, may be used to store non-volatile software programs, non-volatile computer-executable programs, and modules, such as the method for calculating information-related stocks in embodiment 1. The processor 21 executes various functional applications and data processing of the apparatus for calculating information-related stocks by operating non-volatile software programs, instructions, and modules stored in the memory 22, that is, implements the method for calculating information-related stocks of embodiment 1.
The memory 22 may include high speed random access memory and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some embodiments, the memory 22 may optionally include memory located remotely from the processor 21, and these remote memories may be connected to the processor 21 via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The program instructions/modules are stored in the memory 22 and, when executed by the one or more processors 21, perform the method for calculating information-related stocks of embodiment 1, for example, perform the steps shown in fig. 1-3 described above.
Those of ordinary skill in the art will appreciate that all or part of the steps of the various methods of the embodiments may be implemented by associated hardware as instructed by a program, which may be stored on a computer-readable storage medium, which may include: a Read Only Memory (ROM), a Random Access Memory (RAM), a magnetic or optical disk, or the like.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents and improvements made within the spirit and principle of the present invention are intended to be included within the scope of the present invention.

Claims (10)

1. A method for computing information-related stocks, comprising:
acquiring a plurality of pieces of information according to search conditions input by a user, and calculating to obtain one or more keywords of each piece of information;
for each piece of information, performing model training on each keyword of the piece of information, and calculating to obtain a keyword vector corresponding to each keyword in the piece of information;
for each piece of information, calculating the association degree between the keywords of the piece of information and the stock words of the market based on the keyword vector, and further determining one or more stocks related to the piece of information.
2. The method of claim 1, wherein after determining one or more stocks related to the piece of information, the method further comprises:
displaying the one or more stocks to a user according to a preset sorting mode; the preset sequence is the sequence of the relevance of each stock word, the sequence of the release time of corresponding information or the sequence of the number of corresponding keywords.
3. The method of claim 1, wherein the obtaining of the plurality of pieces of information according to the search condition input by the user and the calculating of the one or more keywords of each piece of information comprises:
acquiring a plurality of related information according to a retrieval condition input by a user; wherein, the search condition comprises one or more items of information title, information content and information keyword;
performing data cleaning on each information to remove useless tags and dirty data in the information;
and performing word segmentation operation on the text in each piece of information, removing invalid words, selecting one or more keywords as the keywords of the piece of information from the rest words, and calculating the weight of each keyword.
4. The method of claim 3, wherein for each piece of information, one or more keywords selected from the remaining vocabularies are used as keywords of the piece of information, and the method comprises:
for each remaining vocabulary, comparing the probability of the vocabulary appearing in the full news information with the probability of the vocabulary appearing in the news information; if the probability of the vocabulary appearing in the news information is higher than that appearing in the full information, the vocabulary is used as a keyword of the information.
5. The method of claim 3, wherein the invalid words comprise one or more of stop words, dirty words, and noise words.
6. The method as claimed in claim 1, wherein the cbow model is selected for training when performing keyword training, and the final keyword vector is a 1 x 256-dimensional vector.
7. The method of claim 1, wherein for each piece of information, the keyword vector is used to calculate the degree of association between the keyword of the piece of information and the stock words in the market, and further determine one or more stocks related to the piece of information, specifically:
traversing and calculating the distance of the cosine of the included angle between the keyword vector of the information and each stock word vector in the market, and multiplying the distance by the corresponding weight to obtain the association degree of the keyword of the information and the corresponding stock word;
after each calculation, comparing the obtained association degree with a preset threshold value, and if the association degree is higher than the preset threshold value, storing the corresponding stock words and the association degree value in a stock dictionary;
and after the traversal calculation is completed, determining one or more stocks related to the information based on one or more stock words stored in the stock dictionary.
8. The method of claim 7, wherein after determining one or more stocks related to the piece of information based on one or more stock words stored in the stock dictionary, the method further comprises:
sorting the stock words in the stock dictionary according to the sequence of the corresponding relevance values from big to small, and forming n stock tickets with the top rank into an ordered relevant stock array; wherein n is more than or equal to 1;
and displaying the associated stock array to the user according to a sorting mode so that the user can conveniently perform stock investment based on information association.
9. The method of any one of claims 1-8, wherein the plurality of pieces of information are obtained from one or more information platforms, and after obtaining the plurality of pieces of information, the method further comprises:
respectively analyzing the properties of the plurality of pieces of information, screening out one or more pieces of predictive information, and recording an information platform and a predictive result corresponding to each piece of predictive information;
obtaining a correlation result corresponding to each piece of predictive information through big data crawler analysis and/or access to a national reference information platform;
matching the estimation result of each piece of estimation information with the corresponding correlation result to obtain the accuracy of each piece of estimation information so as to obtain the reliability of the corresponding information platform;
then, when the stock related to the information is calculated next time, the acquiring of the plurality of pieces of information specifically includes: the information platforms with reliability higher than the preset reference value are obtained.
10. An apparatus for calculating information related stocks, comprising at least one processor and a memory, wherein the at least one processor and the memory are connected by a data bus, and the memory stores instructions executable by the at least one processor, and the instructions, after being executed by the processor, are used for performing the method for calculating information related stocks according to any one of claims 1 to 9.
CN201911024061.0A 2019-10-25 2019-10-25 Method and device for calculating information-related stock Pending CN110889024A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911024061.0A CN110889024A (en) 2019-10-25 2019-10-25 Method and device for calculating information-related stock

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911024061.0A CN110889024A (en) 2019-10-25 2019-10-25 Method and device for calculating information-related stock

Publications (1)

Publication Number Publication Date
CN110889024A true CN110889024A (en) 2020-03-17

Family

ID=69746482

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911024061.0A Pending CN110889024A (en) 2019-10-25 2019-10-25 Method and device for calculating information-related stock

Country Status (1)

Country Link
CN (1) CN110889024A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111640025A (en) * 2020-06-09 2020-09-08 国泰君安证券股份有限公司 Method for realizing information labeling processing based on label system
CN112464081A (en) * 2020-09-08 2021-03-09 广东省华南技术转移中心有限公司 Project information matching method, device and storage medium
CN113378555A (en) * 2021-06-22 2021-09-10 富途网络科技(深圳)有限公司 Intelligent association method for individual stock and related product
CN113722432A (en) * 2021-08-26 2021-11-30 杭州隆埠科技有限公司 Method and device for associating news with stocks

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20020084906A (en) * 2000-09-30 2002-11-16 주식회사 그린피스 Stock information service system for supporting the news information of securities market on the internet
CN103226554A (en) * 2012-12-14 2013-07-31 西藏同信证券有限责任公司 Automatic stock matching and classifying method and system based on news data
CN107025264A (en) * 2017-02-13 2017-08-08 闽南师范大学 A kind of automatic share-selecting method based on news big data
CN107885806A (en) * 2017-11-03 2018-04-06 上海宽全智能科技有限公司 Plate intelligence division methods and device, computing device and storage medium
CN108038164A (en) * 2017-12-06 2018-05-15 上海宽全智能科技有限公司 Data correlation method, equipment and storage medium
CN108287819A (en) * 2018-01-12 2018-07-17 深圳市富途网络科技有限公司 A method of realizing that financial and economic news is automatically associated to stock
CN108629693A (en) * 2018-05-08 2018-10-09 平安科技(深圳)有限公司 Automatically generate method, apparatus, computer equipment and the storage medium of suggestion for investment
CN110020104A (en) * 2017-09-05 2019-07-16 腾讯科技(北京)有限公司 News handles method, apparatus, storage medium and computer equipment
TWM584469U (en) * 2019-05-03 2019-10-01 國立臺北商業大學 Financial management news credibility evaluation device

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20020084906A (en) * 2000-09-30 2002-11-16 주식회사 그린피스 Stock information service system for supporting the news information of securities market on the internet
CN103226554A (en) * 2012-12-14 2013-07-31 西藏同信证券有限责任公司 Automatic stock matching and classifying method and system based on news data
CN107025264A (en) * 2017-02-13 2017-08-08 闽南师范大学 A kind of automatic share-selecting method based on news big data
CN110020104A (en) * 2017-09-05 2019-07-16 腾讯科技(北京)有限公司 News handles method, apparatus, storage medium and computer equipment
CN107885806A (en) * 2017-11-03 2018-04-06 上海宽全智能科技有限公司 Plate intelligence division methods and device, computing device and storage medium
CN108038164A (en) * 2017-12-06 2018-05-15 上海宽全智能科技有限公司 Data correlation method, equipment and storage medium
CN108287819A (en) * 2018-01-12 2018-07-17 深圳市富途网络科技有限公司 A method of realizing that financial and economic news is automatically associated to stock
CN108629693A (en) * 2018-05-08 2018-10-09 平安科技(深圳)有限公司 Automatically generate method, apparatus, computer equipment and the storage medium of suggestion for investment
TWM584469U (en) * 2019-05-03 2019-10-01 國立臺北商業大學 Financial management news credibility evaluation device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
董守斌等: "《网络信息检索》", 30 April 2010, 西安电子科技大学出版社 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111640025A (en) * 2020-06-09 2020-09-08 国泰君安证券股份有限公司 Method for realizing information labeling processing based on label system
CN112464081A (en) * 2020-09-08 2021-03-09 广东省华南技术转移中心有限公司 Project information matching method, device and storage medium
CN113378555A (en) * 2021-06-22 2021-09-10 富途网络科技(深圳)有限公司 Intelligent association method for individual stock and related product
CN113722432A (en) * 2021-08-26 2021-11-30 杭州隆埠科技有限公司 Method and device for associating news with stocks
CN113722432B (en) * 2021-08-26 2024-01-09 杭州隆埠科技有限公司 Method and device for associating news with stocks

Similar Documents

Publication Publication Date Title
US11334635B2 (en) Domain specific natural language understanding of customer intent in self-help
CN108829822B (en) Media content recommendation method and device, storage medium and electronic device
CN106649818B (en) Application search intention identification method and device, application search method and server
CN111125422B (en) Image classification method, device, electronic equipment and storage medium
CN110889024A (en) Method and device for calculating information-related stock
EP2192500B1 (en) System and method for providing robust topic identification in social indexes
US9589277B2 (en) Search service advertisement selection
CN110704603B (en) Method and device for discovering current hot event through information
US8949227B2 (en) System and method for matching entities and synonym group organizer used therein
Shi et al. Learning-to-rank for real-time high-precision hashtag recommendation for streaming news
CN111797214A (en) FAQ database-based problem screening method and device, computer equipment and medium
TW201839628A (en) Method, system and apparatus for discovering and tracking hot topics from network media data streams
CN114707074B (en) Content recommendation method, device and system
CN109388743B (en) Language model determining method and device
RU2680746C2 (en) Method and device for developing web page quality model
CN110096614B (en) Information recommendation method and device and electronic equipment
CN111274365A (en) Intelligent inquiry method and device based on semantic understanding, storage medium and server
CN110795542A (en) Dialogue method and related device and equipment
CN109299227B (en) Information query method and device based on voice recognition
CN114549874A (en) Training method of multi-target image-text matching model, image-text retrieval method and device
CN113342976A (en) Method, device, storage medium and equipment for automatically acquiring and processing data
CN106407316B (en) Software question and answer recommendation method and device based on topic model
CN110990533A (en) Method and device for determining standard text corresponding to query text
CN108664515A (en) A kind of searching method and device, electronic equipment
CN112860865A (en) Method, device, equipment and storage medium for realizing intelligent question answering

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20200317