CN105373853A - Stock public opinion index prediction method and device - Google Patents

Stock public opinion index prediction method and device Download PDF

Info

Publication number
CN105373853A
CN105373853A CN201510796661.4A CN201510796661A CN105373853A CN 105373853 A CN105373853 A CN 105373853A CN 201510796661 A CN201510796661 A CN 201510796661A CN 105373853 A CN105373853 A CN 105373853A
Authority
CN
China
Prior art keywords
stock
data source
preassigned pattern
weighted value
public sentiment
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201510796661.4A
Other languages
Chinese (zh)
Inventor
张立邦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN201510796661.4A priority Critical patent/CN105373853A/en
Publication of CN105373853A publication Critical patent/CN105373853A/en
Pending legal-status Critical Current

Links

Abstract

The invention provides a stock public opinion index prediction method and device. The stock public opinion index prediction method includes the following steps that: real-time data in different types of data sources are obtained, wherein the different types of data sources include a search engine-based data source, a community/forum-based data source and a news-base data source; description information which is contained in the real-time data in the different types of data sources and is related to a stock requiring stock public opinion index prediction is determined; and the stock public opinion index of the stock is determined according to all the description information which is contained in the real-time data in the different types of data sources. According to the stock public opinion index prediction method and device provided by the technical scheme of the invention, the real-time data in the different types of data sources are comprehensively considered, and therefore, an accurate stock public opinion index can be obtained. The stock public opinion index obtained by using the method and device of the invention can be adopted as a reference index in a quantitative investment process. The method and device provided by the technical scheme of the invention are conducive to quantitative investment.

Description

Stock public sentiment index forecasting method and device
Technical field
The present invention relates to network technology, especially relate to a B shareB public sentiment index forecasting method and stock public sentiment exponential forecasting device.
Background technology
Quantize investment and obtain increasing concern and application at home and abroad.Come from according to the nearly trading volume being no less than 60% of statistics American market and quantize transaction, and 2010 are also described as by domestic investors the first year that China quantizes investment.
Quantize investment be not rely on people feel manage assets, but build mathematical model according to the investment thought of people and investment experiences, and utilizing a large amount of historical stock amount valence mumbers to carry out validation verification according to as data source to mathematical model based on computer equipment, the mathematical model that validation verification passes through can be used to quantize in investment.
Inventor is realizing finding in process of the present invention, because the noise in Share Price and Exchange Volume data is more and it has full disclosure, therefore, utilize Share Price and Exchange Volume data to realize quantification investment often to exist and build the larger problem of effective quantification investment strategy difficulty, thus not utilizationization investment.
Summary of the invention
The object of this invention is to provide B shareB public sentiment index forecasting method and a device.
According to an aspect of the present invention, one B shareB public sentiment index forecasting method is provided, and described method mainly comprises the following steps: obtain the real time data in dissimilar data source, wherein, described dissimilar data source comprises: based on the data source of search engine, the data source based on community/forum and the data source based on news; Determine the descriptor relevant to the stock needing to carry out stock public sentiment exponential forecasting that the real time data in described dissimilar data source comprises; The all descriptors comprised according to the real time data in dissimilar data source determine the stock public sentiment index of described stock.
According to another aspect of the present invention, one B shareB public sentiment exponential forecasting device is provided, and described device mainly comprises: for obtaining the device of the real time data in dissimilar data source, wherein, described dissimilar data source comprises: based on the data source of search engine, the data source based on community/forum and the data source based on news; For the device of the descriptor relevant to the stock needing to carry out stock public sentiment exponential forecasting that the real time data determined in described dissimilar data source comprises; All descriptors for comprising according to the real time data in dissimilar data source determine the device of the stock public sentiment index of described stock.
Compared with prior art, the present invention has the following advantages: the present invention is by judging respectively based on the real time data in the data source of search engine, the data source based on community/forum and the data source based on news, the each descriptor relevant to the stock needing to carry out stock public sentiment exponential forecasting existing in dissimilar real time data can be determined in time, make the descriptor that the present invention may be able to be had an impact by the more multipair stock price of data mining acquisition of multi-angle; The present invention, by determining stock public sentiment index according to the descriptor determined, makes the stock public sentiment index doped be based upon and carries out on the basis comprehensively considered to the multiple descriptors that may have an impact to stock price in multiple dissimilar data source; It can thus be appreciated that overall evaluation of a technical project provided by the invention considers the real time data in the data source of number of different types, thus stock public sentiment index comparatively accurately can be obtained; The stock public sentiment index doped due to the present invention can as quantizing in investment process the index that has reference value, and therefore, technique scheme provided by the invention is conducive to quantizing investment.
Accompanying drawing explanation
By reading the detailed description done non-limiting example done with reference to the following drawings, other features, objects and advantages of the present invention will become more obvious:
Fig. 1 is the stock public sentiment index forecasting method process flow diagram of the embodiment of the present invention one;
Fig. 2 is the stock public sentiment exponential forecasting device schematic diagram of the embodiment of the present invention two.
In accompanying drawing, same or analogous Reference numeral represents same or analogous parts.
Embodiment
Before in further detail exemplary embodiment being discussed, it should be mentioned that some exemplary embodiments are described as the process or method described as process flow diagram.Although operations is described as the process of order by process flow diagram, many operations wherein can be implemented concurrently, concomitantly or simultaneously.In addition, the execution sequence of operations can be rearranged.Described process can be terminated when its operations are completed, but can also have the additional step do not comprised in the accompanying drawings.Described process can correspond to method, function, code, subroutine, subroutine etc.
Within a context alleged " computer equipment ", also referred to as " computer ", refer to the intelligent electronic device that can be performed the predetermined process such as numerical evaluation and/or logical calculated process by operation preset program or instruction, it can comprise processor and storer, the survival instruction that prestores in memory is performed to perform predetermined process process by processor, or perform predetermined process process by the hardware such as ASIC, FPGA, DSP, or combined by said two devices and realize.Computer equipment includes but not limited to server, PC and notebook computer etc.
Described computer equipment comprises subscriber equipment and the network equipment.Wherein, described subscriber equipment includes but not limited to computer, smart mobile phone, PDA etc.; The described network equipment includes but not limited to the server group that single network server, multiple webserver form or the cloud be made up of a large amount of computing machine or the webserver based on cloud computing (CloudComputing), wherein, cloud computing is the one of Distributed Calculation, the super virtual machine be made up of a group loosely-coupled computing machine collection.Wherein, described computer equipment isolated operation can realize the present invention, also accessible network by realizing the present invention with the interactive operation of other computer equipments in network.Wherein, the network residing for described computer equipment includes but not limited to internet, wide area network, Metropolitan Area Network (MAN), LAN (Local Area Network), VPN etc.
It should be noted that; described subscriber equipment, the network equipment and network etc. are only citing; other computer equipments that are existing or that may occur from now on or network, as being applicable to the present invention, within also should being included in scope, and are contained in this with way of reference.
Method (some of them are illustrated by process flow diagram) discussed below is implemented by hardware, software, firmware, middleware, microcode, hardware description language or its combination in any.When implementing by software, firmware, middleware or microcode, program code or code segment in order to implement necessary task can be stored in machine or computer-readable medium (such as storage medium).(one or more) processor can implement necessary task.
Concrete structure disclosed herein and function detail are only representational, and are the objects for describing exemplary embodiment of the present invention.But the present invention can carry out specific implementation by many replacement forms, and should not be construed as only being limited to the embodiments set forth herein.
Should be understood that, although may have been used term " first ", " second " etc. here to describe unit, these unit should not limit by these terms.These terms are used to be only used to a unit and another unit to distinguish.For example, when not deviating from the scope of exemplary embodiment, first module can be called as second unit, and second unit can be called as first module similarly.Here used term "and/or" comprise one of them or more any and all combinations of listed associated item.
Should be understood that, when a unit is called as " connection " or " coupling " to another unit, it can directly connect or be coupled to another unit described, or can there is temporary location.On the other hand, " when being directly connected " or " directly coupled " to another unit, then there is not temporary location when a unit is called as.Should explain in a comparable manner the relation be used between description unit other words (such as " and be in ... between " compared to " and be directly in ... between ", " with ... contiguous " compared to " and with ... be directly close to " etc.).
Here used term is only used to describe specific embodiment and be not intended to limit exemplary embodiment.Unless context refers else clearly, otherwise singulative used here " ", " one " are also intended to comprise plural number.It is to be further understood that, the existence of the feature that term used here " comprises " and/or " comprising " specifies to state, integer, step, operation, unit and/or assembly, and do not get rid of and there is or add other features one or more, integer, step, operation, unit, assembly and/or its combination.
Also it should be mentioned that and to replace in implementation at some, the function/action mentioned can according to being different from occurring in sequence of indicating in accompanying drawing.For example, depend on involved function/action, in fact the two width figure in succession illustrated can perform simultaneously or sometimes can perform according to contrary order substantially.
Below in conjunction with accompanying drawing, the present invention is described in further detail.
Embodiment one, stock public sentiment index forecasting method.
Fig. 1 is the process flow diagram of the stock public sentiment index forecasting method of the present embodiment, and the method shown in Fig. 1 mainly comprises: step S100, step S110 and step S120.Method described in the present embodiment is normally performed in computer equipment, and preferably, the method described in the present embodiment can be performed in server, desk-top computer and other network equipments.Below each step in Fig. 1 is described respectively.
S100, the real time data obtained in dissimilar data source.
Concrete, dissimilar data source in the present embodiment mainly comprises: the data source based on search engine (also can be called the large data of search, as large data are searched for by Baidu), (also can be described as the large data of community/forum based on the data source of community/forum, data as large in community of Baidu) and based on the data source (also can be called the large data of news, data as large in Baidu's news) of news.A concrete example, in actual applications, dissimilar data source comprises: large data, the large data in community of Baidu and the large data of Baidu's news are searched for by Baidu.In addition, the dissimilar data source in the present embodiment on the basis of data source types comprising above-mentioned three types, can also comprise the data source of other types.
Exemplarily, real time data in the present embodiment also can be called the data etc. of current data or non-historical data or fresh data or non-overaging, and the real time data in the present embodiment typically refers to the data of data generation time within the schedule time, as the present embodiment can using the data on the same day all as real time data, for another example the present embodiment can using the previous day 15:00 to the same day early the data that produce of 7:00 all as real time data.
The descriptor relevant to the stock needing to carry out stock public sentiment exponential forecasting that S110, the real time data determined in dissimilar data source comprise.
Concrete, descriptor in the present embodiment typically refers to the descriptor based on finance relevant to stock, namely the descriptor in the present embodiment typically refers to the financial language relevant to stock, as investigation of being put on record, purchase, is purchased and rearrangement of assets etc.The present embodiment does not limit the particular content of descriptor.
Exemplarily, the present embodiment can preset descriptor set, multiple descriptor is included in this descriptor set, the present embodiment can utilize this descriptor set and need the stock identification information carrying out stock public sentiment exponential forecasting to filter respectively each real time data in dissimilar data source, to judge whether each real time data comprises the descriptor in above-mentioned stock identification information and descriptor set.
Exemplarily, stock identification information in the present embodiment can be specially: one or more in the abbreviation of stock name, stock code and stock name, and a stock identification information can only illustrate a stock by only table, the stock gone out represented by different stock identification informations is not identical.In actual applications, stock identification information preferably includes: the abbreviation of stock name, stock code and stock name, can carry out sufficient filtering statistical so that follow-up to the real time data in data source.
Exemplarily, the present embodiment can according to the acquisition of information of outside input to needing the stock identification information carrying out stock public sentiment exponential forecasting, as according to the acquisition of information of input through keyboard to needing the stock identification information carrying out stock public sentiment exponential forecasting, for another example according to Internet Transmission come acquisition of information to needing the stock identification information carrying out stock public sentiment exponential forecasting.
Exemplarily, the stock identification information needing to carry out stock public sentiment exponential forecasting is got the file that the present embodiment also can store from this locality; A concrete example, includes the stock identification information of all stocks in current A share market, thus can get any one stock identification information needing to carry out stock public sentiment exponential forecasting from this file in the local file stored.When needing all to carry out stock public sentiment exponential forecasting to all stocks in current A share market, the present embodiment can read stock identification information one by one from this file, by performing the operation such as filtration and judgement of this step respectively for the stock identification information read out at every turn, the object of all stocks in current A share market being carried out respectively to stock public sentiment exponential forecasting can be realized.
The present embodiment does not limit the particular content obtaining and need the specific implementation of the stock identification information carrying out stock public sentiment exponential forecasting and stock identification information to comprise.
Exemplarily, the present embodiment successively can carry out filtration treatment to the real time data in dissimilar data source respectively, as first carried out filtration treatment to all real time datas that Baidu is searched in large data, afterwards filtration treatment is carried out to all real time datas in the large data in community of Baidu, finally filtration treatment is carried out to all real time datas in the large data of Baidu's news; Certainly, what the present embodiment also can walk abreast carries out filtration treatment respectively to the real time data in dissimilar data source, as searched for while all real time datas in large data carry out filtration treatment to Baidu, filtration treatment can be carried out to all real time datas in the large data in community of Baidu on the one hand, filtration treatment can also be carried out to all real time datas in the large data of Baidu's news on the other hand.
S120, the stock public sentiment index of descriptor determination stock comprised according to the real time data in dissimilar data source.
Concrete, stock public sentiment index in the present embodiment is mainly used in reflecting the trend of stock, the stock public sentiment index namely in the present embodiment reflect market in the recent period to the polarity (rising/drop as stock price) of stock and intensity (as supported, stock price rises the/impetus etc. of drop) the attitude of holding; A concrete example, when the maximal value presetting stock public sentiment index is+1 and its minimum value is-1, stock public sentiment index more close+1, then represent in the recent period to personal share, market sees that susceptible thread is higher, stock public sentiment index more close-1, then represent in the recent period to personal share, market sees that susceptible thread is lower, and when stock public sentiment index is hovered near 0, then show that market does not have obvious tendentiousness to many sky expections of personal share in the recent period.
Exemplarily, stock public sentiment index in the present embodiment can be applied in the software product of speculation in stocks type, so that the decision references information of the software product quantification investment that stock public sentiment index can be utilized to provide for user, lead in APP as the stock public sentiment index in the present embodiment can be applied to stock market of Baidu, and the present embodiment does not limit the embody rule of stock public sentiment index.
Exemplarily, the present embodiment can introduce preassigned pattern in the process of the stock public sentiment index according to descriptor determination stock, the stock public sentiment index namely by utilizing preassigned pattern to determine stock.
The quantity of the preassigned pattern pre-set in the present embodiment is generally multiple, and the preassigned pattern in the present embodiment arranges on the basis of finance theory, and that is, the preassigned pattern in the present embodiment is the preassigned pattern based on financial language; Financial language in the present embodiment is as investigation of being put on record, rearrangement of assets, purchase and purchased.Above-mentioned financial language also can be called as the descriptor based on finance.The present embodiment does not limit the particular content of financial language.
Exemplarily, preassigned pattern in the present embodiment can comprise two parts content usually, part content can be described as be in the real time data in tentation data source and must occur specific financial language (namely based on the descriptor of finance), and another part content can be described to the number of times that specific financial language occurs in the real time data of particular source should meet predetermined condition (i.e. frequency statistics requirement).A concrete example, a preassigned pattern in the present embodiment can be described to: in the real time data of particular source, occurred " investigation of being put on record " this financial language, and the average that the number of times that " by investigation of putting on record " this financial language occurs in the real time data of particular source has exceeded the number of times that it occurs in the past for 20 days in the historical data of particular source adds 2 times of standard deviations.The present embodiment does not limit the concrete manifestation form of preassigned pattern.In addition, when the present embodiment adopts preassigned pattern, the descriptor set in above-mentioned steps S110 refers to the summation of descriptor involved in each preassigned pattern.
Exemplarily, preassigned pattern in the present embodiment is normally arranged for particular source, that is, each data source in the present embodiment can to there being a preassigned pattern set, and preassigned pattern set corresponding to dissimilar data source is not identical.In addition, be arranged in two preassigned patterns of different preassigned pattern set for financial language likely identical, also likely not identical.
Exemplarily, the present embodiment can arrange its respective preassigned pattern respectively by carrying out data mining to the historical data in each data source for each data source.A concrete example, for the first data source, data mining is carried out to the historical data (as the data in the first half of the year or the data etc. of last quarter) in the first data source, as carried out the process such as filtering statistical according to all descriptors based on finance in the financial language set that presets to historical data, with all descriptors based on finance repeated for the first stock identification information in the historical data determining the first data source (as investigation of being put on record, rearrangement of assets, purchase and purchased), optionally, the present embodiment can generate frequent mode according to all descriptors based on finance repeated, the frequent mode that the present embodiment generates also comprises two parts content (concrete as the above-mentioned description for preassigned pattern, not to be repeated) with preassigned pattern is the same, afterwards, the present embodiment for each descriptor based on finance repeated that data mining goes out obtain respectively the corresponding period of history (as the descriptor based on finance generation time the latest after several days in or tens days in) stock price information, and judge whether stock price there occurs fluctuation in this period of history according to the stock price information got, and whether Stock Price Fluctuation meets pre-provisioning request, if stock price creates fluctuation and Stock Price Fluctuation meets pre-provisioning request (as Stock Price Fluctuation amplitude exceedes certain threshold value etc.), then the present embodiment can utilize the descriptor based on finance repeated accordingly to generate preassigned pattern, in addition, when aforementioned optionally generated frequent mode, the present embodiment no longer can perform the operation of above-mentioned generation preassigned pattern, but directly using this frequent mode as preassigned pattern.Above-mentionedly only to be described for the first data source and the first stock identification information, the present embodiment can also utilize the first data source and other stock identification informations or utilize other data sources and each stock identification information to generate preassigned pattern, describes in detail no longer one by one at this.
Exemplarily, the preassigned pattern that the determination stock of the present embodiment exists in the data source of respective type can be specially: for the first data source, judging the specific financial language in one or more preassigned pattern that a real time data includes corresponding to the first data source, and the specific financial language that this real time data comprises is for when needing the stock identification information carrying out stock public sentiment exponential forecasting, the specific financial language comprised for this real time data carries out corresponding statistical operation, and judge the predetermined condition in the whether satisfied each preassigned pattern corresponding to first data source of the result of this statistical operation, if the result that judged result is this statistical operation meets the predetermined condition in the corresponding preassigned pattern corresponding to the first data source, then this corresponding preassigned pattern is defined as the preassigned pattern existed in the first data source, if the result that judged result is this statistical operation does not meet the predetermined condition in each preassigned pattern corresponding to the first data source, then any one preassigned pattern in each preassigned pattern can not be defined as the preassigned pattern existed in the first data source, afterwards, continue to carry out word filter operation to next real time data in this first data source.By that analogy, until carried out above-mentioned judgement and statistical operation respectively for each real time data in each data source, can determine in dissimilar data source for each preassigned pattern needing to carry out existing for the stock identification information of stock public sentiment exponential forecasting.
Exemplarily, each data source in the present embodiment is previously provided with weighted value respectively, all preassigned patterns corresponding to each data source in the present embodiment are also previously provided with weighted value respectively, and the weighted value of the weighted value of each data source in the present embodiment and all preassigned patterns corresponding to each data source all can dynamic conditioning.The present embodiment can calculate according to the weighted value of each preassigned pattern occurred in the weighted value of dissimilar data source and dissimilar data source determines stock public sentiment index, thus the present embodiment can make stock public sentiment index be based upon carries out on the basis comprehensively considered multiple data source and the multiple information that may have an impact to stock price.
Exemplarily, the calculating of the present embodiment determines that a detailed process of the stock public sentiment index of stock identification information can be: superpose after being multiplied with the weighted value of corresponding data source by the weighted value of each preassigned pattern existing in dissimilar data source, thus can using the stock public sentiment index of the numerical value after superposition as stock identification information.
An example more specifically, the weighted value that large data are searched for by setting Baidu is 0.5, the weighted value of the large data in community of Baidu be 0.3 and the weighted value of the large data of Baidu's news be 0.2, set the preassigned pattern that Baidu searches for large data corresponding to comprise: weighted value is as+1 preassigned pattern A, weighted value be+2 preassigned pattern B and weighted value be the preassigned pattern C of-3, preassigned pattern corresponding to the large data in community of Baidu comprises: weighted value is the preassigned pattern D of+3, weighted value be+1 preassigned pattern E and weighted value be the preassigned pattern F of-2, preassigned pattern corresponding to the large data of Baidu's news comprises: weighted value is the preassigned pattern G of+1, weighted value be+4 preassigned pattern H and weighted value be the preassigned pattern I of-2, if for the first stock identification information, the preassigned pattern determine that Baidu searches for that the preassigned pattern existed in large data is preassigned pattern B by above-mentioned steps S110, having in the large data in community of Baidu is the preassigned pattern existed in preassigned pattern E and the large data of Baidu's news is preassigned pattern H, then the stock public sentiment index that the present embodiment dopes for the first stock identification information is: 0.5 × 2+0.3 × 1+0.2 × 4=2.1.
Exemplarily, the calculating of the present embodiment determines that another detailed process of the stock public sentiment index of stock identification information can be: superpose after being multiplied with the weighted value of corresponding data source by the weighted value of each preassigned pattern existing in dissimilar data source, the numerical value that superposition obtains is carried out mapping process, the numerical value obtained to make superposition is converted into predetermined interval (as [-1,1] numerical value), thus can will map the stock public sentiment index of the numerical value after process as stock identification information.The numerical value mapped after process definitely can show market in the recent period to the attitude that polarity and the intensity of stock are held.The present embodiment can adopt various ways to carry out mapping process to the numerical value that superposition obtains, and the present embodiment does not limit the specific implementation mapping process.
An example more specifically, the weighted value that large data are searched for by setting Baidu is 0.5, the weighted value of the large data in community of Baidu be 0.3 and the weighted value of the large data of Baidu's news be 0.2, set the preassigned pattern that Baidu searches for large data corresponding to comprise: weighted value is as+1 preassigned pattern A, weighted value be+2 preassigned pattern B and weighted value be the preassigned pattern C of-3, preassigned pattern corresponding to the large data in community of Baidu comprises: weighted value is the preassigned pattern D of+3, weighted value be+1 preassigned pattern E and weighted value be the preassigned pattern F of-2, preassigned pattern corresponding to the large data of Baidu's news comprises: weighted value is the preassigned pattern G of+1, weighted value be+4 preassigned pattern H and weighted value be the preassigned pattern I of-2, if for the first stock identification information, the preassigned pattern determine that Baidu searches for that the preassigned pattern existed in large data is preassigned pattern B by above-mentioned steps S110, having in the large data in community of Baidu is the preassigned pattern existed in preassigned pattern E and the large data of Baidu's news is preassigned pattern H, numerical value then after the present embodiment superposition is: 0.5 × 2+0.3 × 1+0.2 × 4=2.3, numerical value after superposition is carried out [-1,1] interval mapping process, the stock public sentiment index doped for the first stock identification information obtained after mapping process is: 0.6.
It should be noted that, above-mentionedly only illustrate the present embodiment calculates stock public sentiment index two kinds of specific implementation processes according to corresponding weighted value, the stock public sentiment index that the present embodiment can also adopt other computing method to determine corresponding to stock identification information according to the weighted value of each preassigned pattern existing in the weighted value of dissimilar data source and dissimilar data source, the present embodiment does not limit the specific implementation calculating stock public sentiment index according to corresponding weighted value.
Below the set-up mode of the weighted value of each data source in the present embodiment and the weighted value of each preassigned pattern is illustrated.
The example that the present embodiment arranges of weighted value concrete respectively for each data source is in advance, in advance initial weight value is set respectively for each data source, initial weight value as first Baidu searched for large data is set to 0.5, the initial weight value of large for community of Baidu data is set to 0.3, and be 0.2 by the initial weight value of large for Baidu's news data, then, obtain the historical data (data or the data etc. in last season as the first half of the year) of each data source, and utilize historical data to predict the stock public sentiment index of multiple stock identification information (all stock identification informations as in current A share market), data due to each data source used are the historical data in each data source, therefore, the present embodiment can utilize current each stock that can get in the actual stock price of corresponding period of history to the accuracy of the stock public sentiment index of each stock detecting current predictive and go out, as utilized pre-defined algorithm (neural network algorithm etc.), the actual stock price corresponding to the stock public sentiment index of all stock identification informations and respective stock identification information is learnt, finally to determine that large data are searched for by Baidu, the present weight value of the large data in community of Baidu and the large data of Baidu's news is (as improved the present weight value of the strong data source of the actual stock price ability of reflection, and reduce the present weight value of the weak data source of the actual stock price ability of reflection).
The example that the present embodiment arranges of weighted value concrete respectively for the preassigned pattern in each data source is in advance, arrange in the process of preassigned pattern above-mentioned for each data source, the present embodiment can also arrange weighted value for the preassigned pattern in different pieces of information source, namely be that corresponding preassigned pattern arranges weighted value etc. according to the amplitude of Stock Price Fluctuation in the process that preassigned pattern is set, as judged the amplitude of Stock Price Fluctuation, when the amplitude of Stock Price Fluctuation is ascensional range, the weighted value of preassigned pattern is set to positive weights value, and the larger weighted value of its ascensional range is larger, and when the amplitude of Stock Price Fluctuation is fall, the weighted value of preassigned pattern is set to negative weighted value, and the larger weighted value of fall is less, because a kind of preassigned pattern may be set up different weighted values in different weighted value setting up procedure, therefore, the present embodiment can utilize the modes such as the average asking multiple weighted value to determine a kind of weighted value of preassigned pattern, then, obtain the historical data (data or the data etc. in last season as the first half of the year) of each data source, and utilize the historical data obtained to predict the stock public sentiment index of multiple stock identification information (all stock identification informations as in current A share market), data due to each data source used are the historical data in each data source, therefore, the present embodiment can utilize current each stock that can get in the actual stock price of corresponding period of history to the accuracy of the stock public sentiment index of each stock detecting current predictive and go out, as utilized pre-defined algorithm (neural network algorithm etc.), the actual stock price corresponding to the stock public sentiment index of all stock identification informations and respective stock identification information is learnt, finally to determine that the present weight value of each preassigned pattern is (as improved the present weight value of the strong preassigned pattern of the actual stock price ability of reflection, and reduce the present weight value of the weak preassigned pattern of the actual stock price ability of reflection).
In advance weighted value is being set respectively for each data source, and in advance for after the preassigned pattern in each data source arranges weighted value respectively, and be put in actual application at stock public sentiment index forecasting method, the present embodiment still can be regular the above-mentioned learning process (the actual stock price corresponding to the stock public sentiment index of all stock identification informations and respective stock identification information being learnt as all utilized neural network algorithm every day) that utilizes dynamic conditioning is carried out to the weighted value of each data source and the weighted value of each preassigned pattern, constantly to improve the forecasting accuracy of stock public sentiment index.An object lesson of the process of the weighted value of each data source of above-mentioned regular adjustment and the weighted value of each preassigned pattern is: obtain all stocks that the present embodiment doped yesterday stock public sentiment index and today all stock actual stock price, utilize pre-defined algorithm (neural network algorithm etc.) to the stock public sentiment index of all stocks doped yesterday and today all stock actual stock price learn, with the weighted value of the weighted value and each preassigned pattern that adjust each data source.The present embodiment does not limit the above-mentioned pre-defined algorithm (as neural network algorithm) that utilizes and carries out the specific implementation learnt.
Embodiment two, stock public sentiment exponential forecasting device.
The stock public sentiment exponential forecasting device of the present embodiment can be arranged in computer equipment usually, and preferably, the stock public sentiment exponential forecasting device described in the present embodiment can be arranged in server, desk-top computer and other network equipments.The primary structure of the stock public sentiment exponential forecasting device of the present embodiment as shown in Figure 2.
Referring to specific embodiment, stock public sentiment exponential forecasting device is described.
In fig. 2, the stock public sentiment exponential forecasting device of the present embodiment mainly comprises: for the device (following referred to as " Real time data acquisition device 200 ") obtaining the real time data in dissimilar data source, the device (following referred to as " descriptor determining device 210 ") of the descriptor relevant to the stock needing to carry out stock public sentiment exponential forecasting comprised for the real time data determined in dissimilar data source and the device (following referred to as " stock public sentiment index determining device 220 ") of the stock public sentiment index of all descriptor determination stocks for comprising according to the real time data in dissimilar data source.
Real time data acquisition device 200 is mainly used in obtaining the real time data in dissimilar data source.
Concrete, dissimilar data source in the present embodiment mainly comprises: the data source based on search engine (also can be called the large data of search, as large data are searched for by Baidu), (also can be described as the large data of community/forum based on the data source of community/forum, data as large in community of Baidu) and based on the data source (also can be called the large data of news, data as large in Baidu's news) of news.A concrete example, in actual applications, dissimilar data source comprises: large data, the large data in community of Baidu and the large data of Baidu's news are searched for by Baidu.In addition, the dissimilar data source in the present embodiment on the basis of data source types comprising above-mentioned three types, can also comprise the data source of other types.
Exemplarily, the real time data that Real time data acquisition device 200 obtains also can be called the data etc. of current data or non-historical data or fresh data or non-overaging, and the real time data that Real time data acquisition device 200 obtains typically refers to the data of data generation time within the schedule time, as the data on the same day can all obtain as real time data by real-time data acquisition facility 200,15:00 the previous day can all obtain as real time data to data that early 7:00 produces on the same day by Real time data acquisition device 200 for another example.
Descriptor determining device 210 is mainly used in the descriptor relevant to the stock needing to carry out stock public sentiment exponential forecasting determining that the real time data in dissimilar data source comprises.
Concrete, descriptor in the present embodiment typically refers to the descriptor based on finance relevant to stock, namely the descriptor in the present embodiment typically refers to the financial language relevant to stock, as investigation of being put on record, purchase, is purchased and rearrangement of assets etc.The present embodiment does not limit the particular content of descriptor.
Exemplarily, the present embodiment can preset descriptor set, multiple descriptor is included in this descriptor set, descriptor determining device 210 can utilize this descriptor set and need the stock identification information carrying out stock public sentiment exponential forecasting to filter respectively each real time data in dissimilar data source, to judge whether each real time data comprises the descriptor in above-mentioned stock identification information and descriptor set.
Exemplarily, the stock identification information that descriptor determining device 210 uses can be specially: in the abbreviation of stock name, stock code and stock name one or two or all, and a stock identification information can only illustrate a stock by only table, the stock gone out represented by different stock identification informations is not identical.In actual applications, the stock identification information that descriptor determining device 210 uses preferably includes: the abbreviation of stock name, stock code and stock name, can carry out sufficient filtering statistical so that follow-up to the real time data in data source.
Exemplarily, descriptor determining device 210 can according to the acquisition of information of outside input to needing the stock identification information carrying out stock public sentiment exponential forecasting, as descriptor determining device 210 according to the acquisition of information of input through keyboard to needing the stock identification information carrying out stock public sentiment exponential forecasting, for another example descriptor determining device 210 according to Internet Transmission come acquisition of information to needing the stock identification information carrying out stock public sentiment exponential forecasting.
Exemplarily, the stock identification information needing to carry out stock public sentiment exponential forecasting is got the file that descriptor determining device 210 also can store from this locality; A concrete example, include the stock identification information of all stocks in current A share market in the local file stored, thus descriptor determining device 210 can get the stock identification information that any one needs to carry out stock public sentiment exponential forecasting from this file.When the stock public sentiment prediction unit of the present embodiment needs all to carry out stock public sentiment exponential forecasting to all stocks in current A share market, descriptor determining device 210 can read stock identification information one by one from this file, by performing the operations such as filtration and judgement respectively for the stock identification information read out at every turn, the object of all stocks in current A share market being carried out respectively to stock public sentiment exponential forecasting can be realized.
The present embodiment does not limit descriptor determining device 210 and obtains the particular content needing the specific implementation of the stock identification information carrying out stock public sentiment exponential forecasting and stock identification information to comprise.
Exemplarily, descriptor determining device 210 successively can carry out filtration treatment to the real time data in dissimilar data source respectively, as descriptor determining device 210 first carries out filtration treatment to all real time datas that Baidu is searched in large data, all real time datas afterwards in descriptor determining device 210 pairs of large data in community of Baidu carry out filtration treatment, and all real time datas in last descriptor determining device 210 pairs of large data of Baidu's news carry out filtration treatment; Certainly, what descriptor determining device 210 also can walk abreast carries out filtration treatment respectively to the real time data in dissimilar data source, such as descriptor determining device 210 is being searched for while all real time datas in large data carry out filtration treatment to Baidu, descriptor determining device 210 can carry out filtration treatment to all real time datas in the large data in community of Baidu on the one hand, and descriptor determining device 210 can also carry out filtration treatment to all real time datas in the large data of Baidu's news on the other hand.
Stock public sentiment index determining device 220 is mainly used in the stock public sentiment index of all descriptor determination stocks comprised according to the real time data in dissimilar data source.
Concrete, the stock public sentiment index that stock public sentiment index determining device 220 is determined is mainly used in reflecting the trend of stock, the stock public sentiment index that namely stock public sentiment index determining device 220 is determined reflect market in the recent period to the polarity (rising/drop as stock price) of stock and intensity (as supported, stock price rises the/impetus etc. of drop) the attitude of holding; A concrete example, when the maximal value presetting stock public sentiment index is+1 and its minimum value is-1, stock public sentiment index more close+1, then represent in the recent period to personal share, market sees that susceptible thread is higher, stock public sentiment index more close-1, then represent in the recent period to personal share, market sees that susceptible thread is lower, and when stock public sentiment index is hovered near 0, then show that market does not have obvious tendentiousness to many sky expections of personal share in the recent period.
Exemplarily, the stock public sentiment index that stock public sentiment index determining device 220 is determined can be applied in the software product of speculation in stocks type, so that the decision references information of the software product quantification investment that stock public sentiment index can be utilized to provide for user, the stock public sentiment index determined as stock public sentiment index determining device 220 can be applied to stock market of Baidu and lead in APP, and the present embodiment does not limit the embody rule of stock public sentiment index.
Exemplarily, stock public sentiment index determining device 220 can introduce preassigned pattern in the process of the stock public sentiment index according to descriptor determination stock, i.e. the stock public sentiment index of stock public sentiment index determining device 220 by utilizing preassigned pattern to determine stock.
The quantity of the preassigned pattern pre-set in stock public sentiment exponential forecasting device is generally multiple, and the preassigned pattern in stock public sentiment exponential forecasting device arranges on the basis of finance theory, that is, the preassigned pattern in the present embodiment is the preassigned pattern based on financial language; Financial language in the present embodiment is as investigation of being put on record, rearrangement of assets, purchase and purchased.Above-mentioned financial language also can be called as the descriptor based on finance.The present embodiment does not limit the particular content of financial language.
Exemplarily, preassigned pattern in stock public sentiment exponential forecasting device can comprise two parts content usually, part content can be described as be in the real time data in tentation data source and must occur specific financial language (namely based on the descriptor of finance), and another part content can be described to the number of times that specific financial language occurs in the real time data of particular source should meet predetermined condition.A concrete example, a preassigned pattern in stock public sentiment exponential forecasting device can be described to: in the real time data of particular source, occurred " investigation of being put on record " this financial language, and the average that the number of times that " by investigation of putting on record " this financial language occurs in the real time data of particular source has exceeded the number of times that it occurs in the past for 20 days in the historical data of particular source adds 2 times of standard deviations.The present embodiment does not limit the concrete manifestation form of preassigned pattern.
Exemplarily, preassigned pattern in stock public sentiment exponential forecasting device is normally arranged for particular source, that is, each data source in stock public sentiment exponential forecasting device is to there being a preassigned pattern set, and preassigned pattern set corresponding to dissimilar data source is not identical.In addition, be divided into two preassigned patterns in different preassigned pattern set for financial language likely identical, also likely not identical.
Exemplarily, the stock public sentiment exponential forecasting device of the present embodiment can optionally comprise: for arranging device (the following abbreviation " preassigned pattern setting device " of described each preassigned pattern by carrying out data mining to the historical data in each data source, not shown in Fig. 2), namely preassigned pattern setting device is mainly used in arranging its respective preassigned pattern respectively by carrying out data mining to the historical data in each data source for each data source.Optionally, this preassigned pattern setting device can specifically comprise: for carrying out data mining respectively to the historical data in dissimilar each data source, to determine device (the following abbreviation " data mining device " of the descriptor based on finance repeated for a stock, not shown in Fig. 2), for device (the following abbreviation " Stock Price Fluctuation determining device " of the price determination Stock Price Fluctuation of the respective stock according to the corresponding period of history, not shown in Fig. 2) and for when Stock Price Fluctuation meets pre-provisioning request, device (the following abbreviation " preassigned pattern generating apparatus " of preassigned pattern is generated according to the descriptor based on finance repeated, not shown in Fig. 2).
A concrete example, for the first data source, preassigned pattern setting device (as data mining device) carries out data mining to the historical data (as the data in the first half of the year or the historical data etc. of last quarter) in the first data source, as preassigned pattern setting device (as data mining device) carries out the process such as filtering statistical according to all descriptors based on finance in the financial language set preset to historical data, with all descriptors based on finance repeated for the first stock identification information in the historical data determining the first data source (as investigation of being put on record, rearrangement of assets, purchase and purchased), optionally, preassigned pattern setting device (as preassigned pattern generating apparatus) can generate frequent mode according to all descriptors based on finance repeated, preassigned pattern setting device (as preassigned pattern generating apparatus) the frequent mode that generates with preassigned pattern, also comprise two parts content (concrete as the above-mentioned description for preassigned pattern, not to be repeated), afterwards, preassigned pattern setting device (as Stock Price Fluctuation determining device) for each descriptor based on finance repeated that data mining goes out obtain respectively the corresponding period of history (as the descriptor based on finance generation time the latest after several days in or tens days in) stock price information, and judge whether stock price there occurs fluctuation in this period of history according to the stock price information got, and whether Stock Price Fluctuation meets pre-provisioning request, if stock price creates fluctuation and Stock Price Fluctuation meets pre-provisioning request (as Stock Price Fluctuation amplitude exceedes certain threshold value etc.), then preassigned pattern setting device (as preassigned pattern generating apparatus) can utilize the descriptor based on finance repeated accordingly to generate corresponding preassigned pattern, in addition, when aforementioned optionally generated frequent mode, preassigned pattern setting device (as preassigned pattern generating apparatus) no longer can perform the operation of above-mentioned generation preassigned pattern, but directly using this frequent mode as preassigned pattern.Above-mentionedly only to be described for the first data source and the first stock identification information, preassigned pattern setting device can also utilize the first data source and other stock identification informations or utilize other data sources and each stock identification information to generate preassigned pattern, describes in detail no longer one by one at this.
Exemplarily, stock public sentiment index determining device 220 can optionally comprise: all descriptors for comprising the real time data in dissimilar data source carry out device (the following abbreviation " frequency statistics device " of corresponding frequency statistics respectively, not shown in Fig. 2), for requiring in the frequency statistics determining that the result of frequency statistics meets preassigned pattern time, frequency statistics is required that the preassigned pattern be satisfied is defined as the device of the preassigned pattern that stock exists in the data source of respective type; (following abbreviation " frequency statistics judgment means ", not shown in Fig. 2) and each preassigned pattern for existing in dissimilar data source according to stock determine the device (following abbreviation " the first index determining device ", not shown in Fig. 2) of the stock public sentiment index of described stock; And the first index determining device wherein can optionally comprise: calculate for the weighted value according to each preassigned pattern existed in the weighted value of dissimilar data source and dissimilar data source the device (following abbreviation " the second index determining device ", not shown in Fig. 2) determining the stock public sentiment index of stock.
Exemplarily, stock public sentiment index determining device 220 determines that the preassigned pattern that stock exists in the data source of respective type can be specially: for the first data source, frequency statistics device is judging the specific financial language in one or more preassigned pattern that a real time data includes corresponding to the first data source, and the specific financial language that this real time data comprises is for when needing the stock identification information carrying out stock public sentiment exponential forecasting, the specific financial language that frequency statistics device comprises for this real time data carries out corresponding statistical operation, predetermined condition in the whether satisfied each preassigned pattern corresponding to first data source of result of this statistical operation of frequency statistics judgment means determination frequency statistic device, if the result that judged result is this statistical operation meets the predetermined condition in the corresponding preassigned pattern corresponding to the first data source, this corresponding preassigned pattern is defined as the preassigned pattern existed in the first data source by then frequency statistics judgment means, if the result that judged result is this statistical operation does not meet the predetermined condition in each preassigned pattern corresponding to the first data source, any one preassigned pattern in each preassigned pattern can not be defined as the preassigned pattern existed in the first data source by then frequency statistics judgment means, afterwards, frequency statistics device continues to carry out filter operation to next real time data in this first data source.By that analogy, until frequency statistics device and frequency statistics judgment means have carried out above-mentioned judgement and statistical operation respectively for each real time data in each data source, can determine in dissimilar data source for each preassigned pattern needing to carry out existing for the stock identification information of stock public sentiment exponential forecasting.
Exemplarily, each data source in the present embodiment is previously provided with weighted value respectively, all preassigned patterns corresponding to each data source in the present embodiment are also previously provided with weighted value respectively, and the weighted value of the weighted value of each data source in the present embodiment and all preassigned patterns corresponding to each data source all can dynamic conditioning.The first index determining device (the second index determining device as wherein) in the present embodiment can according in the weighted value of dissimilar data source and dissimilar data source the weighted value of each preassigned pattern that occurs calculate and determine stock public sentiment index, thus the stock public sentiment prediction unit of the present embodiment can make stock public sentiment index be based upon carries out on the basis comprehensively considered multiple data source and the multiple information that may have an impact to stock price.
Exemplarily, second index determining device can optionally comprise: for device (the following abbreviation " stacking apparatus " superposed after being multiplied with the weighted value of corresponding data source by the weighted value of each preassigned pattern existing in dissimilar data source, not shown in Fig. 2) and for using the numerical value after superposition as the device (following abbreviation " the first index determining device ", not shown in Fig. 2) of the stock public sentiment index of stock identification information.Optionally, this the second index determining device also can comprise: for device (the following abbreviation " stacking apparatus " superposed after being multiplied with the weighted value of corresponding data source by the weighted value of each preassigned pattern existing in dissimilar data source, not shown in Fig. 2), for by superposition after data value maps to device (the following abbreviation " mapping device " within the scope of predetermined interval, not shown in Fig. 2) and for using the numerical value after mapping as device (the following abbreviation " the 3rd index determining device " of the stock public sentiment index of stock identification information, not shown in Fig. 2).
Exemplarily, second index determining device calculates determines that a detailed process of the stock public sentiment index of stock identification information can be: the second index determining device (as stacking apparatus) superposes after being multiplied with the weighted value of corresponding data source by the weighted value of each preassigned pattern existing in dissimilar data source, thus the second index determining device (as the first index determining device) can using the stock public sentiment index of the numerical value after superposition as stock identification information.
An example more specifically, the weighted value that large data are searched for by setting Baidu is 0.5, the weighted value of the large data in community of Baidu be 0.3 and the weighted value of the large data of Baidu's news be 0.2, set the preassigned pattern that Baidu searches for large data corresponding to comprise: weighted value is as+1 preassigned pattern A, weighted value be+2 preassigned pattern B and weighted value be the preassigned pattern C of-3, preassigned pattern corresponding to the large data in community of Baidu comprises: weighted value is the preassigned pattern D of+3, weighted value be+1 preassigned pattern E and weighted value be the preassigned pattern F of-2, preassigned pattern corresponding to the large data of Baidu's news comprises: weighted value is the preassigned pattern G of+1, weighted value be+4 preassigned pattern H and weighted value be the preassigned pattern I of-2, if for the first stock identification information, the preassigned pattern that frequency statistics judgment means determines that Baidu searches for that the preassigned pattern existed in large data is preassigned pattern B, exist in the large data in community of Baidu is the preassigned pattern existed in preassigned pattern E and the large data of Baidu's news is preassigned pattern H, then the stock public sentiment index that the first index determining device dopes for the first stock identification information is: 0.5 × 2+0.3 × 1+0.2 × 4=2.1.
Exemplarily, second index determining device calculates determines that another detailed process of the stock public sentiment index of stock identification information can be: the second index determining device (as stacking apparatus) superposes after being multiplied with the weighted value of corresponding data source by the weighted value of each preassigned pattern existing in dissimilar data source, the numerical value that superposition obtains is carried out mapping process by the second index determining device (as mapping device), the numerical value obtained to make superposition is converted into predetermined interval (as [-1, 1] numerical value), thus the second index determining device (as the 3rd index determining device) can will map the stock public sentiment index of the numerical value after process as stock identification information.The numerical value mapped after process definitely can show market in the recent period to the attitude that polarity and the intensity of stock are held.Second index determining device (as mapping device) can adopt various ways to carry out mapping process to the numerical value that superposition obtains, and the present embodiment does not limit the specific implementation mapping process.
An example more specifically, the weighted value that large data are searched for by setting Baidu is 0.5, the weighted value of the large data in community of Baidu be 0.3 and the weighted value of the large data of Baidu's news be 0.2, set the preassigned pattern that Baidu searches for large data corresponding to comprise: weighted value is as+1 preassigned pattern A, weighted value be+2 preassigned pattern B and weighted value be the preassigned pattern C of-3, preassigned pattern corresponding to the large data in community of Baidu comprises: weighted value is the preassigned pattern D of+3, weighted value be+1 preassigned pattern E and weighted value be the preassigned pattern F of-2, preassigned pattern corresponding to the large data of Baidu's news comprises: weighted value is the preassigned pattern G of+1, weighted value be+4 preassigned pattern H and weighted value be the preassigned pattern I of-2, if for the first stock identification information, it is preassigned pattern B that frequency statistics judgment means determines that the preassigned pattern existed in large data is searched for by Baidu, the preassigned pattern existed in the large data in community of Baidu is the preassigned pattern existed in preassigned pattern E and the large data of Baidu's news is preassigned pattern H, numerical value then after the second index determining device (as stacking apparatus) superposition is: 0.5 × 2+0.3 × 1+0.2 × 4=2.3, numerical value after superposition is carried out [-1 by the second index determining device (as mapping device), 1] interval mapping process, after mapping process the second index determining device (as the 3rd index determining device) obtain for the first stock identification information the stock public sentiment index that dopes be: 0.6.
It should be noted that, above-mentionedly only illustrate the second index determining device calculates stock public sentiment index two kinds of specific implementation processes according to corresponding weighted value, the stock public sentiment index that second index determining device can also adopt other computing method to determine corresponding to stock identification information according to the weighted value of each preassigned pattern existing in the weighted value of dissimilar data source and dissimilar data source, the present embodiment does not limit the second index determining device calculates stock public sentiment index specific implementation according to corresponding weighted value.
Below the set-up mode of the weighted value of each data source in stock public sentiment exponential forecasting device and the weighted value of each preassigned pattern is illustrated.
Exemplarily, preassigned pattern setting device in stock public sentiment exponential forecasting device can optionally comprise: for arranging the device (following abbreviation " weighted value setting device ", not shown in Fig. 2) of the weighted value of preassigned pattern according to the amplitude of Stock Price Fluctuation, and this weighted value setting device can optionally comprise: for judging device (the following abbreviation " amplitude judgment means " of the amplitude of Stock Price Fluctuation, not shown in Fig. 2), for when the amplitude of Stock Price Fluctuation is ascensional range, the weighted value of preassigned pattern is set to device (the following abbreviation " the first setting device " of positive weights value, not shown in Fig. 2) and for when the amplitude of Stock Price Fluctuation is fall, the weighted value of preassigned pattern is set to device (the following abbreviation " the second setting device " of negative weighted value, not shown in Fig. 2).
Exemplarily, stock public sentiment exponential forecasting device can also comprise: for by utilizing pre-defined algorithm to learn the stock public sentiment index of described stock identification information and stock price corresponding to stock identification information, with the device (following abbreviation " learning device ", not shown in Fig. 2) of the weighted value of the weighted value and/or described each preassigned pattern that adjust dissimilar data source.
The example that stock public sentiment exponential forecasting device arranges of weighted value concrete respectively for each data source is in advance, weighted value setting device arranges initial weight value respectively for each data source in advance, the initial weight value of as pre-weighted value setting device Baidu being searched for large data is set to 0.5, the initial weight value of large for community of Baidu data is set to 0.3, and be 0.2 by the initial weight value of large for Baidu's news data, then, stock public sentiment index determining device 220 is by obtaining the historical data (data or the data etc. in last season as the first half of the year) of each data source, and utilize historical data to predict the stock public sentiment index of multiple stock identification information (all stock identification informations as in current A share market), data due to each data source of stock public sentiment index determining device 220 use are the historical data in each data source, therefore, learning device can utilize current each stock that can get in the actual stock price of corresponding period of history to the accuracy of the stock public sentiment index of each stock detecting current predictive and go out, as learning device utilizes pre-defined algorithm (neural network algorithm etc.) to learn the actual stock price corresponding to the stock public sentiment index of all stock identification informations and respective stock identification information, finally to determine that large data are searched for by Baidu, the present weight value of the large data in community of Baidu and the large data of Baidu's news is (as improved the present weight value of the strong data source of the actual stock price ability of reflection, and reduce the present weight value of the weak data source of the actual stock price ability of reflection).
The example that stock public sentiment exponential forecasting device arranges of weighted value concrete respectively for the preassigned pattern in each data source is in advance, be that each data source arranges in the process of each preassigned pattern respectively at preassigned pattern setting device, weighted value setting device can also arrange weighted value for the preassigned pattern in different pieces of information source, namely arrange in the process of preassigned pattern at preassigned pattern setting device, weighted value setting device (as amplitude judgment means) judges the amplitude of Stock Price Fluctuation, weighted value setting device (as the first setting device and the second setting device) is that corresponding preassigned pattern arranges weighted value etc. according to the amplitude of Stock Price Fluctuation, as when the amplitude of Stock Price Fluctuation is ascensional range, the weighted value of preassigned pattern is set to positive weights value by weighted value setting device (as the first setting device), and the larger weighted value of ascensional range is larger, and when the amplitude of Stock Price Fluctuation is fall, the weighted value of preassigned pattern is set to negative weighted value by weighted value setting device (as the second setting device), and the larger weighted value of fall is less, because a kind of preassigned pattern may be set up different weighted values in different processes, therefore weighted value setting device can utilize the modes such as the average asking multiple weighted value to determine a kind of weighted value of preassigned pattern, then, stock public sentiment index determining device 220 obtains the historical data (data or the data etc. in last season as the first half of the year) of each data source, and utilizes the historical data obtained to predict the stock public sentiment index of multiple stock identification information (all stock identification informations as in current A share market), data due to each data source used are the historical data in each data source, therefore, learning device can utilize current each stock that can get in the actual stock price of corresponding period of history to the accuracy of the stock public sentiment index of each stock detecting current predictive and go out, as learning device utilizes pre-defined algorithm (neural network algorithm etc.) to learn the actual stock price corresponding to the stock public sentiment index of all stock identification informations and respective stock identification information, finally to determine that the present weight value of each preassigned pattern is (as improved the present weight value of the strong preassigned pattern of the actual stock price ability of reflection, and reduce the present weight value of the weak preassigned pattern of the actual stock price ability of reflection).
In advance weighted value is set respectively for each data source at weighted value setting device, and in advance for after the preassigned pattern in each data source arranges weighted value respectively, and be put in actual application at stock public sentiment exponential forecasting device, stock public sentiment exponential forecasting device still can be regular the learning process of learning device (as every angel's learning device learns the actual stock price corresponding to the stock public sentiment index of all stock identification informations and respective stock identification information based on neural network algorithm) that utilizes dynamic conditioning is carried out to the weighted value of each data source and the weighted value of each preassigned pattern, constantly to improve the forecasting accuracy of stock public sentiment index.An object lesson of the process of the weighted value of each data source of above-mentioned regular adjustment and the weighted value of each preassigned pattern is: learning device obtain all stocks that stock public sentiment exponential forecasting device doped yesterday stock public sentiment index and today all stock actual stock price, utilize pre-defined algorithm (neural network algorithm etc.) to the stock public sentiment index of all stocks doped yesterday and today all stock actual stock price learn, with the weighted value of the weighted value and each preassigned pattern that adjust each data source.The present embodiment does not limit learning device and utilizes pre-defined algorithm (as neural network algorithm) to carry out the specific implementation learnt.
It should be noted that the present invention can be implemented in the assembly of software and/or software restraint, such as, each device of the present invention can adopt special IC (ASIC) or any other similar hardware device to realize.In one embodiment, software program of the present invention can perform to realize step mentioned above or function by processor.Similarly, software program of the present invention (comprising relevant data structure) can be stored in computer readable recording medium storing program for performing, such as, and RAM storer, magnetic or CD-ROM driver or flexible plastic disc and similar devices.In addition, steps more of the present invention or function can adopt hardware to realize, such as, as coordinating with processor thus performing the circuit of each step or function.
To those skilled in the art, obviously, the invention is not restricted to the details of above-mentioned one exemplary embodiment, and when not deviating from spirit of the present invention or essential characteristic, the present invention can be realized in other specific forms.Therefore, no matter from the viewpoint of which, all should regard embodiment as exemplary, and be nonrestrictive, scope of the present invention is limited by claims instead of above-mentioned explanation, therefore, all changes be intended in the implication of the equivalency by dropping on claim and scope are included in the present invention.Any Reference numeral in claim should be considered as the claim involved by limiting.In addition, obviously " comprising " one word do not get rid of other unit or step, odd number does not get rid of plural number.Multiple unit of stating in system claims or device also can be realized by software or hardware by a unit or device.The word such as first and second is used for representing title, and does not represent any particular order.
Although show and describe exemplary embodiment especially above, it will be appreciated by those skilled in the art that when not deviating from the spirit and scope of claims, can change to some extent in its form and details.Here sought protection is set forth in the dependent claims.

Claims (20)

1. a B shareB public sentiment index forecasting method, wherein, the method comprises the following steps:
Obtain the real time data in dissimilar data source, wherein, described dissimilar data source comprises: based on the data source of search engine, the data source based on community/forum and the data source based on news;
Determine the descriptor relevant to the stock needing to carry out stock public sentiment exponential forecasting that the real time data in described dissimilar data source comprises;
The all descriptors comprised according to the real time data in dissimilar data source determine the stock public sentiment index of described stock.
2. method according to claim 1, wherein, the described all descriptors comprised according to the real time data in dissimilar data source determine that the step of the stock public sentiment index of described stock comprises:
Respectively corresponding frequency statistics is carried out to all descriptors that the real time data in described dissimilar data source comprises;
When the frequency statistics determining that the result of frequency statistics meets preassigned pattern requires, the preassigned pattern requiring the preassigned pattern be satisfied to be defined as described stock frequency statistics to exist in the data source of respective type;
The each preassigned pattern existed in dissimilar data source according to described stock determines the stock public sentiment index of described stock.
3. method according to claim 2, wherein, the described each preassigned pattern existed in dissimilar data source according to described stock determines that the step of the stock public sentiment index of described stock comprises:
Weighted value according to each preassigned pattern existed in the weighted value of dissimilar data source and described dissimilar data source calculates the stock public sentiment index determining described stock.
4. method according to claim 3, wherein, the weighted value of each preassigned pattern existed in the described weighted value according to dissimilar data source and described dissimilar data source calculates determines that the step of the stock public sentiment index of described stock comprises:
Superpose after the weighted value of each preassigned pattern existing in described dissimilar data source is multiplied with the weighted value of corresponding data source;
Using the stock public sentiment index of the numerical value after described superposition as described stock.
5. method according to claim 3, wherein, the weighted value of each preassigned pattern existed in the described weighted value according to dissimilar data source and described dissimilar data source calculates determines that the step of the stock public sentiment index of described stock comprises:
Superpose after the weighted value of each preassigned pattern existing in described dissimilar data source is multiplied with the weighted value of corresponding data source;
By the data value maps after superposition within the scope of predetermined interval;
Using the stock public sentiment index of the numerical value after described mapping as described stock.
6. the method according to claim arbitrary in claim 2 to 5, wherein, described method also comprises:
By carrying out data mining to the historical data in each data source, described preassigned pattern is set.
7. method according to claim 6, wherein, the described step arranging each preassigned pattern by carrying out data mining to the historical data in each data source comprises:
Respectively data mining is carried out to the historical data in dissimilar each data source, to determine the descriptor based on finance repeated for a stock;
According to the price determination Stock Price Fluctuation of the respective stock of corresponding period of history;
When described Stock Price Fluctuation meets pre-provisioning request, generate preassigned pattern according to the described descriptor based on finance repeated.
8. method according to claim 7, wherein, the described step arranging each preassigned pattern by carrying out data mining to the historical data in each data source also comprises:
The weighted value of preassigned pattern is set according to the amplitude of described Stock Price Fluctuation.
9. method according to claim 8, wherein, the step that the described amplitude according to described Stock Price Fluctuation arranges the weighted value of preassigned pattern comprises:
Judge the amplitude of described Stock Price Fluctuation;
When the amplitude of described Stock Price Fluctuation is ascensional range, the weighted value of preassigned pattern is set to positive weights value;
When the amplitude of described Stock Price Fluctuation is fall, the weighted value of preassigned pattern is set to negative weighted value.
10. the method according to claim arbitrary in claim 3 to 5, wherein, described method also comprises:
By utilizing pre-defined algorithm, the stock public sentiment index of described stock identification information and stock price corresponding to described stock identification information are learnt, with the weighted value of the weighted value and/or described each preassigned pattern that adjust dissimilar data source.
11. 1 B shareB public sentiment exponential forecasting devices, wherein, comprising:
For obtaining the device of the real time data in dissimilar data source, wherein, described dissimilar data source comprises: based on the data source of search engine, the data source based on community/forum and the data source based on news;
For the device of the descriptor relevant to the stock needing to carry out stock public sentiment exponential forecasting that the real time data determined in described dissimilar data source comprises;
All descriptors for comprising according to the real time data in dissimilar data source determine the device of the stock public sentiment index of described stock.
12. devices according to claim 11, wherein, the described all descriptors comprised for the real time data in dissimilar data source determine that the device of the stock public sentiment index of described stock comprises:
All descriptors for comprising the real time data in described dissimilar data source carry out the device of corresponding frequency statistics respectively;
During for requiring in the frequency statistics determining that the result of frequency statistics meets preassigned pattern, frequency statistics is required that the preassigned pattern be satisfied is defined as the device of the preassigned pattern that described stock exists in the data source of respective type;
Each preassigned pattern for existing in dissimilar data source according to described stock determines the device of the stock public sentiment index of described stock.
13. devices according to claim 12, wherein, described each preassigned pattern for existing in dissimilar data source according to described stock determines that the device of the stock public sentiment index of described stock comprises:
The device determining the stock public sentiment index of described stock is calculated for the weighted value according to each preassigned pattern existed in the weighted value of dissimilar data source and described dissimilar data source.
14. devices according to claim 13, wherein, describedly determine that the device of the stock public sentiment index of described stock comprises for calculating according to the weighted value of each preassigned pattern existed in the weighted value of dissimilar data source and described dissimilar data source:
For the device superposed after the weighted value of each preassigned pattern existing in described dissimilar data source is multiplied with the weighted value of corresponding data source;
For using the device of the numerical value after described superposition as the stock public sentiment index of described stock.
15. devices according to claim 13, wherein, describedly determine that the device of the stock public sentiment index of described stock comprises for calculating according to the weighted value of each preassigned pattern existed in the weighted value of dissimilar data source and described dissimilar data source:
For the device superposed after the weighted value of each preassigned pattern existing in described dissimilar data source is multiplied with the weighted value of corresponding data source;
For by superposition after data value maps to the device within the scope of predetermined interval;
For using the device of the numerical value after described mapping as the stock public sentiment index of described stock.
16. according to claim 12 to the device described in arbitrary claim in 15, and wherein, described stock public sentiment exponential forecasting device also comprises:
For by carrying out to the historical data in each data source the device that data mining arranges described preassigned pattern.
17. devices according to claim 16, wherein, the described device for arranging each preassigned pattern by carrying out data mining to the historical data in each data source comprises:
For carrying out data mining respectively to the historical data in dissimilar each data source, to determine the device of the descriptor based on finance repeated for a stock;
For the device of the price determination Stock Price Fluctuation of the respective stock according to the corresponding period of history;
For when described Stock Price Fluctuation meets pre-provisioning request, generate the device of preassigned pattern according to the described descriptor based on finance repeated.
18. devices according to claim 17, wherein, the described device for arranging each preassigned pattern by carrying out data mining to the historical data in each data source also comprises:
For arranging the device of the weighted value of preassigned pattern according to the amplitude of described Stock Price Fluctuation.
19. devices according to claim 18, wherein, the device of the described weighted value for arranging preassigned pattern according to the amplitude of described Stock Price Fluctuation comprises:
For judging the device of the amplitude of described Stock Price Fluctuation;
For when the amplitude of described Stock Price Fluctuation is ascensional range, the weighted value of preassigned pattern is set to the device of positive weights value;
For when the amplitude of described Stock Price Fluctuation is fall, the weighted value of preassigned pattern is set to the device of negative weighted value.
20. according to claim 13 to the device described in arbitrary claim in 15, and wherein, described stock public sentiment exponential forecasting device also comprises:
For by utilizing pre-defined algorithm to learn the stock public sentiment index of described stock identification information and stock price corresponding to described stock identification information, with the device of the weighted value of the weighted value and/or described each preassigned pattern that adjust dissimilar data source.
CN201510796661.4A 2015-11-18 2015-11-18 Stock public opinion index prediction method and device Pending CN105373853A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510796661.4A CN105373853A (en) 2015-11-18 2015-11-18 Stock public opinion index prediction method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510796661.4A CN105373853A (en) 2015-11-18 2015-11-18 Stock public opinion index prediction method and device

Publications (1)

Publication Number Publication Date
CN105373853A true CN105373853A (en) 2016-03-02

Family

ID=55376032

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510796661.4A Pending CN105373853A (en) 2015-11-18 2015-11-18 Stock public opinion index prediction method and device

Country Status (1)

Country Link
CN (1) CN105373853A (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108830722A (en) * 2018-06-27 2018-11-16 东莞市波动赢机器人科技有限公司 Based on transaction machine people recommended method, electronic equipment and the storage medium to liquidate
CN109087205A (en) * 2018-08-10 2018-12-25 北京字节跳动网络技术有限公司 Prediction technique and device, the computer equipment and readable storage medium storing program for executing of public opinion index
CN109493228A (en) * 2018-12-12 2019-03-19 安徽省泰岳祥升软件有限公司 A kind of method and device generating stock news in brief model
WO2019095569A1 (en) * 2017-11-17 2019-05-23 平安科技(深圳)有限公司 Financial analysis method based on financial and economic event on microblog, application server, and computer readable storage medium
CN110390408A (en) * 2018-04-16 2019-10-29 北京京东尚科信息技术有限公司 Trading object prediction technique and device
WO2019205378A1 (en) * 2018-04-26 2019-10-31 平安科技(深圳)有限公司 Method and apparatus for selecting investment stocks based on public sentiment factor, and storage medium
CN111460354A (en) * 2020-03-31 2020-07-28 上海蜜度信息技术有限公司 Method and equipment for digitizing data subject of network public opinion data
CN113643060A (en) * 2021-08-12 2021-11-12 工银科技有限公司 Product price prediction method and device

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019095569A1 (en) * 2017-11-17 2019-05-23 平安科技(深圳)有限公司 Financial analysis method based on financial and economic event on microblog, application server, and computer readable storage medium
CN110390408A (en) * 2018-04-16 2019-10-29 北京京东尚科信息技术有限公司 Trading object prediction technique and device
CN110390408B (en) * 2018-04-16 2024-03-05 北京京东尚科信息技术有限公司 Transaction object prediction method and device
WO2019205378A1 (en) * 2018-04-26 2019-10-31 平安科技(深圳)有限公司 Method and apparatus for selecting investment stocks based on public sentiment factor, and storage medium
CN108830722A (en) * 2018-06-27 2018-11-16 东莞市波动赢机器人科技有限公司 Based on transaction machine people recommended method, electronic equipment and the storage medium to liquidate
CN109087205A (en) * 2018-08-10 2018-12-25 北京字节跳动网络技术有限公司 Prediction technique and device, the computer equipment and readable storage medium storing program for executing of public opinion index
CN109087205B (en) * 2018-08-10 2020-09-18 北京字节跳动网络技术有限公司 Public opinion index prediction method and device, computer equipment and readable storage medium
CN109493228A (en) * 2018-12-12 2019-03-19 安徽省泰岳祥升软件有限公司 A kind of method and device generating stock news in brief model
CN111460354A (en) * 2020-03-31 2020-07-28 上海蜜度信息技术有限公司 Method and equipment for digitizing data subject of network public opinion data
CN113643060A (en) * 2021-08-12 2021-11-12 工银科技有限公司 Product price prediction method and device

Similar Documents

Publication Publication Date Title
CN105373853A (en) Stock public opinion index prediction method and device
Ginting et al. Technical approach of TOPSIS in decision making
CN105719001B (en) Large scale classification in neural networks using hashing
CN108833458B (en) Application recommendation method, device, medium and equipment
CN106909931B (en) Feature generation method and device for machine learning model and electronic equipment
JP2019517057A (en) Wide and deep machine learning model
CN103778548A (en) Goods information and keyword matching method, and goods information releasing method and device
Hikmawati et al. Minimum threshold determination method based on dataset characteristics in association rule mining
CN111639253B (en) Data weight judging method, device, equipment and storage medium
CN104992347A (en) Video matching advertisement method and device
JP6907664B2 (en) Methods and equipment used to predict non-stationary time series data
CN105095414A (en) Method and apparatus used for predicting network search volume
CN107832794A (en) A kind of convolutional neural networks generation method, the recognition methods of car system and computing device
CN102945273B (en) A kind of for providing the method and apparatus of Search Results
CN111612581A (en) Method, device and equipment for recommending articles and storage medium
CN105373854A (en) Stock public opinion index prediction method and device
CN112766402A (en) Algorithm selection method and device and electronic equipment
CN113807469A (en) Multi-energy user value prediction method, device, storage medium and equipment
Cole (Infra) structural discontinuity: Capital, labour, and technological change
CN110019563B (en) Portrait modeling method and device based on multi-dimensional data
CN113301017B (en) Attack detection and defense method and device based on federal learning and storage medium
CN110580265B (en) ETL task processing method, device, equipment and storage medium
CN115687764B (en) Training method of vehicle track evaluation model, vehicle track evaluation method and device
CN104834958A (en) Method and device for evaluating steps of answer
CN105335385A (en) Project-based collaborative filtering recommendation method and device

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20160302

RJ01 Rejection of invention patent application after publication