CN105183836B - A kind of algorithm that event big data information is obtained based on symbolic feature - Google Patents

A kind of algorithm that event big data information is obtained based on symbolic feature Download PDF

Info

Publication number
CN105183836B
CN105183836B CN201510553189.1A CN201510553189A CN105183836B CN 105183836 B CN105183836 B CN 105183836B CN 201510553189 A CN201510553189 A CN 201510553189A CN 105183836 B CN105183836 B CN 105183836B
Authority
CN
China
Prior art keywords
decimal
event
big data
symbolic
time series
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201510553189.1A
Other languages
Chinese (zh)
Other versions
CN105183836A (en
Inventor
张雨
张弛
史焕然
邹建平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
JIANGSU RUNBANG INTELLIGENT PARKING EQUIPMENT CO., LTD.
Original Assignee
Jiangsu Run State Intelligent Garage Ltd By Share Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiangsu Run State Intelligent Garage Ltd By Share Ltd filed Critical Jiangsu Run State Intelligent Garage Ltd By Share Ltd
Priority to CN201510553189.1A priority Critical patent/CN105183836B/en
Publication of CN105183836A publication Critical patent/CN105183836A/en
Application granted granted Critical
Publication of CN105183836B publication Critical patent/CN105183836B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24568Data stream processing; Continuous queries

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Complex Calculations (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

The invention discloses a kind of algorithms that event big data information is obtained based on symbolic feature:Step 1:Decimal system time series { the x of acquisition eventmAnd sampling total length is set;Step 2:Binary character length L to be encoded and sampling delay, τ are set;Step 3:Calculate decimal system time series { xmMean μ;Step 4:Using μ as the dividing line P of 0 and 1 two symbol field0, threshold function table is set;Step 5:To decimal system time series { xmAll over threshold function table is applied, build binary symbol sequence { sn};Step 6:To binary character code sequence { snDecimal coded is carried out, it is converted into decimal symbol code sequence { Sn};Step 7:Count { SnIn each decimal symbol code SnThe frequency P of appearancen, form decimal symbol code SnFrequency PnNogata figure.The domination of big data feature is realized, the decimal system time series { x of event is represented convenient for judgementmWhether there is big data feature.

Description

A kind of algorithm that event big data information is obtained based on symbolic feature
Technical field
The present invention relates to a kind of algorithms that event big data information is obtained based on symbolic feature.
Background technology
Such definition is given for " big data " (Big data) research institution Gartner:" big data " is desirable New tupe could have stronger decision edge, see clearly the magnanimity for finding power and process optimization ability, high growth rate and various The information assets of change.
For describing the decimal system time series of a certain generalized event as shown in Figure 1, what kind of big data is characterized inSuch as There are big data features for fruit, then how to obtain the big data featureThe method for obtaining big data in the prior art is not unique, this Patent proposes a kind of algorithm that event big data information is obtained based on symbolic feature.
Invention content
In view of the above-mentioned problems, the present invention provides a kind of algorithm that event big data information is obtained based on symbolic feature, realize The domination of big data feature corresponds to some decimal symbol code S convenient for judgementnParticular event whether have big data Feature;Further, correspond to decimal symbol code sequence { S convenient for judgementn(namely corresponding to decimal system time series { xm}) A certain generalized event whether have randomness or certainty.
To realize above-mentioned technical purpose and the technique effect, the invention is realized by the following technical scheme:
A kind of algorithm that event big data information is obtained based on symbolic feature, which is characterized in that include the following steps:
Step 1:Decimal system time series { the x of acquisition eventmAnd sampling total length is set;
Step 2:Binary character length L to be encoded and sampling delay, τ are set;
Step 3:Calculate decimal system time series { xmMean μ;
Step 4:Using μ as the dividing line P of 0 and 1 two symbol field0, threshold function table is set
Step 5:To decimal system time series { xmAll over threshold function table is applied, according to binary character length L and sampling time delay τ, by decimal system time series { xmElement xmIt is transformed to binary symbol sequence { tmIn element tm, structure binary system symbol Number sequence { sn};
Step 6:To binary character code sequence { snDecimal coded is carried out, it is converted into decimal symbol code sequence {Sn};
Step 7:Count { SnIn each decimal symbol code SnThe frequency P of appearancen, form decimal symbol code SnFrequency PnNogata figure.
It is preferred that further include step 8:According to decimal symbol code SnFrequency PnNogata figure computed improved entropy Hs(L)。
The beneficial effects of the invention are as follows:
To the time series { xm" coarse " --- symbolism is implemented, turn the time series of original numerical value change multiterminal It changes into only have the symbol sebolic addressing of several numerical value.It is processed by " coarse ", obtains decimal symbol code SnFrequency PnFigure, In, the symbolic code of big frequency corresponds to strong information, and the symbolic code of small frequency corresponds to Weak Information, it is achieved thereby that big data The domination of feature.
It further, can be to decimal symbol code SnFrequency PnHistogram calculation improves entropy Hs(L), the H of randomness events (L) >=0.9, the H of deterministic cases(L)≤0.1, correspond to decimal symbol code sequence { S so as to can determine thatn(namely it is corresponding In decimal system time series { xm) a certain generalized event whether have randomness or certainty.
Description of the drawings
Fig. 1 is the decimal system time series { x of a certain generalized eventm};
Fig. 2 is decimal system time series { xmBe converted to binary character code sequence { snSchematic diagram;
Fig. 3 is certain stock index variation { xmWeekly figure and it is converted into binary character code sequence { snSchematic diagram;
When Fig. 4 is binary character length L=3, certain stock index variation { xnWeekly figure decimal symbol code SnFrequency PnHistogram;
Fig. 5 is certain four-cylinder diesel engine fuselage shaking { xmSchematic diagram;
When Fig. 6 is binary character length L=6, certain four-cylinder diesel engine fuselage shaking { xmDecimal symbol code SnFrequency Spend PnHistogram.
Specific embodiment
Technical solution of the present invention is described in further detail with specific embodiment below in conjunction with the accompanying drawings, so that ability The technical staff in domain can be better understood from the present invention and can be practiced, but illustrated embodiment is not as the limit to the present invention It is fixed.
A kind of algorithm that event big data information is obtained based on symbolic feature, which is characterized in that include the following steps:
Step 1:Decimal system time series { the x of acquisition eventmAnd sampling total length is set;
Step 2:Binary character length L to be encoded and sampling delay, τ are set;
Step 3:Calculate decimal system time series { xmMean μ;
Step 4:Using μ as the dividing line P of 0 and 1 two symbol field0, threshold function table is set
Step 5:To decimal system time series { xmAll over threshold function table is applied, according to binary character length L and sampling time delay τ, by decimal system time series { xmElement xmIt is transformed to binary symbol sequence { tmIn element tm, structure binary system symbol Number sequence { sn};
Step 6:To binary character code sequence { snDecimal coded is carried out, it is converted into decimal symbol code sequence {Sn};
Step 7:Count { SnIn each decimal symbol code SnThe frequency P of appearancen, form decimal symbol code SnFrequency PnNogata figure.In symbolic code SnFrequency PnIn figure, some symbolic code SnCharacterize a certain particular event, corresponding frequency Pn It is the intensity that the particular event occurs.Wherein, the symbolic code of big frequency corresponds to strong information, and the symbolic code of small frequency corresponds to Weak Information, if frequency PnNumerical value then can determine that the particular event has big data feature with respect to other symbolic code biggers, It is achieved thereby that the domination of big data feature.By some decimal system for corresponding to a certain particular event (i.e. " individual events ") Symbolic code SnThe frequency P of appearancen, can judge whether the particular event has regular big data feature.
One threshold value can be rule of thumb set, as some decimal symbol code SnFrequency PnMore than setting threshold value when, Judge that the particular event has big data feature.
It further, can be according to decimal symbol code SnFrequency PnNogata graphics calculations " improve entropy Hs(L) " it, calculates Formula such as formula (1):
In formula (1):NseqIt is the decimal symbol code total number with non-zero frequency;I is a number sequence of decimal symbol code Number;pi,LIt is the frequency for i-th of decimal symbol code that length is L.
Due to the H of randomness events(L) >=0.9, the H of deterministic cases(L)≤0.1, so as to can determine that correspond to ten into Symbolic code sequence { S processedn(namely corresponding to decimal system time series { xn) a certain generalized event (i.e. " whole event ") whether With randomness or certainty.
By determining binary symbol sequence { snLength L and delay, τ, determine decimal system time series { xmMean value μ, setting threshold function table, can be by decimal system time series { xmBe converted to binary temporal sequence { tm, and then build binary system Symbolic code sequence { sn, to { snAs decimal coded it is converted into decimal symbol code sequence { Sn}.Wherein, each parameter is excellent It is selected as:Total length >=50 point are sampled, the value range of L is that the value range of 3~6, τ is 1~3.It should be noted that the value of τ Range 1~3 refers in binary character domain to binary temporal sequence { tmAt interval of 1~3 data take next element. Fig. 2 is the decimal system time series { x to corresponding to a certain generalized event in Fig. 1m, it is converted into binary character code sequence {snProcess, for express it is simple clear for the sake of, take symbol lengths L=3, delay, τ=1.
The stock index changing rule of economic field is analyzed, seeks the relationship between more empty two sides.Fig. 3 is economical Field stock index variation { xmWeekly (5 days) figure and it is converted into binary character code sequence { snProcess, For the intensive that expression Fig. 3 stock index change, symbol lengths L=3, delay, τ=1 are taken, corresponding stock index changes { xm Decimal symbol code SnFrequency PnHistogram is as shown in Figure 4.
From fig. 4, it can be seen that the frequency that symbolic code " 101 " occurs is maximum (6 times), the frequency that symbolic code " 010 " occurs takes second place (4 It is secondary).In figure 3, " 101 " characterization stock index depth ∨ rebounds, the big ∧ drops of " 010 " characterization stock index.The empty both sides more than the Zhou Zhong stock markets The severity of game, to the greatest extent by the decimal symbol code S of weekly stock index variation figurenFrequency PnHistogram quantitative expression, in many ways Occupy windward compared with short side.And the improvement entropy H of Fig. 4s(L)=0.68, illustrate in weekly stock index change procedure simultaneously by determining The effect of sexual factor and random factor.
The diesel vibration of engineering field is analyzed, seeks the function and effect in relation to influence factor.Fig. 5 is engineering neck Domain four-cylinder diesel engine fuselage shaking { xmFigure, for expression Fig. 5 it is of short duration it is big vibration and its between small vibration large-spacing spy Point takes symbol lengths L=6, delay, τ=3, corresponding four-cylinder diesel engine fuselage shaking { xnDecimal symbol code SnFrequency PnHistogram is as shown in Figure 6.
As seen from Figure 5, fuselage shaking time history { xnThere is big vibration of short duration several times, this is in firing top centre respectively With bottom dead center-nearby, exhaust top dead center and bottom dead center-nearby piston excited target shock cylinder sleeve as a result, with diesel load, work Plug-steel-jacket gap, piston ring sticking state etc. are related.In figure 6, these possible influence factors can be by the ten of fuselage shaking Hex notation code SnFrequency PnHistogram quantitative expression, wherein the decimal symbol code for having several frequency larger.It can be converted For the binary character of symbol lengths L=6, the opportunity of its appearance is observed and found in Figure 5, can judge which factor is drawn Send out diesel engine vibration most very.And the improvement entropy H of Fig. 6s(L)=0.9754, illustrate that diesel engine vibration has on the whole The attribute of randomness event.
It these are only the preferred embodiment of the present invention, be not intended to limit the scope of the invention, it is every to utilize this hair The equivalent structure that bright specification and accompanying drawing content are made either equivalent process transformation or be directly or indirectly used in other correlation Technical field, be included within the scope of the present invention.

Claims (6)

1. a kind of algorithm that event big data information is obtained based on symbolic feature, which is characterized in that include the following steps:
Step 1:Decimal system time series { the x of acquisition eventmAnd sampling total length is set;
Step 2:Binary character length L to be encoded and sampling delay, τ are set;
Step 3:Calculate decimal system time series { xmMean μ;
Step 4:Using μ as the dividing line P of 0 and 1 two symbol field0, threshold function table is set
Step 5:To decimal system time series { xmAll over threshold function table is applied, according to binary character length L and sampling delay, τ, by ten System time series { xmElement xmIt is transformed to binary symbol sequence { tmIn element tm, build binary symbol sequence {sn};
Step 6:To binary character code sequence { snDecimal coded is carried out, it is converted into decimal symbol code sequence { Sn};
Step 7:Count { SnIn each decimal symbol code SnThe frequency P of appearancen, form decimal symbol code SnFrequency PnDirectly Square figure.
2. a kind of algorithm that event big data information is obtained based on symbolic feature according to claim 1, which is characterized in that Sample total length >=50 point.
3. a kind of algorithm that event big data information is obtained based on symbolic feature according to claim 1, which is characterized in that The value range of L is 3~6.
4. a kind of algorithm that event big data information is obtained based on symbolic feature according to claim 1, which is characterized in that The value range of τ is 1~3.
5. a kind of algorithm that event big data information is obtained based on symbolic feature according to claim 1, which is characterized in that As some decimal symbol code SnFrequency PnMore than setting threshold value when, judgement corresponds to a certain particular event of the symbolic code With big data feature.
6. a kind of algorithm that event big data information is obtained based on symbolic feature according to claim 1, which is characterized in that According to decimal symbol code SnFrequency PnNogata figure computed improved entropy Hs(L), work as Hs(L) >=0.9 when, judgement corresponds to symbol Code sequence { SnGeneralized event have randomness;Work as Hs(L)≤0.1 when, judgement corresponds to symbolic code sequence { SnBroad sense thing Part has certainty;When 0.1<Hs(L)<When 0.9, judgement corresponds to symbolic code sequence { SnGeneralized event simultaneously by determine The effect of sexual factor and random factor.
CN201510553189.1A 2015-09-01 2015-09-01 A kind of algorithm that event big data information is obtained based on symbolic feature Active CN105183836B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510553189.1A CN105183836B (en) 2015-09-01 2015-09-01 A kind of algorithm that event big data information is obtained based on symbolic feature

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510553189.1A CN105183836B (en) 2015-09-01 2015-09-01 A kind of algorithm that event big data information is obtained based on symbolic feature

Publications (2)

Publication Number Publication Date
CN105183836A CN105183836A (en) 2015-12-23
CN105183836B true CN105183836B (en) 2018-06-15

Family

ID=54905918

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510553189.1A Active CN105183836B (en) 2015-09-01 2015-09-01 A kind of algorithm that event big data information is obtained based on symbolic feature

Country Status (1)

Country Link
CN (1) CN105183836B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103942300A (en) * 2014-04-15 2014-07-23 大连海事大学 Dynamic solution method of center time series
CN104636325A (en) * 2015-02-06 2015-05-20 中南大学 Document similarity determining method based on maximum likelihood estimation
CN104679991A (en) * 2015-01-27 2015-06-03 吉林大学 Ordered proposition-oriented novel method of information fusion
CN104866929A (en) * 2015-06-11 2015-08-26 陈虹 International investment index data processing and analysis method and international investment index data processing and analysis system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103942300A (en) * 2014-04-15 2014-07-23 大连海事大学 Dynamic solution method of center time series
CN104679991A (en) * 2015-01-27 2015-06-03 吉林大学 Ordered proposition-oriented novel method of information fusion
CN104636325A (en) * 2015-02-06 2015-05-20 中南大学 Document similarity determining method based on maximum likelihood estimation
CN104866929A (en) * 2015-06-11 2015-08-26 陈虹 International investment index data processing and analysis method and international investment index data processing and analysis system

Also Published As

Publication number Publication date
CN105183836A (en) 2015-12-23

Similar Documents

Publication Publication Date Title
Guez et al. Influence of autocorrelation on the topology of the climate network
CN105824880A (en) Webpage grasping method and device
Abd El Hady Exponentiated transmuted weibull distribution a generalization of the weibull distribution
CN103324888B (en) Based on virus characteristic extraction method and the system of family&#39;s sample
CN105183836B (en) A kind of algorithm that event big data information is obtained based on symbolic feature
Wu et al. Chaos criteria design based on modified sign functions with one or three‐threshold
CN101576872B (en) Chinese text processing method and device thereof
CN104253685B (en) Symmetric key generation and the dynamic quantization method of distribution based on radio channel characteristic
Pham et al. Enhance exploring temporal correlation for data collection in WSNs
CN101060411B (en) A multi-mode matching method for improving the detection rate and efficiency of intrusion detection system
Jiang et al. Application of improved median filtering algorithm to image de-noising
CN104091123A (en) Community network level virus immunization method
CN103714346A (en) People quantity estimation method based on video monitoring
Karimi Firozjaei et al. Monitoring and prediction of land use changes and physical expansion of Babol city during 1985-2040 using multi-temporal Landsat imagery
CN107144874A (en) A kind of method and system based on to ENPEMF signals progress BSWT DDTFA time frequency analysis
Chen Solvers with a bit-encoding phase selection policy and a decision-depth-sensitive restart policy
Cai et al. Score2SAT: solver description
Cheng et al. A fuzzy decision tree model for airport terminal departure passenger traffic forecasting
Ki et al. Addressing water pollution hotspots in the tributary monitoring network using a non-linear data analysis tool
Bhat et al. Steganalysis of YASS using Huffman Length statistics
Ke et al. Grey-exponential smoothing prediction model of mining safety based on forecasting validity
Sun et al. Metallographical image segmentation and compression
CN113917428B (en) Complex multi-spread signal and jitter signal separation method
Jiang et al. Quantum correlations in the dimerized spin chain at zero and finite temperatures
Zhang et al. Lower approximation reduction in ordered information system with fuzzy decision

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
CB03 Change of inventor or designer information

Inventor after: Zhang Yu

Inventor after: Zhang Chi

Inventor after: Shi Huanran

Inventor after: Zou Jianping

Inventor before: Zhang Yu

Inventor before: Zhang Chi

CB03 Change of inventor or designer information
TA01 Transfer of patent application right

Effective date of registration: 20170707

Address after: 211803 Star Industrial Park, Pukou District, Jiangsu, Nanjing

Applicant after: JIANGSU RUNBANG INTELLIGENT PARKING EQUIPMENT CO., LTD.

Address before: 1 No. 211167 Jiangsu city of Nanjing province Jiangning Science Park Hongjing Road

Applicant before: Nanjing Institute of Technology

TA01 Transfer of patent application right
CB02 Change of applicant information
CB02 Change of applicant information

Address after: 211803 Pukou District Star Industrial Park in Nanjing, Jiangsu

Applicant after: Jiangsu run state intelligent garage Limited by Share Ltd

Address before: 211803 Pukou District Star Industrial Park in Nanjing, Jiangsu

Applicant before: JIANGSU RUNBANG INTELLIGENT PARKING EQUIPMENT CO., LTD.

GR01 Patent grant
GR01 Patent grant
PE01 Entry into force of the registration of the contract for pledge of patent right

Denomination of invention: An algorithm for obtaining event big data information based on symbol features

Effective date of registration: 20211125

Granted publication date: 20180615

Pledgee: Bank of China Limited Nanjing Jiangbei New Area Branch

Pledgor: JIANGSU RUNBANG INTELLIGENT GARAGE CO.,LTD.

Registration number: Y2021980013224

PE01 Entry into force of the registration of the contract for pledge of patent right