CN103853766B

CN103853766B - A kind of on-line processing method and system towards stream data

Info

Publication number: CN103853766B
Application number: CN201210510056.2A
Authority: CN
Inventors: 张瑾; 程学旗; 林祥辉; 黄康平
Original assignee: Institute of Computing Technology of CAS
Current assignee: Institute of Computing Technology of CAS
Priority date: 2012-12-03
Filing date: 2012-12-03
Publication date: 2017-04-05
Anticipated expiration: 2032-12-03
Also published as: CN103853766A

Abstract

The invention discloses a kind of on-line processing method towards stream data, including：Step 1, sets up online memory cache layer, is stored in the online memory cache layer after carrying out attribute extraction according to key value structure to the stream data；Step 2, sets up hybrid index structure to the stream data in the memory cache layer；Step 3, every stream data to establishing index structure increase an access flag, and this flag bit is used to indicate different analysis programs for the registration scenarios of the stream data, while recording to the state that each analysis program accesses stream data.Step 4, data scrubbing, if certain stream data by the memory cache layer in all analysis programs specified accessed, the stream data is carried out into cleaning operation.The present invention significantly reduces the reading and writing data pressure during Stream Processing, can effectively alleviate the pressure of database in extensive stream data processing system, and can lift the real-time processing speed of stream data.

Description

A kind of on-line processing method and system towards stream data

Technical field

The present invention relates to large-scale data is processed, particularly with regard to a kind of on-line processing method towards stream data and System.

Background technology

It is with the progress and expanding economy in epoch, increasing to the demand of information in people's daily life, especially It is becoming increasingly popular with internet, the information for having magnanimity daily is issued on the internet and propagated.In 2011, analysis was adjusted Grind mechanism IDC to issue《Value is extracted from chaos》.This report shows that global information total amount often spends 2 years, will increase One times.2011, the global data total amount for being created and being replicated was 1.8ZB.For example, 1.8ZB equivalent to the whole world each People does the data total amount produced by 2.15 hundred million high-resolution nuclear magnetic resonance checks daily.

The task of large-scale data analysis process system is exactly that mass data is processed, and the analysis from mass data is dug Excavate valuable knowledge.Common data handling system needs collection to be stored from the data of each data source, then Data are being read from data storage device, is being analyzed and is processed.A kind of framework of conventional data analysis processing system is to set Vertical central database is realizing the storage and reading of data.News, forum are directed to from internet, are won by capture program first The data of the different classifications such as visitor, microblogging, social networks, search engine are acquired and are written in central database；Then, Various analysis programs read data from database, carry out follow-up data analysis and process.Central database assume responsibility for simultaneously The write of data and reading task.

System architecture with database as storage center has widely been accepted and has been applied.But in mass data ring Under border, with increase, the growth of derived data amount and the increase of applied analysis program number purpose of data source species, centre data The problem of storehouse framework is increasingly highlighted.The shortcoming of central database framework has been mainly reflected in three aspects：First real-time responsiveness Can decline；More than second database interaction；3rd data processing time delay.

It is with the increase of data source, the increase of data volume and the increase of number of applications, traditional based on middle calculation Shortcoming according to the Data Management Analysis system of the framework in storehouse is increasingly highlighted.So, a kind of new data processing architecture need be proposed To cause problem above effectively to be alleviated.

Under normal circumstances, for the resolving ideas of this problem can be summarized as following four：

Message-oriented middleware method.Message-oriented middleware is a kind of centre being made up of message transfer mechanism or SMS queue's pattern Part technology.Message can be sent to each application program by message-oriented middleware, can be alleviated by using message-oriented middleware The read-write pressure of data, at the same can in the message between application program is controlled in part for the access of message.Message-oriented middleware exists Important function has been played in many sector applications.In the demand of enterprise-level application, message transmission needs to ensure reliability and safety Property, but, excessively pay close attention to reliability and security increased the time of data processing and the time delay of data transfer, be not suitable for big rule The requirement of the handling capacity of mould data processing.

Distributed Message Queue method.Increasing company and research institution attempt using based on distributed towards disappearing Alleviating the problem brought by central database framework, these distributed message queues great majority are all with item of increasing income for the system of breath Purpose form is issued.Distributed message handling system can be under efficient process mass data environment messenger service.But this Kind distributed message handling system has two, and one is that these systems are all based on the mode of major key inquiry to carry out The read-write of data, it is impossible to according to the inquiry of some critical field, it is impossible to replace the query function of relevant database completely；Two It is distributed message handling system to ensure high-throughput, it is impossible to the fine integrality and security that must ensure data.

Caching method.In Computer Architecture for the read or write speed of internal memory be 10 times of disk read-write speed with On, so in order to avoid frequently data base read-write, just someone employs the thought of caching, is opened up in one piece outside database Deposit as data buffer zone, mitigate database loads with this, improve data access speed.This caching based on internal memory is still There are problems that two, one is efficiency when cannot optimize data write into Databasce；Two is based on key assignments（Key-Value）The number of tissue According to, it is impossible to interval query operation is carried out for some specific field.

Internal memory database method.In Web applications, for example user accesses, and user clicks on, and these data are arrived in streaming Reach, so research becomes academia and industrial quarters is all extremely paid close attention to asks for the processing method of the online data of stream data Topic.The research branch that another online data is processed is the research and development of memory database.Memory database, as the term suggests Data are exactly placed on the database operated in internal memory.Relative to disk, the reading and writing data speed of internal memory will be higher by several quantity Level, compares in saving the data in internal memory and the performance that can be greatly enhanced application is accessed from disk.Meanwhile, memory database The traditional approach of data in magnetic disk management is abandoned, architecture has all been redesigned in internal memory based on total data, and It has been also carried out being correspondingly improved in terms of data buffer storage, fast algorithm, parallel work-flow, so data processing speed compares traditional database Data processing speed it is many soon, typically all more than 10 times.The maximum feature of memory database is its " primary copy " or " work Make version " memory-resident, i.e. active transaction only come into contacts with the memory copying of real-time internal memory database.Redis maximum shortcoming Be it is not fine must solve the problems, such as data, services reliability, all of data are all stored in the memory headroom of user's application Interior, once process is restarted, or exception is exited, and will result in loss of data.But which cannot meet the different words according to data The demand of Duan Jinhang inquiries.

In sum, alleviate the ability of data access pressure in prior art, limited by various different factors, it is impossible to meet Actual demand.

The content of the invention

The purpose of the present invention is：An inline cache layer based on internal memory is introduced, the characteristics of for stream data, will be original For a large amount of read-write pressure of database are transferred in inline cache layer, so as to during significantly reducing Stream Processing, data are read Pressure is write, effectively alleviates the pressure of database in extensive stream data processing system, lift the real-time processing speed of stream data Degree.

For achieving the above object, the present invention proposes a kind of on-line processing method towards stream data, including：

Step 1, sets up online memory cache layer, and the stream data is carried out storing after attribute extraction according to key value structure In the online memory cache layer；

Step 2, sets up hybrid index structure to the stream data in the memory cache layer；

Step 3, every stream data to establishing index structure increase an access flag, and this flag bit is used to mark Will difference analysis program is for the registration scenarios of the stream data；Access the state of stream data simultaneously to each analysis program Recorded；

Step 4, data scrubbing, if certain stream data by the memory cache layer in all analysis programs for specifying access Cross, then the stream data is carried out into cleaning operation.

The on-line processing method also includes：After certain analysis program reads stream data from the memory cache layer, Check the access flag of the stream data：

If the stream data was accessed by the analysis program, it is to have read flag bit, then not by the stream data Return the analysis program；

If the stream data was not accessed by the analysis program, it is not read flag bit, then the stream data is returned Back to the analysis program, and the flag bit of the stream data is arranged to read flag bit.

The on-line processing method also includes：After reading stream data, the access flag of the stream data is checked：

If the stream data was accessed by the analysis program of all registrations, by the stream data from memory cache layer Remove；

Whether the residence time for otherwise inquiring about the stream data exceedes threshold value, and analysis is continued waiting for if not less than the threshold value The stream data is removed from memory cache layer if more than the threshold value by the access of program.

The mode of setting up of the key value structure in the step 1 is：For each stream data, memory cache layer will be which Unique No. ID key as record of distribution one, all properties information of the key assignments corresponding to the stream data.The step Hybrid index structure described in rapid 2 is combined foundation according to key value structure, B+ trees index structure and Hash Index Structure.

The step 2 includes：

Judge whether the stream data in the inline cache layer is needed by Field Inquiry：

If desired press Field Inquiry：If necessary to carry out interval query according to current attribute, to this Building Attribute Field B+ Tree index structure, if necessary to carry out major key inquiry according to current attribute, then to this Building Attribute Field Hash Index Structure；

If need not be by Field Inquiry, need not be to this Building Attribute Field index structure.

In the step 3：The access flag is 32 integer numerals, each bit of each integer numeral Position can represent an analysis program for the access state of stream data, when the stream data in internal memory is initialized, Each bit of the access flag of every stream data is 0；

When analysis program is registered to internal memory cache layer, the memory cache layer is its one access flag of distribution Position, after certain analysis program accesses a stream data, the memory cache layer is by the access flag of the stream data Digitwise operation is carried out with the access identities of the analysis program, and using the result after calculating as the current access mark of the stream data Will position.

In the step 4：

After reading stream data, the access flag of the stream data is checked：

Otherwise inquire about whether the stream data exceedes threshold value, the visit of analysis program is continued waiting for if not less than the threshold value Ask, the stream data is removed from memory cache layer if more than the threshold value.

For achieving the above object, the present invention also provides a kind of Online Processing System towards stream data, including：

Online memory cache layer building module, for setting up online memory cache layer, carries out attribute to the stream data It is stored in the online memory cache layer according to key value structure after extraction；

Hybrid index structure sets up module, for setting up hybrid index to the stream data in the memory cache layer Structure；

Access flag builds module, increases an access flag for every stream data to establishing index structure Position, this flag bit are used to indicate different analysis programs for the registration scenarios of the stream data, while to each analysis program The state for accessing stream data is recorded；

Internal memory stream data cleaning modul, for accessing to all analysis programs specified in by the memory cache layer The stream data crossed, carries out cleaning operation.

The Online Processing System also includes：

Stream data exits return module, for reading after stream data, checks the access flag of the stream data：

If the stream data analyzed routine access mistake, is to have read flag bit, then the stream data is not returned Analysis program；If the stream data does not have analyzed routine access mistake, it is not read flag bit, then by the mark of the stream data Position is arranged to read flag bit, and returns the stream data to analysis program.

In the internal memory stream data cleaning modul：

After analysis program reads stream data from the memory cache layer, the access flag of the stream data is checked Position：It is if the stream data was accessed by all registered analysis programs, the stream data is clear from memory cache layer Except the stream data；Whether the residence time for otherwise inquiring about the stream data exceedes threshold value, continues if not less than the threshold value The stream data is removed the stream data from memory cache layer if more than the threshold value by the access of program to be analyzed.

The beneficial effects of the present invention is：The on-line processing method and system towards stream data of the present invention is by increasing Data buffer storage based on internal memory, the characteristics of for stream data, a large amount of read-write pressure originally for database is transferred to The pressure of database in extensive stream data processing system in inline cache layer, is effectively alleviated, streaming number is greatly reduced According to read-write pressure, improve stream data real-time processing speed and data handling system it is ageing.

Describe the present invention below in conjunction with the drawings and specific embodiments, but it is not as a limitation of the invention.

Description of the drawings

Fig. 1 is the on-line processing method flow chart towards stream data of the present invention；

Fig. 2 is the Online Processing System schematic diagram towards stream data of the present invention.

Specific embodiment

The core concept of the present invention is an inline cache layer based on internal memory to be introduced on original framework, for stream The characteristics of formula data, for a large amount of read-write pressure of database, will be transferred in inline cache, and efficient must can carry originally For data, services.

Fig. 1 is the on-line processing method flow chart towards stream data of the present invention.As shown in figure 1, the method includes：

Step 1, sets up online memory cache layer, and the stream data is carried out storing after attribute extraction according to key value structure In the online memory cache layer.

Step 2, sets up hybrid index structure to the stream data in the memory cache layer.

Step 3, every stream data to establishing index structure increase an access flag, and this flag bit is used to mark Will difference analysis program is for the registration scenarios of the stream data；Access the state of stream data simultaneously to each analysis program Recorded.

Stream data is that dynamic is present, and for every stream data, what which can be accessed by which analysis program is certain 's.

The mode of setting up of the key value structure in the step 1 is：For each stream data, memory cache layer will be which Unique No. ID key as record of distribution one, all properties information of the key assignments corresponding to the stream data.Original On the basis of based on central database framework, an online memory cache layer is increased.The memory cache layer of increase is based on interior The management of row stream data is deposited into, and reading and writing data service is externally provided by network interface.The increase of memory cache layer is right Adjusted in the data flow of data handling system.On the one hand, the stream data for collecting is written to interior by capture program Deposit in caching, analysis program reads stream data from memory cache, carries out data analysis.On the other hand, memory cache will be fixed Phase is written to the stream data in internal memory in database and carries out persistent storage.

In online memory cache, each stream data organizes storage according to the mode of key assignments.For each streaming number According to memory cache will distribute one globally unique No. ID key as record for which, and followed by key storage is the institute of record There is the information of attribute.All of stream data is stored in key assignments mode, and by the key of stream data come unique mark One record.On the basis of based on key assignments storage, the present invention sets up many index structures of mixing for stream data, for per bar The different field of stream data sets up different types of index structure.For the stream data of storage, some inquiries need by The inquiry of uniqueness is carried out according to attribute field, some inquiries need to be inquired about according to the interval of field.Need for there is uniqueness These fields are set up hash index in internal memory by the inquiry asked.Set up using uniqueness field as the index value of hash index Hash index, carry out in Hash Index Structure uniqueness inquire about when, under best-case can with O (1) (i.e. constant) when Between complexity carry out the inquiry of stream data.For the attribute field for having interval query demand, these fields are built in internal memory Vertical B+ trees index.The interval query carried out by B+ trees index structure can be with O's (logn) (i.e. logarithm) under average case Complete in time complexity.

The on-line processing method also includes dynamic registration step：

After certain analysis program reads stream data from the memory cache layer, the access mark of the stream data is checked Will position：

If the stream data was not accessed by the analysis program, it is not read flag bit, then the stream data is returned Back to the analysis program, and the flag bit of the stream data is arranged to read flag bit.The present invention is set up in internal memory Application program dynamic registration based on access control label and cancel register mechanism, there is provided the data stream type of high scalability reads. For stream data, the present invention is in internal memory for each stream data record increases a data access label.Data are visited Ask that label is 32 integer numerals, each bit of integer numeral can represent an analysis program for streaming The service condition of data.Analysis program needs to memory cache to be registered, and memory cache is its one data access mark of distribution Know, i.e., the analysis program registered is represented using some bit in 32 integer numerals.When analysis program is registered After success, memory cache can be the mark of one access data of its distribution, and the analysis program is exactly come convection current by the mark Formula data conduct interviews and use.In order to reduce repetition stream data accounting for for the network bandwidth in the process of stream data With each analysis program is unable to repeated accesses same stream data.During for data initialization in internal memory, every streaming number According to data access mark each bit be 0.After certain application program accessed the stream data, memory cache The data access mark of the data access flag position of this stream data and the analysis program carried out step-by-step or computing, will meter Result after calculation is used as the current data access abstract factory of the stream data.When an application program accessed certain streaming number According to afterwards, cannot the repeated accesses stream datas.

The step 4 includes：

After reading stream data, the access flag of the stream data is checked：

I.e. the present invention establishes efficient internal storage data cleaning and escape mechanism, the streaming number being resident in cleaning internal memory in time According to the availability of raising data, services.For the cleaning mechanism of the stream data in internal memory, the present invention is classified as two kinds of situations Account for.Under normal circumstances, internal storage data caching checks the access control label of stream data in internal memory, if it find that right In all registered analysis programs, the stream data had all been used, then by log-on data scale removal process, by which from interior Deposit middle deletion.In abnormal cases, internal storage data caching checks the access control label of stream data in internal memory, if it find that having Some analysis programs still have not visited the stream data, then the residence time to this stream data in internal memory judges. If the stream data is resident in internal memory for a long time, exceed the time threshold of regulation, then by log-on data scale removal process, Which is deleted from internal memory；If residence time of the stream data in internal memory is not less than the time threshold of regulation, not right Which is processed, and allows which to continue to be stored in internal memory.

Fig. 2 is the Online Processing System schematic diagram towards stream data of the present invention.As shown in Fig. 2 the system includes：

Access flag builds module, increases an access flag for every stream data to establishing index structure Position, this flag bit are used to indicate different analysis programs for the registration scenarios of the stream data；Simultaneously to each analysis program The state for accessing stream data is recorded；

On the basis of original framework based on central database, an online memory cache layer is increased.What is increased is interior Deposit cache layer carries out the management of stream data based on internal memory, and externally provides reading and writing data service by network interface.Internal memory The increase of cache layer is adjusted for the data flow of data handling system.On the one hand, capture program is by the stream for collecting Formula data are written in memory cache, and analysis program reads stream data from memory cache, carry out data analysis.The opposing party Face, memory cache periodically will be written to the stream data in internal memory in database and carry out persistent storage.

In online memory cache, each stream data organizes storage according to the mode of key assignments.For each streaming number According to memory cache will distribute one globally unique No. ID key as record for which, and the key assignments corresponds to the stream data All properties information.All of stream data is stored in key assignments mode, and by the key of stream data uniquely marking Know a record.On the basis of based on key assignments storage, the present invention sets up many index structures of mixing for stream data, for every The different field of bar stream data sets up different types of index structure.For the stream data of storage, some inquiries need The inquiry of uniqueness is carried out according to attribute field, some inquiries need to be inquired about according to the interval of field.For there is uniqueness These fields are set up hash index in internal memory by the inquiry of demand.Build using uniqueness field as the index value of hash index Vertical hash index, when uniqueness inquiry is carried out in Hash Index Structure, can be with O's (1) (i.e. constant) under average case Time complexity carries out the inquiry of stream data.For the attribute field for having interval query demand, to these fields in internal memory Set up B+ trees index.The interval query carried out by B+ trees index structure can be with O (logn) (i.e. logarithm) under average case Time complexity in complete.

The Online Processing System also includes：

If the stream data analyzed routine access mistake, is to have read flag bit, then the stream data is not returned Analysis program；If the stream data does not have analyzed routine access mistake, it is not read flag bit, then by the mark of the stream data Position is arranged to read flag bit, and returns the stream data to analysis program.The present invention is set up based on access control in internal memory The application program dynamic registration of label processed and cancel register mechanism, there is provided the data stream type of high scalability reads.For streaming number According to the present invention is in internal memory for each stream data record increases a data access label.Data access label is one Individual 32 integer numerals, each bit of integer numeral can represent an analysis program for the use of stream data Situation.Analysis program needs to memory cache to be registered, and memory cache is its one data access identities of distribution, i.e., using 32 Some bit in the integer numeral of position is representing the analysis program registered.It is after analysis program succeeds in registration, interior The mark that caching can be one access data of its distribution is deposited, the analysis program is exactly to be visited come streaming data by the mark Ask and use.In order to reduce duplicate data for the occupancy of the network bandwidth in the process of stream data, each analysis program is not Can repeated accesses same stream data.During for data initialization in internal memory, the data access mark of every stream data Each bit be 0.After certain application program accessed the stream data, memory cache is by this stream data The data access mark of data access flag position and the analysis program carries out step-by-step or computing, using the result after calculating as this The current data access abstract factory of stream data.After an application program accessed certain stream data, cannot be again The stream data is accessed again.

In the internal memory stream data cleaning modul：

I.e. the present invention establishes efficient internal storage data cleaning and escape mechanism, the streaming number being resident in cleaning internal memory in time According to the availability of raising data, services.For the cleaning mechanism of the stream data in internal memory, the present invention is classified as two kinds of situations Account for.Under normal circumstances, internal storage data caching checks the access control label of stream data in internal memory, if it find that right In all registered analysis programs, the stream data had all been used, then by log-on data scale removal process, by which from interior Middle deletion is deposited, the effective rate of utilization of internal memory is lifted.Under abnormal conditions, internal storage data caching checks the access of stream data in internal memory Abstract factory, if it find that have some analysis programs still to have not visited the stream data, then to this stream data in internal memory Residence time judged.If the stream data is resident in internal memory for a long time, exceed the time threshold of regulation, then By log-on data scale removal process, which is deleted from internal memory；If residence time of the stream data in internal memory is not less than rule Fixed time threshold, then do not processed to which, allows which to continue to be stored in internal memory.

Certainly, the present invention can also have other various embodiments, in the case of without departing substantially from spirit of the invention and its essence, ripe Know those skilled in the art and various corresponding changes and deformation, but these corresponding changes and deformation can be made according to the present invention The protection domain of the claims in the present invention should all be belonged to.

Claims

1. a kind of on-line processing method towards stream data, it is characterised in that include：

Step 1, sets up online memory cache layer, institute is stored in after carrying out attribute extraction according to key value structure to the stream data State in online memory cache layer；

Step 3, every stream data to establishing index structure increase an access flag, and this flag bit is used for mark not With analysis program for the registration scenarios of the stream data, while carrying out to the state that each analysis program accesses stream data Record；

Step 4, data scrubbing, if certain stream data by the memory cache layer in all analysis programs specified accessed, The stream data is carried out into cleaning operation then.

2. on-line processing method as claimed in claim 1, it is characterised in that the on-line processing method also includes dynamic registration Step：

After certain analysis program reads stream data from the memory cache layer, the access flag of the stream data is checked Position：

If the stream data was accessed by the analysis program, it is to have read flag bit, then the stream data is not returned The analysis program；

If the stream data was not accessed by the analysis program, it is not read flag bit, then the stream data is returned to The analysis program, and the flag bit of the stream data is arranged to read flag bit.

3. on-line processing method as claimed in claim 1, it is characterised in that the foundation side of the key value structure in the step 1 Formula is：For each stream data, memory cache floor will distribute unique No. ID key as record for which, the key is remembered Record the information of the stream data all properties.

4. on-line processing method as claimed in claim 1, it is characterised in that hybrid index structure is described in the step 2 Combine foundation according to key value structure, B+ trees index structure and Hash Index Structure.

5. on-line processing method as claimed in claim 1, it is characterised in that the step 2 includes：

If desired press Field Inquiry：If necessary to carry out interval query according to current attribute, to this Building Attribute Field B+ tree ropes Guiding structure, if necessary to carry out major key inquiry according to current attribute, then to this Building Attribute Field Hash Index Structure；

6. on-line processing method as claimed in claim 1, it is characterised in that in the step 3：The access flag is one Individual 32 integer numerals, each bit of each integer numeral can represent an analysis program for stream data Access state, when the stream data in internal memory is initialized, each bit of the access flag of every stream data It is 0；

When analysis program is registered to internal memory cache layer, the memory cache layer is its one access flag of distribution, when After certain analysis program accesses a stream data, the memory cache layer is by the access flag of the stream data and this point The access identities of analysis program carry out digitwise operation, and using the result after calculating as the current access flag of the stream data.

7. on-line processing method as claimed in claim 1, it is characterised in that in the step 4：

After reading stream data, the access flag of the stream data is checked：

It is if the stream data was accessed by the analysis program of all registrations, the stream data is clear from memory cache layer Remove；

Whether the residence time for otherwise inquiring about the stream data exceedes threshold value, if not less than the threshold value continues waiting for analysis program Access, if more than the stream data being removed from memory cache layer if the threshold value.

8. a kind of Online Processing System towards stream data, it is characterised in that include：

Online memory cache layer building module, for setting up online memory cache layer, carries out attribute extraction to the stream data It is stored in the online memory cache layer according to key value structure afterwards；

Hybrid index structure sets up module, for setting up hybrid index knot in the memory cache layer to the stream data Structure；

Access flag builds module, increases an access flag for every stream data to establishing index structure, This flag bit is used to indicate different analysis programs for the registration scenarios of the stream data, while accessing to each analysis program The state of stream data is recorded；

Internal memory stream data cleaning modul, for what is accessed to all analysis programs specified in by the memory cache layer Stream data, carries out cleaning operation.

9. Online Processing System as claimed in claim 8, it is characterised in that the Online Processing System also includes：

Stream data exits return module, for reading after stream data, checks the access flag of the stream data：If The stream data analyzed routine access mistake, is to have read flag bit, then the stream data is not returned analysis program；If The stream data does not have analyzed routine access mistake, is not read flag bit, then be arranged to read by the flag bit of the stream data Flag bit, and the stream data is returned to analysis program.

10. Online Processing System as claimed in claim 8, it is characterised in that in the internal memory stream data cleaning modul：

After analysis program reads stream data from the memory cache layer, the access flag of the stream data is checked： If the stream data was accessed by all registered analysis programs, the stream data is removed from memory cache layer； Whether the residence time for otherwise inquiring about the stream data exceedes threshold value, and the visit of analysis program is continued waiting for if not less than the threshold value Ask, the stream data is removed from memory cache layer if more than the threshold value.