CN105022699A - Cache region data preprocessing method and system - Google Patents

Cache region data preprocessing method and system Download PDF

Info

Publication number
CN105022699A
CN105022699A CN201510412138.7A CN201510412138A CN105022699A CN 105022699 A CN105022699 A CN 105022699A CN 201510412138 A CN201510412138 A CN 201510412138A CN 105022699 A CN105022699 A CN 105022699A
Authority
CN
China
Prior art keywords
data
user
buffer memory
query
lambda
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201510412138.7A
Other languages
Chinese (zh)
Other versions
CN105022699B (en
Inventor
施文进
阎九吉
吴青
王飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiangsu Huiyin Science & Technology Co Ltd
ZHENJIANG HUILONG YANGTSE RIVER PORT CO Ltd
WELLONG ETOWN INTERNATIONAL LOGISTICS Co Ltd
Original Assignee
Jiangsu Huiyin Science & Technology Co Ltd
ZHENJIANG HUILONG YANGTSE RIVER PORT CO Ltd
WELLONG ETOWN INTERNATIONAL LOGISTICS Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiangsu Huiyin Science & Technology Co Ltd, ZHENJIANG HUILONG YANGTSE RIVER PORT CO Ltd, WELLONG ETOWN INTERNATIONAL LOGISTICS Co Ltd filed Critical Jiangsu Huiyin Science & Technology Co Ltd
Priority to CN201510412138.7A priority Critical patent/CN105022699B/en
Publication of CN105022699A publication Critical patent/CN105022699A/en
Application granted granted Critical
Publication of CN105022699B publication Critical patent/CN105022699B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a cache region data preprocessing method and system. The method can accurately estimate user query time, user retention time and user query content. The method specifically comprises: recording and constructing basic data, and preprocessing the basic data; establishing a least square model for simulating user behaviors, and predicting data relationships among parameters such as user working time, query content and the like; and storing data input and received from a cache to a cache region, and outputting the data from the cache region according to a first-in and first-out sequence. According to the system, a user behavior rule is researched, the user query time, each working time, the query content and the like are predicted, and the cache region data is set by the system in advance according to prediction information, so that the query experience of users is optimized and the technical problem in accurate estimation of the user working time, the user retention time and the user query content in an e-business data processing system are solved.

Description

The preprocess method of buffer area data and system
Technical field
The present invention relates to a kind of preprocess method and system of data, particularly relate to a kind of preprocess method and the system that are applied to buffer area data.
Background technology
A kind of technology of employing that current Data Preprocessing Technology is mainly simple, and electronic commerce data has sudden strong and instantaneous data and is responsible for the features such as exception is heavy, a kind for the treatment of technology of simple use can cause very large data processing load, can not meet the demand of ecommerce.
First Input First Output is a kind of traditional manner of execution according to the order of sequence, and when buffer area data are full, the data/commands entering buffer area at first first completes and performs and leave buffer area, and then just performs Article 2 data/commands.It is a kind of data buffer of first in first out, the difference of he and normal memory does not have exterior read-write address wire, use very simple like this, but shortcoming can only be sequentially written in data exactly, the sense data of order, its data address is read and write pointer by inside and is automatically added 1 and complete, and can not can be determined read or write certain address of specifying by address wire as normal memory, and it accurately can not estimate user's query time in electronic commerce data system, the residence time, query contents; Statistical method, utilize mathematical statistics method, the system frequency of statistics, preferably be there is buffer area in any active ues information, be buffered in by color register in the buffer area of answering with the Color pair of the region of memory of the physical address of current accessed in buffer, the service efficiency of buffer memory can be improved, improve system performance, but the method still cannot meet the feature of electronic commerce data.
The invention provides a kind of preprocess method of buffer area data, the method is by the method for machine learning, the code of conduct of research user, prediction user query time, each working time and query contents etc., system will arrange buffer area data in advance according to information of forecasting, thus make the inquiry of user experience optimization.
Summary of the invention
Embodiments provide a kind of preprocess method of buffer area data, the method is by the method for machine learning, the code of conduct of research user, prediction user query time, each working time and query contents etc., system will arrange buffer area data in advance according to information of forecasting, thus make the inquiry of user experience optimization.
For achieving the above object, embodiments of the invention adopt following technical scheme:
First aspect present invention provides a kind of buffer area data preprocessing method, comprising:
Record structure foundation data, to basic data pre-service;
Set up LEAST SQUARES MODELS FITTING modelling customer behavior, the data relationship between parameter such as prediction user job time and query contents etc.;
Store and input the data of reception to buffer area from buffer memory, export from described buffer area according to first in first out order.
Preferably, according to first aspect, described record structure foundation data, specifically comprise:
Basic data refers to user's query time TimeUserQuery, user residence time TimeUserStand and user's query contents ContentUserQuery.Structure TimeUserQuery, TimeUserStand and ContentUserQuery interface function obtains the query time of client user from initial server end, the residence time and query contents; Preset timer Timer in described TimeUserQuery and TimeUserStand function, and adopt cookie ActiveX Techniques, obtain query time and the residence time of user in current behavior; The data collected are sent to destination server end by the mode that GET, POST are asynchronous; Described basic data is shown to described destination server end by interface with JSON form.
Preferably, described user's query contents ContentUserQuery, specifically comprises:
The manipulable all query contents of systemic presupposition user have the one in Loading, Unloading, Cargo, Carrier and Route or its combination in any (different industries and the predeterminable different query contents of demand), the parameter of ContentUserQuery interface function is Loading, Unloading, Cargo, Carrier and Route, according to the different operating behavior of user, return different with the parameter value shown, the parameter rreturn value of having carried out described query contents is set to 1, and the parameter rreturn value of not carrying out described query contents is set to 0.
Preferably, according to first aspect, described to basic data pre-service, specifically comprise:
After described destination server receives rreturn value and returned content, system uses the Parse method of JObject or JArray that JSON character string is converted to JSON object, extract described basic data by the mode of described JSON object, namely the association analyzed between described basic data query contents and query time constructs the graph of a relation of Loading, Unloading, Cargo, Carrier, Route and TimeUserQuery and TimeUserStand.
Preferably, according to first aspect, the graph of a relation of described structure Loading, Unloading, Cargo, Carrier, Route and TimeUserQuery and TimeUserStand, a kind of possibility implementation be:
Preferably, in described graph of a relation, TimeUserQuery and TimeUserStand separately as dependent variable and Loading, Unloading, Cargo, Carrier, Route as independent variable, observe figure and find to have certain linear regression trend, consideration least square method is made prediction.
Preferably, least square method is a kind of mathematical optimization techniques, it finds the optimal function coupling of data by the quadratic sum of minimum error, utilize least square method can try to achieve unknown data easily, and the quadratic sum of error is minimum between the data making these try to achieve and real data, can in the hope of the optimal value of objective function.
Step 1: described destination server receives the repeatedly query manipulation of a user, and described user queried the one or more of described query contents, if query contents is n, the time that user inquires about each described query contents is designated as respectively:
T=(t 1,t 2,t 3,...t i...,t n) (1)
Wherein t iexpression user inquires about described query time during described i-th query contents.
Step 2: the described query time of m the described query contents of inquiry of a user is expressed as:
y(t 1,K,t n;x 0,x 1,K,x n)=x 0+x 1t 1+Λ+x nt n(2)
Wherein y representative of consumer inquires about the working time of described query contents, x 0, x 1, K, x nrepresent model parameter, this parameter makes the quadratic sum of actual value and observed difference minimum, usually gets x 0=1, be expressed as with system of linear equations:
y 1=x 0+x 1t 11+Λ+x jt 1j+Λ+x nt 1n
y 2=x 0+x 1t 21+Λ+x jt 2j+Λ+x nt 2n
MM
y i=x 0+x 1t i1+Λ+x jt ij+Λ+x nt in(3)
M
y m=x 0+x 1t m1+Λ+x jt mj+Λ+x nt mn
Wherein y irepresent that described user inquires about described query contents query time used for i-th time, t ijrepresent that described user inquires about described jth item query contents query time used for i-th time.
Usually by t ijbe denoted as data matrix A, described model parameter x ibe denoted as parameter vector X, query time y described in user ibe denoted as Y, then system of linear equations can be expressed as:
1 t 11 Λ t 1 j Λ t l n 1 t 21 Λ t 2 j Λ t 2 n M 1 t i 1 Λ t i j Λ t i n M 1 t m 1 Λ t m j Λ t m n · x 0 x 1 x 2 M x j M x n = y 1 y 2 M y i M y n I.e. AX=Y (4)
Wherein, A = 1 t 11 Λ t 1 j Λ t l n 1 t 21 Λ t 2 j Λ t 2 n M 1 t i 1 Λ t i j Λ t i n M 1 t m 1 Λ t m j Λ t m n , X = x 0 x 1 x 2 M x j M x n , Y = y 1 y 2 M y i M y n . - - - ( 5 )
Step 3: the value of the query time of matching real user behavior and the described model parameter matrix X of query contents is:
The observability estimate value of a described query contents is inquired about by LEAST SQUARES MODELS FITTING definable user with the estimated value of described model parameter
y ^ i = x ^ 0 + x ^ 1 t 1 i + x ^ 2 t 2 i + Λ + x ^ i t k i , Wherein i=1,2, Λ, n, k=1,2, Λ, m.(6)
Obtain:
∂ ∂ X ^ 0 Q = 0 ∂ ∂ X ^ 1 Q = 0 ∂ ∂ X ^ 2 Q = 0 M ∂ ∂ X ^ k Q = 0 - - - ( 7 )
Wherein Q = Σ i = 1 n e i 2 = Σ i = 1 n ( y i - y ^ i ) 2 = Σ i = 1 n ( y i - ( x ^ 0 + x ^ 1 t 1 i + x ^ 2 t 2 i + Λ + x ^ k t k i ) ) - - - ( 8 )
So obtain the estimated value system of equations with described model parameter:
Σy i - Σ ( x ^ 0 + x ^ 1 t 1 i + x ^ 2 t 2 i + Λ + x k t k i ) = 0 Σy i x 1 i - Σ ( x ^ 0 + x ^ 1 t 1 i + x ^ 2 t 2 i + Λ + x k t k i ) x 1 i = 0 Σy i x 2 i - Σ ( x ^ 0 + x ^ 1 t 1 i + x ^ 2 t 2 i + Λ + x k t k i ) x 2 i = 0 M Σy i x k i - Σ ( x ^ 0 + x ^ 1 t 1 i + x ^ 2 t 2 i + Λ + x k t k i ) x k i = 0 - - - ( 9 )
Obtain user according to (8) (9) to inquire about the observed reading of described query contents time used and estimated value and close and be:
Q = Σ i = 1 n e i 2 = Σ i = 1 n ( y i - y ^ i ) 2 = e t e = ( Y - A X ^ ) t ( Y - A X ^ ) - - - ( 10 )
According to the principle of least square, the value of described model parameter is:
∂ ∂ X ^ ( Y - A X ^ ) t ( Y - A X ^ ) = 0 - - - ( 11 )
The estimated value finally obtaining described model parameter is:
∂ ∂ X ^ ( Y t Y - 2 X ^ A t Y + X ^ A t A X ^ ) = 0 A t Y = A t A X ^ X ^ = ( A t A ) - 1 A t Y - - - ( 12 )
Step 4: the TimeUserQuery time predicting described user:
wherein t iexpression user inquires about described query time during described i-th query contents.X irepresent the described model parameter that i-th described query contents is corresponding, wherein x 0=1.If user only carries out Cargo operation, prediction Cargo query time is:
y 3=x 0+t 3x 3。(13)
Wherein a SessionId is set in tables of data respectively for described query contents Loading, Unloading, Cargo, Carrier, Route.In above-mentioned steps 4, directly obtain related parameter values by described SessionId, and the data obtained are inputted the raw data of data as buffer area.
Preferably, second aspect, provides a kind of buffer area data preprocessing method, also comprises:
Master cache district is arranged to the data storing and receive from buffer memory input, cache controller is used for optionally being routed to buffer area for subsequent use from described buffer zone by described reception data, makes the described data received from buffer memory input from described buffer area for subsequent use, described reception data can be outputted to described buffer memory according to FIFO order and exports.
Preferably, described buffer memory for subsequent use for storing the described reception data of the input of described buffer memory or the reception of storage master cache, and exports described reception data are outputted to described buffer memory with the order in described master cache identical reception data.
Preferably, the effect of described cache controller is when described master cache is empty data mode, described master cache is from buffer memory input to described buffer memory transmission data for subsequent use, or when described buffer memory for subsequent use is full data mode, described buffer memory for subsequent use is from buffer memory input to described master cache transmission data, or when described master cache data mode is not empty, described reception data are from buffer memory input to described master cache transmission data.
Preferably, described master cache and buffer memory for subsequent use can store the independently fifo queue of different types of data and the data space of master cache is greater than the data space of buffer memory for subsequent use.
Preferably, the third aspect, provides a kind of buffer area data pretreatment, comprising:
Conveyer: send the data to buffer area; Buffer area: for receiving data from conveyer, and according to the order of first-in first-out, the data received are sent to receiving trap; Receiving trap: for receiving the data come from buffer area.
Wherein, first described system is trained data and is processed, and because data volume is comparatively large, first puts it into buffer area by transmitting device.
Preferably, according to the one of the third aspect may implementation be:
Buffer area comprises master cache and buffer memory for subsequent use, and described master cache is configured to be mainly used in storing the data received from buffer memory input; Described buffer memory for subsequent use is mainly used for the described reception data storing the input of described buffer memory or store master cache reception, and exports described reception data are outputted to described buffer memory with the order in described master cache identical reception data.
Preferably, described buffer area also comprises cache controller, it is when described master cache is for full data mode, described master cache is from buffer memory input to described buffer memory transmission data for subsequent use, or when described buffer memory for subsequent use is full data mode, described buffer memory for subsequent use is from buffer memory input to described master cache transmission data, or when described data cached state for subsequent use is discontented with, described master cache is from buffer memory input to described buffer memory transmission data for subsequent use.
Preferably, according to the third aspect, the second mode in the cards is:
Secondly in order to improve the performance of described system, first adopt least square method constantly to train and pre-service data, be the multiple buffer area of system configuration, the data space of last master cache is greater than the storage space of buffer memory for subsequent use.
Accompanying drawing explanation
In order to be illustrated more clearly in embodiments of the invention or technical scheme of the prior art, be briefly described to the accompanying drawing used required in embodiment or description of the prior art below, the accompanying drawing in the following describes is only some embodiments of the present invention.
The partial function interface diagram of a kind of buffer area data preprocessing method that Fig. 1 provides for embodiments of the invention;
Fig. 2 provides least square method to ask the schematic flow sheet of model parameter for embodiments of the invention;
A kind of buffer area data preprocessing method schematic flow sheet that Fig. 3 provides for embodiments of the invention;
The structural representation of a kind of buffer area data pretreatment that Fig. 4 provides for embodiments of the invention.
Embodiment
For making the technical problem to be solved in the present invention, technical scheme and advantage clearly, below in conjunction with the accompanying drawings and the specific embodiments.
Embodiments of the invention provide a kind of buffer area data preprocessing method and system.The present invention can be used for buffer area data prediction, first for the parameter such as behavior record query contents and working time of certain user on platform, based on data carry out recording and pre-service, set up LEAST SQUARES MODELS FITTING according to described pre-service basic data and carry out modelling customer behavior, data relationship between parameter such as prediction user's query time and query contents etc., the data obtained distribute to buffer area as the data received from buffer memory input, export from described buffer area according to first in first out order.
Concrete, embodiments of the invention provide a kind of buffer area data preprocessing method and system, according to parameters such as the behavior record query contents of certain user on platform and working times with reference to shown in Fig. 1, comprise following content:
The query contents of user described in the behavior record of recording user on platform and working time, based on data, specifically comprise:
Basic data refers to user's query time TimeUserQuery, user residence time TimeUserStand and user's query contents ContentUserQuery.Structure TimeUserQuery, TimeUserStand and ContentUserQuery interface function obtains the query time of client user from initial server end, the residence time and query contents; Preset timer Timer in described TimeUserQuery and TimeUserStand function, and adopt cookie ActiveX Techniques, obtain query time and the residence time of user in current behavior; The data collected are sent to destination server end by the mode that GET, POST are asynchronous; Described basic data is shown to described destination server end by interface with JSON form.
Described user's query contents ContentUserQuery, specifically comprise: the manipulable all query contents of systemic presupposition user have Loading, Unloading, Cargo, Carrier and Route (different industries and the predeterminable different query contents of demand), the parameter of ContentUserQuery interface function is Loading, Unloading, Cargo, Carrier and Route, according to the different operating behavior of user, return different with the parameter value shown, the parameter rreturn value of having carried out described query contents is set to 1, the parameter rreturn value of not carrying out described query contents is set to 0.
After recording described basic data, pre-service is carried out to described basic data, specifically comprise: after described destination server receives rreturn value and returned content, system uses the Parse method of JObject or JArray that JSON character string is converted to JSON object, extract described basic data by the mode of described JSON object, namely the association analyzed between described basic data query contents and query time constructs the graph of a relation of Loading, Unloading, Cargo, Carrier, Route and TimeUserQuery and TimeUserStand.The graph of a relation of structure Loading, Unloading, Cargo, Carrier, Route and TimeUserQuery and TimeUserStand, a kind of possibility implementation be:
In described graph of a relation, TimeUserQuery and TimeUserStand separately as dependent variable and Loading, Unloading, Cargo, Carrier, Route as independent variable, observe figure and find that there is certain linear regression trend, consider to use least square method model and forecast.
The embodiment provides the model and forecast flow process of least square method, and try to achieve the optimum solution of model parameter, with reference to shown in Fig. 2, comprise the following steps:
Least square method is a kind of mathematical optimization techniques, it finds the optimal function coupling of data by the quadratic sum of minimum error, utilize least square method can try to achieve unknown data easily, and the quadratic sum of error is minimum between the data making these try to achieve and real data, can in the hope of the optimal value of objective function.
Step 1: described destination server receives the repeatedly query manipulation of a user, and described user queried the one or more of described query contents, if query contents is n, the time that user inquires about each described query contents is designated as respectively:
T=(t 1,t 2,t 3,...t i...,t n) (1)
Wherein t iexpression user inquires about described query time during described i-th query contents.
Step 2: the described query time of m the described query contents of inquiry of a user is expressed as:
y(t 1,K,t n;x 0,x 1,K,x n)=x 0+x 1t 1+Λ+x nt n(2)
Wherein y representative of consumer inquires about the working time of described query contents, x 0, x 1, K, x nrepresent model parameter, this parameter makes the quadratic sum of actual value and observed difference minimum, usually gets x 0=1, be expressed as with system of linear equations:
y 1=x 0+x 1t 11+Λ+x jt 1j+Λ+x nt 1n
y 2=x 0+x 1t 21+Λ+x jt 2j+Λ+x nt 2n
MM
y i=x 0+x 1t i1+Λ+x jt ij+Λ+x nt in(3)
M
y m=x 0+x 1t m1+Λ+x jt mj+Λ+x nt mn
Wherein y irepresent that described user inquires about described query contents query time used for i-th time, t ijrepresent that described user inquires about described jth item query contents query time used for i-th time.
Usually by t ijbe denoted as data matrix A, described model parameter x ibe denoted as parameter vector X, query time y described in user ibe denoted as Y, then system of linear equations can be expressed as:
1 t 11 Λ t 1 j Λ t l n 1 t 21 Λ t 2 j Λ t 2 n M 1 t i 1 Λ t i j Λ t i n M 1 t m 1 Λ t m j Λ t m n · x 0 x 1 x 2 M x j M x n = y 1 y 2 M y i M y n I.e. AX=Y (4)
Wherein, A = 1 t 11 Λ t 1 j Λ t l n 1 t 21 Λ t 2 j Λ t 2 n M 1 t i 1 Λ t i j Λ t i n M 1 t m 1 Λ t m j Λ t m n , X = x 0 x 1 x 2 M x j M x n , Y = y 1 y 2 M y i M y n . - - - ( 5 )
Step 3: the value of the query time of matching real user behavior and the described model parameter matrix X of query contents is:
The observability estimate value of a described query contents is inquired about by LEAST SQUARES MODELS FITTING definable user with the estimated value of described model parameter
y ^ i = x ^ 0 + x ^ 1 t 1 i + x ^ 2 t 2 i + Λ + x ^ i t k i , Wherein i=1,2, Λ, n, k=1,2, Λ, m.(6)
Obtain:
∂ ∂ X ^ 0 Q = 0 ∂ ∂ X ^ 1 Q = 0 ∂ ∂ X ^ 2 Q = 0 M ∂ ∂ X ^ k Q = 0 - - - ( 7 )
Wherein Q = Σ i = 1 n e i 2 = Σ i = 1 n ( y i - y ^ i ) 2 = Σ i = 1 n ( y i - ( x ^ 0 + x ^ 1 t 1 i + x ^ 2 t 2 i + Λ + x ^ k t k i ) ) - - - ( 8 )
So obtain the estimated value system of equations with described model parameter:
Σy i - Σ ( x ^ 0 + x ^ 1 t 1 i + x ^ 2 t 2 i + Λ + x k t k i ) = 0 Σy i x 1 i - Σ ( x ^ 0 + x ^ 1 t 1 i + x ^ 2 t 2 i + Λ + x k t k i ) x 1 i = 0 Σy i x 2 i - Σ ( x ^ 0 + x ^ 1 t 1 i + x ^ 2 t 2 i + Λ + x k t k i ) x 2 i = 0 M Σy i x k i - Σ ( x ^ 0 + x ^ 1 t 1 i + x ^ 2 t 2 i + Λ + x k t k i ) x k i = 0 - - - ( 9 )
Obtain user according to (8) (9) to inquire about the observed reading of described query contents time used and estimated value and close and be:
Q = Σ i = 1 n e i 2 = Σ i = 1 n ( y i - y ^ i ) 2 = e t e = ( Y - A X ^ ) t ( Y - A X ^ ) - - - ( 10 )
According to the principle of least square, the value of described model parameter is:
∂ ∂ X ^ ( Y - A X ^ ) t ( Y - A X ^ ) = 0 - - - ( 11 )
The estimated value finally obtaining described model parameter is:
∂ ∂ X ^ ( Y t Y - 2 X ^ A t Y + X ^ A t A X ^ ) = 0 A t Y = A t A X ^ X ^ = ( A t A ) - 1 A t Y - - - ( 12 )
Step 4: the TimeUserQuery time predicting described user:
y i = x 0 + Σ i = 1 n t i x i - - - ( 13 )
Wherein t iexpression user inquires about described query time during described i-th query contents.X irepresent the described model parameter that i-th described query contents is corresponding, wherein x 0=1.If user only carries out Cargo operation, prediction Cargo query time is:
y 3=x 0+t 3x 3
Wherein a SessionId is set in tables of data respectively for described query contents Loading, Unloading, Cargo, Carrier, Route.In above-mentioned steps 4, directly obtain related parameter values by described SessionId, and the data obtained are inputted the raw data of data as buffer area.
The embodiment provides a kind of buffer area data preprocessing method, the data that described pretreated basic data receives as described buffer memory input, the operational scheme in platform, with reference to shown in Fig. 3, comprises following content:
Master cache district is arranged to the data storing and receive from buffer memory input, cache controller is used for optionally being routed to buffer area for subsequent use from described buffer zone by described reception data, makes the described data received from buffer memory input from described buffer area for subsequent use, described reception data can be outputted to described buffer memory according to FIFO order and exports.
Described buffer memory for subsequent use for storing the described reception data of the input of described buffer memory or the reception of storage master cache, and exports described reception data are outputted to described buffer memory with the order in described master cache identical reception data.
The effect of described cache controller is when described master cache is empty data mode, and described master cache is from buffer memory input to described buffer memory transmission data for subsequent use;
Or;
When described buffer memory for subsequent use is full data mode, described buffer memory for subsequent use is from buffer memory input to described master cache transmission data;
Or;
When described master cache is not empty data mode, described reception data are from buffer memory input to described master cache transmission data.
Described master cache and buffer memory for subsequent use can store the independently fifo queue of different types of data and the data space of master cache is greater than the data space of buffer memory for subsequent use.
Upgrade the store status of buffer area, receive request of data;
It is complete that buffer area data are set in advance.
The embodiment provides a kind of buffer area data pretreatment, with reference to shown in Fig. 4, comprise following content:
Conveyer: send the data to buffer area; Buffer area: for receiving data from conveyer, and according to the order of first-in first-out, the data received are sent to receiving trap; Receiving trap: for receiving the data come from buffer area.
First described a kind of buffer area data preprocessing method is trained data and is processed, and because data volume is comparatively large, first puts it into buffer area; Described buffer area receives data from conveyer, and according to the order of first-in first-out, the data received is sent to receiving trap.
Buffer area comprises master cache and buffer memory for subsequent use, and described master cache is configured to be mainly used in storing the data received from buffer memory input; Described buffer memory for subsequent use is mainly used for the described reception data storing the input of described buffer memory or store master cache reception, and exports described reception data are outputted to described buffer memory with the order in described master cache identical reception data.
Described buffer area also comprises cache controller, it is when described master cache is empty data mode, described master cache is from buffer memory input to described buffer memory transmission data for subsequent use, or when described buffer memory for subsequent use is full data mode, described buffer memory for subsequent use is from buffer memory input to described master cache transmission data, or when described master cache data mode is not empty, described reception data are from buffer memory input to described master cache transmission data.
Secondly in order to improve the performance of described system, first adopt least square method constantly to train and pre-service data, be the multiple buffer memory of system configuration, the data space of last master cache is greater than the storage space of buffer memory for subsequent use.
The above is the preferred embodiment of the present invention; should be understood that; for the middle-and-high-ranking technical user of the art; under the prerequisite not departing from principle of the present invention; some improvements and modifications can also be made; these improvements and modifications are exhibition results before the certainty in our invention, also should be considered as protection scope of the present invention.

Claims (15)

1. a preprocess method for buffer area data, is characterized in that, comprising:
Record structure foundation data, to basic data pre-service;
Set up LEAST SQUARES MODELS FITTING modelling customer behavior, the data relationship between parameter such as prediction user job time and query contents etc.;
Store and input the data of reception to buffer area from buffer memory, export from described buffer area according to first in first out order.
2. method according to claim 1, is characterized in that, described record structure foundation data, specifically comprise:
Basic data refers to user's query time TimeUserQuery, user residence time TimeUserStand and user's query contents ContentUserQuery;
Structure TimeUserQuery, TimeUserStand and ContentUserQuery interface function obtains the query time of client user from initial server end, the residence time and query contents;
Preset timer Timer in described TimeUserQuery and TimeUserStand function, and adopt cookie ActiveX Techniques, obtain query time and the residence time of user in current behavior;
The data collected are sent to destination server end by the mode that GET, POST are asynchronous; Described basic data is shown to described destination server end by interface with JSON form.
3. method according to claim 2, is characterized in that, described user's query contents ContentUserQuery, specifically comprises:
All query contents of systemic presupposition user operation have one in Loading, Unloading, Cargo, Carrier and Route or its combination in any, the parameter of ContentUserQuery interface function is Loading, Unloading, Cargo, Carrier and Route, according to the different operating behavior of user, return different with the parameter value shown, the parameter rreturn value of having carried out described query contents is set to 1, and the parameter rreturn value of not carrying out described query contents is set to 0.
4. method according to claim 2, is characterized in that, described to basic data pre-service, specifically comprises:
After described destination server receives rreturn value and returned content, system uses the Parse method of JObject or JArray that JSON character string is converted to JSON object, extract described basic data by the mode of described JSON object, namely the association analyzed between described basic data query contents and query time constructs the graph of a relation of Loading, Unloading, Cargo, Carrier, Route and TimeUserQuery and TimeUserStand.
5. method according to claim 4, is characterized in that, the graph of a relation of described structure Loading, Unloading, Cargo, Carrier, Route and TimeUserQuery and TimeUserStand, specifically comprises:
According to described graph of a relation, TimeUserQuery and TimeUserStand has certain linear regression trend as dependent variable and Loading, Unloading, Cargo, Carrier, Route as independent variable separately, makes prediction by least square method.
6. method according to claim 1, is characterized in that, describedly sets up least-squares algorithm model, specifically comprises:
Step 1: described destination server receives the repeatedly query manipulation of a user, and described user queried the one or more of described query contents, if query contents is n, the time that user inquires about each described query contents is designated as respectively:
T=(t 1,t 2,t 3,...t i...,t n) (1)
Wherein t iexpression user inquires about described query time during i-th query contents;
Step 2: the described query time of m the described query contents of inquiry of a user is expressed as:
y(t 1,K,t n;x 0,x 1,K,x n)=x 0+x 1t 1+Λ+x nt n(2)
Wherein y representative of consumer inquires about the working time of described query contents, x 0, x 1, K, x nrepresent model parameter, this parameter makes the quadratic sum of actual value and observed difference minimum, usually gets x 0=1, be expressed as with system of linear equations:
Wherein y irepresent that described user inquires about described query contents query time used for i-th time, t ijrepresent the query time that described user i-th inquiry jth item query contents is used;
Usually by t ijbe denoted as data matrix A, described model parameter x ibe denoted as parameter vector X, query time y described in user ibe denoted as Y, then system of linear equations can be expressed as:
1 t 11 Λ t 1 j Λ t 1 n 1 t 21 Λ t 2 j Λ t 2 n M 1 t i 1 Λ t i j Λ t i n M 1 t m 1 Λ t m j Λ t m n · x 0 x 1 x 2 M x j M x n = y 1 y 2 M y i M y n I.e. AX=Y (4)
Wherein, A = 1 t 11 Λ t 1 j Λ t 1 n 1 t 21 Λ t 2 j Λ t 2 n M 1 t i 1 Λ t i j Λ t i n M 1 t m 1 Λ t m j Λ t m n , X = x 0 x 1 x 2 M x j M x n , Y = y 1 y 2 M y i M y n - - - ( 5 ) ;
Step 3: the value of the query time of matching real user behavior and the described model parameter matrix X of query contents is:
The observability estimate value of a described query contents is inquired about by LEAST SQUARES MODELS FITTING definable user with the estimated value of described model parameter
wherein i=1,2, Λ, n, k=1,2, Λ, m (6)
Obtain:
∂ ∂ X ^ 0 Q = 0 ∂ ∂ X ^ 1 Q = 0 ∂ ∂ X ^ 2 Q = 0 M ∂ ∂ X ^ k Q = 0 - - - ( 7 )
Wherein Q = Σ i = 1 n e i 2 = Σ i = 1 n ( y i - y ^ i ) 2
= Σ i = 1 n ( y i - ( x ^ 0 + x ^ 1 t 1 i + x ^ 2 t 2 i + Λ + x ^ k t k i ) ) - - - ( 8 )
So obtain the estimated value system of equations with described model parameter:
Σy i - Σ ( x ^ 0 + x ^ 1 t 1 i + x ^ 2 t 2 i + Λ + x k t k i ) = 0 Σy i x 1 i - Σ ( x ^ 0 + x ^ 1 t 1 i + x ^ 2 t 2 i + Λ + x k t k i ) x 1 i = 0 Σy i x 2 i - Σ ( x ^ 0 + x ^ 1 t 1 i + x ^ 2 t 2 i + Λ + x k t k i ) x 2 i = 0 M Σy i x k i - Σ ( x ^ 0 + x ^ 1 t 1 i + x ^ 2 t 2 i + Λ + x k t k i ) x k i = 0 - - - ( 9 )
Obtain user according to (8) (9) to inquire about the observed reading of described query contents time used and estimated value and close and be:
Q = Σ i = 1 n e i 2 = Σ i = 1 n ( y i - y ^ i ) 2 = e t e = ( Y - A X ^ ) t ( Y - A X ^ ) - - - ( 10 )
According to the principle of least square, the value of described model parameter is:
∂ ∂ X ^ ( Y - A X ^ ) t ( Y - A X ^ ) = 0 - - - ( 11 )
The estimated value finally obtaining described model parameter is:
∂ ∂ X ^ ( Y t Y - 2 X ^ A t Y + X ^ A t A X ^ ) = 0
A t Y = A t A X ^ X ^ = ( A t A ) - 1 A t Y - - - ( 12 ) ;
Step 4: the TimeUserQuery time predicting described user:
wherein t iexpression user inquires about described query time during described i-th query contents; x irepresent the described model parameter that i-th described query contents is corresponding, wherein x 0=1; If user only carries out Cargo operation, the measurable one-tenth of Cargo query time:
y 3=x 0+t 3x 3
7. method according to claim 6, is characterized in that, described user's query contents specifically comprises:
For described query contents Loading, Unloading, Cargo, Carrier, Route arrange a SessionId respectively in tables of data; Directly related parameter values is obtained by described SessionId in above-mentioned steps 4.
8. method according to claim 1, is characterized in that, described storage inputs the data of reception to buffer area from buffer memory, exports, specifically comprise according to first in first out order from described buffer area:
Master cache district is arranged to the data storing and receive from buffer memory input, cache controller is used for optionally being routed to buffer area for subsequent use from buffer zone by described reception data, makes the described data received from buffer memory input from described buffer area for subsequent use, described reception data be outputted to described buffer memory according to FIFO order and exports.
9. method according to claim 8, it is characterized in that, described buffer memory for subsequent use for storing the described reception data of the input of described buffer memory or the reception of storage master cache, and exports described reception data are outputted to described buffer memory with the order in the identical reception in described master cache district data.
10. method according to claim 8, is characterized in that, described buffer control implement body comprises:
The effect of described cache controller is when described master cache is empty data mode, and described master cache is from buffer memory input to described buffer memory transmission data for subsequent use;
Or;
When described buffer memory for subsequent use is full data mode, described buffer memory for subsequent use is from buffer memory input to described master cache transmission data;
Or;
When described master cache data mode is not empty, described reception data are from buffer memory input to described master cache transmission data.
11. methods according to claim 8, is characterized in that, wherein said master cache and buffer memory for subsequent use can store the independently fifo queue of different types of data and the data space of master cache is greater than the data space of buffer memory for subsequent use.
The pretreatment system of 12. 1 kinds of buffer area data, is characterized in that, specifically comprises:
Conveyer: send the data to buffer area;
Buffer area: for receiving data from conveyer, and according to the order of first-in first-out, the data received are sent to receiving trap;
Receiving trap: for receiving the data come from buffer area.
Wherein, first described system is trained data and is processed, and because data volume is comparatively large, first puts it into buffer area by transmitting device.
13. systems according to claim 12, is characterized in that, specifically comprise:
Buffer area comprises master cache and buffer memory for subsequent use, and described master cache district is configured to be mainly used in storing the data received from buffer memory input; Described buffer memory for subsequent use is mainly used for the described reception data storing the input of described buffer memory or store master cache reception, and exports described reception data are outputted to described buffer memory with the order in described master cache identical reception data.
14. systems according to claim 13, is characterized in that, specifically comprise:
Described buffer area also comprises cache controller, it is when described master cache is empty data mode, described master cache is from buffer memory input to described buffer memory transmission data for subsequent use, or when described buffer memory for subsequent use is full data mode, described buffer memory for subsequent use is from buffer memory input to described master cache transmission data, or when described master cache data mode is not empty, described reception data are from buffer memory input to described master cache transmission data.
15. systems according to claim 13, is characterized in that, specifically comprise:
This system adopts least square method constantly to train data, and the multiple buffer area of described system configuration, the data space of described master cache is greater than the space of buffer memory for subsequent use.
CN201510412138.7A 2015-07-14 2015-07-14 The preprocess method and system of buffer area data Active CN105022699B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510412138.7A CN105022699B (en) 2015-07-14 2015-07-14 The preprocess method and system of buffer area data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510412138.7A CN105022699B (en) 2015-07-14 2015-07-14 The preprocess method and system of buffer area data

Publications (2)

Publication Number Publication Date
CN105022699A true CN105022699A (en) 2015-11-04
CN105022699B CN105022699B (en) 2018-04-24

Family

ID=54412687

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510412138.7A Active CN105022699B (en) 2015-07-14 2015-07-14 The preprocess method and system of buffer area data

Country Status (1)

Country Link
CN (1) CN105022699B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107784364A (en) * 2016-08-25 2018-03-09 微软技术许可有限责任公司 The asynchronous training of machine learning model

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101369276A (en) * 2008-09-28 2009-02-18 杭州电子科技大学 Evidence obtaining method for Web browser caching data
WO2011019759A2 (en) * 2009-08-10 2011-02-17 Visa U.S.A. Inc. Systems and methods for targeting offers
CN103546394A (en) * 2013-10-25 2014-01-29 杭州华三通信技术有限公司 Communication device
CN103714130A (en) * 2013-12-12 2014-04-09 深圳先进技术研究院 Video recommendation system and method thereof

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101369276A (en) * 2008-09-28 2009-02-18 杭州电子科技大学 Evidence obtaining method for Web browser caching data
WO2011019759A2 (en) * 2009-08-10 2011-02-17 Visa U.S.A. Inc. Systems and methods for targeting offers
CN103546394A (en) * 2013-10-25 2014-01-29 杭州华三通信技术有限公司 Communication device
CN103714130A (en) * 2013-12-12 2014-04-09 深圳先进技术研究院 Video recommendation system and method thereof

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
付关友 等: "个性化服务中基于用户行为分析的用户兴趣建模", 《计算机工程与科学》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107784364A (en) * 2016-08-25 2018-03-09 微软技术许可有限责任公司 The asynchronous training of machine learning model
CN107784364B (en) * 2016-08-25 2021-06-15 微软技术许可有限责任公司 Asynchronous training of machine learning models

Also Published As

Publication number Publication date
CN105022699B (en) 2018-04-24

Similar Documents

Publication Publication Date Title
CN105184366B (en) A kind of time-multiplexed general neural network processor
CN105051693B (en) Method, equipment and system for managing computer server capacity
CN118153639A (en) System, method and neural network processor for enhancing data processing
JP5570008B2 (en) Kernel regression system, method and program
CN110806954A (en) Method, device and equipment for evaluating cloud host resources and storage medium
CN102156933A (en) Method and counting system for counting electronic commerce transaction data
CN109842563A (en) Content delivery network flow dispatching method, device and computer readable storage medium
CN113537850A (en) Storage optimization method and device, computer equipment and storage medium
US20140201114A1 (en) Device of managing distributed processing and method of managing distributed processing
CN108628882A (en) Method and system for prejudging problem
CN110019420A (en) A kind of data sequence prediction technique and calculate equipment
CN107274215A (en) Flight prices Forecasting Methodology, device, equipment and storage medium
CN103578020A (en) Commodity information pushing system and method
CN111160566A (en) Sample generation method and device, computer readable storage medium and computer equipment
CN105022699A (en) Cache region data preprocessing method and system
CN117688955A (en) Method, apparatus, electronic device, and computer-readable medium for humidity temperature adjustment
CN113506023A (en) Working behavior data analysis method, device, equipment and storage medium
Wang et al. Optimal production control of a service-oriented manufacturing system with customer balking behavior
Nechifor et al. Autonomic monitoring approach based on cep and ml for logistic of sensitive goods
Li et al. Applications of AR*-GRNN model for financial time series forecasting
CN115983362A (en) Quantization method, recommendation method and device
CN109685527A (en) Detect method, apparatus, system and the computer storage medium of trade company's wash sale
Beliakov et al. Non-smooth optimization methods for computation of the conditional value-at-risk and portfolio optimization
CN114612708A (en) Commodity identification method and device, terminal equipment and computer readable medium
Xi et al. An attention-based recurrent neural network for resource usage prediction in cloud data center

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
CB03 Change of inventor or designer information

Inventor after: Shi Wenjin

Inventor after: Hu Fanghuai

Inventor after: Yan Jiuji

Inventor after: Wu Qing

Inventor after: Wang Fei

Inventor before: Shi Wenjin

Inventor before: Yan Jiuji

Inventor before: Wu Qing

Inventor before: Wang Fei

CB03 Change of inventor or designer information
GR01 Patent grant
GR01 Patent grant
PE01 Entry into force of the registration of the contract for pledge of patent right

Denomination of invention: Preprocessing methods and systems for cache data

Effective date of registration: 20230920

Granted publication date: 20180424

Pledgee: China Construction Bank Corporation Zhenjiang Runzhou Sub branch

Pledgor: ZHENJIANG HUILONG YANGTZE RIVER PORT CO.,LTD.

Registration number: Y2023980057474

PE01 Entry into force of the registration of the contract for pledge of patent right