CN108108438A - The recognition methods of behavioral data and device - Google Patents

The recognition methods of behavioral data and device Download PDF

Info

Publication number
CN108108438A
CN108108438A CN201711388582.5A CN201711388582A CN108108438A CN 108108438 A CN108108438 A CN 108108438A CN 201711388582 A CN201711388582 A CN 201711388582A CN 108108438 A CN108108438 A CN 108108438A
Authority
CN
China
Prior art keywords
time
behavioral data
value
aggregate
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201711388582.5A
Other languages
Chinese (zh)
Inventor
赵明露
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sz Hengteng Network Co Ltd
Original Assignee
Sz Hengteng Network Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sz Hengteng Network Co Ltd filed Critical Sz Hengteng Network Co Ltd
Priority to CN201711388582.5A priority Critical patent/CN108108438A/en
Publication of CN108108438A publication Critical patent/CN108108438A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2457Query processing with adaptation to user needs

Abstract

Recognition methods and device this application discloses a kind of behavioral data, it is related to technical field of data processing, main purpose in behavioral data in solving correlation technique when identifying, Mysql can only support to inquire about, business demand is single, the business demand of lasting variation can not be met, technical staff needs to develop corresponding polling routine again, and waste is largely exploited natural resources;Further, since the identification that data are carried out in Mysql needs to occupy database resource in the case where database can not be expanded, the continuous expansion of business demand can not be met, the problem of causing Installed System Memory interim card, influence the speed of service of whole system.Main technical schemes:Determine the time identifier of behavioral data to be identified;Behavioral data is corresponded to by the preset time window scheme in default non-relational database to the time identifier to be read out, the preset time window scheme is in a manner that the aggregate-value for being stored in multiple continuous behavioral datas in default non-relational database is identified in time series.

Description

The recognition methods of behavioral data and device
Technical field
The present invention relates to technical field of data processing, recognition methods and device particularly with regard to a kind of behavioral data.
Background technology
In the evolution of electric business business, marketing behavior more dependent on personal marketing, i.e., in different business field Under scape, can analysis modeling computing flexibly be carried out to user behavior, such as transaction record data, identify and meet user behavior Marketing strategy, realize accurate marketing purpose.For example, just for one day electric business shopping record is generated by certain transaction platform APP And first single monovalent new visitor in 88-87 of visitor, to this new objective marketing strategy for carrying out user feedback activity.Wherein, in order to find Accurate marketing strategy such as is sought by time window to count identification, it is necessary to identify the behavioral data in particular time range Sell cost, profit conversion ratio statistics etc..
At present, existing behavioral data is identified by relevant database Mysql storage behavioral datas, then according to specified Period and different business demand inquire about in different time sections the factor for influencing marketing strategy by specific code statement, and The corresponding user behavior list of different business scene is established, to carry out the operations such as statistics identification.But Mysql can only be supported Inquiry, business demand is single, can not meet the business demand of lasting variation, and technical staff needs to develop corresponding inquiry journey again Sequence, waste are largely exploited natural resources;Further, since the identification that data are carried out in Mysql needs to occupy database resource, in data In the case that storehouse can not be expanded, the continuous expansion of business demand can not be met, cause Installed System Memory interim card, influence whole system The speed of service.
The content of the invention
Recognition methods and device an embodiment of the present invention provides behavioral data solve behavioral data in correlation technique and know When other, Mysql can only support to inquire about, and business demand is single, can not meet the business demand of lasting variation, and technical staff needs weight Corresponding polling routine newly developed, waste are largely exploited natural resources;Further, since the identification needs that data are carried out in Mysql account for With database resource, in the case where database can not be expanded, the continuous expansion of business demand can not be met, cause Installed System Memory Interim card, the problem of influencing the speed of service of whole system.
One side according to embodiments of the present invention provides a kind of recognition methods of behavioral data, including:
Determine the time identifier of behavioral data to be identified, the time identifier for identify under different business scene user into Difference behavioural characteristic corresponds to the time point of behavioral data during row transaction;
Behavioral data is corresponded to the time identifier by presetting the preset time window scheme in non-relational database It is read out, the preset time window scheme is to being stored in multiple companies in default non-relational database according to time series The mode that the aggregate-value of continuous behavioral data is identified.
Further, before the time identifier for determining behavioral data to be identified, the method further includes:
Count the aggregate-value of the behavioral data under different time mark;
Judge that the time identifier whether in ineffective time scope, if in ineffective time scope, retains the nothing Imitate largest cumulative value and corresponding time identifier in time range, and delete in the ineffective time scope remaining aggregate-value and Corresponding time identifier;
The aggregate-value of behavioral data under different time is identified according to time series is stored to default non-relational data In storehouse.
Further, before the time identifier for determining behavioral data to be identified, the method further includes:
If waiting, rushing positive time identifier was in the range of effective time, and the effective time scope is searched according to operation flow Aggregate-value under interior time identifier is less than or equal to the behavioral data of predetermined threshold value;
By the aggregate-value of the behavioral data with when the positive amount of money of preshoot is added, and being updated to the default non-relational In database.
Further, it is described to store the aggregate-value of the behavioral data under different time mark to default non-according to time series After in relevant database, the method further includes:
Judge that the aggregate-value of the behavioral data in the default non-relational database under different time mark whether there is In default distributed data base;
If being not present, the aggregate-value is updated in the default distributed data base, and under different time mark Behavioral data carry out rushing positive processing.
Further, it is described by presetting the preset time window scheme in non-relational database to the time identifier pair Answer behavioral data aggregate-value be read out including:
Inter-area modes are slided to the aggregate-value of the behavioral data under time sequencing according to the first preset window according to scan-type Length carries out slip reading;Or,
According to periodical particular range pattern to the aggregate-value of the behavioral data under time sequencing according to the second preset window Length is timed reading.
Further, the aggregate-value of the behavioral data under the statistics different time mark includes:
Obtain the behavioral data under different time mark;
The aggregate-value of different behavioral datas under being identified to same time is ranked up respectively, and largest cumulative value is determined as The aggregate-value of behavioral data under the time identifier.
Further, the method further includes:
It is swept according to different business scenarios, behavioural characteristic, the first preset window length, the second preset window length to be described It retouches formula slip inter-area modes and the periodical particular range pattern chooses storage aggregate-value and the program of reading aggregate-value is drawn It holds up.
Another aspect according to embodiments of the present invention provides a kind of identification device of behavioral data, including:
Determination unit, for determining the time identifier of behavioral data to be identified, the time identifier is not of the same trade or business for identifying Difference behavioural characteristic corresponds to the time point of behavioral data when user is traded under business scene;
Reading unit, for by presetting the preset time window scheme in non-relational database to the time identifier Corresponding behavioral data is read out, and the preset time window scheme is to being stored in default non-relational number according to time series The mode being identified according to the aggregate-value of multiple continuous behavioral datas in storehouse.
According to another aspect of the present invention, a kind of storage device is provided, is stored thereon with computer program, the program quilt Processor realizes following steps when performing:
Determine the time identifier of behavioral data to be identified, the time identifier for identify under different business scene user into Difference behavioural characteristic corresponds to the time point of behavioral data during row transaction;
Behavioral data is corresponded to the time identifier by presetting the preset time window scheme in non-relational database It is read out, the preset time window scheme is to being stored in multiple companies in default non-relational database according to time series The mode that the aggregate-value of continuous behavioral data is identified.
According to another aspect of the present invention, a kind of terminal device is provided, including storage device, processor and is stored in The computer program that can be run in equipment and on a processor is stored up, the processor realizes following steps when performing described program:
Determine the time identifier of behavioral data to be identified, the time identifier for identify under different business scene user into Difference behavioural characteristic corresponds to the time point of behavioral data during row transaction;
Behavioral data is corresponded to the time identifier by presetting the preset time window scheme in non-relational database It is read out, the preset time window scheme is to being stored in multiple companies in default non-relational database according to time series The mode that the aggregate-value of continuous behavioral data is identified.
By above-mentioned technical proposal, a kind of recognition methods of behavioral data provided by the invention and device, with passing through at present Relevant database Mysql storage behavioral datas are compared, and the present invention can be marked according to the time for determining behavioral data to be identified Knowledge is read out behavioral data to run the identification method of preset time window scheme, so as to fulfill diversified behavioral data Inquiry, meets the variation of different business demand, develops corresponding program again without technical staff, reduces resource consumption, and pre- If non-relational database can maximize expansion, system interim card is reduced, system running speed is improved, so as to improve behavioral data Recognition efficiency.
Above description is only the general introduction of technical solution of the present invention, in order to better understand the technological means of the present invention, And can be practiced according to the content of specification, and in order to allow above and other objects of the present invention, feature and advantage can It is clearer and more comprehensible, below the special specific embodiment for lifting the present invention.
Description of the drawings
Attached drawing described herein is used for providing a further understanding of the present invention, forms the part of the application, this hair Bright schematic description and description does not constitute improper limitations of the present invention for explaining the present invention.In the accompanying drawings:
Fig. 1 shows a kind of recognition methods flow diagram of behavioral data provided in an embodiment of the present invention;
Fig. 2 shows the recognition methods flow diagram of another behavioral data provided in an embodiment of the present invention;
Fig. 3 shows a kind of time identifier provided in an embodiment of the present invention schematic diagram one corresponding with behavioral data;
Fig. 4 shows a kind of time identifier provided in an embodiment of the present invention schematic diagram two corresponding with behavioral data;
Fig. 5 shows that a kind of scan-type provided in an embodiment of the present invention slides section mode reads and takes schematic diagram;
Fig. 6 shows a kind of time identifier provided in an embodiment of the present invention schematic diagram three corresponding with behavioral data;
Fig. 7 shows that a kind of periodical particular range pattern provided in an embodiment of the present invention reads schematic diagram;
Fig. 8 shows a kind of identification device block diagram of behavioral data provided in an embodiment of the present invention;
Fig. 9 shows the identification device block diagram of another behavioral data provided in an embodiment of the present invention;
Figure 10 shows a kind of entity apparatus structure diagram of terminal device provided in an embodiment of the present invention.
Specific embodiment
The exemplary embodiment of the disclosure is more fully described below with reference to accompanying drawings.Although the disclosure is shown in attached drawing Exemplary embodiment, it being understood, however, that may be realized in various forms the disclosure without should be by embodiments set forth here It is limited.On the contrary, these embodiments are provided to facilitate a more thoroughly understanding of the present invention, and can be by the scope of the present disclosure Completely it is communicated to those skilled in the art.
An embodiment of the present invention provides a kind of recognition methods of behavioral data, as shown in Figure 1, the described method includes:
101st, the time identifier of behavioral data to be identified is determined.
Wherein, the time identifier corresponds to for identifying difference behavioural characteristic when user is traded under different business scene The time point of behavioral data, the behavioral data can include the data that user triggers behavior generation when using transaction platform, Such as the order data in buying behavior, payment data in paying behaviors etc., the embodiment of the present invention is not specifically limited.Generally , due to current behavior data recognition methods executive agent for server-side, due to can be in a time to transaction platform Point receives the behavior of multiple user's triggerings, and therefore, a time identifier can correspond to the behavioral data of multiple users, work as progress During the identification of behavioral data, i.e., it needs to be determined that the time identifier of behavioral data to be identified, time identifier key, i.e. year, month, day Specific data or a Digital ID, the embodiment of the present invention be not specifically limited.
It should be noted that in electric business development, different business scene is to include the particular transactions content of electric business, such as Commodity, life payment, the second-hand exchange of user etc. are bought, under these business scenarios, transaction platform server-side can be according to analysis Different user behavior datas to push targetedly marketing activity to user, so as to which user be promoted to carry out more transaction rows For therefore, it is necessary to the behavioral datas under being identified to different time to be identified analysis.Further, since current behavioral data is complete Portion is stored in advance in default non-relational database, is carrying out time identifier timing really, can be according to the to be identified of input The time identifier of behavioral data inquires the time identifier being stored in default non-relational database, so that it is determined that need to know Other behavioral data.
102nd, behavior is corresponded to the time identifier by presetting the preset time window scheme in non-relational database Data are read out.
Wherein, the preset time window scheme is more in default non-relational database to being stored according to time series The mode that the aggregate-value of a continuous behavioral data is identified can correspond to multiple behavioral datas under the time identifier, and For the ease of counting and analyzing, each behavioral data can be stored in the corresponding aggregate-value of itself behavioural characteristic, for example, behavior is special It levies to buy the aggregate-value of number, the then behavioral data of the user under time identifier, i.e., in the purchase that key is 2017.5.4.21.05 Cumulative number is bought as 500 times.Further, since time identifier can correspond to multiple user behaviors, when carrying out behavioral data reading, Can also other data of behavioral data aggregate-value be read out, whether such as user images are new according to specific marketing demand Visitor etc., to determine the statistics, such as user such as the cost of pending marketing activity, profit conversion ratio according to these data Static attribute, user's dynamic attribute, user brand attribute, user behavior attribute, user buy attribute, user's category behavior.Institute State default non-relational storehouse be with increase income write using ANSI C languages, support network, can based on memory also can persistence Log type Key-Value databases, such as redis databases, since the data stored in redis databases are closed with key Key word carried out with value data values it is corresponding, therefore, exactly match under preset time window model according to time identifier key read Corresponding aggregate-value value.The preset time window model is to pass through preset window size, step-length, time range Exchange definite time window, the mould being then read out by traveling time window to the behavioral data under time identifier Formula can be carried out to being stored in default non-relational database according to the time identifier and behavioral data of time series storage The portable reading of window size, so as to obtain the aggregate-value of needs, the embodiment of the present invention is not specifically limited.
By above-mentioned technical proposal, a kind of recognition methods of behavioral data provided by the invention, with passing through relationship type at present Database Mysql storage behavioral datas are compared, and the present invention can be transported according to the time identifier for determining behavioral data to be identified The identification method of row preset time window scheme is read out behavioral data, is inquired about so as to fulfill diversified behavioral data, Meet the variation of different business demand, develop corresponding program again without technical staff, reduce resource consumption, and preset non-pass It is that type database can maximize expansion, reduces system interim card, system running speed is improved, so as to improve the identification of behavioral data Efficiency.
An embodiment of the present invention provides the recognition methods of another behavioral data, as shown in Fig. 2, the described method includes:
201st, the aggregate-value of the behavioral data under different time mark is counted.
For the embodiment of the present invention, for the ease of the modeling basis using the aggregate-value of behavioral data as marketing strategy, need The aggregate-value of behavioral data is counted according to time interval, wherein, the corresponding time identifier of aggregate-value for counting behavioral data can Statistics is marked according to default time interval, i.e., set according to time dimensions such as year, month, day, hour, min, this hair Bright embodiment is not specifically limited.Further, since current data statistics is directly to be put down with default non-relational database redis Platform carry out, therefore, can using redis character strings value self-propagations characteristic increment and redis single-threaded model come Realization, each request obtain newest aggregate-value by INCRBY key increment, and it is tired to ensure that secondary aggregate-value is not repeated Meter.
It should be noted that due to storing into redis, it further, can be summed to behavioral data, ask equal It is worth, seeks mode, seek maximin, seek scope, sought average point number, seek coefficient, matches somebody with somebody so as to fulfill window time is self-defined It puts, so as to meet the identification of the behavioral data of different demands.
For the embodiment of the present invention, step 201 is specifically as follows:Obtain the behavioral data under different time mark;To phase Aggregate-value with the different behavioral datas under time identifier is ranked up respectively, and largest cumulative value is determined as the time identifier Under behavioral data aggregate-value.
For the embodiment of the present invention, the aggregate-value in different behavioral datas under being identified in order to avoid same time weighs It is multiple to add up and avoid so that first determine whether the largest cumulative value of behavioral data, influence when reading aggregate-value every time The efficiency of data is read, the aggregate-value of all behavioral datas under being identified to same time is ranked up, then by maximum Aggregate-value be determined as the aggregate-value of behavioral data.Wherein.Since multiple behavioral datas can be corresponded under a time identifier, and For redis when sorted set is used to store, aggregate-value corresponds to a time identifier so that the aggregate-value under time identifier can not Modification, therefore, it is necessary to the aggregate-values to behavioral data to be ranked up, and the maximum aggregate-value that will sort is as this behavioral data Aggregate-value, and a behavioral data only corresponds to an aggregate-value, then can compare without sequence, be determined directly as this behavior The aggregate-value of data.For example, under a time identifier, there are multiple aggregate-values, then retain maximum aggregate-value, delete minimum Aggregate-value.
It should be noted that the sorted set characteristics of redis, there is sorting data water model, in Sorted-Sets Each behavioral data can be associated there are one time identifier, it is exactly come for collection using time identifier to add up to realize thought The aggregate-value of behavioral data in conjunction carries out sequence from small to large, although the behavioral data in Sorted-Sets must be only One, but aggregate-value is recursive.
202nd, judge that the time identifier whether in ineffective time scope, if in ineffective time scope, retains institute Largest cumulative value and corresponding time identifier in ineffective time scope are stated, and it is accumulative to delete remaining in the ineffective time scope Value and corresponding time identifier.
For the embodiment of the present invention, in order to avoid being eliminated due to the data in the range of ineffective time, and cause behavior number Lack minuend, it is necessary to by the number in the range of ineffective time according to caused by discontinuous during the data statistics of time difference is carried out According to the behavior number for doing a data reproduction, i.e. the largest cumulative value of a pending data dump in reservation ineffective time scope According to so as to as statistics foundation.Wherein, the ineffective time scope is time out of date and corresponding data, including Through carry out behavioral data identification time range or have been identified as be invalid data time range, as shown in figure 3, The embodiment of the present invention is not specifically limited.For example, as shown in figure 3, time identifier 1,2 is in ineffective time scope, then retain most The behavioral data of the corresponding time identifier 2 of big aggregate-value.
203rd, the aggregate-value of the behavioral data under different time is identified according to time series is stored to default non-relational In database.
For the embodiment of the present invention, in order to meet the needs of being identified under diversified business scenario to behavioral data, so as to reality The quick aggregate-value for reading behavioral data of existing time window, the aggregate-value of the behavioral data under can different time be identified are deposited Storage is in default non-relational database.Wherein, the time series be refer to sequentially in time and time identifier generation Behavioral data under each time identifier can be carried out time-sequencing by time dimension by time series, default will pass through Time window pattern is read out behavioral data.
It should be noted that default non-relational database is redis databases, it can be by redis databases The algorithm carried carries out the storage and reading of data, and the embodiment of the present invention is not specifically limited.
204th, judge the behavioral data under the mark of different time in the default non-relational database aggregate-value whether It is present in default distributed data base.
For the embodiment of the present invention, since the memory size for presetting non-relational database is limited, and non-relational is preset Database can carry out data update according to prefixed time interval, and the behavioral data of storage is likely to again that there are no situations about being read It is just updated down, it therefore, can be by the way that the data backup in default non-relational database be extremely had massive store In amount and the database of high-performance data storage.Wherein, the default distributed data base be with massive store amount and The database of high-performance data storage, such as MongoDB, the embodiment of the present invention are not specifically limited.
It should be noted that in order to can be in time and accurately by data backup into default distributed data base, it can be with The aggregate-value of the lower behavioral data of different time mark in redis databases is judged that judgement is according to specified time interval It is no be in in MongoDB, if so, explanation have been carried out backing up, when carrying out behavioral data identification, according to the time Mark is read in redis less than behavioral data, then will inquire about goal displacement into MongoDB, it is ensured that behavioral data stores Accuracy and behavioral data is avoided to lose.
If the 205, being not present, the aggregate-value is updated in the default distributed data base, and to different time mark Behavioral data under knowing carries out rushing positive processing.
For the embodiment of the present invention, in order to ensure the continuity and accuracy of behavioral data storage, behavioral data is avoided to lose It loses, if the aggregate-value of behavioral data is not present in MongoDB under the time identifier present in redis, by time identifier Under the aggregate-value of behavioral data be updated in MongoDB.Wherein, the positive processing of the punching is that trading activity in behavioral data is lost The means to save the situation taken when losing for example, order transaction behavioral data has been set to success in the terminal, but is sent to current There is no respond for the account of server-side, it is understood that there may be situations such as transaction is overtime in order to ensure the interests of user, can cancel and currently order Single trading activity data, if merchandising successfully, then rollback is merchandised, so as to fulfill punching just.Under being identified to different time Behavioral data rushes positive processing, it is ensured that all data in MongoDB are all accurate.
It should be noted that the positive processing of punching is not only to be carried out for the data being stored in MongoDB, or The data being stored in redis are carried out, so that the data for ensuring all are all repaired.
If the 206, waiting, rushing positive time identifier was in the range of effective time, and the effective time is searched according to operation flow In the range of aggregate-value under time identifier be less than or equal to the behavioral data of predetermined threshold value.
For the embodiment of the present invention, in order to ensure the accuracy of data reparation, batch is avoided to rush positive processing influence behavior number According to identification or storage, it is necessary to carry out reparation further to the aggregate-value under time identifier in the range of effective time.Wherein, The operation flow is to refer to the flow triggered according to user behavior under business scenario, and the predetermined threshold value, which is equal to add up, works as preshoot Total value after positive value-when preshoot positive value, if aggregate-value is less than predetermined threshold value, illustrate the aggregate-value under current time mark be not into What row was repaired, therefore, preshoot positive value can be worked as by the way that the aggregate-value unification of all behavioral datas under time identifier is increased, from And it realizes that punching is positive and repairs.
It should be noted that in order to avoid after adding when preshoot positive value, there is the situation in step 201, it is necessary to will Other aggregate-values are deleted, it is ensured that the aggregate-value after punching just is maximum aggregate-value.In addition, if ineffective time is divided into interior, punching It is just nonsensical, it can be without rushing just.
207th, by the aggregate-value of the behavioral data with when the positive amount of money of preshoot is added, and being updated to and described presetting non-pass It is in type database.
For example, when the time point between 5-10, just there are meaning, A for punching in time point 1,5,10 as shown in Figure 4 To work as the positive amount of money of preshoot, B be the just accumulative total amount of punching, C is cumulative total amount after preshoot positive amount of money, first, increment A to B returns to C values, inquires about in effective period of time, and the aggregate-value of time identifier key is less than or equal to the behavioral data of (C-A), then Aggregate-value+the A of each behavioral data, is put into set, deletes other aggregate-values less than current aggregate-value.
208th, the time identifier of behavioral data to be identified is determined.
This step is identical with step 101 method shown in FIG. 1, and details are not described herein.
209a, the aggregate-value of the behavioral data under time sequencing is preset according to first according to scan-type slip inter-area modes Length of window carries out slip reading.
For the embodiment of the present invention, due to influence of the different business demands to marketing strategy, when can select different Between window reading model be read out.It is to being stored according to the first preset window length that the scan-type, which slides inter-area modes, The behavioral data with time identifier is read a group by a group according to the time identifier in length of window in redis or MongoDB It takes, as shown in figure 5, wherein, preset window is a length of when once reading data the corresponding length of time identifier can be 1 minute, Or 5 minutes, the embodiment of the present invention is not specifically limited.For example, the time identifier that first time window is shown have a, B, c, d, length of window are 1 minute, then have read the aggregate-value of the behavioral data of time identifier a, b, c, d in this 1 minute, then The aggregate-value that lower 1 minutes are identified as the behavioral data of e, f, g, h is read, and so on.
For example, as shown in fig. 6, in Sorted-Set add, delete or update one behavioral data be all very quickly, Its time complexity is the logarithm of behavioral data in set, due to position of the behavioral data in Sorted-Sets in set It is ordered into, therefore, it is very efficient even to access the behavioral data being located in the middle part of set and remain on.Reduce network, IO The meaningless pressure that interacting strip is come between grade application servers.Acquisition modes are as follows:A, accumulating sum before 5 minutes: ZREVRANGEBYSCORE key 5 0LIMIT 0 1;B, accumulating sum within 5 minutes:(ZREVRANGEBYSCORE key 10 5 LIMIT 0 1)–(ZREVRANGEBYSCORE key(5 0LIMIT 0 1)。
For the embodiment of the present invention, with step 209a step 209b arranged side by side, according to periodical particular range pattern pair when Between sequentially under the aggregate-value of behavioral data be timed reading according to the second preset window length.
For the embodiment of the present invention, due to influence of the different business demands to marketing strategy, when can select different Between window reading model be read out.Periodical particular range pattern is to being stored according to the second preset window length In redis or MongoDB with time identifier behavioral data according to certain number in window time identifier one by one into Row read, as shown in fig. 7, preset window it is a length of once read data when the corresponding length of time identifier, can be 1 minute, Or 5 minutes, the embodiment of the present invention is not specifically limited.For example, time identifier is a, b, c, d, e, f, g, at first Between window correspond to time identifier as a, b, c, d, when having read the aggregate-value of whole behavioral datas, divide according to window size 1 Window is moved to corresponding time identifier as b, c, d, e, the corresponding behavioral datas of time identifier e is added up by the moving range of clock After value is read, and so on.
Further, the embodiment of the present invention further includes:According to different business scenarios, behavioural characteristic, the first preset window Length, the second preset window length slide inter-area modes for the scan-type and the periodical particular range pattern chooses storage Aggregate-value and the program engine for reading aggregate-value.
For the embodiment of the present invention, inter-area modes and periodical particular range pattern knowledge are slided by scan-type in order to optimize The efficiency and accuracy of other behavioral data can respectively slide scan-type inter-area modes and periodical particular range model selection Different program engines performs the programs such as specific reading, storage.Wherein, program engine includes search aggregation engine, span Ranking engine selects to draw according to different business scene, behavioural characteristic, the first preset window length, the second preset window length The criterion held up could be provided as:The high scene of the time frequency uses search aggregating algorithm preferentially using span sort algorithm; Second, point, when, day time window unit preferentially using search polymerization, the moon, year, window-unit was preferentially sorted using span;Window Mouth step-length m>30 preferential is sorted using span, otherwise preferential uses search polymerization.For example, span sort algorithm is suitable for forms Greatly, single number is received in the short data analysis of step-length, such as property current time 40 minutes;It is small suitable for forms to search for polymerization, The data accumulation inquiry of step-length length, such as the amount of money that March pays the fees to preceding property in May before current business.
It should be noted that the storing process that behavioral data is carried out for different engines is different, for example, search polymerization is drawn It holds up and realizes that redis storages include:It creates Storm and calculates topological stream in real time;It assigns having by Hash table and represents a certain business spy Property key, if assigned the big new property owner crowd for promoting discount of electric business that enjoys, code name RQ-001, unified internal ID: 80001, key are arranged to 80001RQ-001;Key-value pair key is (single by the former time for sliding setting inside assembling storage map Position dimension conversion), such as:It is statistical dimension to rise new owner's registion time that close honey APP registers by perseverance, and registion time is September 24 in 2017,20 points 14 seconds 00 minute, if the unit of step-length is set as hour, being converted to field is exactly 1506254400000, if the unit of step-length is set as day, it is 1506182400000 to be converted to field;Pass through redis orders: HINCRBYFLOAT KEY_NAME FIELD_NAME INCR_BY_NUMBER use increment increment accumulative totals, corresponding The api of redis:hincrbyfloat(key,insertion,value);Ineffective time data are deleted, institute under this inquiry key There are data, pass through redis orders:HGETALL [key] deletes invalid data, passes through redis orders:HDEL key field2 [field2];The expired time of this map is set, and good Free up Memory passes through redis orders:EXPIRE[key seconds]. Search aggregation engine realizes that MongoDB storages include:The data type BSON that Hash map is used;MongoDB can be according to row The value for finding effective time of scope conditional;By find conditional information retrievals, it is implemented as follows:Mongodb databases are looked into Operation is ask i.e. using find () or findOne () function, can also be inquired about according to different conditions.
In addition, span ranking engine realizes that redis storages include:It checks on the fraction row that needs are inserted into whether there is value, leads to Cross redis orders:ZCOUNT [key min max], the api of corresponding redis:zrangeByScore(key,insertion, insertion);The characteristic (behavioral data repeated is not allowed to appear in a Set) of zset makes here under time identifier Corresponding value cannot directly be changed, it is necessary to first delete, so first deleting data under same time dimension and returning to greatest member, delete Except order:ZREM key member [member...], the api of corresponding redis:zrem(key,lo);In a kind of service conditions Under, data collision may be led to the problem of, so having to solve first to repair colliding data, problem is to be not necessarily the insertion time The newest time, all data (such as the calculating for the positive order that liquidates) that repair behind the insertion time, that is, it is each accumulative Under value all adds up, operational order:ZREM key member [member...], then ZADD key score1member1 [score2member2], the api of corresponding redis:zrem(key,lo1)、zadd(key,lo2);It is inserted into cumulative data, insertion Redis orders:ZADD key score1 member1 [score2 member2], the api of corresponding redis:zadd(key, insertion,String.valueOf(insertionValueOf));Ineffective time data are deleted, institute under this inquiry key There are data, pass through redis orders:HGETALL [key] deletes invalid data, passes through redis orders:HDEL key field2 [field2];The expired time of this map is set, and good Free up Memory passes through redis orders:EXPIRE[key seconds].
By above-mentioned technical proposal, the recognition methods of another kind behavioral data provided by the invention, the present invention can use The storage feature of redis and MongDB performs scan-type slip inter-area modes using different program engines and periodicity is specific Reading of the range mode to behavioral data under time identifier, it is ensured that more scenes, various dimensions, time range are self-defined, data are real-time Property and response high efficiency requirement, for by largely marketing operation scene provides support in true environment, accuracy is high, so as to real Existing diversified behavioral data inquiry, meets the variation of different business demand, develops corresponding program again without technical staff, Resource consumption is reduced, and default non-relational database can maximize expansion, reduce system interim card, improve system operation speed Degree, so as to improve the recognition efficiency of behavioral data.
Further, as the realization to method shown in above-mentioned Fig. 1, an embodiment of the present invention provides a kind of behavioral datas Identification device, as shown in figure 8, the device includes:Determination unit 31, reading unit 32.
Determination unit 31, for determining the time identifier of behavioral data to be identified, the time identifier is used to identify difference Difference behavioural characteristic corresponds to the time point of behavioral data when user is traded under business scenario;
Reading unit 32, for being marked by presetting the preset time window scheme in non-relational database to the time Know corresponding behavioral data to be read out, the preset time window scheme is to being stored in default non-relational according to time series The mode that the aggregate-value of multiple continuous behavioral datas is identified in database.
In specific application scenarios, as shown in figure 9, described device further includes:
Statistic unit 33, for counting the aggregate-value of the behavioral data under different time mark;
First judging unit 34, for whether judging the time identifier in ineffective time scope, if be in invalid Between scope, then retain largest cumulative value and corresponding time identifier in the ineffective time scope, and delete the ineffective time Remaining aggregate-value and corresponding time identifier in scope;
Storage unit 35, for storing the aggregate-value of the behavioral data under different time mark to pre- according to time series If in non-relational database.
In specific application scenarios, described device further includes:
Searching unit 36 if being in for waiting to rush positive time identifier in the range of effective time, is searched according to operation flow Aggregate-value in the range of the effective time under time identifier is less than or equal to the behavioral data of predetermined threshold value;
Updating block 37, for by the aggregate-value of the behavioral data with when the positive amount of money of preshoot is added, and being updated to In the default non-relational database.
Second judgment unit 38, for judging the behavior number in the default non-relational database under different time mark According to aggregate-value whether there is in default distributed data base;
Processing unit 39, if for being not present, the aggregate-value is updated in the default distributed data base, and right Behavioral data under different time mark carries out rushing positive processing.
In specific application scenarios, the reading unit 32 includes:
Read module 3201 is slided, for sliding inter-area modes to the tired of the behavioral data under time sequencing according to scan-type Evaluation carries out slip reading according to the first preset window length;Or,
Timing read module 3202, for the tiring out to the behavioral data under time sequencing according to periodical particular range pattern Evaluation is timed reading according to the second preset window length.
In specific application scenarios, statistic unit 33 includes:
Acquisition module 3301, for obtaining the behavioral data under different time mark;
Determining module 3302 is ranked up respectively for the aggregate-value to the different behavioral datas under same time mark, Largest cumulative value is determined as to the aggregate-value of the behavioral data under the time identifier.
In specific application scenarios, described device further includes:
Unit 310 is chosen, for being preset according to different business scenarios, behavioural characteristic, the first preset window length, second Length of window slides inter-area modes for the scan-type and the periodical particular range pattern chooses storage aggregate-value and reading Take the program engine of aggregate-value.
By above-mentioned technical proposal, a kind of identification device of behavioral data provided by the invention, the present invention can use The storage feature of redis and MongDB performs scan-type slip inter-area modes using different program engines and periodicity is specific Reading of the range mode to behavioral data under time identifier, it is ensured that more scenes, various dimensions, time range are self-defined, data are real-time Property and response high efficiency requirement, for by largely marketing operation scene provides support in true environment, accuracy is high, so as to real Existing diversified behavioral data inquiry, meets the variation of different business demand, develops corresponding program again without technical staff, Resource consumption is reduced, and default non-relational database can maximize expansion, reduce system interim card, improve system operation speed Degree, so as to improve the recognition efficiency of behavioral data.
Based on above-mentioned method as shown in Figure 1, correspondingly, the embodiment of the present invention additionally provides a kind of storage device, deposit thereon Computer program is contained, which realizes following steps when being executed by processor:Determine the time identifier of behavioral data to be identified, The time identifier be used for identify difference behavioural characteristic when user is traded under different business scene correspond to behavioral data when Between point;Behavioral data progress is corresponded to the time identifier by presetting the preset time window scheme in non-relational database It reads, the preset time window scheme is multiple continuous in default non-relational database to being stored according to time series The mode that the aggregate-value of behavioral data is identified.
Embodiment based on above-mentioned method as shown in Figure 1 and device as shown in Figure 4, the embodiment of the present invention additionally provide one kind The entity apparatus of terminal device, as shown in Figure 10, the cloud server include:It processor 41, storage device 42 and is stored in The computer program that can be run in equipment 42 and on a processor is stored up, the processor 41 realizes following step when performing described program Suddenly:Collect the short message characteristic information for the pseudo-base station note that different clients are sent;According to the short message characteristic information, puppet base is determined It stands corresponding base station position information;According to the base station position information, the distribution situation and historical track of pseudo-base station are determined;It should Terminal device further includes:Bus 43 is configured as coupling processor 41 and storage device 42.
Algorithm and display be not inherently related to any certain computer, virtual system or miscellaneous equipment provided herein. Various general-purpose systems can also be used together with teaching based on this.As described above, required by constructing this kind of system Structure be obvious.In addition, the present invention is not also directed to any certain programmed language.It should be understood that it can utilize various Programming language realizes the content of invention described herein, and the description done above to language-specific is to disclose this hair Bright preferred forms.
In the specification provided in this place, numerous specific details are set forth.It is to be appreciated, however, that the implementation of the present invention Example can be put into practice without these specific details.In some instances, well known method, structure is not been shown in detail And technology, so as not to obscure the understanding of this description.
Similarly, it should be understood that in order to simplify the disclosure and help to understand one or more of each inventive aspect, Above in the description of exemplary embodiment of the present invention, each feature of the invention is grouped together into single implementation sometimes In example, figure or descriptions thereof.However, the method for the disclosure should be construed to reflect following intention:I.e. required guarantor Shield the present invention claims the more features of feature than being expressly recited in each claim.It is more precisely, such as following Claims reflect as, inventive aspect is all features less than single embodiment disclosed above.Therefore, Thus the claims for following specific embodiment are expressly incorporated in the specific embodiment, wherein each claim is in itself Separate embodiments all as the present invention.
Those skilled in the art, which are appreciated that, to carry out adaptively the module in the equipment in embodiment Change and they are arranged in one or more equipment different from the embodiment.It can be the module or list in embodiment Member or component be combined into a module or unit or component and can be divided into addition multiple submodule or subelement or Sub-component.In addition at least some in such feature and/or process or unit exclude each other, it may be employed any Combination is disclosed to all features disclosed in this specification (including adjoint claim, summary and attached drawing) and so to appoint Where all processes or unit of method or equipment are combined.Unless expressly stated otherwise, this specification is (including adjoint power Profit requirement, summary and attached drawing) disclosed in each feature can be by providing the alternative features of identical, equivalent or similar purpose come generation It replaces.
In addition, it will be appreciated by those of skill in the art that although some embodiments described herein include other embodiments In included some features rather than other feature, but the combination of the feature of different embodiments means in of the invention Within the scope of and form different embodiments.For example, in the following claims, embodiment claimed is appointed One of meaning mode can use in any combination.
The all parts embodiment of the present invention can be with hardware realization or to be run on one or more processor Software module realize or realized with combination thereof.It will be understood by those of skill in the art that it can use in practice Microprocessor or digital signal processor (DSP) realize the recognition methods of behavioral data according to embodiments of the present invention and dress The some or all functions of some or all components in putting.The present invention is also implemented as performing described here Some or all equipment of method or program of device (for example, computer program and computer program product).This The program of the realization present invention of sample can may be stored on the computer-readable medium or can have one or more signal Form.Such signal can be downloaded from internet website to be obtained either providing or with any other on carrier signal Form provides.
It should be noted that the present invention will be described rather than limits the invention for above-described embodiment, and ability Field technique personnel can design alternative embodiment without departing from the scope of the appended claims.In the claims, Any reference symbol between bracket should not be configured to limitations on claims.Word "comprising" does not exclude the presence of not Element or step listed in the claims.Word "a" or "an" before element does not exclude the presence of multiple such Element.The present invention can be by means of including the hardware of several different elements and being come by means of properly programmed computer real It is existing.If in the unit claim for listing equipment for drying, several in these devices can be by same hardware branch To embody.The use of word first, second, and third does not indicate that any order.These words can be explained and run after fame Claim.

Claims (10)

1. a kind of recognition methods of behavioral data, which is characterized in that including:
Determine the time identifier of behavioral data to be identified, the time identifier is handed over for identifying user under different business scene Difference behavioural characteristic corresponds to the time point of behavioral data when easily;
Behavioral data progress is corresponded to the time identifier by presetting the preset time window scheme in non-relational database It reads, the preset time window scheme is multiple continuous in default non-relational database to being stored according to time series The mode that the aggregate-value of behavioral data is identified.
2. according to the method described in claim 1, it is characterized in that, it is described determine behavioral data to be identified time identifier it Before, the method further includes:
Count the aggregate-value of the behavioral data under different time mark;
Whether the time identifier is judged in ineffective time scope, if ineffective time scope is in, when retaining described invalid Between largest cumulative value and corresponding time identifier in scope, and delete remaining aggregate-value and correspondence in the ineffective time scope Time identifier;
The aggregate-value of behavioral data under different time is identified according to time series is stored into default non-relational database.
3. according to the method described in claim 2, it is characterized in that, it is described determine behavioral data to be identified time identifier it Before, the method further includes:
If waiting, rushing positive time identifier was in the range of effective time, when being searched according to operation flow in the range of the effective time Between identify under aggregate-value be less than or equal to predetermined threshold value behavioral data;
By the aggregate-value of the behavioral data with when the positive amount of money of preshoot is added, and being updated to the default non-relational data In storehouse.
4. according to the method described in claim 3, it is characterized in that, it is described according to time series by different time identify under row For data aggregate-value store into default non-relational database after, the method further includes:
Judge that the aggregate-value of the behavioral data in the default non-relational database under different time mark whether there is in pre- If in distributed data base;
If being not present, the aggregate-value is updated in the default distributed data base, and to the row under different time mark It carries out rushing positive processing for data.
5. according to claim 1-4 any one of them methods, which is characterized in that described by default non-relational database Preset time window scheme the time identifier is corresponded to behavioral data aggregate-value be read out including:
Inter-area modes are slided to the aggregate-value of the behavioral data under time sequencing according to the first preset window length according to scan-type Carry out slip reading;Or,
According to periodical particular range pattern to the aggregate-value of the behavioral data under time sequencing according to the second preset window length It is timed reading.
6. according to the method described in claim 2, it is characterized in that, described count the tired of the behavioral data under different time mark Evaluation includes:
Obtain the behavioral data under different time mark;
The aggregate-value of different behavioral datas under being identified to same time is ranked up respectively, largest cumulative value is determined as described The aggregate-value of behavioral data under time identifier.
7. according to the method described in claim 5, it is characterized in that, the method further includes:
It is the scan-type according to different business scenarios, behavioural characteristic, the first preset window length, the second preset window length It slides inter-area modes and the periodical particular range pattern chooses storage aggregate-value and reads the program engine of aggregate-value.
8. a kind of identification device of behavioral data, which is characterized in that including:
Determination unit, for determining the time identifier of behavioral data to be identified, the time identifier is used to identify different business field Difference behavioural characteristic corresponds to the time point of behavioral data when user is traded under scape;
Reading unit, for being corresponded to by presetting the preset time window scheme in non-relational database to the time identifier Behavioral data is read out, and the preset time window scheme is to being stored in default non-relational database according to time series In multiple continuous behavioral datas the mode that is identified of aggregate-value.
9. a kind of storage device, is stored thereon with computer program, which is characterized in that is realized when described program is executed by processor Following steps:
Determine the time identifier of behavioral data to be identified, the time identifier is handed over for identifying user under different business scene Difference behavioural characteristic corresponds to the time point of behavioral data when easily;
Behavioral data progress is corresponded to the time identifier by presetting the preset time window scheme in non-relational database It reads, the preset time window scheme is multiple continuous in default non-relational database to being stored according to time series The mode that the aggregate-value of behavioral data is identified.
10. a kind of terminal device, can run on a storage device and on a processor including storage device, processor and storage Computer program, which is characterized in that the processor realizes following steps when performing described program:
Determine the time identifier of behavioral data to be identified, the time identifier is handed over for identifying user under different business scene Difference behavioural characteristic corresponds to the time point of behavioral data when easily;
Behavioral data progress is corresponded to the time identifier by presetting the preset time window scheme in non-relational database It reads, the preset time window scheme is multiple continuous in default non-relational database to being stored according to time series The mode that the aggregate-value of behavioral data is identified.
CN201711388582.5A 2017-12-20 2017-12-20 The recognition methods of behavioral data and device Pending CN108108438A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711388582.5A CN108108438A (en) 2017-12-20 2017-12-20 The recognition methods of behavioral data and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711388582.5A CN108108438A (en) 2017-12-20 2017-12-20 The recognition methods of behavioral data and device

Publications (1)

Publication Number Publication Date
CN108108438A true CN108108438A (en) 2018-06-01

Family

ID=62210650

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711388582.5A Pending CN108108438A (en) 2017-12-20 2017-12-20 The recognition methods of behavioral data and device

Country Status (1)

Country Link
CN (1) CN108108438A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109905292A (en) * 2019-03-12 2019-06-18 北京奇虎科技有限公司 A kind of terminal device recognition methods, system and storage medium
CN110866033A (en) * 2018-08-28 2020-03-06 北京国双科技有限公司 Feature determination method and device for predicting query resource occupancy
CN111064789A (en) * 2019-12-18 2020-04-24 北京三快在线科技有限公司 Data migration method and system
CN112084219A (en) * 2020-09-16 2020-12-15 京东数字科技控股股份有限公司 Method, apparatus, electronic device, and medium for processing data
WO2021073510A1 (en) * 2019-10-15 2021-04-22 深圳前海微众银行股份有限公司 Statistical method and device for database
CN116955738A (en) * 2023-09-19 2023-10-27 北京华鑫杰瑞计算机系统工程有限公司 User behavior prediction system based on network footprint analysis

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101082940A (en) * 2006-05-31 2007-12-05 株式会社日立制作所 Work movement analysis method, work movement analysis apparatus, and work movement analysis program
US20130339302A1 (en) * 2012-06-18 2013-12-19 Actifio, Inc. System and method for intelligent database backup
CN104699718A (en) * 2013-12-10 2015-06-10 阿里巴巴集团控股有限公司 Method and device for rapidly introducing business data
CN106227765A (en) * 2016-07-13 2016-12-14 广州唯品会网络技术有限公司 The implementation method that time window is accumulative

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101082940A (en) * 2006-05-31 2007-12-05 株式会社日立制作所 Work movement analysis method, work movement analysis apparatus, and work movement analysis program
US20130339302A1 (en) * 2012-06-18 2013-12-19 Actifio, Inc. System and method for intelligent database backup
CN104699718A (en) * 2013-12-10 2015-06-10 阿里巴巴集团控股有限公司 Method and device for rapidly introducing business data
CN106227765A (en) * 2016-07-13 2016-12-14 广州唯品会网络技术有限公司 The implementation method that time window is accumulative

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110866033A (en) * 2018-08-28 2020-03-06 北京国双科技有限公司 Feature determination method and device for predicting query resource occupancy
CN110866033B (en) * 2018-08-28 2022-06-21 北京国双科技有限公司 Feature determination method and device for predicting query resource occupancy
CN109905292A (en) * 2019-03-12 2019-06-18 北京奇虎科技有限公司 A kind of terminal device recognition methods, system and storage medium
CN109905292B (en) * 2019-03-12 2021-08-10 北京奇虎科技有限公司 Terminal equipment identification method, system and storage medium
WO2021073510A1 (en) * 2019-10-15 2021-04-22 深圳前海微众银行股份有限公司 Statistical method and device for database
CN111064789A (en) * 2019-12-18 2020-04-24 北京三快在线科技有限公司 Data migration method and system
CN111064789B (en) * 2019-12-18 2022-09-20 北京三快在线科技有限公司 Data migration method and system
CN112084219A (en) * 2020-09-16 2020-12-15 京东数字科技控股股份有限公司 Method, apparatus, electronic device, and medium for processing data
CN116955738A (en) * 2023-09-19 2023-10-27 北京华鑫杰瑞计算机系统工程有限公司 User behavior prediction system based on network footprint analysis
CN116955738B (en) * 2023-09-19 2023-12-08 北京华鑫杰瑞计算机系统工程有限公司 User behavior prediction system based on network footprint analysis

Similar Documents

Publication Publication Date Title
CN108108438A (en) The recognition methods of behavioral data and device
US7660459B2 (en) Method and system for predicting customer behavior based on data network geography
US7080052B2 (en) Method and system for sample data selection to test and train predictive algorithms of customer behavior
US20180341898A1 (en) Demand forecast
CN107689008A (en) A kind of user insures the method and device of behavior prediction
CN104508694B (en) The income goal systems used and method based on Mobile solution
Çavdar et al. Airline customer lifetime value estimation using data analytics supported by social network information
CN111090822A (en) Business object pushing method and device
CN109213771A (en) Update the method and apparatus of portrait label
CN108932625A (en) Analysis method, device, medium and the electronic equipment of user behavior data
JP2018139036A (en) Analysis device
CN110069676A (en) Keyword recommendation method and device
CN108154311A (en) Top-tier customer recognition methods and device based on random forest and decision tree
Turkmen et al. Intermittent demand forecasting with deep renewal processes
US11222073B2 (en) System and method of creating different relationships between various entities using a graph database
CN115330300A (en) Purchase order processing method, device, equipment and storage medium
CN110427545B (en) Information pushing method and system
US7197471B2 (en) System and method for assessing demographic data accuracy
US20140136280A1 (en) Predictive Tool Utilizing Correlations With Unmeasured Factors Influencing Observed Marketing Activities
KR20220073262A (en) System and method for recommending linked discount rates and advertisement profit models optimized for stores using AI
Laughery et al. Simulation of service systems
CN110378701A (en) A kind of control method and system of novel customer relation management
US20150006342A1 (en) Generating a Simulated Invoice
CN114969486B (en) Corpus recommendation method, apparatus, device and storage medium
Amadi et al. Analysis of online discount sales and price optimization using cognitive learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
AD01 Patent right deemed abandoned

Effective date of abandoning: 20220719

AD01 Patent right deemed abandoned