Embodiment
To make the purpose, technical scheme and advantage of the application clearer, below in conjunction with the application specific embodiment and
Technical scheme is clearly and completely described corresponding accompanying drawing.Obviously, described embodiment is only the application one
Section Example, rather than whole embodiments.Based on the embodiment in the application, those of ordinary skill in the art are not doing
Go out the every other embodiment obtained under the premise of creative work, belong to the scope of the application protection.
Bloom filter has been used in the scheme of the application.In order to make it easy to understand, first being illustrated to Bloom filter.
Bloom filter is made up of a binary vector and multiple different hash functions, when the binary vector is initial
All positions are 0.Each element in one element set can be passed through each hash function point of Bloom filter respectively
A position in the binary vector of the Bloom filter is not mapped to, will be mapped to after mapping each time in the binary vector
Position be updated to 1.
When needing to judge that some element whether there is in the element set, as long as the element is passed through into each Hash respectively
Function Mapping to a position in the binary vector, and judge to be mapped to everybody whether all 1, if, it is determined that
There is a strong possibility is present in the element set for the element, otherwise, it determines the element is not present in the element set.This mistake
Journey is properly termed as:The element is filtered using the Bloom filter;Determine that the element is likely to be present in greatly the element set very much
Represent not filtering out the element in conjunction;Determine that the element is not present in the element set that i.e. expression has filtered out the element.
It follows that the presence relation of element is only deposited in the binary vector of Bloom filter, without depositing element sheet
Body, therefore, required storage overhead are few compared to the storage overhead needed for element is stored in itself.
Fig. 1 is the principle schematic of Bloom filter.In Fig. 1, { x, y, z } represents multiple Hash of the Bloom filter
Function (by taking 3 hash function x, y, z as an example), by the mapping of hash function, in the binary vector of the Bloom filter
The presence relation of an element set is saved, w is currently to be judged an element with the presence or absence of in the element set, can
To see, 3 positions in the binary vector that w is respectively mapped to by 3 hash functions, wherein there is a position to be 0, because
This, it may be determined that w is not present in the element set, that is, having filtered out w by the Bloom filter.
It is present in it should be noted that Bloom filter has less probability to judge an element by accident in the element set, but
It is, for the scene of the scheme of the application, due to being not required for message to be sent must being sent into all targeted customers, therefore
Can allow erroneous judgement, behind can in the scheme of the application describe in detail.
The number for the hash function that k is a Bloom filter is defined, n is needs the number of elements filtered, and m is should
The digit of the binary vector of Bloom filter, then the probability of miscarriage of justice of the Bloom filter is as shown in Figure 2.Fig. 2 is the grand filtering of cloth
The probability of miscarriage of justice schematic diagram (just show a conventional part) of device, has been shown in particular Bloom filter in k=[1,8],Probability of miscarriage of justice in span.
Fig. 3 sends the schematic flow sheet of determination method for a kind of message that the embodiment of the present application is provided.The execution of the flow
Main body can be service end or client.The equipment for carrying the service end or client includes but is not limited to:Personal computer,
Big-and-middle-sized computer, computer cluster, mobile phone, tablet personal computer, intelligent wearable device, vehicle device etc..
Flow in Fig. 3 may comprise steps of:
S301:Obtain the mark of the targeted customer of message to be sent.
In the embodiment of the present application, message to be sent can be specific a piece of news (such as one marketing message),
Can be a class message (such as, the related class marketing message of some business).Certainly, message to be sent is also not limited to be battalion
Message or the message of other guide are sold, marketing message is intended only as the scheme of the application being applied in background technology
Scene in when, a kind of example of message to be sent.
In the embodiment of the present application, targeted customer is probably that the user for being transmitted across message to be sent (is referred to as:It is to be sent
The transmission user of message), it is also possible to the user for being not yet transmitted across message to be sent (is referred to as:Message to be sent is not sent out
Send user).
For the mark of the either objective user of message to be sent, the flow in one or many Fig. 3 can be performed.Than
Such as, it may be predetermined that the set of the mark comprising multiple targeted customers, a mark is randomly choosed from the set every time to use
Flow in execution Fig. 3, etc..
In the embodiment of the present application, the mark includes but is not limited to:Email address, cell-phone number, instant messaging account etc..
S302:Determine to preserve the grand mistake of at least one cloth in the corresponding tables of data of the message to be sent, the tables of data
The binary vector of filter, the Bloom filter is used for the mark for not sending user for filtering out the message to be sent.
In the embodiment of the present application, it is emphasized that, the Bloom filter used is with existing Bloom filter
Distinguishing, existing Bloom filter is temporarily stored into internal memory, and power down is volatile, and because memory size is extremely limited,
Cause the digit of the binary vector of existing Bloom filter can not be long.And the grand filtering of cloth used in the scheme of the application
The binary vector of device is stored in tables of data (Database Table) so that binary vector is able to persistence and unlikely
Volatile in power down, moreover, compared to internal memory, the memory capacity of the memory (generally hard disk) residing for tables of data big will be obtained
It is many, therefore the digit of binary vector can be very long, can reach megabit even ten million etc., advantageously reduce the grand filtering of cloth
The probability of miscarriage of justice of device.If without specified otherwise, Bloom filter mentioned below refers both to the grand filtering of cloth of persistence binary vector
Device, rather than existing Bloom filter.
In the embodiment of the present application, the mark of user is above-mentioned element, is protected in the corresponding tables of data of message to be sent
Housed in the binary vector for the Bloom filter deposited:The presence relation of the mark for having sent user of message to be sent,
That is, the logo collection that the mark for having sent user of message to be sent is constituted is reflected.Therefore, the Bloom filter was used for
Filter out the mark for not sending user of message to be sent.
In the embodiment of the present application, " the transmission user of message to be sent ", " not the sending user of message to be sent " can
Further to be segmented based on time range.For same targeted customer, if being sent in some time range to the user
Message to be sent is crossed, and to be transmitted across message to be sent to the user in another time range, then the user is " described
The transmission user of message to be sent in some time range ", " message to be sent does not send out in another described time range
Send user ".
, can not if only preserving the binary vector of a Bloom filter in the corresponding tables of data of message to be sent
Distinguish multiple time ranges.For any one user, no matter one or many information to be sent, the user are transmitted across to the user
All be " the transmission user of message to be sent ", and if be not transmitted across information to be sent to the user, the user is " to be sent
Message does not send user ".
If preserving the binary vector of multiple Bloom filters in the corresponding tables of data of message to be sent, it can distinguish
Multiple time ranges, each binary vector corresponds respectively to a time range, and each Bloom filter is respectively used to filtering
Go out message to be sent in the corresponding time range of its binary vector does not send user.
S303:Each hash function of the Bloom filter is determined, and using by the binary vector and each Kazakhstan
The Bloom filter that uncommon function is constituted, the mark to the targeted customer is filtered.
In the embodiment of the present application, step S303 can be for one or many at least one described Bloom filter
What individual Bloom filter was performed respectively.Filtering carried out to the mark of the targeted customer be in order to:Judge the mark of targeted customer
With the presence or absence of having sent in the logo collection of user in the corresponding message to be sent of one or more time ranges, that is, sentencing
Disconnected targeted customer whether be message to be sent in one or more time ranges transmission user.
In the embodiment of the present application, each hash function of Bloom filter can be stored in tables of data, can also be preserved
In elsewhere (such as, internal memory, configuration file are medium), the application is not limited this.In addition, " determining the Bloom filter
Each hash function " can also first carry out in advance and necessity in step S303 perform.
S304:According to the filter result, it is determined whether the message to be sent is sent into the targeted customer.
In the embodiment of the present application, the forward direction targeted customer that can be determined according to filter result sends the feelings of message to be sent
Condition, and then based on the situation (herein in connection with more rules), determine whether this is sent to targeted customer by message to be sent.
In the embodiment of the present application, Bloom filter can reflect a kind of basic predetermined message fatigue of comparison in itself
Degree, the basic predetermined message fatigue strength is used to be limited to most multidirectional corresponding mesh in each unit interval (such as, 1 day etc.)
Mark user and send once message to be sent.In this case, when each Bloom filter can correspond respectively to a unit
Between, it is assumed that current time was in some unit interval, as long as being sent out within the unit interval before current time to targeted customer
Message to be sent is passed through, then is that the targeted customer can be filtered out using the unit interval corresponding Bloom filter, enters
And can determine not send message to be sent to the targeted customer.
Further, it can be seen from background technology, can also there is more complicated predetermined message fatigue strength, such as, it is described more
Complicated predetermined message fatigue strength for being limited in each unit interval most multidirectional corresponding targeted customer except sending once
Outside message to be sent, it is additionally operable to be limited to the maximum for sending message to be sent in multiple unit interval to corresponding targeted customer
Number of times.In this case, combined filtering result and the more complicated predetermined message fatigue are needed when performing step S304
Degree, it is determined whether message to be sent is sent to targeted customer.
In the embodiment of the present application, the executive agent of each step in Fig. 3 can be that same equipment or difference are set
It is standby to be used as executive agent.Such as, step S301~S304 executive agent is equipment 1;Again such as, step S301's and S302
Executive agent is equipment 1, and step S303 and S304 executive agent are equipment 2;Etc..
In addition, the necessary order according to S301~S304 is not performed each step in Fig. 3 yet.Such as, it can also first carry out
Step S302, then perform step S301;Etc..
, can the binary system based on the Bloom filter preserved in the corresponding tables of data of message to be sent by the above method
Vectorial and corresponding hash function, it is determined whether send message to be sent to some targeted customer, without preserving such as
The managing detailed catalogues such as the mark of user are sent, wherein, the Bloom filter is used to filter out not sending for the message to be sent
The mark of user;When targeted customer's quantity of message to be sent is more, opened compared to the storage needed for preserving each managing detailed catalogue
Pin, the storage overhead needed for preserving the binary vector and each hash function is much smaller, therefore, it can part or complete
Solve the problems of the prior art to portion.
Based on the method in Fig. 3, the embodiment of the present application additionally provides some specific embodiments of this method, and extension
Scheme, is illustrated below.
In the embodiment of the present application, because the binary vector of Bloom filter is stored in tables of data, it therefore, it can lead to
The data crossed in SQL (Structured Query Language, SQL) sentence operation binary vector are (main
If data query operation), to realize filter process.Specifically, for step S303, using by the binary vector and institute
The Bloom filter that each hash function is constituted is stated, the mark to the targeted customer is filtered, can included:Difference root
It is a position by the identity map of the targeted customer according to each hash function;According to each institute's rheme of mapping, for described
The binary vector preserved in tables of data carries out data query operation (specifically can be by the Select languages in SQL statement
Sentence is realized);According to the Query Result, it is determined that using the cloth being made up of the binary vector and each hash function
The filter result that grand filter is filtrated to get to the mark of the targeted customer., can be with when there is multiple Bloom filters
The process in this section is performed for each Bloom filter respectively.
The advantage of the scheme of the preceding paragraph is:The SQL statement that directly can be supported using tables of data is completed needed for filter process
Most of work, and most of work is in tables of data (such as, for data query operation of binary vector etc.)
Completed in residing database, executive agent can only send SQL statement, and corresponding data manipulation is performed by the database,
The processing pressure of executive agent can be mitigated, cost can also be reduced.In actual applications, step S303 can also be other real
Mode is applied, such as, the binary vector preserved in tables of data can all be read out, and secondary is stored in elsewhere (such as,
Internal memory), then the binary vector based on secondary preservation filtered, and can so mitigate processing pressure of the database, etc..
Following embodiment is mainly also based on the implementation of former scheme.
In the embodiment of the present application, it is assumed that only need to judge whether to be transmitted across message to be sent to the targeted customer, without
Consideration is specifically transmitted across several times, can regard a unit interval as All Time scope.Correspondingly, can only it be protected in tables of data
There is the binary vector of a Bloom filter, in this case, for step S304, according to the filter result, really
It is fixed whether the message to be sent to be sent to the targeted customer, it can specifically include:According to the filter result, institute is determined
Whether state targeted customer is that the message to be sent does not send user;If, it is determined that the message to be sent is sent to
The targeted customer;Otherwise, it determines the message to be sent is not sent into the targeted customer.It is achieved thereby that using target
The control of the message fatigue strength at family.
More specifically, each hash function of the Bloom filter can be used respectively by the identity map of targeted customer for one
Individual position, accordingly inquires about whether each institute's rheme is 1 in the binary vector of the Bloom filter;If not, then it represents that filtering
The mark of targeted customer is gone out, it may be determined that targeted customer does not send user for message to be sent;If, then it represents that do not filter
Go out the mark of targeted customer, it may be determined that targeted customer be message to be sent do not send user (have less probability of miscarriage of justice,
But little are influenceed on the scheme works of the application.Because, even if being treated because erroneous judgement causes not send user to some and sent
Message is sent, the dislike that will not also cause this not send user is not more than predetermined message fatigue strength, and then will not also produce
The problems of the prior art).
Further,, can be with it is determined that the message to be sent is sent to after the targeted customer for step S304
Perform:The message to be sent is sent to the targeted customer;According to the mark of the targeted customer, in the tables of data
The binary vector preserved is updated.Kept most so that the element preserved in binary vector has relation
New state.
It is similar with the data query operation performed in filter process, the data renewal behaviour that renewal process can also be based on SQL
Make and (can specifically realize by the Update sentences in SQL statement) to perform.Such as, it can be updated as follows
Journey:Every carry out data for identity map to the binary vector of targeted customer in filter process update operation, will be each
Position in position not for 1 is updated to 1, and the binary vector after renewal can reflect the targeted customer to have sent user.
In the embodiment of the present application, more complicated predetermined message fatigue strength has been previously noted, in this case, for reality
Now to the control of the message fatigue strength of targeted customer, the binary vector of multiple Bloom filters can be preserved in tables of data,
Each binary vector corresponds respectively to a specified time range, each the corresponding grand filtering of cloth of the binary vector
Device is used for:Filter out the mark for not sending user of the message to be sent in the corresponding specified time range of the binary vector
Know.
For example it is assumed that the unit interval is 1 day, the predetermined message fatigue strength to targeted customer is defined below:At most can only within 1 day
Sent to the targeted customer in once message to be sent, N days at most can only send M message to be sent to the targeted customer, its
In, N, M are integer.
The binary vector of N number of Bloom filter can be preserved in tables of data, each binary vector is corresponded in N
One day in it, wherein, current time typically belongs to one day in N days the latest, can be rolled with the advance at current time within N days
Update.Such as, it is assumed that N=7, current time is this Sunday, then N days can for this Sunday add before this Sunday continuous 6 days (
That is, this Monday to this Saturday), and if current time by this Sunday advancing to second day (next Monday) when, N days can also be corresponding
Ground is rolled and is updated to this Tuesday to this Sunday plus next Monday, and the related data of this Monday can be emptied after renewal.
Further, current time belongs at least one time range in each specified time range, for step
Rapid S303, determines each hash function of the Bloom filter, and using by the binary vector and each hash function
The Bloom filter constituted, the mark to the targeted customer is filtered, can specifically included:Determine described to be sent
The corresponding pre-defined rule of message, the pre-defined rule, which was used to limit in the range of certain time, to send described pending to same user
Send the maximum times (such as, the M in upper example) of message;In each binary vector, corresponding specified time range is determined
Belong to the binary vector of the certain time scope;Performed respectively for each binary vector of determination:Determine that this two enters
Each hash function of vectorial corresponding Bloom filter is made, and it is grand using binary vector cloth corresponding with the binary vector
The Bloom filter that each hash function of filter is constituted, the mark to the targeted customer is filtered.Wherein, it is described pre-
Set pattern is then the part or all of content of predetermined message fatigue strength.
Further, for step S304, according to the filter result, it is determined whether the message to be sent is sent
To the targeted customer, it can specifically include:According to the filter result and the pre-defined rule, it is determined that in the certain time
Whether the number of times for sending the message to be sent to the targeted customer in scope is less than the maximum times;If, it is determined that
The message to be sent is sent to the targeted customer;Otherwise, it determines the message to be sent is not sent into the target
User.It is achieved thereby that the control of the message fatigue strength to targeted customer.
It should be noted that determining the message to be sent being sent to before the targeted customer, in addition it is also necessary to perform:It is determined that
Time range of the current time belonging in each specified time range;Determine that the targeted customer is:It is described
The message to be sent does not send user in affiliated time range.To ensure any in each specified time range
Specify in time range, at most can only send once message to be sent to the targeted customer.
In the application in embodiment, for step S304, it is determined that the message to be sent is sent into the target
After user, it can also carry out:The message to be sent is sent to the targeted customer;Determine the delivery time each described
Specify the time range belonging in time range;According to the mark of the targeted customer, to the correspondence preserved in the tables of data
It is updated in the binary vector of the affiliated time range.
According to the above description, the embodiment of the present application additionally provide a kind of practical application scene (i.e. above-mentioned N=7 example
Scene) under data manipulation schematic diagram involved by each binary vector for preserving in tables of data, as shown in Figure 4.
In FIG. 4, it is assumed that when initial, N days specially continuous 7 days Day1~Day7, Day1~Day7 be corresponding in turn in
One day, and the binary vector of the Bloom filter corresponded respectively in database), current time is in Day7;Day0
For transfer.For the ease of description, Day1~Day7 is also used for representing can be used for storing the grand filtering of a cloth respectively in tables of data
The mark (such as, field or record etc.) of the storage region of the binary vector of device.
, N days can be one day with rolls forward when going to next day from Day7 at current time.Specifically, can be by before 6 days
Related data is emptied (that is, Day1 binary vector), it is possible to for the Day0 for transfer, increases the grand filtering of a cloth
The binary vector of device, the binary vector corresponds to next day.
Hereafter, during being still fallen within current time in next day, Day0 is no longer used for transfer, but change quilt
The Day1 emptied is used for transfer (next time).During this period, the mark for targeted customer determines whether that the targeted customer sends out
, it is necessary to carry out read operation (data query operation) to Day2-7,0 corresponding binary vector during sending message to be sent,
If it is determined that sending message to be sent to the targeted customer, then also need to enter Day0 corresponding binary vectors row write behaviour after transmission
Make (data update operation).
First record in Fig. 4 table is illustrated above.Similarly, can also continue day by day within new N days to
It is preceding roll, afterwards often roll one day after be related to reading and writing, empty operation institute specific to Day such as first record after
Each record shown in, no longer illustrate, be summarized as follows one by one:
Day (d%8) is write, Day (d%8+1) is emptied, Day [(d%8+8-j) %8] is read;
Wherein, j=[0,6], d is the natural day counted from certain day.
It should be noted that in actual applications, can also be without individually taking one day as transfer, but can wait clear
It is empty empty within one day finish after, be used further to write one day.But, so do following drawback:When binary vector length compared with
Emptied when long, it is necessary to take a significant amount of time, during emptying, it is impossible to normally update the corresponding binary system in next day to
Amount, that is, can not normally control targe user message fatigue strength.
In the embodiment of the present application, preserving type of the binary vector of Bloom filter in tables of data can have many
Kind.Such as, each binary vector can be preserved with a field, or by one data of each binary vector
Storehouse record progress is preserved, etc..
It is main by taking field as an example below.
At present, most of tables of data all supports position (bit) type, and for depositing bit string, the single file of a field typically may be used
To deposit the bit string of most long 64, corresponding bit types are Bit (64) type.But, in a fairly large number of situation of targeted customer
Under, the length of binary vector may be far above 64.
In this case, binary vector, can be segmented by the scheme based on the application, each segmentation at most 64
Position, deposits a segmentation one by one in the multirow of the corresponding field column of binary vector.Binary system in this to
Storage configuration is measured, following relation can be obtained:
X of binary vector with the i-th row of corresponding data table, and in corresponding field b relation it is as follows:
B of i-th row are located at (i) * 64+b of binary vector;
X of binary vector are located at X%64 of X/64 rows;
Wherein, X, i and b are since 0.
In the embodiment of the present application, the corresponding specified time range of each binary vector can correspond respectively to tables of data
In a field, in the column of the field point a line or multirow preserve the corresponding binary system of corresponding specified time range to
Amount.Time range is specified because the field corresponds to, for the ease of description, the field can be referred to as:Time field.
Each specified time range can be continuous or discontinuous.In actual applications, control message fatigue
Be typically also to need to be controlled in a continuous time interval when spending, otherwise may missing data, so as to influence pair
The precise control of message fatigue strength, the embodiment of the present application is mainly also based on this scene, under the scene, when each specified
Between scope non-overlapping copies, and can be sequentially connected composition one continuous time interval.
Further, the one day Day0 that can have for transfer is refer in above-mentioned N=7 example, in practical application
In, the transfer field that this day can correspond in tables of data.Specifically, the when appearance of each specified time range
Together, the transfer field of the particular time range corresponded to behind the immediately continuous time interval, institute are also included in the tables of data
State particular time range identical with the duration of the specified time range.
For the method in Fig. 3, it can also carry out:In the tables of data increase a Bloom filter binary system to
Amount, is stored in a line of the column of the transfer field or multirow, and the Bloom filter is used to filter out described specific
The mark for not sending user of the message to be sent in time range;In each specified time range earlier than current time
Afterwards, the specified time range correspondence of the time in each specified time range preserved in the tables of data earliest is emptied
Binary vector;And the transfer field is not re-used as transfer field, and the specified time model by the time earliest
Corresponding time field is enclosed as transfer field.
The embodiment of the present application additionally provides to be sent under a kind of practical application scene (scene of i.e. above-mentioned N=7 example)
The explanation of field table of the corresponding tables of data of message, and the tables of data sample table, respectively as shown in table 1 below, table 2.
Table 1
Fatigue_BITMAP_EMAIL (a kind of example of tables of data table name)
Table 1 is the explanation of field table.
Table 2
bitTag |
Day0 |
Day1 |
Day2 |
Day3 |
Day4 |
Day5 |
Day6 |
Day7 |
0 |
Bit(64) |
Bit(64) |
Bit(64) |
Bit(64) |
Bit(64) |
Bit(64) |
Bit(64) |
Bit(64) |
1 |
Bit(64) |
Bit(64) |
Bit(64) |
Bit(64) |
Bit(64) |
Bit(64) |
Bit(64) |
Bit(64) |
2 |
Bit(64) |
Bit(64) |
Bit(64) |
Bit(64) |
Bit(64) |
Bit(64) |
Bit(64) |
Bit(64) |
3 |
Bit(64) |
Bit(64) |
Bit(64) |
Bit(64) |
Bit(64) |
Bit(64) |
Bit(64) |
Bit(64) |
… |
|
|
|
|
|
|
|
|
… |
|
|
|
|
|
|
|
|
Table 2 is the sample table.
By taking the tables of data in table 2 as an example, it is necessary to by the row zero needed for the preservation binary vector in table and just when initial
Beginningization (such as determines every behavior Bit (64) type of time field, the span for determining several rows of needs, determining bitTag),
To ensure that subsequently normal data can must be obtained according to during row fetch bit.
Update operating aspect, binary vector corresponding for each field, can based on SQL's or computing carry out more
Newly, such efficiency is higher, is conducive to improving renewal speed.
In terms of null clear operation, timed task can be based on, corresponding binary vector is emptied (that is, in tables of data
A certain row), such as, it is assumed that want to empty the corresponding binary vectors of DayX, such as following SQL statement can be performed:
" update Fatigue_BITMAP_EMAIL set DayX=0where Id=i;
i<=max (BitTag) ".
Based on the corresponding tables of data of Tables 1 and 2, the embodiment of the present application provides a kind of practical application scene (i.e. above-mentioned N=7
Example scene) under above-mentioned message send the embodiment of determination method, each Bloom filter has k hash function respectively,
Targeted customer's is designated email address, and message to be sent is marketing mail:
Daily 0 point empties Day (d%8+1);
Sent to either objective user before message to be sent, read (inquiry) Day [(d%8+8-j) %8], j=[0,6], with
It is determined that the total degree for having sent message to be sent to the targeted customer in nearest 7 days (can once can only tire out for many days daily
Product);
Wherein, specific querying method is:According to the mark of the targeted customer, k of corresponding Bloom filter are used
Hash function is mapped to k position (L1、L2..., L3, Lk), then perform such as following SQL statement:
" select DayN from Fatigue_BITMAP_EMAIL where BitTag=in (L1/ 64, L2/ 64,
L3/ 64 ..., Lk/ 64) ", whether 1 is all according to this corresponding k position of DayN, whether judged in corresponding DayN to the target
User is transmitted across message to be sent;
If total degree is less than the restriction of predetermined message fatigue strength, message to be sent can be sent to the targeted customer, it is no
It can not then send;
Corresponding binary vector is updated after transmission, Day (d%8) is such as write.
How many storage overhead reduced compared to prior art on earth for scheme for the ease of understanding the application, is used the example above
This is made a concrete analysis of.It is assumed that averagely to send 6,000,000 marketing mails daily, the length of the addresses of items of mail of targeted customer is put down
It is 15 bytes (Byte, B).
Then in the prior art, forty-two million time marketing mail is about sent within every 7 days, correspondingly, the detailed letter of required record
Breath at least includes forty-two million addresses of items of mail (being each posting address), and storage overhead is:15B* (forty-two million)=
630MB。
And when using the scheme of the application, it is assumed that k=8,Consult Fig. 2 and understand that the False Rate of Bloom filter is
0.00014.The each binary vector preserved in tables of data is (6,000,000 * 20) bit=15MB, need at most preserve 8 binary systems
Vectorial (corresponding respectively to 7 time fields plus a transfer field), as 15MB*8=120MB, hence it is evident that to be less than existing skill
The storage overhead of art.
Similarly, it is assumed that averagely to send 10,000,000 marketing mails daily, the storage overhead needed for prior art is about
1GB, and the storage overhead needed for the scheme of the application is about 200MB.
In actual applications, a class marketing message would correspond to a company or a business, can be according to different public affairs
Department, different business, are respectively created multiple separate tables of data, to be respectively used to enter for corresponding class marketing message
Row message fatigue strength is controlled, and will not be interfered.
It should be noted that being sent out above mainly by taking N=7 days this scenes as an example the message that the embodiment of the present application is provided
Determination method is sent (to be referred to as:Message fatigue strength control method) be described in detail.In actual applications, the application
Scheme autgmentability is very strong, can control the fatigue strength of not same area dimension, can be carried by way of increasing field in tables of data
The quantity of high time range, either in one or more time ranges, no matter the length of each time range is specially how many, nothing
It is whether continuous each other by each time range, it may be applicable to carry out message fatigue strength control using the scheme of the application, and
And the scheme of the application is also based on timer and periodically empties expired data.
Determination method is sent for the message that the embodiment of the present application is provided above, based on same thinking, the embodiment of the present application
A kind of tables of data creation method is additionally provided, as shown in figure 5, the tables of data is the data in the message transmission determination method
Table, the executive agent of this method can send the executive agent and/or database of determination method for the message, and this method can be with
Determination method is sent for the message, and support is provided.
A kind of schematic flow sheet for tables of data creation method that Fig. 5 provides for the embodiment of the present application.This method can include
Following steps:
S501:Determine message to be sent.
S502:Tables of data corresponding with the message to be sent is created, the tables of data is grand for preserving at least one cloth
The binary vector of filter, the Bloom filter is used for the mark for not sending user for filtering out the message to be sent.
In the embodiment of the present application, the tables of data is used for the binary vector for preserving multiple Bloom filters, Mei Gesuo
State binary vector and correspond respectively to a specified time range.
In the embodiment of the present application, each specified time range non-overlapping copies, and it is continuous to be sequentially connected composition one
Time interval.
In the embodiment of the present application, included in the tables of data with each specified time range it is one-to-one multiple when
Between field, point a line or the corresponding specified time range corresponding two of multirow preservation are entered in the column of each time field
System vector.
In the embodiment of the present application, the duration of each specified time range is identical, is also included in the tables of data
Corresponding to the transfer field of particular time range of the immediately continuous time after interval, the particular time range refers to described
The duration for scope of fixing time is identical;The transfer field is used for the specific Bloom filter in a line or multirow of its column
Binary vector, the specific Bloom filter is used to filter out in the particular time range message to be sent not
Send the mark of user;After each specified time range is earlier than current time, empty preserved in the tables of data
The corresponding binary vector of specified time range of time earliest in each specified time range.
Above determination method and a kind of tables of data creation method, base are sent for a kind of message that the embodiment of the present application is provided
In same thinking, the embodiment of the present application additionally provides the corresponding intrument of these methods, as shown in Figure 6, Figure 7.
Fig. 6 sends the structural representation of determining device for a kind of message corresponding to Fig. 3 that the embodiment of the present application is provided, should
Device includes:
Module 601 is obtained, the mark of the targeted customer of message to be sent is obtained;
Tables of data determining module 602, determine to preserve in the corresponding tables of data of the message to be sent, the tables of data to
The binary vector of a few Bloom filter, what the Bloom filter was used to filtering out the message to be sent does not send use
The mark at family;
Filtering module 603, determines each hash function of the Bloom filter, and using by the binary vector and institute
The Bloom filter that each hash function is constituted is stated, the mark to the targeted customer is filtered;
Determining module 604 is sent, according to the filter result, it is determined whether the message to be sent is sent into the mesh
Mark user.
Alternatively, the identity map of the targeted customer, respectively according to each hash function, is one by filtering module 603
Individual position;According to each institute's rheme of mapping, data query operation is carried out for the binary vector preserved in the tables of data;
According to the Query Result, it is determined that using the Bloom filter being made up of the binary vector and each hash function
The filter result that mark to the targeted customer is filtrated to get.
Alternatively, the binary vector of a Bloom filter is only preserved in the tables of data, determining module 604 is sent
Whether according to the filter result, it is that the message to be sent does not send user to determine the targeted customer;If, it is determined that
The message to be sent is sent to the targeted customer;Otherwise, it determines the message to be sent is not sent into the target
User.
Alternatively, described device also includes:
Update module 605 is sent, determines that the message to be sent is sent into the target uses sending determining module 604
Behind family, the message to be sent is sent to the targeted customer;According to the mark of the targeted customer, in the tables of data
The binary vector preserved is updated.
Alternatively, the binary vector of multiple Bloom filters is preserved in the tables of data, each binary system to
Amount corresponds respectively to a specified time range, and each the corresponding Bloom filter of the binary vector is used for:Filter out
The mark for not sending user of the message to be sent in the corresponding specified time range of the binary vector.
Alternatively, current time belongs at least one time range in each specified time range;Filtering module
603 determine the corresponding pre-defined rule of the message to be sent, and the pre-defined rule was used to limit in the range of certain time can be to same
One user sends the maximum times of the message to be sent;In each binary vector, corresponding specified time model is determined
Enclose the binary vector for belonging to the certain time scope;Performed respectively for each binary vector of determination:Determine this two
Each hash function of the corresponding Bloom filter of system vector, and using binary vector cloth corresponding with the binary vector
The Bloom filter that each hash function of grand filter is constituted, the mark to the targeted customer is filtered.
Alternatively, determining module 604 is sent according to the filter result and the pre-defined rule, it is determined that in described one timing
Between send the message to be sent to the targeted customer in scope number of times whether be less than the maximum times;If so, then true
It is fixed that the message to be sent is sent to the targeted customer;Otherwise, it determines the message to be sent is not sent into the mesh
Mark user.
Alternatively, send determining module 604 to determine the message to be sent being sent to before the targeted customer, determine institute
State time range of the current time belonging in each specified time range;Determine that the targeted customer is:The institute
The message to be sent does not send user in the time range of category.
Alternatively, update module 605 is sent, determines the message to be sent being sent to institute sending determining module 604
State after targeted customer, the message to be sent is sent to the targeted customer;Determine the delivery time each described specified
Time range belonging in time range;According to the mark of the targeted customer, correspond to institute to what is preserved in the tables of data
The binary vector of time range belonging to stating is updated.
Alternatively, each specified time range non-overlapping copies, and can be sequentially connected composition one continuous time interval.
Alternatively, the time field that each specified time range is corresponded respectively in the tables of data, this when
Between field column in point a line or multirow preserve the corresponding binary vector of corresponding specified time range.
Alternatively, the duration of each specified time range is identical, also includes and corresponds to immediately in the tables of data
The transfer field of particular time range after the continuous time is interval, the particular time range and the specified time range
Duration it is identical;
Described device also includes:
Transfer empties module 606, increases the binary vector of a Bloom filter in the tables of data, is stored in institute
In a line or multirow for stating the column of transfer field, the Bloom filter is used to filter out the institute in the particular time range
State the mark for not sending user of message to be sent;After each specified time range is earlier than current time, empty described
The corresponding binary system of specified time range of time earliest in each specified time range preserved in tables of data to
Amount;And the transfer field is not re-used as transfer field, and the specified time range by the time earliest it is corresponding when
Between field be used as transfer field.
A kind of structural representation for tables of data creating device corresponding to Fig. 5 that Fig. 7 provides for the embodiment of the present application, the dress
Put including:
Determining module 701, determines message to be sent;
Creation module 702, creates tables of data corresponding with the message to be sent, and the tables of data is used for preservation at least one
The binary vector of individual Bloom filter, what the Bloom filter was used to filtering out the message to be sent does not send user's
Mark.
Alternatively, the tables of data is used to preserving the binary vectors of multiple Bloom filters, each binary system to
Amount corresponds respectively to a specified time range.
Alternatively, each specified time range non-overlapping copies, and can be sequentially connected composition one continuous time interval.
Alternatively, included in the tables of data with the one-to-one multiple time fields of each specified time range, often
Point a line or multirow preserve the corresponding binary vector of corresponding specified time range in the column of the individual time field.
Alternatively, the duration of each specified time range is identical, also includes and corresponds to immediately in the tables of data
The transfer field of particular time range after the continuous time is interval, the particular time range and the specified time range
Duration it is identical;
The transfer field be used to preserving in a line or multirow of its column the binary system of specific Bloom filter to
Amount, what the specific Bloom filter was used to filtering out in the particular time range message to be sent does not send user
Mark;
Described device also includes:
Transit module 703 is emptied, after each specified time range is earlier than current time, is emptied in the tables of data
The corresponding binary vector of specified time range of time earliest in each specified time range preserved.
The apparatus and method that the embodiment of the present application is provided are one-to-one, and therefore, device also has corresponding side
The similar advantageous effects of method, due to the advantageous effects of method being described in detail above, therefore, here
Repeat no more the advantageous effects of device.
It should be understood by those skilled in the art that, embodiments of the invention can be provided as method, system or computer program
Product.Therefore, the present invention can be using the reality in terms of complete hardware embodiment, complete software embodiment or combination software and hardware
Apply the form of example.Moreover, the present invention can be used in one or more computers for wherein including computer usable program code
The computer program production that usable storage medium is implemented on (including but is not limited to magnetic disk storage, CD-ROM, optical memory etc.)
The form of product.
The present invention is the flow with reference to method according to embodiments of the present invention, equipment (system) and computer program product
Figure and/or block diagram are described.It should be understood that can be by every first-class in computer program instructions implementation process figure and/or block diagram
Journey and/or the flow in square frame and flow chart and/or block diagram and/or the combination of square frame.These computer programs can be provided
The processor of all-purpose computer, special-purpose computer, Embedded Processor or other programmable data processing devices is instructed to produce
A raw machine so that produced by the instruction of computer or the computing device of other programmable data processing devices for real
The device for the function of being specified in present one flow of flow chart or one square frame of multiple flows and/or block diagram or multiple square frames.
These computer program instructions, which may be alternatively stored in, can guide computer or other programmable data processing devices with spy
Determine in the computer-readable memory that mode works so that the instruction being stored in the computer-readable memory, which is produced, to be included referring to
Make the manufacture of device, the command device realize in one flow of flow chart or multiple flows and/or one square frame of block diagram or
The function of being specified in multiple square frames.
These computer program instructions can be also loaded into computer or other programmable data processing devices so that in meter
Series of operation steps is performed on calculation machine or other programmable devices to produce computer implemented processing, thus in computer or
The instruction performed on other programmable devices is provided for realizing in one flow of flow chart or multiple flows and/or block diagram one
The step of function of being specified in individual square frame or multiple square frames.
In a typical configuration, computing device includes one or more processors (CPU), input/output interface, net
Network interface and internal memory.
Internal memory potentially includes the volatile memory in computer-readable medium, random access memory (RAM) and/or
The forms such as Nonvolatile memory, such as read-only storage (ROM) or flash memory (flash RAM).Internal memory is computer-readable medium
Example.
Computer-readable medium includes permanent and non-permanent, removable and non-removable media can be by any method
Or technology come realize information store.Information can be computer-readable instruction, data structure, the module of program or other data.
The example of the storage medium of computer includes, but are not limited to phase transition internal memory (PRAM), static RAM (SRAM), moved
State random access memory (DRAM), other kinds of random access memory (RAM), read-only storage (ROM), electric erasable
Programmable read only memory (EEPROM), fast flash memory bank or other memory techniques, read-only optical disc read-only storage (CD-ROM),
Digital versatile disc (DVD) or other optical storages, magnetic cassette tape, the storage of tape magnetic rigid disk or other magnetic storage apparatus
Or any other non-transmission medium, the information that can be accessed by a computing device available for storage.Define, calculate according to herein
Machine computer-readable recording medium does not include temporary computer readable media (transitory media), such as data-signal and carrier wave of modulation.
It should also be noted that, term " comprising ", "comprising" or its any other variant are intended to nonexcludability
Comprising so that process, method, commodity or equipment including a series of key elements are not only including those key elements, but also wrap
Include other key elements being not expressly set out, or also include for this process, method, commodity or equipment intrinsic want
Element.In the absence of more restrictions, the key element limited by sentence "including a ...", it is not excluded that wanted including described
Also there is other identical element in process, method, commodity or the equipment of element.
Embodiments herein is the foregoing is only, the application is not limited to.For those skilled in the art
For, the application can have various modifications and variations.It is all any modifications made within spirit herein and principle, equivalent
Replace, improve etc., it should be included within the scope of claims hereof.