CN109729054A - Access data monitoring method and relevant device - Google Patents

Access data monitoring method and relevant device Download PDF

Info

Publication number
CN109729054A
CN109729054A CN201711048801.5A CN201711048801A CN109729054A CN 109729054 A CN109729054 A CN 109729054A CN 201711048801 A CN201711048801 A CN 201711048801A CN 109729054 A CN109729054 A CN 109729054A
Authority
CN
China
Prior art keywords
access data
distribution characteristics
promotion message
field
access
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201711048801.5A
Other languages
Chinese (zh)
Other versions
CN109729054B (en
Inventor
贺海军
刘新颖
畅恒宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba Group Holding Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN201711048801.5A priority Critical patent/CN109729054B/en
Publication of CN109729054A publication Critical patent/CN109729054A/en
Application granted granted Critical
Publication of CN109729054B publication Critical patent/CN109729054B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Information Transfer Between Computers (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

This application provides a kind of monitoring methods for accessing data, this method can be according to the first promotion message, obtain access data corresponding with the first promotion message, and count these distribution characteristics of access data on the second promotion message, since the first promotion message is the larger information that may be modified in abduction behavior, and the second promotion message can't be modified to a greater extent, therefore, after obtaining corresponding access data according to the first promotion message, whether meet preset condition further according to these distribution characteristics of access data on the second promotion message, it can determine in access data and kidnap behavior with the presence or absence of flow, i.e. whether the first promotion message is modified.In order to guarantee the application and realization of the above method in practice, present invention also provides equipment relevant to the access monitoring method of data.

Description

Access data monitoring method and relevant device
Technical field
This application involves flow monitoring technical fields, more specifically, being access data monitoring method and relevant device.
Background technique
In e-commerce field, the product of seller is in electric business gondola sales.In order to attract more buyers to browse product letter Breath, a kind of popularization means of seller are to be promoted by popularization main body to commodity.It can be in microblogging as shown in Figure 1, promoting main body Publisher in (a kind of social software), wechat (a kind of social software), a kind of Baidu's application (search engine) various network platforms The access entrance of product attracts the user of these network platforms to browse merchandise news by access entrance and even buys commodity.Equity Ground, seller can pay certain remuneration to main body is promoted according to the expense of conclusion of the business.This promotional technique is properly termed as proportionately delivering Take (Pay for Sale).
In the above promotional technique, different popularization main bodys has different marks, enters from the network platform and sells family property The mark for promoting main body can be carried in the access address of the product page, seller can determine according to the mark carried in access address Remuneration object.
Currently, there is a kind of flow abduction situation, i.e., the popularization main body mark A in access address is distorted to promote and leading Body identifies B, so that the remuneration that seller will should pay popularization main body A is mistakenly paid and promotes main body B.In order to avoid this Situation, electric business platform need a kind of technology, to monitor the abduction behavior in flowing of access.
Summary of the invention
In view of this, this application provides a kind of access data monitoring method, for determining whether access data quilt occur The case where abduction.In addition, present invention also provides access data monitoring devices, to guarantee the application of the method in practice And it realizes.
In order to achieve the object, technical solution provided by the present application is as follows:
In a first aspect, this application provides a kind of monitoring methods for accessing data, comprising:
According to the first promotion message, access data corresponding with first promotion message are obtained;
Determine the distribution characteristics of the second promotion message of the access data;
Whether meet preset condition according to the distribution characteristics, determines that the access data whether there is and pushed away to described first The modification of Guangxin breath.
Second aspect, this application provides a kind of monitoring devices for accessing data, comprising:
Data obtaining module is accessed, for obtaining visit corresponding with first promotion message according to the first promotion message Ask data;
Distribution characteristics determining module, the distribution characteristics of the second promotion message for determining the access data;
Distribution characteristics detection module determines the access number for whether meeting preset condition according to the distribution characteristics According to the presence or absence of the modification to first promotion message.
The third aspect, this application provides a kind of monitoring devices for accessing data, comprising: processor and memory, it is described Software program, calling storage data in the memory of the processor by operation storage in the memory, at least Execute following steps:
According to the first promotion message, access data corresponding with first promotion message are obtained;
Determine the distribution characteristics of the second promotion message of the access data;
Whether meet preset condition according to the distribution characteristics, determines that the access data whether there is and pushed away to described first The modification of Guangxin breath.
From the above technical scheme, this application provides a kind of monitoring method for accessing data, this method can basis First promotion message obtains access data corresponding with the first promotion message, and counts these access data and promote letter second Distribution characteristics on breath, since the first promotion message is the larger information that may be modified in abduction behavior, and second promotes Information can't be modified to a greater extent, therefore, after obtaining corresponding access data according to the first promotion message, further according to Whether these distribution characteristics of access data on the second promotion message meet preset condition, and can determine in access data is No there are flows to kidnap behavior, i.e., whether the first promotion message is modified.
Detailed description of the invention
In order to illustrate the technical solutions in the embodiments of the present application or in the prior art more clearly, to embodiment or will show below There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this The embodiment of application for those of ordinary skill in the art without creative efforts, can also basis The attached drawing of offer obtains other attached drawings.
Fig. 1 is a kind of schematic diagram of a scenario that object is promoted in the prior art;
Fig. 2 is a kind of flow chart of access data monitoring method provided by the present application;
Fig. 3 is another flow chart of access data monitoring method provided by the present application;
Fig. 4 is another flow chart of access data monitoring method provided by the present application;
Fig. 5 is a kind of structural schematic diagram of access data monitoring device provided by the present application;
Fig. 6 is a kind of structural schematic diagram of access data monitoring device provided by the present application.
Specific embodiment
Below in conjunction with the attached drawing in the embodiment of the present application, technical solutions in the embodiments of the present application carries out clear, complete Site preparation description, it is clear that described embodiments are only a part of embodiments of the present application, instead of all the embodiments.It is based on Embodiment in the application, it is obtained by those of ordinary skill in the art without making creative efforts every other Embodiment shall fall in the protection scope of this application.
Seller can be promoted in the commodity of electric business gondola sales by the network platform except electric business platform.Network Platform is properly termed as media or channel, such as navigation website, social software group, the network platform can be provided on itself platform into Mouthful, user can access the commodity webpage of seller by the entrance, to achieve the purpose that promote seller's commodity.Wherein, it uses Family is properly termed as flowing of access, access data or access request etc. to the access of commodity webpage;The entrance provided in the network platform It is properly termed as Web portal or access entrance.
The network platform is a kind of concrete form of product promotion main body, in this implementation, the different network platforms With different marks, the network platform promotes commodity on itself platform, and seller can award to the network platform.When So, product promotion main body can not be the network platform, but promote user.Account can be registered on electric business platform by promoting user Family, account name are the unique identifications for promoting user.Commodity can be promoted in the various network platforms by promoting user, be sold Family can reward user is promoted.
Product promotion main body can be various forms, for ease of description, product promotion main body can referred to as be promoted Main body, target subject or target user.Promote main body have unique identification, promote main body mark be properly termed as main body mark or User identifier.As described above, main body mark can be the mark of the network platform, the account name for promoting user etc..
Different popularization main bodys has different marks, can carry in the flow of user's access target commodity and promote main body Mark, electric business platform can according to promote main body mark come distinguish user be from which promote main body promote access entrance into Enter commodity webpage.Current popularization main body is identified as publication number (publish identification, PID), the mark Included in uniform resource locator (Uniform Resource Location, URL).In order to facilitate understanding, can will unify Resource Locator is known as access address.That is, including to promote main body mark, electric business platform root in the access address of user Determine which user sources promote main body according to the mark.
It is currently, there are a kind of flow abduction phenomenon, this abduction technology belongs to link layer abduction.Link layer abduction refers to Third party such as network service operators or hacker are implanted into rogue program or control user and service between user and server The network equipment between device, to listen to or distort the network data between user and server.Specifically, in this application, it dislikes Popularization main body in access address can be identified A and be revised as promoting main body mark B by meaning program, be led so that seller will mistakenly promote Source of the body B as flowing of access.
Generally, seller can pay certain remuneration to the main body for carrying out product promotion, and flow abduction will lead to seller's Economic loss.For electric business platform, the related data of flowing of access can be collected, the flowing of access inaccuracy after abduction, Lead to the data collection inaccuracy of electric business platform.Therefore, it is necessary to a kind of technologies, to monitor whether that there are flows to kidnap situation.
A kind of existing monitoring scheme technology is to be carried out based on region and network service operators concentration degree to monitoring is kidnapped Monitoring.
Specifically, access of the user to electric business platform needs the network service provided using network service operators.Net Network service provider can build network within the scope of multiple regions, so that the user in each different zones can be covered by network Lid, and network can be used and access to electric business platform.Under normal circumstances, the territorial scope of the user of electric business platform is accessed It is dispersion, and the types of network services that user uses is also dispersion.If certain rogue program uses certain network operator Resource kidnaps flowing of access, then it, which can not only change, promotes main body mark, but also modifies access source address and/or access It is relevant to be revised as network operator's resource by the types of network services used for access source address and/or types of network services Data.
For example, certain rogue program carries out flow abduction using the Internet resources that Shandong connection provides, then it can be by access number Access source address in is revised as Shandong (regional scope), and the network service that access uses is revised as connection network (a kind of network type that Internet Service Provider provides).In this way, the access source address in flowing of access will concentrate on Shandong, The type of network service will concentrate on connection network.
Therefore, a kind of prior art is that statistics is to the access source address in all flowing of access of end article and/or makes Types of network services, if most access source addresses and/or most types of network services reaches certain threshold condition, Then explanation concentrates on some territorial scope to the flowing of access of end article and/or concentrates on certain types of network services, in turn Determine that there are abduction behaviors in these flowing of access.
However, the monitoring result of the above monitoring method is inaccurate.The reason is that promoting the side that main body promotes commodity There are many formulas, and as pushed away bid in certainly, social software is promoted.Refer to that the quotient of itself popularization can be bought by promoting main body from bid in is pushed away Product;The goods links that social software promotes such as publication in wechat group are promoted.Mainly collect from the flowing of access pushed away under bid in scene In in same user;Social software is promoted under scene, and user in same wechat group in identical territorial scope and/or may make With the network service of same type.Therefore, the access source address for these normal ways of promotion, in flowing of access And/or types of network services inherently compares concentration, above-mentioned monitoring scheme may may mistakenly determine that exist in flowing of access and rob It holds.
Another existing monitoring mode is to monitor the situation of change that main body mark is promoted in access address, passes through variation Situation judges in flowing of access with the presence or absence of abduction.Specifically, electric business platform will record user whithin a period of time to target The access data of commodity, that is to say, that the access data of record are continuous in time.If in short time interval, access number According to popularization main body mark changed, it is determined that in the access data exist kidnap.
As it can be seen that second of existing monitoring mode is that popularization main body situation of change abnormal in data is accessed by monitoring, To determine the abduction behavior of flowing of access.But the main problem of this monitoring mode has two, first is that in some scenarios, access stream The variation that main body mark is promoted in amount is user's triggering, and this variation for promoting main body mark belongs to normal condition, but can be by It is erroneously determined as the abduction behavior there are flowing of access;It, can will be in flowing of access second is that rogue program is when kidnapping flow Existing main body mark of promoting itself is deleted, and subsequent main body mark of promoting uses the popularization main body mark after changing, this Sample monitoring program just cannot find that the variation of main body mark is promoted in front and back, so that the flow being unable to monitor in network access is robbed It holds.
Another existing monitoring mode is to monitor the lifetime value (time- in flowing of access on a user device To-live, ttl).Specifically, user sends access data to server by user equipment, and server can also be set to user Standby returning response data.Lifetime value is included in the network layer protocol head of response data, and response data is every to pass through a road By device, lifetime value will subtract one.If rogue program kidnaps access data in link layer, the net of response data cannot be copied Ttl value in network layers protocol header.Therefore, network data Packet capturing program can be installed on a user device, which can catch Receive the response data of server return.Capture program often receives a response data, just judge the ttl value of the response data with Whether the difference of the ttl value of previous response data reaches certain threshold value.If reaching, the corresponding access number of this response data is just determined According to maliciously being kidnapped.
However, the above monitoring mode is applied in user equipment side, due to flowing of access of being held as a hostage user and be not fixed, The access data of any user are likely to be kidnapped by rogue program.Therefore, because the dispersibility of user, it can not be in whole use Monitoring program is installed in the equipment of family, this kind of monitoring mode is of limited application, and applicability is not strong.
The three of the above prior art has certain problems, and therefore, this application provides a kind of access data monitoring sides Method.The monitoring method can be using on the server, which can be the server of electric business platform.
As shown in Fig. 2, a process of the monitoring method of access data includes the following steps S201~step S203.
S201: according to the first promotion message, access data corresponding with the first promotion message are obtained.
Wherein, the first promotion message is to hold easily modified information in flow abduction behavior, by kidnapping phenomenon to flow Analysis experience, flow kidnap in some data be hold it is easily modified, rogue program by modify these data To achieve the purpose that flow is kidnapped.
This step obtains corresponding access data according to the first promotion message, for example, mentioning in access data to be monitored Take the access data comprising the first promotion message.It should be noted that since the first promotion message in access data may be Modified, therefore access data that may be modified comprising the first promotion message in the access data obtained, that is, it accesses There may be flows to kidnap behavior in data.
The identity that first promotion message can be used to indicate to promote main body, including but not limited to promotes the mark of main body. It is described in detail so that the first promotion message is to promote main body mark as an example below.
Promoting main body is the main body that certain object is promoted, and promotes main body and is properly termed as target subject.Object can be electricity Commodity on quotient's platform are also possible to the object of other forms.As long as the object for being able to use web page display its relevant information is equal It may be considered the object in the application.Object is referred to as promoting object or target object.
The access entrance such as website links, two dimensional code etc. for being directed toward object can be provided by promoting main body, so that user passes through access Entrance accesses to object is promoted.Therefore, this step, which may be considered, obtains the access entrance life based on main body offer is promoted At access data.One access request enters from access entrance, and accesses popularization object pointed by the access entrance, from visit Ask that entrance enters and data can be generated during accessing popularization object, these data are properly termed as access data.
When obtaining access data, it can be and obtained according to the mark for promoting main body.Specifically, it accesses in data and includes It promotes main body and identifies this field, this field can be modified by implementing the rogue program that flow is kidnapped.That is, promoting main body Mark may be the replaced mark of rogue program, this mark is another mark for promoting main body.And another popularization main body Often implement the popularization main body that flow kidnaps behavior also with rogue program.
It is appreciated that rogue program, which can modify popularization main body, identifies this field, certainly, rogue program may also be repaired Change other fields that can indicate to promote subject identity, therefore, as long as the information promoting main body and being modified can be indicated It is the foundation that this step obtains access data.For the ease of with can hereafter indicate to promote main body and the information that will not be modified It distinguishes, information herein can be known as being used to indicate to promote the first promotion message of main body.
It should be noted that the information that will not be hereafter modified not can never, be a kind of opposite situation, as long as The case where being modified relative to the first promotion message is less.
Webpage comprising promoting object-related information saves on the server, and server can recorde to end article webpage Access data.It should be noted that the server of record access data can be the server for executing monitoring method, it can also be with It is other servers, the application is simultaneously not specifically limited.If the server for executing monitoring method is not the clothes of record access data Business device, then this step when being executed, needs to obtain access data from other servers.Certainly, the equipment of record access data Can and non-server, can also be other equipment.The equipment of record access data can be distributed apparatus or non-distributed Equipment.If access data are recorded on distributed apparatus, the safety for accessing data is more preferable;In addition, according to distributed mode Read access data, data reading performance using redundancy are higher.
For accessing data and be recorded on server, for server, the access recorded includes various shapes Formula such as includes the access that direct login service device carries out, and also includes the access carried out by promoting the access entrance that main body provides. The application is concerned with based on the access for promoting the access entrance progress that main body provides, it is therefore desirable to the access recorded from server In data, the access data that the access entrance based on popularization generates are extracted.
For example, a popularization commodity on e-commerce website are handbag, user can directly log in the e-commerce The information of the website browsing handbag, to generate the data of access handbag webpage.It is issued moreover, promoting main body in wechat group The website links of the handbag, the user in wechat group can browse the information of the handbag by the website links, thus Generate the data of another access handbag webpage.In this application, the access data for needing to extract are latter access data.
Accessing data includes multiple fields, if access data are the access entrance access data generated based on popularization, It is then to promote main body mark comprising a field in the access data.Therefore, whether include popularization main body according in access data Mark can determine whether the access entrance based on popularization generates access data, and then extract comprising promoting main body mark Data are accessed as required access data.
It should be noted that multiple objects may be promoted by promoting main body, in one implementation, the application is not relevant for Which its object promoted specifically has.Therefore, the access data that Web portal provided by main body generates are promoted as long as passing through, Regardless of which object the Web portal is directed toward, which can be acquired.Reason is that rogue program implements flow abduction Main purpose be to be revised as specifically promoting main body for the popularization main body mark accessed in data to identify, therefore, general In application scenarios, rogue program implement flow kidnap be do not distinguish arbitrarily abduction flow access be which is right As intercepting and capturing a flowing of access, just modifying the popularization main body mark in the flowing of access.
Certainly, in other application scenarios, in fact it could happen that promote main body just for a certain or a variety of specific objects Access data the case where being kidnapped.These objects are properly termed as target object.Therefore, when obtaining access data, in addition to According to whether there is or not promote main body mark standard outside, can also include the mark of no target object.The mark of target object is herein It is a kind of alternative condition in application scenarios, further, in other application scenarios, other alternative conditions can also be used Extract access data.In addition, it is necessary to explanation, no matter in which kind of application scenarios, obtaining access data can be one section Access data in time.This time can be specified special time period, be also possible to any one period.Period Length the application and be not specifically limited, such as one day or one month.
Alternative condition can be presented as that the field in access data, different fields indicate access number according in different aspect Attribute, field are referred to as attribute field.That is, can extract in the access data of record and meet objective attribute target attribute Access data.It should be noted that no matter the access data extracted are properly termed as according to which kind of condition or attribute selection Target access data.
It should be noted that the different corresponding access data of main body of promoting are different, this step can be led according to promoting Body mark is grouped access data, i.e., different popularization main bodys is identified corresponding access data and be divided into different groups.Specifically Ground, it is preceding to have addressed, the field that main body mark belongs to access data is promoted, data can will be accessed according to the value of the field Be grouped, and can as follows in any one obtain access data: obtain the access in each grouping respectively Data (being following steps to be executed respectively to the access data in each grouping), or obtain the access in any one grouping Data perhaps obtain the access data in some designated packet or obtain the access data in certain several designated packet.Always From the point of view of knot, this step is to extract to belong to the same access data for promoting main body.
S202: the distribution characteristics of the second promotion message of access data is determined.
In practical applications, implement flow kidnap behavior rogue program can modify access data in some or it is certain can It indicates to promote the information of subject identity, but can indicate that the information for promoting subject identity will not be by under relative case there are also some Rogue program modification.These information are properly termed as the second promotion message, and the second promotion message can indicate access number factually border pair The popularization main body answered.
Under normal circumstances, access data accessed by step S201 include a plurality of, and every access data all have energy The field of main body is promoted in enough reflections (field is referred to as parameter or attribute).Although rogue program can in flow abduction Main body mark is promoted in modification, and promoting main body mark is that one kind can directly reflect that whose field promotes main body is, but other can also To reflect that the field for promoting main body mark simultaneously one is surely modified by rogue program.Therefore, the application be will according to these fields come It determines whether there is flow and kidnaps situation.
Specifically, it can be determined in access data for indicating the access number factually corresponding field for promoting main body in border, These fields can be a kind of concrete form of the second promotion message.For ease of description, these fields can be known as target Field, target component or objective attribute target attribute;Count the distribution characteristics of the field value of aiming field.The calculation of distribution characteristics can be with It is described below including a variety of.
In one example, these fields used in this application can be, and it is whom that directly main body is promoted in reflection or expression Field.
For example, accessing in data includes a field for channel (channel), this field identifies one with main body is promoted Sample, channel corresponding to different popularization main bodys are different.For example, the channel for promoting main body 1 is 1-23155155, master is promoted The channel of body 2 is 1-23177841, and the channel for promoting main body 3 is 1-23260440, and the channel for promoting main body 4 is 1- 23262200.Thus, it could be seen that can determine whether channel is identical by channel value, channel difference can indicate to promote main body not Together.
In another example, these fields used in this application may not be able to directly reflect or indicate to promote main body Whom is, also can reflect or indicate to promote specific characteristic of the main body in terms of certain.
For example, comprising field being user agent (user-agent) in access data.User agent can indicate to visit Ask the network platform used by a user.Access entrance is provided in some network platform it is understood that only promoting main body On, access user can just access to target object by the access entrance.However, a popularization main body generally can't It is promoted, is concentrated mainly in several network platforms of negligible amounts in the diversified network platform.As it can be seen that user's generation Reason can reflect that the feature for promoting main body, this feature refer specifically to promote access entrance for promoting main body to a certain extent The network platform it is relatively simple, can't diversification.
In promoting the application scenarios that main body is the network platform, which is just the popularization main body itself, it is clear that this Under kind scene, the network platform for providing access entrance is the most single.For example, microblogging (a kind of social software) is pushed away as one Wide main body, the network platform for being used to promote access entrance generally only have microblogging a kind of.A kind of Ali's mobile phone Taobao (mobile phone shopping Using) it is used as a popularization main body, the network platform for being used to promote access entrance generally only has Ali's mobile phone Taobao a kind of.Ah In hundred rivers promote (a kind of popularization and application) and be used as a popularization main body, the network platform for being used to promote access entrance generally only has Hundred river of Ali is a kind of.It should be noted that even if the different network platforms belongs to consolidated network platform provider, such as Ali's mobile phone Taobao and hundred river of Ali, which are promoted, belongs to Alibaba, but there is still a need for be construed as the different network platforms.
It is to promote the network platform of the user for promoting also generally more in the application scenarios for promoting user in popularization main body It is single.This limitation, which can be embodied in, promotes user when registering, and limits the network platform that the popularization user of registration uses Number of species.For example, it is two kinds that some user, which when being registered as promoting user, limits its used network platform,.Into one Step ground, user can insert which two kinds its used network platform be.
It can be seen from the above, the second promotion message in access data can indicate to promote the feature of main body, for example indicate to push away Whom wide main body is or promotes the network platform used in main body.It is appreciated that the corresponding access number of same promotion message In, the second promotion message is more unified, and diversity is not present.Therefore, the distribution of the second promotion message can be counted Feature.Distribution characteristics can indicate the distribution multiplicity implementations of the second promotion message.That determines can reflect various implementations Result can be other data except numerical value or numerical value.More specifically, numerical value can be entropy or other.Whether Which kind of is as a result, can be known as distribution characteristics for result.
It should be noted that see below specific descriptions there are many determination modes of various implementations.
Whether S203: meeting preset condition according to distribution characteristics, determines in access data and promotes letter with the presence or absence of to first The modification of breath.
It should be noted that the distribution characteristics determined in step S202 can be to be distributed multifarious feature for expression, Represented various implementations may include two kinds, i.e. there is the first promotion message diversity or the first promotion message not to have Diversity.Correspondingly, preset condition can be specially diversity condition, and diversity condition can be with required by diversity Condition, then this step is to judge whether distribution characteristics reaches the condition of diversity requirements, if reached, then it is assumed that in access data In the presence of the access data being held as a hostage, that is to say, that there are flows to kidnap behavior in access data.
In practical applications, a kind of concrete form of diversity condition can be preset threshold, if true according to access data Fixed distribution characteristics reaches the preset threshold, then it is assumed that these access data have the modification to the first promotion message, that is, exist Flow is kidnapped.Certainly, diversity condition is according to other forms, and the application does not limit.
From the above technical scheme, this method can extract corresponding same promotion message according to the first promotion message Access data.The second promotion message is also carried in these access data, the second promotion message is able to reflect actual first and promotes The case where information.It is appreciated that rogue program can modify the first promotion message when flow is kidnapped, but largely simultaneously The second promotion message will not be modified by rogue program, therefore under normal circumstances, the corresponding access data of same first promotion message In, the second promotion message be it is more unified, can be indicated if the distribution characteristics of the second promotion message reaches preset condition There is the case where being held as a hostage in these access data.Thus, it could be seen that may determine that this by the distribution characteristics of the second promotion message With the presence or absence of the case where being held as a hostage in a little access data.
The prior art and access data monitoring scheme provided by the present application are compared, judgment mode used in the two is completely not Together.The application is to determine whether there is flow according to whether promotion message has diversity and kidnap phenomenon, and there is no according to ground The distribution situation in domain and network service operators, the situation of change also not identified according to main body is promoted, therefore can evade The problem of stating the inaccuracy of judging result caused by judgment mode or even can not judging.In addition, access data prison provided by the present application Survey scheme can be directly applied to server-side, therefore application range without applying in each client being held as a hostage It is relatively wide.
In practical applications, determine access data distribution characteristics a kind of concrete mode can be, according to user agent this What a field determined.Below in conjunction with process shown in Fig. 3, to illustrate to realize the prison of access data based on this field of user agent The process of survey method.
As shown in figure 3, a kind of process of the monitoring method of access data provided by the present application includes step S301~step S303。
S301: according to the first promotion message, access data corresponding with the first promotion message are determined.
Wherein, this step is identical as above-mentioned step S201 shown in Fig. 2, may refer to above description, does not repeat herein.
S302: according to user agent's field in access data, determine that the distribution of the corresponding network platform of access data is special Sign.
Wherein, accessing in data includes this field of user agent, can be identified to the field value of this field, is known Not Chu the field value which network platform belonged to.
For example, including QQ in user agent's field value in certain access data, then it is corresponding to may determine that this accesses data The network platform be QQ (a kind of social software);It include weibo in user agent's field value in certain access data, then it can be with Judge that this accesses the corresponding network platform of data as microblogging (a kind of social software);In certain access data, user agent's word It include baiduboxapp in segment value, then may determine that this accesses the corresponding network platform of data is that (a kind of search is drawn for Baidu It holds up);Include aliapptb in user agent's field value in certain access data, then it is corresponding to may determine that this accesses data The network platform is Ali Taobao (a kind of E-business Software);In certain access data, include in user agent's field value Aliappbc, then may determine that this accesses the corresponding network platform of data is hundred river of Ali (a kind of popularization software);Certain visit It asks in data, includes aliapp in user agent's field value, but do not include bc or tb, then it is corresponding to may determine that this accesses data The network platform be Ali other promote and apply etc..
It is then multiple according to the network platform that user agent's field is determined since access data are a plurality of.It can count The total amount of the network platform counts these network platforms and belongs to how many type, counts the corresponding network platform of each type respectively The quantity for the network platform that quantity, the network platform for calculating each type include accounts for the ratio of network platform total amount.
According to above-mentioned statistical result and the calculation formula of entropy, the Distribution Entropy of the network platform is calculated.Distribution Entropy is properly termed as point Cloth situation is referred to as the distribution characteristics of the network platform.
Specifically, the calculation formula of Distribution Entropy is
Wherein, H (X) indicates the distribution characteristics of the network platform;N is used to indicate the type of the corresponding network platform of access data Quantity;I=1,2 ..., n, i is for indicating different types of network platform, PiThe network that the network platform for a certain type includes The quantity of platform accounts for the ratio of network platform total amount.
It should be noted that illustrating that network platform distribution is more concentrated if the value of Distribution Entropy is lower;If distribution The value of entropy is higher, then illustrates that network platform distribution is more dispersed.In other words, Distribution Entropy value is higher, then it represents that network is flat The diversity level of platform is higher.
Whether S303: meeting diversity condition according to the distribution characteristics of the network platform, determine access data with the presence or absence of pair The modification of first promotion message.
Wherein, the available Distribution Entropy of step S302, Distribution Entropy value height to a certain extent, can just qualitatively determine network Platform has diversity, therefore can preset diversity condition, for judging distribution characteristics.
Diversity condition can be a threshold value, and the Distribution Entropy of above-mentioned calculating is compared with the threshold value.If Distribution Entropy Reach the threshold value, then it is assumed that the distribution characteristics of the network platform meets diversity condition, and then determines that there are flows in access data Kidnap situation.On the contrary, if Distribution Entropy is not up to the threshold value, then it is assumed that the distribution characteristics of the network platform is unsatisfactory for diversity condition, And then determine that there is no flows to kidnap situation in access data.It should be noted that in order to be distinguished with hereafter threshold value, it can be by this Place's threshold value is known as characteristic threshold value.
In the above-mentioned technical solutions, accessing includes this field of user agent in data, under normal circumstances, implements flow The rogue program of abduction can't modify the field, and therefore, which can indicate which network is access data really derive from Platform.If the field value of this field of user agent has diversity, then it represents that the network platform in access number factually border source With diversity.And under normal circumstances, the network platform for accessing data source should not have diversity, if the network platform has Diversity, then illustrating the access data, there are flows to kidnap situation.
It is understood that access data include this field of channel, under normal circumstances, the field value of this field is not The rogue program modification of flow abduction can be carried out.Channel can access the corresponding practical popularization main body of data with faithful representation.Canal Road is different, then it represents that the corresponding practical popularization main body of access data is different.Therefore, it can be taken by channel in access data Value judges to access the corresponding practical distribution characteristics for promoting main body of data, and then determines that promoting main body is according to distribution characteristics It is no that there is diversity.
As shown in figure 4, a kind of process of the monitoring method of access data provided by the present application includes step S401~step S403。
S401: according to the first promotion message, access data corresponding with the first promotion message are determined.
Wherein, this step is identical as above-mentioned step S201 shown in Fig. 2, may refer to above description, does not repeat herein.
S402: according to the channel field in access data, the corresponding distribution characteristics for promoting main body of access data is determined.
Preceding to have addressed, under normal circumstances, the field value for accessing channel field in data is more unified, if channel field value There is diversity, then can indicate that the corresponding popularization main body of access data has diversity, in these access data very likely In the presence of the access data being held as a hostage.
Therefore, after obtaining access data, the field value of channel field can be extracted.For example, extracting channel field Field value includes: 1-23155155,1-23177841,1-23260440,1-23262200.
Access data be it is a plurality of, then the field value extracted be it is multiple.One field value indicates a popularization main body.It can be with Statistics promotes the total amount of main body, counts these and promotes main bodys and belongs to how many type, counts corresponding popularizations of each type respectively and lead The quantity for the popularization main body that the quantity of body, the popularization main body for calculating each type include accounts for the ratio for promoting main body total amount.According to Formula in step S302 calculates the Distribution Entropy (or distribution characteristics) for promoting main body.
Specifically, the calculation formula of Distribution Entropy is
Wherein, H (X) indicates to promote the distribution characteristics of main body;N is used to indicate the corresponding type for promoting main body of access data Quantity;I=1,2 ..., n, i is for indicating different types of popularization main body, PiThe popularization that popularization main body for a certain type includes The quantity of main body accounts for the ratio for promoting main body total amount.
Whether S403: meeting diversity condition according to the distribution characteristics for promoting main body, determine access data with the presence or absence of pair The modification of first promotion message.
Wherein, this step is identical as above-mentioned step S303 shown in Fig. 3, may refer to above description, does not repeat herein.
In above technical scheme, using the channel field in access data, to determine the distribution characteristics for promoting main body.By In under normal circumstances, rogue program can't modify the field value of channel field, therefore the field value can reflect access number According to corresponding practical popularization main body.If the field value of channel field has diversity, show to access the corresponding reality of data Promoting main body has diversity.And under normal circumstances, the corresponding practical main body of promoting of access data is more unified, therefore, if The field value of channel field has diversity, can determine that there are flows to kidnap phenomenon in access data.
It should be noted that above in the distribution characteristics for determining the network platform, calculation in addition to using Distribution Entropy, Other modes can also be used.For example, including in the number of species (being denoted as M1) for obtaining the network platform and every kind of network platform It, can be using M1 and N1 as distribution characteristics after network platform number (being denoted as N1).It is more determining whether to have according to distribution characteristics When sample, if M1 reaches certain threshold value and N1 also reaches certain threshold value, determine that the network platform have diversity, also into One step determines access data, and there are flows to kidnap phenomenon.The value of the two certain threshold values might not be identical.
Similarly, above when determining the distribution characteristics for promoting main body, the calculation in addition to using Distribution Entropy can also make In other ways.For example, the popularization main body promoted main body in the number of species (being denoted as M2) and every kind for obtaining promoting main body and include It, can be using M2 and N2 as distribution characteristics after number (being denoted as N2).When being determined whether according to distribution characteristics with diversity, such as Fruit M2 reaches certain threshold value and N2 also reaches certain threshold value, has diversity then determining that and promoting main body, also further determines that Accessing data, there are flows to kidnap phenomenon.The value of the two certain threshold values might not be identical, for the ease of distinguishing, can incite somebody to action First threshold value is known as type threshold value, and second threshold value is known as number threshold value.
The structure of the monitoring device of access data provided by the present application is illustrated below.As shown in figure 5, access data Monitoring device may include: access data obtaining module 501, distribution characteristics determining module 502 and distribution characteristics detection module 503。
Data obtaining module 501 is accessed, for obtaining corresponding with first promotion message according to the first promotion message Access data;
Distribution characteristics determining module 502, the distribution characteristics of the second promotion message for determining the access data;
Distribution characteristics detection module 503 determines the access for whether meeting preset condition according to the distribution characteristics Data are with the presence or absence of the modification to first promotion message.
In one example, the distribution characteristics is for indicating to be distributed multifarious feature, and the preset condition is more Sample condition.
In one example, the access data obtaining module 501 includes: that access data obtain submodule.Access data Submodule is obtained, for obtaining access data corresponding with the popularization mark of main body according to the mark for promoting main body.
In one example, the distribution characteristics determining module 502 includes: that aiming field determines submodule and distribution characteristics Statistic submodule.Aiming field determines submodule, for determining that the aiming field in the access data, the aiming field are used In the first promotion message that expression is not modified;And distribution characteristics statistic submodule, for counting the word of the aiming field The distribution characteristics of segment value.
In one example, the aiming field determines that submodule includes: aiming field determination unit.Aiming field determines Unit, for the user agent's field accessed in data or channel field to be determined as aiming field.
In one example, the distribution characteristics statistic submodule includes: that type and number statistic unit and distribution are special Levy statistic unit.Type and number statistic unit, for counting the number of species and every kind of field value of the field value of aiming field Including field value number;Distribution characteristics statistic unit, for determining mesh according to the number of species and the field value number The distribution characteristics of the field value of marking-up section.
In one example, the distribution characteristics statistic unit includes: that ratio-dependent subelement and distribution characteristics determine Subelement.Ratio-dependent subelement, the field value number for including according to every kind of field value, determines the word of every kind of field Segment value accounts for the ratio of field value sum;Distribution characteristics determines subelement, for the type according to the field value of the aiming field The field value of quantity and every kind of field accounts for the ratio of field value sum, determines the distribution characteristics of the field value of aiming field.
In one example, the distribution characteristics detection module includes: the first detection sub-module and the second detection sub-module. First detection sub-module, if reaching preset characteristic threshold value for the distribution characteristics, it is determined that be to deposit in the access data In the modification to first promotion message;Second detection sub-module, if being not up to the feature threshold for the distribution characteristics Value, it is determined that there is no the modifications to first promotion message in the access data.
In one example, the distribution characteristics statistic unit includes: distribution characteristics statistics subelement.Distribution characteristics statistics Subelement, the distribution characteristics of the field value for the number of species and the field value number to be determined as to aiming field.
In one example, the distribution characteristics detection module includes: third detection sub-module and the 4th detection sub-module. Third detection sub-module, if reaching preset type threshold value for the number of species and the field value number reaches preset Number threshold value, it is determined that there is the modification to first promotion message in the access data;4th detection sub-module, is used for If the number of species are not up to preset type threshold value or the field value number is not up to preset number threshold value, it is determined that There is no the modifications to first promotion message in the access data.
See Fig. 6, it illustrates a kind of structures for the monitoring device for accessing data provided by the present application.As shown in fig. 6, the visit Ask that the monitoring device of data can specifically include: memory 601, processor 602 and bus 603.
Memory 601, for storing program instruction and/or data.
Processor 602, by reading the instruction and/or data that store in the memory 601, for executing following behaviour Make: according to the first promotion message, obtaining access data corresponding with first promotion message;Determine the of the access data The distribution characteristics of two promotion messages;And whether preset condition is met according to the distribution characteristics, determine that the access data are The no modification existed to first promotion message.
Bus 603, for each hardware component for accessing the monitoring device of data to be coupled.
It should be noted that processor 602 is when executing function, it can be according in above-mentioned each access data monitoring method Specific implementation execute, do not repeat herein.
It should be noted that all the embodiments in this specification are described in a progressive manner, each embodiment weight Point explanation is the difference from other embodiments, and the same or similar parts between the embodiments can be referred to each other.
It should also be noted that, herein, relational terms such as first and second and the like are used merely to one Entity or operation are distinguished with another entity or operation, without necessarily requiring or implying between these entities or operation There are any actual relationship or orders.Moreover, the terms "include", "comprise" or its any other variant are intended to contain Lid non-exclusive inclusion, so that the process, method, article or equipment including a series of elements is not only wanted including those Element, but also including other elements that are not explicitly listed, or further include for this process, method, article or equipment Intrinsic element.In the absence of more restrictions, the element limited by sentence "including a ...", it is not excluded that There is also other identical elements in process, method, article or equipment including above-mentioned element.
The foregoing description of the disclosed embodiments makes professional and technical personnel in the field can be realized or use the application. Various modifications to these embodiments will be readily apparent to those skilled in the art, as defined herein General Principle can be realized in other embodiments without departing from the spirit or scope of the application.Therefore, the application Be not intended to be limited to the embodiments shown herein, and be to fit to in principles disclosed herein and features of novelty phase Consistent widest scope.

Claims (15)

1. a kind of monitoring method for accessing data characterized by comprising
According to the first promotion message, access data corresponding with first promotion message are obtained;
Determine the distribution characteristics of the second promotion message of the access data;
Whether meet preset condition according to the distribution characteristics, determines that the access data whether there is to promote to described first and believe The modification of breath.
2. the monitoring method of access data according to claim 1, which is characterized in that the distribution characteristics is for indicating It is distributed multifarious feature, the preset condition is diversity condition.
3. the monitoring method of access data according to claim 1, which is characterized in that first promotion message of foundation, Obtain access data corresponding with first promotion message, comprising:
According to the mark for promoting main body, access data corresponding with the popularization mark of main body are obtained.
4. the monitoring method of access data according to claim 1, which is characterized in that the determination access data The distribution characteristics of second promotion message, comprising:
Determine the aiming field in the access data, the aiming field is the promotion message that do not modified;
Count the distribution characteristics of the field value of the aiming field.
5. the monitoring method of access data according to claim 4, which is characterized in that in the determination access data Aiming field, comprising:
The user agent's field accessed in data or channel field are determined as aiming field.
6. the monitoring method of access data according to claim 4, which is characterized in that the statistics aiming field The distribution characteristics of field value, comprising:
The field value number that the number of species and every kind of field value for counting the field value of aiming field include;
According to the number of species and the field value number, the distribution characteristics of the field value of aiming field is determined.
7. the monitoring method of access data according to claim 6, which is characterized in that it is described according to the number of species and The field value number, determines the distribution characteristics of the field value of aiming field, comprising:
According to the field value number that every kind of field value includes, determine that the field value of every kind of field accounts for the ratio of field value sum Example;
The ratio of field value sum is accounted for according to the field value of the number of species of the field value of the aiming field and every kind of field, really The distribution characteristics of the field value for the field that sets the goal.
8. the monitoring method of access data according to claim 7, which is characterized in that described to be according to the distribution characteristics It is no to meet preset condition, it determines in the access data with the presence or absence of the modification to first promotion message, comprising:
If the distribution characteristics reaches preset characteristic threshold value, it is determined that be to exist to promote to described first in the access data The modification of information;
If the distribution characteristics is not up to the characteristic threshold value, it is determined that there is no promote to described first in the access data The modification of information.
9. the monitoring method of access data according to claim 6, which is characterized in that it is described according to the number of species and The field value number, determines the distribution characteristics of the field value of aiming field, comprising:
The number of species and the field value number are determined as to the distribution characteristics of the field value of aiming field.
10. the monitoring method of access data according to claim 9, which is characterized in that described according to the distribution characteristics Whether meet preset condition, determine the access data with the presence or absence of the modification to first promotion message, comprising:
If the number of species reach preset type threshold value and the field value number reaches preset number threshold value, it is determined that There is the modification to first promotion message in the access data;
If the number of species are not up to preset type threshold value or the field value number is not up to preset number threshold value, Determine that there is no the modifications to first promotion message in the access data.
11. a kind of monitoring device for accessing data characterized by comprising
Data obtaining module is accessed, for obtaining access number corresponding with first promotion message according to the first promotion message According to;
Distribution characteristics determining module, the distribution characteristics of the second promotion message for determining the access data;
Distribution characteristics detection module determines that the access data are for whether meeting preset condition according to the distribution characteristics The no modification existed to first promotion message.
12. the monitoring device of access data according to claim 11, which is characterized in that the distribution characteristics is for table Show that the multifarious feature of distribution, the preset condition are diversity condition.
13. the monitoring device of access data according to claim 11, which is characterized in that the foundation first promotes letter Breath obtains access data corresponding with first promotion message, comprising:
According to the mark for promoting main body, access data corresponding with the popularization mark of main body are obtained.
14. the monitoring device of access data according to claim 11, which is characterized in that the determination access data The second promotion message distribution characteristics, comprising:
Determine that the aiming field in the access data, the aiming field are used for the first promotion message for indicating not modified;
Count the distribution characteristics of the field value of the aiming field.
15. a kind of monitoring device for accessing data characterized by comprising processor and memory, the processor pass through fortune Software program, the data of calling storage in the memory, at least execution following steps of row storage in the memory:
According to the first promotion message, access data corresponding with first promotion message are obtained;
Determine the distribution characteristics of the second promotion message of the access data;
Whether meet preset condition according to the distribution characteristics, determines that the access data whether there is to promote to described first and believe The modification of breath.
CN201711048801.5A 2017-10-31 2017-10-31 Access data monitoring method and related equipment Active CN109729054B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711048801.5A CN109729054B (en) 2017-10-31 2017-10-31 Access data monitoring method and related equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711048801.5A CN109729054B (en) 2017-10-31 2017-10-31 Access data monitoring method and related equipment

Publications (2)

Publication Number Publication Date
CN109729054A true CN109729054A (en) 2019-05-07
CN109729054B CN109729054B (en) 2021-08-13

Family

ID=66293091

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711048801.5A Active CN109729054B (en) 2017-10-31 2017-10-31 Access data monitoring method and related equipment

Country Status (1)

Country Link
CN (1) CN109729054B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110796516A (en) * 2019-09-26 2020-02-14 北京瑞卓喜创科技发展有限公司 Commodity promotion method and device
CN110995532A (en) * 2019-11-19 2020-04-10 上海易点时空网络有限公司 Data processing method and system for resource bit and server
CN111510429A (en) * 2020-03-11 2020-08-07 南京大学 Analysis and detection method and system for flow hijacking in android system application and popularization

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030069824A1 (en) * 2001-03-23 2003-04-10 Restaurant Services, Inc. ("RSI") System, method and computer program product for bid proposal processing using a graphical user interface in a supply chain management framework
CN105631361A (en) * 2014-10-28 2016-06-01 中国移动通信集团终端有限公司 Application program channel source identification method and device
CN106301979A (en) * 2015-05-27 2017-01-04 腾讯科技(北京)有限公司 The method and system of the abnormal channel of detection
CN106933905A (en) * 2015-12-31 2017-07-07 北京国双科技有限公司 The monitoring method and device of web page access data
CN107153971A (en) * 2017-05-05 2017-09-12 北京京东尚科信息技术有限公司 Method and device for recognizing equipment cheating in APP popularizations
CN107274212A (en) * 2017-05-26 2017-10-20 北京小度信息科技有限公司 Cheating recognition methods and device

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030069824A1 (en) * 2001-03-23 2003-04-10 Restaurant Services, Inc. ("RSI") System, method and computer program product for bid proposal processing using a graphical user interface in a supply chain management framework
CN105631361A (en) * 2014-10-28 2016-06-01 中国移动通信集团终端有限公司 Application program channel source identification method and device
CN106301979A (en) * 2015-05-27 2017-01-04 腾讯科技(北京)有限公司 The method and system of the abnormal channel of detection
CN106933905A (en) * 2015-12-31 2017-07-07 北京国双科技有限公司 The monitoring method and device of web page access data
CN107153971A (en) * 2017-05-05 2017-09-12 北京京东尚科信息技术有限公司 Method and device for recognizing equipment cheating in APP popularizations
CN107274212A (en) * 2017-05-26 2017-10-20 北京小度信息科技有限公司 Cheating recognition methods and device

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110796516A (en) * 2019-09-26 2020-02-14 北京瑞卓喜创科技发展有限公司 Commodity promotion method and device
CN110995532A (en) * 2019-11-19 2020-04-10 上海易点时空网络有限公司 Data processing method and system for resource bit and server
CN111510429A (en) * 2020-03-11 2020-08-07 南京大学 Analysis and detection method and system for flow hijacking in android system application and popularization
CN111510429B (en) * 2020-03-11 2021-07-09 南京大学 Analysis and detection method and system for flow hijacking in android system application and popularization

Also Published As

Publication number Publication date
CN109729054B (en) 2021-08-13

Similar Documents

Publication Publication Date Title
US12015681B2 (en) Methods and apparatus to determine media impressions using distributed demographic information
US9135653B2 (en) Building a social graph using sharing activity of users of the open web by identifying nodes in the social graph and adjusting weights associated with edges
US10110687B2 (en) Session based web usage reporter
Gomer et al. Network analysis of third party tracking: User exposure to tracking cookies through search
US20160140626A1 (en) Web page advertisement configuration and optimization with visual editor and automatic website and webpage analysis
CN108737535A (en) A kind of information push method, storage medium and server
US20120071131A1 (en) Method and system for profiling data communication activity of users of mobile devices
Falahrastegar et al. Anatomy of the third-party web tracking ecosystem
CN102710770A (en) Identification method for network access equipment and implementation system for identification method
US11308502B2 (en) Method for detecting web tracking services
CN101496048A (en) Identifying spurious requests for information
CN104866586A (en) Method and system for calculating node importance of information transmission in social media
CN109729054A (en) Access data monitoring method and relevant device
CN107835132A (en) A kind of method and device of traffic source tracking
US20160092929A1 (en) Scaling user audience groups to facilitate advertisement targeting
US20130151526A1 (en) Sns trap collection system and url collection method by the same
US11410201B2 (en) Marketing to consumers using data obtained from abandoned GPS searches
US9973950B2 (en) Technique for data traffic analysis
US11397745B1 (en) System and method for determining rankings, searching, and generating reports of profiles and personal information
Bailey et al. Look Who's Tracking-An analysis of the 500 websites most-visited by Finnish web users
US20200193458A1 (en) A web-based method for enhanced analysis of analytics setup and data
KR102343815B1 (en) Method, device and system for measuring advertisement effect using network data
CN110968785B (en) Target account identification method and device, storage medium and electronic device
KR20220090754A (en) Fraud click blocking and advertisement effect measurement system targeting websites
CN116671065A (en) Hybrid messaging neural network and personalized page rank graph convolutional network model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant