CN109947803A - A kind of data processing method, system and storage medium - Google Patents
A kind of data processing method, system and storage medium Download PDFInfo
- Publication number
- CN109947803A CN109947803A CN201910186327.5A CN201910186327A CN109947803A CN 109947803 A CN109947803 A CN 109947803A CN 201910186327 A CN201910186327 A CN 201910186327A CN 109947803 A CN109947803 A CN 109947803A
- Authority
- CN
- China
- Prior art keywords
- characteristic information
- apparatus characteristic
- information
- rule
- frequency
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Landscapes
- Debugging And Monitoring (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention discloses a kind of data processing method, system and storage mediums.Its method includes: to obtain apparatus characteristic information set, and the set element in the apparatus characteristic information set is apparatus characteristic information, the facility information of apparatus characteristic information terminal device for identification;Determine the frequency of occurrence of each apparatus characteristic information in the apparatus characteristic information set;Top n apparatus characteristic information is obtained according to frequency of occurrence descending;The top n apparatus characteristic information and processing request instruction are sent to data processing client-side interface;The processing result that the data processing client-side interface returns is received, the processing result is the processing result that the instruction instructed according to the processing request handles the top n apparatus characteristic information.The treatment effeciency of data processing method provided in an embodiment of the present invention is higher.
Description
Technical field
The present invention relates to technical field of data processing more particularly to a kind of data processing methods, system and storage medium.
Background technique
The purpose of to realize security protection, creation user's portrait etc., the equipment such as router are often to accessing its terminal
Equipment carries out equipment identification.
Current device identification method is mainly host name (hostname) information of real-time acquisition terminal equipment, using pre-
The regularity collection first obtained matches host name information, carries out equipment identification according to matching result.
Wherein, other regular collections for needing to use are by manually to mass data in regularity collection or equipment identification
Obtained from being analyzed and processed, treatment effeciency is low.
Summary of the invention
In view of the above problems, it proposes on the present invention overcomes the above problem or at least be partially solved in order to provide one kind
State device identification method, system and the storage medium of problem.
In a first aspect, the embodiment of the invention provides a kind of data processing methods, comprising:
Apparatus characteristic information set is obtained, the set element in the apparatus characteristic information set is apparatus characteristic information,
The facility information of apparatus characteristic information terminal device for identification;
Determine the frequency of occurrence of each apparatus characteristic information in the apparatus characteristic information set;
Top n apparatus characteristic information is obtained according to frequency of occurrence descending;
The top n apparatus characteristic information and processing request instruction are sent to data processing client-side interface;
The processing result that the data processing client-side interface returns is received, the processing result is asked according to the processing
The a plurality of rule for asking the instruction of instruction to handle the top n apparatus characteristic information.
Inventor has found in the implementation of the present invention, in the apparatus characteristic information set for extracting regularity collection
In the presence of a large amount of duplicate apparatus characteristic informations, there is also a large amount of single apparatus characteristic informations.And in apparatus characteristic information set
In duplicate apparatus characteristic information it is more, illustrate bigger, the single apparatus characteristic information of probability that the apparatus characteristic information occurs
The probability repeated is smaller.Therefore, apparatus characteristic information duplicate in apparatus characteristic information set can be considered as one to set
Standby characteristic information, and then apparatus characteristic information is ranked up according to frequency of occurrence, top n apparatus characteristic information is sent to number
According to processing client-side interface, to be used for create-rule.As it can be seen that method provided in an embodiment of the present invention, passes through the data of automation
Treatment process filters out partial data for create-rule from mass data, even if by manually to the data screened
Analysis create-rule is carried out, workload is greatly reduced, and treatment effeciency gets a promotion.Using side provided in an embodiment of the present invention
Method carries out the screening of data, has also been effectively ensured in follow-up equipment identification process, the coverage area of rule match.
With reference to first aspect, described according to appearance frequency in the first implementation of first aspect of the embodiment of the present invention
Before secondary descending obtains top n apparatus characteristic information, the method also includes:
According to the functional relation of the object statistics value of preceding i apparatus characteristic information and sequence value i, obtain predetermined
Inflection point object statistics value corresponding sequence value i=N, 1≤i≤I, I are the equipment feature in the apparatus characteristic information set
Information sum, the object statistics value are number of elements of the preceding i apparatus characteristic information in the apparatus characteristic information set
It is preceding i equipment feature with the ratio of the element total quantity in the apparatus characteristic information set or the object statistics value
Probability density of the information in the apparatus characteristic information set.
Method provided in an embodiment of the present invention is determined the value of N using edge effect, set as much as possible by what is repeated
Standby characteristic information covers, and excludes individual equipment characteristic information.
With reference to first aspect or the first implementation of first aspect, at second of first aspect of the embodiment of the present invention
In implementation, the processing request instruction is characterized rule information creation instruction, and the rule is for describing equipment feature
The regularity of the matching relationship of information and facility information;The method also includes:
The regularity is added in characteristic information rule set.
With reference to first aspect or the first implementation of first aspect, in the third of first aspect of the embodiment of the present invention
In implementation, the processing request instruction is mapping ruler creation instruction, the method also includes:
Obtain the facility information identified to the top n apparatus characteristic information;
The facility information is sent to the processing client-side interface, to determine according to the facility information for retouching
The mapping ruler between the facility information and standard device information is stated, the rule is the mapping ruler.
The third implementation with reference to first aspect, in the 4th kind of implementation of first aspect of the embodiment of the present invention
In, before the acquisition apparatus characteristic information set, the method also includes:
Apparatus characteristic information is obtained, and identifies to obtain facility information according to the apparatus characteristic information;
The facility information is matched using the mapping ruler set pre-established;
If not matching to obtain standard device information, the apparatus characteristic information is added in cluster tool.
Method provided in an embodiment of the present invention, before obtaining apparatus characteristic information set, i.e., to apparatus characteristic information into
Row screening, the apparatus characteristic information covered in existing mapping ruler set is filtered, data volume is further reduced, and is improved
Treatment effeciency.
With reference to first aspect or the first implementation of first aspect, at the 5th kind of first aspect of the embodiment of the present invention
In implementation, the apparatus characteristic information includes user agent's information.
With reference to first aspect or the first implementation of first aspect, at the 6th kind of first aspect of the embodiment of the present invention
In implementation, the apparatus characteristic information includes host name information.
Second aspect, the embodiment of the invention provides a kind of data processing systems, comprising:
Information aggregate acquiring unit, the collection for obtaining apparatus characteristic information set, in the apparatus characteristic information set
Conjunction element is apparatus characteristic information, the facility information of apparatus characteristic information terminal device for identification;
Apparatus characteristic information frequency of occurrence statistic unit, for determining, each equipment is special in the apparatus characteristic information set
The frequency of occurrence of reference breath;
Frequency of occurrence sequencing unit, for obtaining top n apparatus characteristic information according to frequency of occurrence descending;
Request instruction transmission unit, for the top n apparatus characteristic information and processing request instruction to be sent to data
Handle client-side interface;
Processing result receiving unit, the processing result returned for receiving the data processing client-side interface, the place
Reason is the result is that a plurality of rule that the instruction instructed according to the processing request handles the top n apparatus characteristic information
Then.
Inventor has found in the implementation of the present invention, in the apparatus characteristic information set for extracting regularity collection
In the presence of a large amount of duplicate apparatus characteristic informations, there is also a large amount of single apparatus characteristic informations.And in apparatus characteristic information set
In duplicate apparatus characteristic information it is more, illustrate bigger, the single apparatus characteristic information of probability that the apparatus characteristic information occurs
The probability repeated is smaller.Therefore, apparatus characteristic information duplicate in apparatus characteristic information set can be considered as one to set
Standby characteristic information, and then apparatus characteristic information is ranked up according to frequency of occurrence, top n apparatus characteristic information is sent to number
According to processing client-side interface, to be used for create-rule.As it can be seen that system provided in an embodiment of the present invention, passes through the data of automation
Treatment process filters out partial data for create-rule from mass data, even if by manually to the data screened
Analysis create-rule is carried out, workload is greatly reduced, and treatment effeciency gets a promotion.Using system provided in an embodiment of the present invention
System carries out the screening of data, has also been effectively ensured in follow-up equipment identification process, the coverage area of rule match.
In conjunction with second aspect, in the first implementation of second aspect of the embodiment of the present invention, the system also includes
Threshold value determination unit is used for:
Before obtaining top n apparatus characteristic information according to frequency of occurrence descending, according to the target of preceding i apparatus characteristic information
The functional relation of statistical value and sequence value i, obtains the corresponding sequence value i=N of predetermined inflection point object statistics value, and 1
≤ i≤I, I are the apparatus characteristic information sum in the apparatus characteristic information set, and the object statistics value is preceding i equipment
Element sum of the characteristic information in the number of elements and the apparatus characteristic information set in the apparatus characteristic information set
The ratio of amount or the object statistics value are probability of the preceding i apparatus characteristic information in the apparatus characteristic information set
Density.
Method provided in an embodiment of the present invention is determined the value of N using edge effect, set as much as possible by what is repeated
Standby characteristic information covers, and excludes individual equipment characteristic information.
In conjunction with the first of second aspect or second aspect implementation, at second of second aspect of the embodiment of the present invention
In implementation, the processing request instruction is characterized rule information creation instruction, and the processing result is for describing equipment
The regularity of the matching relationship of characteristic information and facility information;The system also includes information adding units, are used for:
The regularity is added in characteristic information rule set.
In conjunction with the first of second aspect or second aspect implementation, in the third of second aspect of the embodiment of the present invention
In implementation, the processing request instruction is mapping ruler creation instruction, the system also includes facility information transmission unit,
For:
Obtain the facility information identified to the top n apparatus characteristic information;
The facility information is sent to the processing client-side interface, to determine according to the facility information for retouching
The mapping ruler between the facility information and standard device information is stated, the rule is the mapping ruler.
In conjunction with the third implementation of second aspect, in the 4th kind of implementation of second aspect of the embodiment of the present invention
In, the system also includes facility information characteristic set updating units, it is used for:
Apparatus characteristic information is obtained, and identifies to obtain facility information according to the apparatus characteristic information;
The facility information is matched using the mapping ruler set pre-established;
If not matching to obtain standard device information, the apparatus characteristic information is added in apparatus characteristic information set.
System provided in an embodiment of the present invention, before obtaining apparatus characteristic information set, i.e., to apparatus characteristic information into
Row screening, the apparatus characteristic information covered in existing mapping ruler set is filtered, data volume is further reduced, and is improved
Treatment effeciency.
In conjunction with the first of second aspect or second aspect implementation, at the 5th kind of second aspect of the embodiment of the present invention
In implementation, the apparatus characteristic information includes user agent's information.
In conjunction with the first of second aspect or second aspect implementation, at the 6th kind of second aspect of the embodiment of the present invention
In implementation, the apparatus characteristic information includes host name information.
The third aspect, the embodiment of the present invention provide a kind of computer system, comprising:
One or more processors;
Memory;
One or more application program, wherein one or more of application programs are stored in the memory and quilt
It is configured to be executed by one or more of processors, realizes the method as described in any implementation of first aspect.
Fourth aspect, the embodiment of the present invention provide a kind of computer readable storage medium, for being stored as above-mentioned third party
The instruction of application program used in computer system described in face.
The above description is only an overview of the technical scheme of the present invention, in order to better understand the technical means of the present invention,
And it can be implemented in accordance with the contents of the specification, and in order to allow above and other objects of the present invention, feature and advantage can
It is clearer and more comprehensible, the followings are specific embodiments of the present invention.
Detailed description of the invention
By reading the following detailed description of the preferred embodiment, various other advantages and benefits are common for this field
Technical staff will become clear.The drawings are only for the purpose of illustrating a preferred embodiment, and is not considered as to the present invention
Limitation.And throughout the drawings, the same reference numbers will be used to refer to the same parts.In the accompanying drawings:
Fig. 1 shows system architecture diagram according to an embodiment of the invention;
Fig. 2 shows data processing method flow charts according to an embodiment of the invention;
Fig. 3 a shows the Hostname edge effect curve graph generated according to an embodiment of the present invention;
Fig. 3 b shows the Hostname edge effect curve graph generated according to a further embodiment of the invention;
Fig. 4 a shows the UA edge effect curve graph generated according to an embodiment of the present invention;
Fig. 4 b shows the UA edge effect curve graph generated according to a further embodiment of the invention;
Fig. 5 shows data processing system block diagram according to an embodiment of the invention.
Specific embodiment
Exemplary embodiments of the present disclosure are described in more detail below with reference to accompanying drawings.Although showing the disclosure in attached drawing
Exemplary embodiment, it being understood, however, that may be realized in various forms the disclosure without should be by embodiments set forth here
It is limited.On the contrary, these embodiments are provided to facilitate a more thoroughly understanding of the present invention, and can be by the scope of the present disclosure
It is fully disclosed to those skilled in the art.
Data processing method provided in an embodiment of the present invention and subsequent device identification method can with but not only limit application
In system shown in FIG. 1.Within the system, in data handling procedure provided in an embodiment of the present invention, cloud server 103 is right
Apparatus characteristic information is screened, and the apparatus characteristic information after screening is sent to data processing client 105, by data processing
Client 105 generates corresponding rule according to the apparatus characteristic information after screening, and cloud server 103 receives data processing client
The rule that end 105 returns;It carries out in equipment identification process, router 101 is used to acquire and report the terminal device 102 of access
Information, the information that Cloud Server 103 is used to be reported according to router 101 identifies terminal 102, obtains its facility information,
And the equipment letter of the terminal device 102 of the output couple in router 101 of the terminal device 104 by being equipped with destination application
Breath.
Wherein, the router 101 in Fig. 1 may be replaced by other IOT (Internet of Things, Internet of Things) and set
Standby or intelligent mobile terminal (such as smart phone, tablet computer).
Wherein, the terminal device 102 of access refers to couple in router 101 to be connected to the terminal of local area network or internet
Equipment, such as intelligent mobile terminal (smart phone, tablet computer), intelligent appliance equipment, Intelligent office equipment, wearable intelligence
Equipment etc..
Wherein, destination application refers to the application program that is communicated and can be controlled it with router 101.
Wherein, data processing client refers to the computer equipment for being equipped with display screen.
It should be pointed out that in other application scenarios or implementation, it can also be by the separate server on internet
Or the function that the equipment in local area network replaces above-mentioned Cloud Server to realize, the embodiment of the present invention are not construed as limiting this.
Method provided in an embodiment of the present invention is described in detail below in conjunction with Fig. 2.
As shown in Fig. 2, data processing method provided in an embodiment of the present invention includes following operation:
Step 201 obtains apparatus characteristic information set, and the set element in the apparatus characteristic information set is equipment feature
Information, the facility information of apparatus characteristic information terminal device for identification.
The embodiment of the present invention is not defined the data source and data memory format of apparatus characteristic information set.
It for example and without limitation, can capture apparatus characteristic information be added to equipment from internet by reptile instrument
In characteristic information set, apparatus characteristic information can also be obtained by the interface of third party's data platform and is added to equipment feature letter
In breath set, the apparatus characteristic information that setting condition is met in equipment identification process can also be added to apparatus characteristic information collection
In conjunction.
Wherein, setting condition can be determined according to actual scene demand in practical applications.For example and without limitation,
The setting condition may is that the facility information recognized does not include target information (such as device model), then corresponding equipment feature
Information meets setting condition;And/or equipment identify used in rule set can not the equipment identification information of successful match meet and set
Fixed condition.
For example and without limitation, apparatus characteristic information set can by but be not limited only to be stored in the form of data form
In database.
Step 202, the frequency of occurrence for determining each apparatus characteristic information in above equipment characteristic information set.
There may be duplicate apparatus characteristic information in apparatus characteristic information set, duplicate number is the equipment feature
Frequency of occurrence of the information in apparatus characteristic information set.
The embodiment of the present invention is not defined the definition of duplicate apparatus characteristic information, can be according to need in practical application
It defines.For example, the identical apparatus characteristic information of content is duplicate apparatus characteristic information;In another example aiming field takes
Being worth identical apparatus characteristic information is duplicate apparatus characteristic information.
In the embodiment of the present invention, there are many implementations of step 202, for example and without limitation, a kind of reality wherein
In existing mode, successively determine object element in a predetermined sequence, create counter for object element, and traverse object element it
Element afterwards deletes the element, and carry out cumulative behaviour to the counter of object element whenever having element and object element to repeat
Make.In another implementation, the element in apparatus characteristic information set is compared two-by-two in a predetermined sequence, according to comparing
As a result element is grouped, same group of duplicate element, can determine that each equipment is special by the number of elements in statistics each group
The frequency of occurrence of reference breath.
Step 203 obtains top n apparatus characteristic information according to frequency of occurrence descending.
Top n apparatus characteristic information and processing request instruction are sent to data processing client-side interface by step 204.
Data processing client-side interface can with but be not limited only to by human-computer interaction interface show top n equipment feature letter
Breath so that user analyzes, and obtains the control instruction of user by human-computer interaction interface to create-rule.
Step 205 receives the processing result that above-mentioned data processing client-side interface returns, which is according to above-mentioned
The a plurality of rule that the instruction of processing request instruction handles top n apparatus characteristic information.
Inventor has found in the implementation of the present invention, in the apparatus characteristic information set for extracting regularity collection
In the presence of a large amount of duplicate apparatus characteristic informations, there is also a large amount of single apparatus characteristic informations.And in apparatus characteristic information set
In duplicate apparatus characteristic information it is more, illustrate bigger, the single apparatus characteristic information of probability that the apparatus characteristic information occurs
The probability repeated is smaller.Therefore, apparatus characteristic information duplicate in apparatus characteristic information set can be considered as one to set
Standby characteristic information, and then apparatus characteristic information is ranked up according to frequency of occurrence, top n apparatus characteristic information is sent to number
According to processing client-side interface, to be used for create-rule.As it can be seen that method provided in an embodiment of the present invention, passes through the data of automation
Treatment process filters out partial data for create-rule from mass data, even if by manually to the data screened
Analysis create-rule is carried out, workload is greatly reduced, and treatment effeciency gets a promotion.Using side provided in an embodiment of the present invention
Method carries out the screening of data, has also been effectively ensured in follow-up equipment identification process, the coverage area of rule match.
In the embodiment of the present invention, N is also possible to carry out in a predetermined manner either predetermined fixed value
The dynamic value of adjustment.
If N is dynamic adjusted value, it is preferred that the edge effect that can use statistical value determines the value of N.It is specific: root
According to the object statistics value of preceding i apparatus characteristic information and the functional relation of sequence value i, it is corresponding to obtain inflection point object statistics value
Sort value i=N, and 1≤i≤I, I are the apparatus characteristic information sum in the apparatus characteristic information set, the object statistics
Value is number of elements of the preceding i apparatus characteristic information in the apparatus characteristic information set and the apparatus characteristic information set
In element total quantity ratio or the object statistics value be preceding i apparatus characteristic information in the apparatus characteristic information
Probability density in set.
Wherein, inflection point object statistics value can be predetermined value, can also determine by other means.As an example rather than limit
It is fixed, coordinate system is established with object statistics value and sequence value, and draw the target of preceding i apparatus characteristic information in the coordinate system
The function relation curve (i.e. edge effect curve) of statistical value and sequence value i, shows the coordinate system by human-computer interaction interface
And function relation curve.In a kind of implementation, generating and show can be along the display control that the function relation curve moves, should
Display control is used to show the coordinate points information of its position, is also used to that the control event detected is reported (such as to click, double-click
Deng);After receiving the target control event and corresponding coordinate points that display control reports, inflection point target is determined according to the coordinate points
The value of statistical value and N.In another implementation, after detecting target control event, the target control event pair is obtained
The cursor position answered determines the corresponding coordinate points of the cursor position according to predetermined mapping relations, true according to the coordinate points
Determine inflection point object statistics value and the value of N.
Method provided in an embodiment of the present invention is determined the value of N using edge effect, set as much as possible by what is repeated
Standby characteristic information covers, and excludes individual equipment characteristic information.
Method provided in an embodiment of the present invention can be applied in a variety of realization scenes.
For example, can use the update that method provided in an embodiment of the present invention realizes characteristic information rule set.On correspondingly,
It states processing request instruction and is characterized rule information creation instruction, above-mentioned rule is for describing apparatus characteristic information and facility information
Matching relationship regularity.After step 205, also regularity is added in characteristic information rule set.
Wherein, apparatus characteristic information can be UA (User Agent, user agent) information.
Wherein, apparatus characteristic information can also be hostname (host name) information.By taking hostname as an example, in database
A large amount of hostname information can not identify brand and model by resolver, need to divide these hostname information
Analysis, supplements corresponding regularity.Wherein, hostname rule magnitude about 20,000,000 to be combed counts hostname data
The frequency and the descending arrangement of appearance, draw the edge effect curve graph as shown in Fig. 3 a or Fig. 3 b, root according to above-mentioned processing mode
Hostname information is screened according to edge effect figure, it is only necessary to which the hostname information manually combed is 2000 or so.
In another example can use the update that method provided in an embodiment of the present invention realizes mapping ruler collection.Mapping ruler collection
Facility information for obtaining identification is standardized mapping, obtains the facility information of standard.Correspondingly, the processing request
Instruction is mapping ruler creation instruction, the method also includes: acquisition is identified to obtain to the top n apparatus characteristic information
Facility information;The facility information is sent to the processing client-side interface, is used to be determined according to the facility information
In describing the mapping ruler between the facility information and standard device information, the rule is the mapping ruler.
Further, before obtaining apparatus characteristic information set, apparatus characteristic information is obtained, and according to the equipment feature
Information identifies to obtain facility information;The facility information is matched using the mapping ruler set pre-established;If not
With standard device information is obtained, the apparatus characteristic information is added in cluster tool.
By taking UA information as an example, facility information triple (address Mac, brand, model) is obtained by UA resolver.Triple
In brand, the double major keys of model can brand, model in Association repository (i.e. mapping ruler collection), and the result not being associated with then into
Enter automatic evaluation mechanism (handling using above-mentioned data screening method into new).It, will be wait comb 7,000,000 in a specific example
UA data, removal repeat and count the frequency of appearance, are arranged and are drawn as shown in figures 4 a and 4b according to statistics frequency descending
Edge effect figure.According to edge effect figure, abscissa indicates the accounting of independent UA quantity and UA total amount in Fig. 4 a.With independence
UA number be continuously increased, UA total amount accounting expands rapidly, and when UA total amount reaches certain magnitude, accounting is smooth-out and approaches
1.It is further found according to Fig. 4 b, the probability density covering almost 100% that preceding 1000 UA occurs.Before this illustrates combing
1000 independent UA, extracting rule information can radiate the almost all of UA data of covering, be based on the method for million grades of UA numbers
The maintenance work of amount is reduced to 1000.
Method provided in an embodiment of the present invention, before obtaining apparatus characteristic information set, i.e., to apparatus characteristic information into
Row screening, the apparatus characteristic information covered in existing mapping ruler set is filtered, data volume is further reduced, and is improved
Treatment effeciency.
The embodiment of the invention provides a kind of data processing systems, as shown in Figure 5, comprising:
Information aggregate acquiring unit 501, for obtaining apparatus characteristic information set, in the apparatus characteristic information set
Set element is apparatus characteristic information, the facility information of apparatus characteristic information terminal device for identification;
Apparatus characteristic information frequency of occurrence statistic unit 502, for determining each in the apparatus characteristic information set set
The frequency of occurrence of standby characteristic information;
Frequency of occurrence sequencing unit 503, for obtaining top n apparatus characteristic information according to frequency of occurrence descending;
Request instruction transmission unit 504, for the top n apparatus characteristic information and processing request instruction to be sent to number
According to processing client-side interface;
Processing result receiving unit 505, the processing result returned for receiving the data processing client-side interface are described
It is a plurality of that processing result is that the instruction instructed according to the processing request handles the top n apparatus characteristic information
Rule.
Inventor has found in the implementation of the present invention, in the apparatus characteristic information set for extracting regularity collection
In the presence of a large amount of duplicate apparatus characteristic informations, there is also a large amount of single apparatus characteristic informations.And in apparatus characteristic information set
In duplicate apparatus characteristic information it is more, illustrate bigger, the single apparatus characteristic information of probability that the apparatus characteristic information occurs
The probability repeated is smaller.Therefore, apparatus characteristic information duplicate in apparatus characteristic information set can be considered as one to set
Standby characteristic information, and then apparatus characteristic information is ranked up according to frequency of occurrence, top n apparatus characteristic information is sent to number
According to processing client-side interface, to be used for create-rule.As it can be seen that system provided in an embodiment of the present invention, passes through the data of automation
Treatment process filters out partial data for create-rule from mass data, even if by manually to the data screened
Analysis create-rule is carried out, workload is greatly reduced, and treatment effeciency gets a promotion.Using system provided in an embodiment of the present invention
System carries out the screening of data, has also been effectively ensured in follow-up equipment identification process, the coverage area of rule match.
Optionally, the system also includes threshold value determination units, are used for:
Before obtaining top n apparatus characteristic information according to frequency of occurrence descending, according to the target of preceding i apparatus characteristic information
The functional relation of statistical value and sequence value i, obtains the corresponding sequence value i=N of predetermined inflection point object statistics value, and 1
≤ i≤I, I are the apparatus characteristic information sum in the apparatus characteristic information set, and the object statistics value is preceding i equipment
Element sum of the characteristic information in the number of elements and the apparatus characteristic information set in the apparatus characteristic information set
The ratio of amount or the object statistics value are probability of the preceding i apparatus characteristic information in the apparatus characteristic information set
Density.
Method provided in an embodiment of the present invention is determined the value of N using edge effect, set as much as possible by what is repeated
Standby characteristic information covers, and excludes individual equipment characteristic information.
Optionally, the processing request instruction is characterized rule information creation instruction, and the processing result is for describing
The regularity of the matching relationship of apparatus characteristic information and facility information;The system also includes information adding units, are used for:
The regularity is added in characteristic information rule set.
Optionally, the processing request instruction is mapping ruler creation instruction, and the system also includes facility information transmission
Unit is used for:
Obtain the facility information identified to the top n apparatus characteristic information;
The facility information is sent to the processing client-side interface, to determine according to the facility information for retouching
The mapping ruler between the facility information and standard device information is stated, the rule is the mapping ruler.
Optionally, it the system also includes facility information characteristic set updating unit, is used for:
Apparatus characteristic information is obtained, and identifies to obtain facility information according to the apparatus characteristic information;
The facility information is matched using the mapping ruler set pre-established;
If not matching to obtain standard device information, the apparatus characteristic information is added in apparatus characteristic information set.
System provided in an embodiment of the present invention, before obtaining apparatus characteristic information set, i.e., to apparatus characteristic information into
Row screening, the apparatus characteristic information covered in existing mapping ruler set is filtered, data volume is further reduced, and is improved
Treatment effeciency.
Optionally, the apparatus characteristic information includes user agent's information.
In conjunction with the first of second aspect or second aspect implementation, at the 6th kind of second aspect of the embodiment of the present invention
In implementation, the apparatus characteristic information includes host name information.
The embodiment of the present invention provides a kind of computer system, comprising:
One or more processors;
Memory;
One or more application program, wherein one or more of application programs are stored in the memory and quilt
It is configured to be executed by one or more of processors, realizes the method as described in any of the above-described implementation.
The embodiment of the present invention provides a kind of computer readable storage medium, for being stored as used in above-mentioned computer system
Application program instruction.
Algorithm and display are not inherently related to any particular computer, virtual system, or other device provided herein.
Various general-purpose systems can also be used together with teachings based herein.As described above, it constructs required by this kind of system
Structure be obvious.In addition, the present invention is also not directed to any particular programming language.It should be understood that can use various
Programming language realizes summary of the invention described herein, and the description done above to language-specific is to disclose this hair
Bright preferred forms.
In the instructions provided here, numerous specific details are set forth.It is to be appreciated, however, that implementation of the invention
Example can be practiced without these specific details.In some instances, well known method, structure is not been shown in detail
And technology, so as not to obscure the understanding of this specification.
Similarly, it should be understood that in order to simplify the disclosure and help to understand one or more of the various inventive aspects,
Above in the description of exemplary embodiment of the present invention, each feature of the invention is grouped together into single implementation sometimes
In example, figure or descriptions thereof.However, the disclosed method should not be interpreted as reflecting the following intention: i.e. required to protect
Shield the present invention claims features more more than feature expressly recited in each claim.More precisely, as following
Claims reflect as, inventive aspect is all features less than single embodiment disclosed above.Therefore,
Thus the claims for following specific embodiment are expressly incorporated in the specific embodiment, wherein each claim itself
All as a separate embodiment of the present invention.
Those skilled in the art will understand that can be carried out adaptively to the module in the equipment in embodiment
Change and they are arranged in one or more devices different from this embodiment.It can be the module or list in embodiment
Member or component are combined into a module or unit or component, and furthermore they can be divided into multiple submodule or subelement or
Sub-component.Other than such feature and/or at least some of process or unit exclude each other, it can use any
Combination is to all features disclosed in this specification (including adjoint claim, abstract and attached drawing) and so disclosed
All process or units of what method or apparatus are combined.Unless expressly stated otherwise, this specification is (including adjoint power
Benefit require, abstract and attached drawing) disclosed in each feature can carry out generation with an alternative feature that provides the same, equivalent, or similar purpose
It replaces.
In addition, it will be appreciated by those of skill in the art that although some embodiments in this include institute in other embodiments
Including certain features rather than other feature, but the combination of the feature of different embodiment means in the scope of the present invention
Within and form different embodiments.For example, in the following claims, embodiment claimed it is any it
One can in any combination mode come using.
Various component embodiments of the invention can be implemented in hardware, or to run on one or more processors
Software module realize, or be implemented in a combination thereof.It will be understood by those of skill in the art that can be used in practice
Microprocessor or digital signal processor (DSP) Lai Shixian according to the system in the embodiment of the present invention in some or all portions
The some or all functions of part.The present invention is also implemented as a part or complete for executing method as described herein
The device or device program (for example, computer program and computer program product) in portion.It is such to realize program of the invention
It can store on a computer-readable medium, or may be in the form of one or more signals.Such signal can be with
It downloads from internet website, is perhaps provided on the carrier signal or is provided in any other form.
It should be noted that the above-mentioned embodiments illustrate rather than limit the invention, and ability
Field technique personnel can be designed alternative embodiment without departing from the scope of the appended claims.In the claims,
Any reference symbol between parentheses should not be configured to limitations on claims.Word "comprising" does not exclude the presence of not
Element or step listed in the claims.Word "a" or "an" located in front of the element does not exclude the presence of multiple such
Element.The present invention can be by means of including the hardware of several different elements and being come by means of properly programmed computer real
It is existing.In the unit claims listing several devices, several in these devices can be through the same hardware branch
To embody.The use of word first, second, and third does not indicate any sequence.These words can be explained and be run after fame
Claim.
The invention discloses:
A1, a kind of data processing method, comprising:
Apparatus characteristic information set is obtained, the set element in the apparatus characteristic information set is apparatus characteristic information,
The facility information of apparatus characteristic information terminal device for identification;
Determine the frequency of occurrence of each apparatus characteristic information in the apparatus characteristic information set;
Top n apparatus characteristic information is obtained according to frequency of occurrence descending;
The top n apparatus characteristic information and processing request instruction are sent to data processing client-side interface;
The processing result that the data processing client-side interface returns is received, the processing result is asked according to the processing
The a plurality of rule for asking the instruction of instruction to handle the top n apparatus characteristic information.
A2, method according to a1, it is described according to frequency of occurrence descending obtain top n apparatus characteristic information before, institute
State method further include:
According to the functional relation of the object statistics value of preceding i apparatus characteristic information and sequence value i, obtain predetermined
Inflection point object statistics value corresponding sequence value i=N, 1≤i≤I, I are the equipment feature in the apparatus characteristic information set
Information sum, the object statistics value are number of elements of the preceding i apparatus characteristic information in the apparatus characteristic information set
It is preceding i equipment feature with the ratio of the element total quantity in the apparatus characteristic information set or the object statistics value
Probability density of the information in the apparatus characteristic information set.
A3, method according to a1 or a2, the processing request instruction are characterized rule information creation instruction, the rule
It is then the regularity for describing the matching relationship of apparatus characteristic information and facility information;The method also includes:
The regularity is added in characteristic information rule set.
A4, method according to a1 or a2, the processing request instruction are mapping ruler creation instruction, and the method is also
Include:
Obtain the facility information identified to the top n apparatus characteristic information;
The facility information is sent to the processing client-side interface, to determine according to the facility information for retouching
The mapping ruler between the facility information and standard device information is stated, the rule is the mapping ruler.
A5, method according to a4, before the acquisition apparatus characteristic information set, the method also includes:
Apparatus characteristic information is obtained, and identifies to obtain facility information according to the apparatus characteristic information;
The facility information is matched using the mapping ruler set pre-established;
If not matching to obtain standard device information, the apparatus characteristic information is added in cluster tool.
A6, method according to a1 or a2, the apparatus characteristic information include user agent's information.
A7, method according to a1 or a2, the apparatus characteristic information include host name information.
B8, a kind of data processing system, comprising:
Information aggregate acquiring unit, the collection for obtaining apparatus characteristic information set, in the apparatus characteristic information set
Conjunction element is apparatus characteristic information, the facility information of apparatus characteristic information terminal device for identification;
Apparatus characteristic information frequency of occurrence statistic unit, for determining, each equipment is special in the apparatus characteristic information set
The frequency of occurrence of reference breath;
Frequency of occurrence sequencing unit, for obtaining top n apparatus characteristic information according to frequency of occurrence descending;
Request instruction transmission unit, for the top n apparatus characteristic information and processing request instruction to be sent to data
Handle client-side interface;
Processing result receiving unit, the processing result returned for receiving the data processing client-side interface, the place
Reason is the result is that a plurality of rule that the instruction instructed according to the processing request handles the top n apparatus characteristic information
Then.
B9, the system according to B8, the system also includes threshold value determination units, are used for:
Before obtaining top n apparatus characteristic information according to frequency of occurrence descending, according to the target of preceding i apparatus characteristic information
The functional relation of statistical value and sequence value i, obtains the corresponding sequence value i=N of predetermined inflection point object statistics value, and 1
≤ i≤I, I are the apparatus characteristic information sum in the apparatus characteristic information set, and the object statistics value is preceding i equipment
Element sum of the characteristic information in the number of elements and the apparatus characteristic information set in the apparatus characteristic information set
The ratio of amount or the object statistics value are probability of the preceding i apparatus characteristic information in the apparatus characteristic information set
Density.
B11, the system according to B8 or B9, the processing request instruction is characterized rule information creation instruction, described
Rule is the regularity for describing the matching relationship of apparatus characteristic information and facility information;The system also includes information to add
Add unit, be used for:
The regularity is added in characteristic information rule set.
B12, the system according to B8 or B9, the processing request instruction are mapping ruler creation instruction, the system
Further include facility information transmission unit, be used for:
Obtain the facility information identified to the top n apparatus characteristic information;
The facility information is sent to the processing client-side interface, to determine according to the facility information for retouching
The mapping ruler between the facility information and standard device information is stated, the rule is the mapping ruler.
B13, system according to b12, the system also includes facility information characteristic set updating units, are used for:
Apparatus characteristic information is obtained, and identifies to obtain facility information according to the apparatus characteristic information;
The facility information is matched using the mapping ruler set pre-established;
If not matching to obtain standard device information, the apparatus characteristic information is added in apparatus characteristic information set.
B14, the system according to B8 or B9, the apparatus characteristic information include user agent's information.
B15, the system according to B8 or B9, the apparatus characteristic information include host name information.
C16, a kind of computer system, comprising:
One or more processors;
Memory;
One or more application program, wherein one or more of application programs are stored in the memory and quilt
It is configured to be executed by one or more of processors, realizes such as the described in any item methods of A1-A7.
D17, a kind of computer readable storage medium are answered used in computer system described in above-mentioned C16 for being stored as
With the instruction of program.
Claims (10)
1. a kind of data processing method characterized by comprising
Apparatus characteristic information set is obtained, the set element in the apparatus characteristic information set is apparatus characteristic information, described
The facility information of apparatus characteristic information terminal device for identification;
Determine the frequency of occurrence of each apparatus characteristic information in the apparatus characteristic information set;
Top n apparatus characteristic information is obtained according to frequency of occurrence descending;
The top n apparatus characteristic information and processing request instruction are sent to data processing client-side interface;
The processing result that the data processing client-side interface returns is received, the processing result is to refer to according to the processing request
The a plurality of rule that the instruction of order handles the top n apparatus characteristic information.
2. the method according to claim 1, wherein described obtain top n equipment spy according to frequency of occurrence descending
Before reference breath, the method also includes:
According to the functional relation of the object statistics value of preceding i apparatus characteristic information and sequence value i, predetermined inflection point is obtained
Object statistics value corresponding sequence value i=N, 1≤i≤I, I are the apparatus characteristic information in the apparatus characteristic information set
Sum, the number of elements and institute that the object statistics value is preceding i apparatus characteristic information in the apparatus characteristic information set
The ratio or the object statistics value for stating the element total quantity in apparatus characteristic information set are preceding i apparatus characteristic information
Probability density in the apparatus characteristic information set.
3. method according to claim 1 or 2, which is characterized in that the processing request instruction is characterized rule information wound
Instruction is built, the rule is the regularity for describing the matching relationship of apparatus characteristic information and facility information;The method
Further include:
The regularity is added in characteristic information rule set.
4. method according to claim 1 or 2, which is characterized in that the processing request instruction is that mapping ruler creation refers to
It enables, the method also includes:
Obtain the facility information identified to the top n apparatus characteristic information;
The facility information is sent to the processing client-side interface, to determine according to the facility information for describing
The mapping ruler between facility information and standard device information is stated, the rule is the mapping ruler.
5. according to the method described in claim 4, it is characterized in that, before the acquisition apparatus characteristic information set, the side
Method further include:
Apparatus characteristic information is obtained, and identifies to obtain facility information according to the apparatus characteristic information;
The facility information is matched using the mapping ruler set pre-established;
If not matching to obtain standard device information, the apparatus characteristic information is added in cluster tool.
6. method according to claim 1 or 2, which is characterized in that the apparatus characteristic information includes user agent's information.
7. method according to claim 1 or 2, which is characterized in that the apparatus characteristic information includes host name information.
8. a kind of data processing system characterized by comprising
Information aggregate acquiring unit, the set member for obtaining apparatus characteristic information set, in the apparatus characteristic information set
Element is apparatus characteristic information, the facility information of apparatus characteristic information terminal device for identification;
Apparatus characteristic information frequency of occurrence statistic unit, for determining each equipment feature letter in the apparatus characteristic information set
The frequency of occurrence of breath;
Frequency of occurrence sequencing unit, for obtaining top n apparatus characteristic information according to frequency of occurrence descending;
Request instruction transmission unit, for the top n apparatus characteristic information and processing request instruction to be sent to data processing
Client-side interface;
Processing result receiving unit, the processing result returned for receiving the data processing client-side interface, the processing knot
Fruit is a plurality of rule that the instruction instructed according to the processing request handles the top n apparatus characteristic information.
9. a kind of computer system characterized by comprising
One or more processors;
Memory;
One or more application program, wherein one or more of application programs are stored in the memory and are configured
To be executed by one or more of processors, the method according to claim 1 to 7 is realized.
10. a kind of computer readable storage medium, which is characterized in that for being stored as computer described in the claims 9
The instruction of application program used in system.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910186327.5A CN109947803B (en) | 2019-03-12 | 2019-03-12 | Data processing method, system and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910186327.5A CN109947803B (en) | 2019-03-12 | 2019-03-12 | Data processing method, system and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109947803A true CN109947803A (en) | 2019-06-28 |
CN109947803B CN109947803B (en) | 2021-11-19 |
Family
ID=67009687
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910186327.5A Active CN109947803B (en) | 2019-03-12 | 2019-03-12 | Data processing method, system and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109947803B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110345056A (en) * | 2019-07-12 | 2019-10-18 | 四川虹美智能科技有限公司 | SCM Based data processing method, driver, controller and system |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101145157A (en) * | 2007-06-14 | 2008-03-19 | 中兴通讯股份有限公司 | XML format embedded type apparatus characteristic information analysis method |
CN105162888A (en) * | 2015-09-30 | 2015-12-16 | 北京奇虎科技有限公司 | Remote tracking method for intelligent wearable device, terminal and server |
CN106407768A (en) * | 2015-07-29 | 2017-02-15 | 阿里巴巴集团控股有限公司 | Methods and devices for determining device fingerprint and identifying target device |
CN106603510A (en) * | 2016-11-28 | 2017-04-26 | 深圳市金立通信设备有限公司 | Data processing method and terminal |
US20180004815A1 (en) * | 2015-12-01 | 2018-01-04 | Huawei Technologies Co., Ltd. | Stop word identification method and apparatus |
US20180157712A1 (en) * | 2015-05-06 | 2018-06-07 | Örjan Vestgöte Technology AB | Method, system and computer program product for performing numeric searches |
CN108959585A (en) * | 2018-07-10 | 2018-12-07 | 维沃移动通信有限公司 | A kind of expression picture acquisition methods and terminal device |
-
2019
- 2019-03-12 CN CN201910186327.5A patent/CN109947803B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101145157A (en) * | 2007-06-14 | 2008-03-19 | 中兴通讯股份有限公司 | XML format embedded type apparatus characteristic information analysis method |
US20180157712A1 (en) * | 2015-05-06 | 2018-06-07 | Örjan Vestgöte Technology AB | Method, system and computer program product for performing numeric searches |
CN106407768A (en) * | 2015-07-29 | 2017-02-15 | 阿里巴巴集团控股有限公司 | Methods and devices for determining device fingerprint and identifying target device |
CN105162888A (en) * | 2015-09-30 | 2015-12-16 | 北京奇虎科技有限公司 | Remote tracking method for intelligent wearable device, terminal and server |
US20180004815A1 (en) * | 2015-12-01 | 2018-01-04 | Huawei Technologies Co., Ltd. | Stop word identification method and apparatus |
CN106603510A (en) * | 2016-11-28 | 2017-04-26 | 深圳市金立通信设备有限公司 | Data processing method and terminal |
CN108959585A (en) * | 2018-07-10 | 2018-12-07 | 维沃移动通信有限公司 | A kind of expression picture acquisition methods and terminal device |
Non-Patent Citations (1)
Title |
---|
陆泽橼等: "基于雷达脉冲压缩信号的辐射源个体识别技术", 《电脑知识与技术》 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110345056A (en) * | 2019-07-12 | 2019-10-18 | 四川虹美智能科技有限公司 | SCM Based data processing method, driver, controller and system |
Also Published As
Publication number | Publication date |
---|---|
CN109947803B (en) | 2021-11-19 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108776934B (en) | Distributed data calculation method and device, computer equipment and readable storage medium | |
US9973521B2 (en) | System and method for field extraction of data contained within a log stream | |
US11915104B2 (en) | Normalizing text attributes for machine learning models | |
CN106649831B (en) | Data filtering method and device | |
CN109951354A (en) | A kind of terminal device recognition methods, system and storage medium | |
CN104036004B (en) | Search for error correction method and search error correction device | |
CN114861910B (en) | Compression method, device, equipment and medium of neural network model | |
CN112463859B (en) | User data processing method and server based on big data and business analysis | |
CN113051308A (en) | Alarm information processing method, equipment, storage medium and device | |
CN113536770B (en) | Text analysis method, device and equipment based on artificial intelligence and storage medium | |
CN104933096B (en) | Abnormal key recognition methods, device and the data system of database | |
CN109947803A (en) | A kind of data processing method, system and storage medium | |
CN107330031B (en) | Data storage method and device and electronic equipment | |
CN111368128B (en) | Target picture identification method, device and computer readable storage medium | |
WO2019024238A1 (en) | Range value data statistical method and system, electronic device, and computer readable storage medium | |
CN109040089B (en) | Network policy auditing method, equipment and computer readable storage medium | |
US11663184B2 (en) | Information processing method of grouping data, information processing system for grouping data, and non-transitory computer readable storage medium | |
CN110532267A (en) | Determination method, apparatus, storage medium and the electronic device of field | |
US10438695B1 (en) | Semi-automated clustered case resolution system | |
CN114356712A (en) | Data processing method, device, equipment, readable storage medium and program product | |
CN110083357B (en) | Interface construction method, device, server and storage medium | |
CN112508518A (en) | RPA flow generation method combining RPA and AI, corresponding device and readable storage medium | |
US11132235B2 (en) | Data processing method, distributed data processing system and storage medium | |
CN109559139A (en) | A kind of processing method of item object, device, medium and electronic equipment | |
CN110119406B (en) | Method and device for checking real-time task records |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
TA01 | Transfer of patent application right | ||
TA01 | Transfer of patent application right |
Effective date of registration: 20210907 Address after: No. 1201, 12 / F, building 6, No. 599, shijicheng South Road, Chengdu hi tech Zone, China (Sichuan) pilot Free Trade Zone, Chengdu, Sichuan 610094 Applicant after: Chengdu panorama Intelligent Technology Co.,Ltd. Address before: 100088 room 112, block D, 28 new street, new street, Xicheng District, Beijing (Desheng Park) Applicant before: BEIJING QIHOO TECHNOLOGY Co.,Ltd. |
|
GR01 | Patent grant | ||
GR01 | Patent grant |