CN114116172A - Flow data acquisition method, device, equipment and storage medium - Google Patents

Flow data acquisition method, device, equipment and storage medium Download PDF

Info

Publication number
CN114116172A
CN114116172A CN202111454126.2A CN202111454126A CN114116172A CN 114116172 A CN114116172 A CN 114116172A CN 202111454126 A CN202111454126 A CN 202111454126A CN 114116172 A CN114116172 A CN 114116172A
Authority
CN
China
Prior art keywords
data acquisition
data
strategy
strategies
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111454126.2A
Other languages
Chinese (zh)
Inventor
袁堂岭
邹学强
苗玲玲
刘中金
鲁睿
王元杰
庞韶敏
尚程
傅强
梁彧
蔡琳
田野
王杰
杨满智
金红
陈晓光
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
National Computer Network and Information Security Management Center
Eversec Beijing Technology Co Ltd
Original Assignee
National Computer Network and Information Security Management Center
Eversec Beijing Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National Computer Network and Information Security Management Center, Eversec Beijing Technology Co Ltd filed Critical National Computer Network and Information Security Management Center
Priority to CN202111454126.2A priority Critical patent/CN114116172A/en
Publication of CN114116172A publication Critical patent/CN114116172A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/4881Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/5038Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the execution order of a plurality of tasks, e.g. taking priority or time dependency constraints into consideration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/505Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the load
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/50Indexing scheme relating to G06F9/50
    • G06F2209/5021Priority
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/50Indexing scheme relating to G06F9/50
    • G06F2209/508Monitor

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention discloses a method, a device, equipment and a storage medium for collecting flow data, wherein the method comprises the following steps: receiving a plurality of data acquisition strategies, and respectively transmitting each data acquisition strategy to a corresponding level processing module according to a communication layer corresponding to each data acquisition strategy; the data acquisition strategy is preset according to the service requirement; executing a plurality of data acquisition strategies through each level processing module, and monitoring resources occupied by the DPI system in the running process in real time to obtain a resource occupation result; and if the resource occupation result exceeds a preset threshold, sequentially determining a target acquisition strategy to be executed in the plurality of data acquisition strategies through each hierarchy processing module according to the priority corresponding to each data acquisition strategy, and sequentially executing the target acquisition strategy. The technical scheme of the embodiment of the invention can realize the acquisition of the flow data in the mobile internet as required and meet the balance between the business requirement and the resource occupation.

Description

Flow data acquisition method, device, equipment and storage medium
Technical Field
The embodiment of the invention relates to the technical field of computers, in particular to a flow data acquisition method, a flow data acquisition device, flow data acquisition equipment and a storage medium.
Background
With the development of the mobile internet, the analysis of traffic data for mobile devices becomes an important task in the mobile internet industry. Through the analysis of the flow data of the mobile equipment, the access frequency, the access time period, the type proportion and the like of a user to websites and software can be known, and the method has high reference value for the improved decision and the future trend of the mobile internet enterprises.
At present, a Deep Packet Inspection (DPI) system is widely used as an effective data acquisition tool in the field of traffic data acquisition of the mobile internet. The existing DPI system usually adopts a full-volume acquisition mode or a sampling acquisition mode to acquire the flow data of the mobile internet.
However, the cost of the full-quantity acquisition mode is high, landing and capacity expansion are obviously delayed, and flexible service change is not supported; although the sampling acquisition mode can reduce the acquired data volume, the randomness of the data exists, the target of data acquisition as required cannot be fundamentally met, and the business diversity requirement of a business analysis department is difficult to be completely met.
Disclosure of Invention
Embodiments of the present invention provide a method, an apparatus, a device, and a storage medium for acquiring traffic data, which can realize acquisition of traffic data in a mobile internet on demand, and satisfy a balance between a service demand and resource occupation.
In a first aspect, an embodiment of the present invention provides a method for acquiring traffic data, where the method is applied to a packet deep inspection (DPI) system, and includes:
receiving a plurality of data acquisition strategies, and respectively transmitting each data acquisition strategy to a corresponding hierarchical processing module according to a communication layer corresponding to each data acquisition strategy; the data acquisition strategy is preset according to the service requirement;
executing a plurality of data acquisition strategies through each hierarchical processing module, and monitoring resources occupied by a DPI system in the running process in real time to obtain a resource occupation result;
and if the resource occupation result exceeds a preset threshold, sequentially determining target acquisition strategies to be executed in the plurality of data acquisition strategies through each hierarchical processing module according to the priority corresponding to each data acquisition strategy, and sequentially executing the target acquisition strategies.
In a second aspect, an embodiment of the present invention further provides a flow data acquisition device, which is applied in a packet deep inspection DPI system, and the device includes:
the strategy receiving module is used for receiving a plurality of data acquisition strategies and respectively transmitting each data acquisition strategy to the corresponding level processing module according to the communication layer corresponding to each data acquisition strategy; the data acquisition strategy is preset according to the service requirement;
the resource monitoring module is used for executing a plurality of data acquisition strategies through each hierarchical processing module and monitoring resources occupied by the DPI system in the running process in real time to obtain a resource occupation result;
and the strategy execution module is used for sequentially determining a target acquisition strategy to be executed in a plurality of data acquisition strategies according to the priority corresponding to each data acquisition strategy through each hierarchical processing module and sequentially executing the target acquisition strategy if the resource occupation result exceeds a preset threshold.
In a third aspect, an embodiment of the present invention further provides a computer device, where the computer device includes:
one or more processors;
storage means for storing one or more programs;
when the one or more programs are executed by the one or more processors, the one or more processors implement a method for collecting traffic data according to any of the embodiments of the present invention.
In a fourth aspect, an embodiment of the present invention further provides a computer-readable storage medium, where a computer program is stored on the storage medium, and when the computer program is executed by a processor, the computer program implements a flow data collection method provided in any embodiment of the present invention.
The technical scheme of the embodiment of the invention comprises the steps of receiving a plurality of data acquisition strategies, respectively transmitting the data acquisition strategies to corresponding hierarchical processing modules according to communication layers corresponding to the data acquisition strategies, executing the data acquisition strategies through the hierarchical processing modules, monitoring resources occupied by a DPI system in the operation process in real time to obtain a resource occupation result, and if the resource occupation result exceeds a preset threshold value, sequentially determining target acquisition strategies to be executed in the data acquisition strategies through the hierarchical processing modules according to priorities corresponding to the data acquisition strategies and sequentially executing the target acquisition strategies.
Drawings
Fig. 1 is a flowchart of a flow data collection method according to a first embodiment of the present invention;
fig. 2 is a flowchart of a flow data acquisition method according to a second embodiment of the present invention;
fig. 3a is a flowchart of a flow data collection method in the third embodiment of the present invention;
figure 3b is a schematic diagram of a DPI system according to a third embodiment of the present invention;
FIG. 4 is a structural view of a flow data acquisition apparatus in a fourth embodiment of the present invention;
fig. 5 is a schematic structural diagram of a computer device in the fifth embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. It should be further noted that, for the convenience of description, only some of the structures related to the present invention are shown in the drawings, not all of the structures.
Example one
Fig. 1 is a flowchart of a traffic data collection method according to an embodiment of the present invention, where this embodiment is applicable to a case where a DPI system collects traffic data in a mobile internet, and the method may be executed by a traffic data collection device, where the device may be implemented by software and/or hardware, and may be generally integrated in a terminal or a server having a data processing function, and specifically includes the following steps:
step 110, receiving a plurality of data acquisition strategies, and respectively transmitting each data acquisition strategy to a corresponding hierarchical processing module according to a communication layer corresponding to each data acquisition strategy.
In this embodiment, the data collection policy is preset according to a service requirement. Specifically, the data collection policy may be set in advance in a manual manner according to specific service requirements (for example, collecting traffic data of a target interface in a target service).
In this step, after receiving the plurality of data acquisition strategies, the communication layer corresponding to each data acquisition strategy may be determined according to the source of the data to be acquired specified in each data acquisition strategy. Specifically, assuming that the data collection policy is "collection request Domain name resolution (DNS) original packet data with a Domain name resolution (xxxx)", it may be determined that a communication layer corresponding to the data collection policy is a network layer.
In the present embodiment, hierarchical processing modules corresponding to the respective communication layers are deployed in advance in the DPI system. After the communication layer corresponding to the data acquisition strategy is determined in the above manner, the data acquisition strategy can be transmitted to the corresponding level processing module. Specifically, assuming that a communication layer corresponding to a certain data acquisition policy is a network layer, the data acquisition policy may be transmitted to a network layer processing module in the DPI system.
And 120, executing a plurality of data acquisition strategies through each hierarchical processing module, and monitoring resources occupied by the DPI system in the running process in real time to obtain a resource occupation result.
In this embodiment, after receiving the matched data acquisition policy, each of the hierarchical processing modules may execute a plurality of data acquisition policies in parallel, and monitor resources occupied by a DPI system in an operation process in real time during the execution process to obtain a resource occupation result.
In a specific embodiment, each of the hierarchical processing modules may monitor resource conditions, such as computational power, memory, and bandwidth, of a Central Processing Unit (CPU) occupied in an operation process of a DPI system, so as to obtain a resource occupation result.
Step 130, if the resource occupation result exceeds a preset threshold, sequentially determining target acquisition strategies to be executed in the plurality of data acquisition strategies through each hierarchical processing module according to the priority corresponding to each data acquisition strategy, and sequentially executing the target acquisition strategies.
In this step, if the resource occupation result exceeds the preset threshold, each of the hierarchical processing modules may determine, according to the priority corresponding to each of the data acquisition policies, a target acquisition policy to be executed in the plurality of data acquisition policies by using an optimal resource occupation reduction algorithm, and stop executing other data acquisition policies, so as to satisfy a balance between a service requirement and resource occupation.
In a specific embodiment, the resource occupation reduction algorithm may include at least one of the following: a First-Come First-serve (FCFS) scheduling algorithm, a Short Job First (SJF) scheduling algorithm, a priority scheduling algorithm, a high-response-ratio priority scheduling algorithm, a time slice round-robin scheduling algorithm, a multi-stage feedback queue scheduling algorithm, and the like.
In the embodiment, flow data can be acquired as required by receiving a plurality of data acquisition strategies formulated based on business requirements; by arranging the hierarchical processing module corresponding to each communication layer in the DPI system, the hierarchy of a data acquisition strategy can be realized, so that the interfaces of flow data to be acquired are richer, and the service diversity requirements of a service department are met; secondly, according to the priority corresponding to each data acquisition strategy, the target acquisition strategy to be executed is sequentially determined in the plurality of data acquisition strategies through an optimal resource occupation order reduction algorithm, and the balance between service requirements and resource occupation can be met to the greatest extent.
The technical scheme of the embodiment of the invention comprises the steps of receiving a plurality of data acquisition strategies, respectively transmitting the data acquisition strategies to corresponding hierarchical processing modules according to communication layers corresponding to the data acquisition strategies, executing the data acquisition strategies through the hierarchical processing modules, monitoring resources occupied by a DPI system in the operation process in real time to obtain a resource occupation result, and if the resource occupation result exceeds a preset threshold value, sequentially determining target acquisition strategies to be executed in the data acquisition strategies through the hierarchical processing modules according to priorities corresponding to the data acquisition strategies and sequentially executing the target acquisition strategies.
Example two
This embodiment is a further refinement of the above embodiment, and the same or corresponding terms as those of the above embodiment are explained, and this embodiment is not described again. Fig. 2 is a flowchart of a flow data acquisition method provided in the second embodiment, in this embodiment, the technical solution of this embodiment may be combined with one or more methods in the solutions of the foregoing embodiments, as shown in fig. 2, the method provided in this embodiment may further include:
step 210, receiving a plurality of data acquisition strategies.
And step 220, converting each data acquisition strategy into a data acquisition instruction suitable for being executed by a DPI system.
In this step, each data acquisition policy may be converted into a data acquisition instruction suitable for the DPI system to execute according to a unified conversion format according to a communication layer corresponding to the data acquisition policy, a service type specified in the data acquisition policy, a data type of data to be acquired, and a time period.
In this embodiment, the data acquisition strategy that needs to be converted generally corresponds to the following scenario: network layer raw message data, transport layer raw message data-Internet Protocol (IP) quintuple, transport layer raw message data-specific location load, application layer raw message data, and log data of specific conditions.
Step 230, determining whether a conflict exists between the data acquisition instructions according to the relevance between the data acquisition instructions, if so, executing step 240 and step 260, and if not, executing step 270.
In this embodiment, when it is determined that there is a correlation between two data acquisition commands, whether there is a coverage conflict, a correlation conflict, a redundancy conflict, a generalization conflict, an irrelevant exception, and the like between the data acquisition commands may be determined by comparing the relationship between the data acquisition commands and the action behavior of the data acquisition commands.
In a specific embodiment, by judging whether the paths and coverage areas of two data acquisition instructions in the policy tree are consistent, if the path of the current data acquisition instruction is consistent with the path and coverage area of the previous data acquisition instruction, a conflict is likely to exist between the two data acquisition instructions; if the paths of the two data acquisition instructions do not coincide, then there is no conflict between the two data acquisition instructions.
In one embodiment of this embodiment, the relationship between the data acquisition instructions includes:
exact match relationship: the values of all strategy items in the data acquisition command are equal, and the two data acquisition command items are in accurate matching.
The method comprises the following steps: the data acquisition instruction A and the data acquisition instruction B are not completely matched, and the value of each strategy item of A is a subset of the value of the corresponding strategy item of B or is equal, so that A and B are inclusive matching.
Completely unrelated: all the strategy items in the data acquisition instruction A are not equal to the values of the corresponding strategy items in the data acquisition instruction B, and the relation of subsets or supersets does not exist, so that A and B are completely unrelated.
Partial matching: if the value of at least one strategy item of the data acquisition instruction A is equal to the value of the corresponding strategy item in the data acquisition instruction B, or a subset and super relationship exists, and the value of at least one strategy item is not equal to the value of the corresponding strategy item in the data acquisition instruction B, or the subset and super relationship does not exist, the A and B parts are matched.
The association relationship is as follows: the data acquisition instruction A has the values of some strategy items which are subsets or equal of the strategy items corresponding to the data acquisition instruction B, and the values of other strategy items of A are supersets of the values of the strategy items corresponding to the B, so that the A and the B are related.
In a specific embodiment, it is assumed that the data collection instruction a is to collect DNS raw packet data (the transport layer is UDP protocol, port number 53), and the data collection instruction B is to collect DNS raw packet data with request domain name xxxx.
And step 240, acquiring the first data acquisition instruction and the second data acquisition instruction which are in conflict.
And 250, determining an invalid data acquisition instruction according to the priorities corresponding to the first data acquisition instruction and the second data acquisition instruction respectively, and unloading the invalid data acquisition instruction.
In a specific embodiment, assuming that the priority of the first data acquisition instruction is higher than the priority of the second data acquisition instruction, the first data acquisition instruction may be executed and the second data acquisition instruction may be unloaded as an invalid acquisition instruction.
In an implementation manner of this embodiment, before determining an invalid data acquisition instruction according to priorities corresponding to the first data acquisition instruction and the second data acquisition instruction, the method further includes:
step 241, if the first data acquisition instruction and the second data acquisition instruction belong to an original message acquisition instruction, determining priorities corresponding to the first data acquisition instruction and the second data acquisition instruction respectively according to communication layers corresponding to the first data acquisition instruction and the second data acquisition instruction respectively;
in a specific embodiment, it is assumed that the first data collection instruction is to collect DNS raw packet data (transport layer is UDP protocol, port number 53), and the second data collection instruction is to collect DNS raw packet data with request domain name xxxx.com, because the second data collection instruction is within the range of the first data collection instruction, and the communication layer corresponding to the second data collection instruction is above the first data collection instruction, so the priority of the first data collection instruction is higher.
In another specific embodiment, it is assumed that the first data acquisition command is to acquire all original packet data of a Transmission Control Protocol (TCP), and the second data acquisition command is to acquire original packet data of a hypertext Transfer Protocol (HTTP), because the second data acquisition command is within a range of the first data acquisition command and a communication layer corresponding to the second data acquisition command is above the first data acquisition command, so that the priority of the first data acquisition command is higher.
Step 242, if the first data acquisition instruction and the second data acquisition instruction belong to a log acquisition instruction, determining priorities corresponding to the first data acquisition instruction and the second data acquisition instruction according to data ranges covered by the first data acquisition instruction and the second data acquisition instruction respectively.
In a specific embodiment, it is assumed that the first data acquisition instruction is to acquire log data of a certain city, and the second data acquisition instruction is to acquire data of N base stations, and if the N base stations are all within the range of the city, it may be determined that the data range covered by the first data acquisition instruction is larger, that is, the priority of the first data acquisition instruction is higher.
And step 260, respectively transmitting the corresponding data acquisition strategies to the matched hierarchical processing module according to the communication layer corresponding to the remaining data acquisition commands.
And 270, respectively transmitting the corresponding data acquisition strategies to the matched hierarchical processing module according to the communication layer corresponding to each data acquisition instruction.
And step 280, executing a plurality of data acquisition strategies through each hierarchical processing module, and monitoring resources occupied by the DPI system in the running process in real time to obtain a resource occupation result.
In an embodiment of this embodiment, the executing, by each of the hierarchical processing modules, a plurality of data acquisition strategies, and monitoring resources occupied by a DPI system in a running process in real time to obtain a resource occupation result includes:
step 281, executing a plurality of data acquisition strategies through each hierarchical processing module, and monitoring hardware equipment resources occupied in the operation process of the DPI system in real time to obtain a hardware resource occupation result;
in this step, the CPU resources, the CPU load status, the storage occupation situation, the data reporting bandwidth resources, and the load resources occupied by the key modules occupied by the DPI system during the operation process can be monitored in real time, so as to obtain the hardware resource occupation result.
282, acquiring the threads called by the DPI system in the process of executing the plurality of data acquisition strategies, and monitoring the software resources occupied by all the threads to obtain the software resource occupation result.
In this step, software resources occupied by all threads, for example, the calling frequency of the threads, the data volume of the threads, the response time of the threads, and the like, may be monitored to obtain a software resource occupation result.
In this embodiment, the resource occupation result may be displayed in a dashboard manner, so as to macroscopically control the overall performance operation condition of the DPI system, and may also visually display the data volume of each data acquisition policy and the influence on the system performance in a multi-dimensional report manner.
In a specific embodiment, the traffic data may be mined and analyzed to pre-judge the resource occupation of the traffic data during the acquisition process, and the pre-judged result and the resource occupation result are combined to update the target acquisition policy to be executed.
And 290, if the resource occupation result exceeds a preset threshold, sequentially determining target acquisition strategies to be executed in the plurality of data acquisition strategies through each hierarchical processing module according to the priority corresponding to each data acquisition strategy, and sequentially executing the target acquisition strategies.
The technical scheme of the embodiment of the invention comprises the steps of receiving a plurality of data acquisition strategies, converting the data acquisition strategies into data acquisition instructions, judging whether conflicts exist among the data acquisition instructions according to the relevance among the data acquisition instructions, if yes, acquiring a first data acquisition instruction and a second data acquisition instruction which have conflicts, determining an invalid data acquisition instruction according to the priorities corresponding to the first data acquisition instruction and the second data acquisition instruction respectively, unloading the invalid data acquisition instruction, and transmitting the corresponding data acquisition strategies to a matched hierarchical processing module according to the communication layers corresponding to the remaining data acquisition instructions; if not, transmitting the corresponding data acquisition strategies to the matched hierarchical processing modules respectively according to the communication layers corresponding to the data acquisition instructions, executing a plurality of data acquisition strategies through the hierarchical processing modules, monitoring resources occupied in the operation process of the DPI system in real time to obtain resource occupation results, and if the resource occupation results exceed a preset threshold, sequentially determining target acquisition strategies to be executed in the plurality of data acquisition strategies through the hierarchical processing modules according to the priorities corresponding to the data acquisition strategies and sequentially executing the target acquisition strategies.
EXAMPLE III
This embodiment is a further refinement of the above embodiment, and the same or corresponding terms as those of the above embodiment are explained, and this embodiment is not described again. Fig. 3a is a flowchart of a flow data acquisition method provided in a third embodiment, in this embodiment, the technical solution of this embodiment may be combined with one or more methods in the solutions of the foregoing embodiments, as shown in fig. 3a, the method provided in this embodiment may further include:
step 310, receiving a plurality of data acquisition strategies, and respectively transmitting each data acquisition strategy to a corresponding hierarchical processing module according to a communication layer corresponding to each data acquisition strategy.
And 320, executing a plurality of data acquisition strategies through each hierarchical processing module, and monitoring resources occupied by the DPI system in the running process in real time to obtain a resource occupation result.
And step 330, if the resource occupation result exceeds a preset threshold, sequentially determining target acquisition strategies to be executed in the plurality of data acquisition strategies through each hierarchical processing module according to the priority corresponding to each data acquisition strategy, and sequentially executing the target acquisition strategies.
Step 340, obtaining target traffic data corresponding to a plurality of target acquisition strategies, and determining a data sending mode matched with each target traffic data according to a data type corresponding to each target traffic data.
In this step, each target traffic data may be divided into the following two data types according to the data transmission frequency and the file capacity size: non-real-time large-batch data and real-time data with smaller flow. The non-real-time mass data can be data with larger capacity and lower frequency, such as service log data or original code stream data; the real-time data with small flow rate may be data with small capacity, high frequency and strong real-time performance, such as real-time alarm data, location change data, real-time tracking data, and the like.
In this embodiment, if the data type corresponding to the target traffic data is non-real-time bulk data, the data sending method matched with the target traffic data may be: a sending mode based on a Web Service Protocol and a File Transfer Protocol (FTP);
if the data type corresponding to the target traffic data is real-time data with a smaller traffic, the data sending mode matched with the target traffic data may be: and (3) a sending mode based on a rapid interaction protocol. The fast exchange protocol may be a Message Queue (MQ) protocol, a Socket protocol, and the like.
In an implementation manner of this embodiment, after obtaining target traffic data corresponding to each of the plurality of target collection policies, the method further includes: verifying each target flow data according to at least one data verification standard to obtain a verification result corresponding to each target flow data; and determining invalid flow data in the plurality of target flow data according to the verification result corresponding to each target flow data, and removing the invalid flow data.
In this step, the data verification criteria may include field accuracy, field integrity, and field reliability criteria. If the target traffic data fails to be verified under the data verification criteria, the target traffic data may be regarded as invalid traffic data.
And step 350, sending each target traffic data to an application end according to a data sending mode matched with each target traffic data.
The technical scheme of the embodiment of the invention comprises the steps of receiving a plurality of data acquisition strategies, respectively transmitting the data acquisition strategies to corresponding hierarchical processing modules according to communication layers corresponding to the data acquisition strategies, executing the data acquisition strategies through the hierarchical processing modules, monitoring resources occupied in the operation process of a DPI system in real time to obtain resource occupation results, if the resource occupation results exceed a preset threshold value, sequentially determining target acquisition strategies to be executed in the data acquisition strategies through the hierarchical processing modules according to the priority levels corresponding to the data acquisition strategies, sequentially executing the target acquisition strategies to obtain target flow data corresponding to the target acquisition strategies, determining a data sending mode matched with the target flow data according to the data types corresponding to the target flow data, sending the data sending mode matched with the target flow data according to the data sending mode matched with the target flow data, the technical means of sending each target flow data to the application end can realize the flow data acquisition in the mobile internet as required and meet the balance between the business requirement and the resource occupation.
On the basis of the foregoing embodiment, fig. 3b is a schematic structural diagram of a DPI system in this embodiment, and as shown in fig. 3b, the DPI system includes a policy issuing module, an elastic policy preprocessing module, and a data reporting module. The strategy issuing module is used for receiving a plurality of data acquisition strategies preset according to business requirements and transmitting the data acquisition strategies to the corresponding level processing modules respectively; the elastic strategy preprocessing module is deployed in each level processing module and is used for executing each distributed data acquisition strategy; the data reporting module is used for acquiring the traffic data obtained after the execution of each data acquisition strategy and sending the traffic data to the application terminal.
In this embodiment, the policy issuing module includes:
the strategy receiving unit is used for receiving a plurality of data acquisition strategies;
the strategy translation unit is used for converting each data acquisition strategy into a data acquisition instruction suitable for being executed by a DPI system;
the strategy conflict detection unit is used for judging whether conflicts exist among the data acquisition instructions according to the relevance among the data acquisition instructions;
the strategy execution path selection unit is used for determining a communication layer corresponding to a data acquisition strategy and a hierarchy processing module matched with the data acquisition strategy, and transmitting each data acquisition strategy to the corresponding hierarchy processing module;
the strategy life cycle management unit is used for managing the whole life cycle of loading, unloading, execution duration and the like of the data acquisition strategy; and secondly, the strategy life cycle management unit is also used for verifying and combing each data acquisition strategy, optimizing hidden strategies, redundant strategies, empty strategies and the like so as to avoid service risks and safety risks brought by non-compliant strategies, and timely cleaning temporarily opened or expired data acquisition strategies so as to avoid strategy redundancy.
Wherein, elasticity strategy preprocessing module includes:
the strategy receiving unit is used for receiving a plurality of data acquisition strategies;
the strategy analysis unit is used for analyzing the data acquisition strategy to obtain a data type, a service type and a specific time period corresponding to the data to be acquired;
the performance monitoring unit is used for monitoring resources (such as CPU load increment) occupied by the DPI system in the operation process to obtain a resource monitoring result;
the resource evaluation unit is used for prejudging the resource situation occupied by the flow data in the acquisition process and judging whether the pre-resource occupation result exceeds a preset threshold value or not by combining the prejudging result and the resource monitoring result;
the strategy priority determining unit is used for determining the priority matched with each data acquisition strategy;
the order-reducing algorithm unit is used for determining a target acquisition strategy to be executed in a plurality of data acquisition strategies through an optimal resource occupation order-reducing algorithm according to the priority corresponding to each data acquisition strategy;
the strategy updating unit is used for updating the data acquisition strategy to be executed currently into the target acquisition strategy;
and the strategy issuing unit is used for issuing the target acquisition strategy to a corresponding processor so as to complete the execution of the target acquisition strategy.
Wherein, the data reporting module includes:
the data subscription unit is used for intercepting corresponding data fields in the flow data according to actual requirements;
a data transfer unit for transferring the intercepted data field to a filter;
the data filtering unit is used for filtering the data fields through a filter according to a preset rule;
the data management unit is used for verifying each flow data according to at least one data verification standard to obtain a verification result corresponding to each flow data; determining invalid flow data in the plurality of flow data according to a verification result corresponding to each flow data, and removing the invalid flow data;
and the data quality unit is used for optimizing the flow data.
Example four
Fig. 4 is a structural diagram of a flow data acquisition device according to a fourth embodiment of the present invention, where the flow data acquisition device is applied to a packet depth inspection DPI system, and includes: a policy receiving module 410, a resource monitoring module 420, and a policy enforcement module 430.
The policy receiving module 410 is configured to receive a plurality of data acquisition policies, and transmit each data acquisition policy to a corresponding hierarchical processing module according to a communication layer corresponding to each data acquisition policy; the data acquisition strategy is preset according to the service requirement;
the resource monitoring module 420 is configured to execute a plurality of data acquisition strategies through each of the hierarchical processing modules, and monitor resources occupied by the DPI system in the running process in real time to obtain a resource occupation result;
a policy executing module 430, configured to, if the resource occupation result exceeds a preset threshold, sequentially determine, by each of the hierarchical processing modules, a target acquisition policy to be executed among the multiple data acquisition policies according to the priority corresponding to each of the data acquisition policies, and sequentially execute the target acquisition policy.
The technical scheme of the embodiment of the invention comprises the steps of receiving a plurality of data acquisition strategies, respectively transmitting the data acquisition strategies to corresponding hierarchical processing modules according to communication layers corresponding to the data acquisition strategies, executing the data acquisition strategies through the hierarchical processing modules, monitoring resources occupied by a DPI system in the operation process in real time to obtain a resource occupation result, and if the resource occupation result exceeds a preset threshold value, sequentially determining target acquisition strategies to be executed in the data acquisition strategies through the hierarchical processing modules according to priorities corresponding to the data acquisition strategies and sequentially executing the target acquisition strategies.
On the basis of the foregoing embodiments, the policy receiving module 410 may include:
the strategy conversion unit is used for converting each data acquisition strategy into a data acquisition command suitable for being executed by a DPI system;
the conflict judging unit is used for judging whether conflicts exist among the data acquisition instructions according to the relevance among the data acquisition instructions;
the strategy transmission unit is used for transmitting each corresponding data acquisition strategy to the matched hierarchical processing module according to the communication layer corresponding to each data acquisition instruction when no conflict exists among the data acquisition instructions;
the acquisition instruction acquisition unit is used for acquiring a first data acquisition instruction and a second data acquisition instruction which are in conflict when conflicts exist among the data acquisition instructions;
the invalid instruction determining unit is used for determining an invalid data acquisition instruction according to the priorities corresponding to the first data acquisition instruction and the second data acquisition instruction respectively and unloading the invalid data acquisition instruction;
the residual instruction processing unit is used for respectively transmitting each corresponding data acquisition strategy to the matched hierarchical processing module according to the communication layer corresponding to each residual data acquisition instruction;
a first priority determining unit, configured to determine, according to communication layers corresponding to the first data acquisition instruction and the second data acquisition instruction, priorities corresponding to the first data acquisition instruction and the second data acquisition instruction, respectively, if the first data acquisition instruction and the second data acquisition instruction belong to an original message acquisition instruction;
and the second priority determining unit is used for determining the priorities corresponding to the first data acquisition instruction and the second data acquisition instruction respectively according to the data ranges covered by the first data acquisition instruction and the second data acquisition instruction respectively if the first data acquisition instruction and the second data acquisition instruction belong to the log acquisition instruction.
The resource monitoring module 420 may include:
the hardware resource monitoring unit is used for executing a plurality of data acquisition strategies through each hierarchical processing module, monitoring hardware equipment resources occupied in the operation process of the DPI system in real time and obtaining a hardware resource occupation result;
and the software resource monitoring unit is used for acquiring threads called by the DPI system in the process of executing the multiple data acquisition strategies, monitoring software resources occupied by all the threads and obtaining a software resource occupation result.
The policy enforcement module 430 may include:
the sending mode determining unit is used for acquiring target flow data corresponding to a plurality of target acquisition strategies respectively and determining a data sending mode matched with each target flow data according to the data type corresponding to each target flow data;
the data sending unit is used for sending each target flow data to an application end according to a data sending mode matched with each target flow data;
the data verification unit is used for verifying each target flow data according to at least one data verification standard to obtain a verification result corresponding to each target flow data;
and the data removing unit is used for determining invalid flow data in the target flow data according to the verification result corresponding to each target flow data and removing the invalid flow data.
The flow data acquisition device provided by the embodiment of the invention can execute the flow data acquisition method provided by any embodiment of the invention, and has the corresponding functional modules and beneficial effects of the execution method.
EXAMPLE five
Fig. 5 is a schematic structural diagram of a computer apparatus according to a fifth embodiment of the present invention, as shown in fig. 5, the computer apparatus includes a processor 510, a memory 520, an input device 530, and an output device 540; the number of the processors 510 in the computer device may be one or more, and one processor 510 is taken as an example in fig. 5; the processor 510, the memory 520, the input device 530 and the output device 540 in the computer apparatus may be connected by a bus or other means, and the connection by the bus is exemplified in fig. 5. The memory 520 is a computer-readable storage medium for storing software programs, computer-executable programs, and modules, such as program instructions/modules corresponding to a traffic data collection method in any embodiment of the present invention (e.g., the policy receiving module 410, the resource monitoring module 420, and the policy executing module 430 in a traffic data collection device). The processor 510 executes various functional applications and data processing of the computer device by executing software programs, instructions and modules stored in the memory 520, so as to implement one of the traffic data collection methods described above. That is, the program when executed by the processor implements:
receiving a plurality of data acquisition strategies, and respectively transmitting each data acquisition strategy to a corresponding hierarchical processing module according to a communication layer corresponding to each data acquisition strategy; the data acquisition strategy is preset according to the service requirement;
executing a plurality of data acquisition strategies through each hierarchical processing module, and monitoring resources occupied by a DPI system in the running process in real time to obtain a resource occupation result;
and if the resource occupation result exceeds a preset threshold, sequentially determining target acquisition strategies to be executed in the plurality of data acquisition strategies through each hierarchical processing module according to the priority corresponding to each data acquisition strategy, and sequentially executing the target acquisition strategies.
The memory 520 may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required for at least one function; the storage data area may store data created according to the use of the terminal, and the like. Further, the memory 520 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some examples, memory 520 may further include memory located remotely from processor 510, which may be connected to a computer device through a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof. The input device 530 may be used to receive input numeric or character information and generate key signal inputs related to user settings and function control of the computer apparatus, and may include a keyboard and a mouse, etc. The output device 540 may include a display device such as a display screen.
EXAMPLE six
The sixth embodiment of the present invention further provides a computer-readable storage medium, on which a computer program is stored, where the computer program is executed by a processor to implement the method according to any embodiment of the present invention. Of course, the embodiment of the present invention provides a computer-readable storage medium, which can perform related operations in a flow data collection method provided in any embodiment of the present invention. That is, the program when executed by the processor implements:
receiving a plurality of data acquisition strategies, and respectively transmitting each data acquisition strategy to a corresponding hierarchical processing module according to a communication layer corresponding to each data acquisition strategy; the data acquisition strategy is preset according to the service requirement;
executing a plurality of data acquisition strategies through each hierarchical processing module, and monitoring resources occupied by a DPI system in the running process in real time to obtain a resource occupation result;
and if the resource occupation result exceeds a preset threshold, sequentially determining target acquisition strategies to be executed in the plurality of data acquisition strategies through each hierarchical processing module according to the priority corresponding to each data acquisition strategy, and sequentially executing the target acquisition strategies.
From the above description of the embodiments, it is obvious for those skilled in the art that the present invention can be implemented by software and necessary general hardware, and certainly, can also be implemented by hardware, but the former is a better embodiment in many cases. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which can be stored in a computer-readable storage medium, such as a floppy disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a FLASH Memory (FLASH), a hard disk or an optical disk of a computer, and includes several instructions for enabling a computer device (which may be a personal computer, a server, or a network device) to execute the methods according to the embodiments of the present invention.
It should be noted that, in the embodiment of the flow data acquisition device, the units and modules included in the embodiment are only divided according to functional logic, but are not limited to the above division, as long as the corresponding functions can be implemented; in addition, specific names of the functional units are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present invention.
It is to be noted that the foregoing is only illustrative of the preferred embodiments of the present invention and the technical principles employed. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, although the present invention has been described in greater detail by the above embodiments, the present invention is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present invention, and the scope of the present invention is determined by the scope of the appended claims.

Claims (10)

1. A flow data acquisition method is applied to a data packet deep inspection (DPI) system and comprises the following steps:
receiving a plurality of data acquisition strategies, and respectively transmitting each data acquisition strategy to a corresponding hierarchical processing module according to a communication layer corresponding to each data acquisition strategy; the data acquisition strategy is preset according to the service requirement;
executing a plurality of data acquisition strategies through each hierarchical processing module, and monitoring resources occupied by a DPI system in the running process in real time to obtain a resource occupation result;
and if the resource occupation result exceeds a preset threshold, sequentially determining target acquisition strategies to be executed in the plurality of data acquisition strategies through each hierarchical processing module according to the priority corresponding to each data acquisition strategy, and sequentially executing the target acquisition strategies.
2. The method of claim 1, wherein transmitting each of the data acquisition strategies to a corresponding hierarchical processing module according to the communication layer corresponding to each of the data acquisition strategies comprises:
converting each data acquisition strategy into a data acquisition instruction suitable for a DPI system to execute;
judging whether conflicts exist among the data acquisition instructions or not according to the relevance among the data acquisition instructions;
and if not, respectively transmitting the corresponding data acquisition strategies to the matched hierarchical processing module according to the communication layer corresponding to each data acquisition instruction.
3. The method of claim 2, wherein after determining whether a conflict exists between the data acquisition commands based on the correlation between the data acquisition commands, further comprising:
if so, acquiring a first data acquisition instruction and a second data acquisition instruction which are in conflict;
determining an invalid data acquisition instruction according to the priorities corresponding to the first data acquisition instruction and the second data acquisition instruction respectively, and unloading the invalid data acquisition instruction;
and respectively transmitting the corresponding data acquisition strategies to the matched hierarchical processing module according to the communication layer corresponding to the remaining data acquisition instructions.
4. The method according to claim 3, before determining invalid data acquisition commands according to the priorities corresponding to the first data acquisition commands and the second data acquisition commands, respectively, further comprising:
if the first data acquisition instruction and the second data acquisition instruction belong to an original message acquisition instruction, determining priorities corresponding to the first data acquisition instruction and the second data acquisition instruction respectively according to communication layers corresponding to the first data acquisition instruction and the second data acquisition instruction respectively;
and if the first data acquisition instruction and the second data acquisition instruction belong to log acquisition instructions, determining priorities corresponding to the first data acquisition instruction and the second data acquisition instruction respectively according to data ranges covered by the first data acquisition instruction and the second data acquisition instruction respectively.
5. The method of claim 1, wherein the step of executing a plurality of data acquisition strategies through each hierarchical processing module and monitoring resources occupied by a DPI system in real time during operation to obtain a resource occupation result comprises:
executing a plurality of data acquisition strategies through each hierarchical processing module, and monitoring hardware equipment resources occupied in the operation process of the DPI system in real time to obtain a hardware resource occupation result;
and acquiring threads called by the DPI system in the process of executing a plurality of data acquisition strategies, and monitoring software resources occupied by all the threads to obtain software resource occupation results.
6. The method of claim 1, wherein after sequentially determining a target acquisition strategy to be executed among a plurality of data acquisition strategies and sequentially executing the target acquisition strategy, further comprising:
acquiring target flow data corresponding to a plurality of target acquisition strategies respectively, and determining a data sending mode matched with each target flow data according to a data type corresponding to each target flow data;
and sending each target flow data to an application end according to a data sending mode matched with each target flow data.
7. The method according to claim 6, further comprising, after obtaining target traffic data corresponding to each of the plurality of target acquisition strategies:
verifying each target flow data according to at least one data verification standard to obtain a verification result corresponding to each target flow data;
and determining invalid flow data in the plurality of target flow data according to the verification result corresponding to each target flow data, and removing the invalid flow data.
8. A flow data acquisition device is applied to a data packet deep inspection (DPI) system and comprises:
the strategy receiving module is used for receiving a plurality of data acquisition strategies and respectively transmitting each data acquisition strategy to the corresponding level processing module according to the communication layer corresponding to each data acquisition strategy; the data acquisition strategy is preset according to the service requirement;
the resource monitoring module is used for executing a plurality of data acquisition strategies through each hierarchical processing module and monitoring resources occupied by the DPI system in the running process in real time to obtain a resource occupation result;
and the strategy execution module is used for sequentially determining a target acquisition strategy to be executed in a plurality of data acquisition strategies according to the priority corresponding to each data acquisition strategy through each hierarchical processing module and sequentially executing the target acquisition strategy if the resource occupation result exceeds a preset threshold.
9. A computer device, comprising:
one or more processors;
storage means for storing one or more programs;
the one or more programs when executed by the one or more processors cause the one or more processors to implement the flow data collection method of any of claims 1-7.
10. A computer-readable storage medium, on which a computer program is stored, which program, when being executed by a processor, is adapted to carry out a method for flow data acquisition according to any one of claims 1 to 7.
CN202111454126.2A 2021-12-01 2021-12-01 Flow data acquisition method, device, equipment and storage medium Pending CN114116172A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111454126.2A CN114116172A (en) 2021-12-01 2021-12-01 Flow data acquisition method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111454126.2A CN114116172A (en) 2021-12-01 2021-12-01 Flow data acquisition method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN114116172A true CN114116172A (en) 2022-03-01

Family

ID=80369587

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111454126.2A Pending CN114116172A (en) 2021-12-01 2021-12-01 Flow data acquisition method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN114116172A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114844763A (en) * 2022-04-19 2022-08-02 北京快乐茄信息技术有限公司 Data processing method and device, electronic equipment and storage medium
CN115499722A (en) * 2022-08-23 2022-12-20 重庆长安汽车股份有限公司 Fusing control method and device for vehicle data acquisition
CN116360301A (en) * 2022-12-02 2023-06-30 国家工业信息安全发展研究中心 Industrial control network flow acquisition and analysis system and method

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114844763A (en) * 2022-04-19 2022-08-02 北京快乐茄信息技术有限公司 Data processing method and device, electronic equipment and storage medium
CN115499722A (en) * 2022-08-23 2022-12-20 重庆长安汽车股份有限公司 Fusing control method and device for vehicle data acquisition
CN116360301A (en) * 2022-12-02 2023-06-30 国家工业信息安全发展研究中心 Industrial control network flow acquisition and analysis system and method
CN116360301B (en) * 2022-12-02 2023-12-12 国家工业信息安全发展研究中心 Industrial control network flow acquisition and analysis system and method

Similar Documents

Publication Publication Date Title
CN114116172A (en) Flow data acquisition method, device, equipment and storage medium
US10404556B2 (en) Methods and computer program products for correlation analysis of network traffic in a network device
US10318366B2 (en) System and method for relationship based root cause recommendation
US7926099B1 (en) Computer-implemented method and system for security event transport using a message bus
US8352790B2 (en) Abnormality detection method, device and program
US6643614B2 (en) Enterprise management system and method which indicates chaotic behavior in system resource usage for more accurate modeling and prediction
US6665716B1 (en) Method of analyzing delay factor in job system
US20060074946A1 (en) Point of view distributed agent methodology for network management
CN109144813B (en) System and method for monitoring server node fault of cloud computing system
CN109597837B (en) Time sequence data storage method, time sequence data query method and related equipment
CN108234189B (en) Alarm data processing method and device
GB2553784A (en) Management of log data in electronic devices
CN112350854A (en) Flow fault positioning method, device, equipment and storage medium
CN114090366A (en) Method, device and system for monitoring data
CN112559285A (en) Distributed service architecture-based micro-service monitoring method and related device
CN112084180A (en) Method, device, equipment and medium for monitoring vehicle-mounted application quality
Grant et al. Overtime: A tool for analyzing performance variation due to network interference
Appleby et al. Yemanja-a layered event correlation engine for multi-domain server farms
Sandur et al. Jarvis: Large-scale server monitoring with adaptive near-data processing
TWI448975B (en) Dispersing-type algorithm system applicable to image monitoring platform
EP3011456B1 (en) Sorted event monitoring by context partition
US7783752B2 (en) Automated role based usage determination for software system
US20060053021A1 (en) Method for monitoring and managing an information system
CN115080363B (en) System capacity evaluation method and device based on service log
CN114356625A (en) Distributed system redundancy diagnosis method, device, electronic device and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination