CN108683569A - A kind of the business monitoring method and system of cloud service-oriented infrastructure - Google Patents
A kind of the business monitoring method and system of cloud service-oriented infrastructure Download PDFInfo
- Publication number
- CN108683569A CN108683569A CN201810585690.XA CN201810585690A CN108683569A CN 108683569 A CN108683569 A CN 108683569A CN 201810585690 A CN201810585690 A CN 201810585690A CN 108683569 A CN108683569 A CN 108683569A
- Authority
- CN
- China
- Prior art keywords
- testing
- server
- data
- log
- task
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L43/00—Arrangements for monitoring or testing data switching networks
- H04L43/08—Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
- H04L43/0805—Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters by checking availability
- H04L43/0811—Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters by checking availability by checking connectivity
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/02—Standardisation; Integration
- H04L41/0246—Exchanging or transporting network management information using the Internet; Embedding network management web servers in network elements; Web-services-based protocols
- H04L41/0273—Exchanging or transporting network management information using the Internet; Embedding network management web servers in network elements; Web-services-based protocols using web services for network management, e.g. simple object access protocol [SOAP]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/06—Management of faults, events, alarms or notifications
- H04L41/069—Management of faults, events, alarms or notifications using logs of notifications; Post-processing of notifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/14—Network analysis or design
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L43/00—Arrangements for monitoring or testing data switching networks
- H04L43/16—Threshold monitoring
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L61/00—Network arrangements, protocols or services for addressing or naming
- H04L61/45—Network directories; Name-to-address mapping
- H04L61/4505—Network directories; Name-to-address mapping using standardised directories; using standardised directory access protocols
- H04L61/4511—Network directories; Name-to-address mapping using standardised directories; using standardised directory access protocols using domain name system [DNS]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
- H04L67/1001—Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
Abstract
The present invention proposes a kind of the business monitoring method and system of cloud service-oriented infrastructure, belongs to the infrastructure field of cloud service.Including control centre's server and it is arranged in the testing servers of each department.Wherein, it is disposed with cloud service testing module on each testing server, testing mission dispatching module, data acquisition module, testing data analysis module, testing alarm module and database are disposed on control centre's server;By the way that testing server is arranged in different regions, the configuration file for the task that monitors is issued to the destination IP correctness of each testing server authentication task in the WEB interface configuration monitoring task of control centre's server, testing mission dispatching module, use the monitoring method of a wide range of asynchronous testing, for router, provide the DNS recursion servers serviced, realize the monitoring to cloud service infrastructure, the loss of data packet is reduced, realizes load balancing, there is higher robustness and stability.
Description
Technical field
The invention belongs to the infrastructure fields of cloud service, are related to a kind of test of leading to property of cloud service infrastructure network
The method and system of verification.
Background technology
With the high speed development of cloud computing and mobile Internet, more and more business events are carried out beyond the clouds, cloud service
It is indispensable in our work, and the infrastructure in cloud service is even more the most important thing, only basis is improved and just be can guarantee
The robustness of system, the accuracy of data.
Cloud service is commercial distribution process caused by development of computer to Internet era.Cloud service is mainly shown as meter
The service of many levels such as calculation machine and network, the storage of information resources, reading, download, information spy, analysis.Due to tool
There are the characteristics of safety and stability, mass memory, cloud service to start to be favored by current social enterprise and individual.Thereupon, how
Ensure that the stability of cloud service also becomes focus, and how to solve cloud service connectivity under the overall situation of cloud service, tests
Safety and stability, the integrality of data of card cloud service are wherein important one.
In multiple office points, there are large-scale networks for cloud service manufacturer, and there are multiple servers in different regions, and
Network status is bad, circuit is bad can cause the unstable of cloud service, to cause the loss of data, come to user, to enterprise
Say it is very big loss.
The currently monitored method is Black-box Testing mostly, can only be known according to final result it is problematic, but can not be true
Determine problem.Different application scenarios have different methods, when data when something goes wrong, it is efficient accurately to navigate to where problem
Solve the key of the failure of cloud service.It therefore can be by being dialled to router, DNS (domain name system) server, business module
It surveys, to test the leading to property of network of cloud service vendor infrastructure.
It is existing using the specific ICP/IP protocol of tissue (transmission control protocol/Internet Protocol) network message into
The mode of row active probe includes mainly come the relevant technologies being monitored to the operating status of different business systems:
1) countries use network research laboratory (NLANER), about research [ginseng that is distributed, focusing on high-performance connection
Examine file 1:Mcgregor T,Braun H W,Brown J.The NLANR Network Analysis
Infrastructure[J].Communications Magazine IEEE,2000,38(5):122-128], NLANER is supported
Passive flux monitoring actively measures and controls three kinds of monitoring modes of information monitoring, wherein AMP (Active Measurement
Program) it is active measure the item, two-way time (RTT), packet loss between main measurement website, topological structure, handling capacity
Deng.In AMP, monitor mutual transmission ICMP (Internet Control Message Protocol) message per minute, often
Routings of ten minutes traceroute to other monitors.Testing throughput can pass through the transmission of a large amount of TCP datas, a large amount of UDP
(User Datagram Protocol) data transmission, ping-F and treno are measured.
2) Ethernet OAM Informations detect [reference paper 2:It opens and realizes fault detect and failure in vast Ethernet OAMs
[C] Chinese Technology Association's annual meeting informationizations and social development academic discussion sub-venue .2008 is isolated], use continuity check message
(Continuity Check messages) is used as heartbeat signal, with the connection situation between detection terminal;Disappeared using link trace
Breath (Link Trace messages) be used for record it is end-to-end between hop (jump) path, the Traceroute tools with IP layers
It is similar;Loopback message (Loopback messages) is similar to the Ping functions of ICMP, for the connection between detecting terminal
Property.
3) Cisco's assurance agent (Cisco Service Assurance Agent)/Cisco's IOS IP service levels
Agreement (IOS IP Service Level Agreements) is built in Cisco ios devices, allows active probe and active
Monitoring, can configure a large amount of options, such as UDP/TCP port numbers, tos field, VRF instance, source IP, destination IP and web URL
Deng.The tool can measure following performance parameters:One-way latency, round-trip delay, delay variation, packet loss, packet order, sound matter
Measure scoring, Internet resources availability, application performance, server response time.
In above technology, the requirement of cloud service infrastructure service test cannot be satisfied.For cloud service network
Greatly, range is wide, the feature more than office point, needs to carry out a wide range of testing in more office points for cloud service manufacturer, for different agreement
(UDP, TCP, HTTP), different office points, different operators carry out accurately asynchronous test, realize call-testing system and data analysis system
System separation, for abnormal the reason of can accurately finding its problem, where helping staff to position cloud service system problem.
Invention content
For the demand, in order to realize under cloud service environment, the business of cloud service-oriented infrastructure monitors, the present invention
A kind of the business monitoring method and system of cloud service-oriented infrastructure are provided, with the number to router, DNS recursion servers
According to being monitored, and feature wide for cloud service range, more than office point, realize asynchronous a wide range of timesharing testing.
The present invention provides a kind of business of cloud service-oriented infrastructure monitor system, including control centre's server with
And it is arranged in the testing server of each department.Wherein, cloud service testing module is disposed on each testing server, in control
Be disposed on central server testing mission dispatching module, data acquisition module, testing data analysis module, testing alarm module and
Database.
User configures monitoring task by testing mission dispatching module, and configuration file is handed down to corresponding testing service
Device;Multiple source IPs, more destination IPs, the monitoring task of various protocols of user configuration described in configuration file.The testing is appointed
Business issues module and is verified to the destination IP in configuration file, to verify not by task do not issue.
After the cloud service testing module receives configuration file, the task in configuration file is traversed, validation task
The correctness of destination IP configures legal data packet to being verified for task, is carried out using asynchronous timesharing testing mode
Data testing.The cloud service testing module includes two kinds of testing data:One is pass the DNS for providing service in cloud service
Server is returned to send the data packet for formulating format domain name;One is the router-list passed through according to the cloud service flow of acquisition and
Data packet number of the sampling of router than testing is arranged, to destination IP transmission data packet.Cloud service testing module is to purpose
After the DNS recursion server transmission data packets that IP and offer service, records and give out a contract for a project daily record S-Log and be sent to control centre's clothes
The testing data analysis module of business device.
The data acquisition module obtains data fingerprint information according to the daily record S-Log that gives out a contract for a project, and traverses office point database,
The connection of office point database is first verified before data query, if connection failure or query timeout, by problem log to problem day
In will E-Log, if successful connection, the storage data of inquiry office point data base after the completion of authorities' point data base traverses, generate
Capture file and capture file daily record R-Log.The data fingerprint information is expressed as a hexa-atomic group information (source
IP, destination IP, source port, destination interface, protocol number, rule ID).
The testing data analysis module obtains the daily record R-Log of the corresponding capture file of some task, problem
The daily record E-Log and daily record S-Log that gives out a contract for a project, first Traversal Problem daily record E-Log, to problematic office point database in database
In be marked, and record corresponding office point database problem;Secondly the daily record S-Log that gives out a contract for a project is traversed, is compared with daily record R-Log
It is right, if testing data are the flow data sent to destination IP, according to data fingerprint information, calculate passed through being averaged for router and adopt
Sample ratio, the storage rate of calculating task;If testing data are the data sent to DNS recursion servers, according to data fingerprint information
It is compared, the storage rate of calculating task.
Stream monitoring of the testing alarm module for router and the prison for the DNS recursion servers for providing service
It surveys, is all preset with threshold value, the storage rate and threshold value comparison of the task that testing data analysis module is calculated are less than threshold to storage rate
The task of value carries out alarm prompt.
A kind of business monitoring method of cloud service-oriented infrastructure proposed by the present invention, divides following steps:
Step 1:Testing server is set in different regions, setting testing server uses asynchronous timesharing testing mode;Institute
The asynchronous timesharing testing mode stated refers to that setting testing data transmission is asynchronous, and the data of setting quantity are sent every setting time
Packet;
Step 2:User configures monitoring task, testing mission dispatching module verification in the WEB interface of control centre's server
The destination IP correctness of task, if properly generating the configuration file of monitoring task;By the information (state for inquiring IP in the libraries IP
Family, province, city, operator) it is whether correct, carry out the correctness of verifying purpose IP;
Step 3:The configuration file for the task that monitors is issued to each testing server, testing service by testing mission dispatching module
Device verifies the destination IP correctness in configuration file, if mistake, to the destination IP without testing, and mistake is believed
Breath feeds back to control centre's server;
Step 4:The sampling ratio for obtaining router-list and router that cloud service flow passes through, according to adopting for router
Sample is than the Probability Condition that the stream daily record with setting is sampled, testing data packet of the setting testing server to destination IP;Testing takes
Business device generates the daily record S-Log that gives out a contract for a project after sending testing data packet every time and is sent to control centre's server.
If the sampling of router is compared for 1/X, it is desirable that the probability that stream daily record is sampled is more than G%, the quantity of testing data packet
For Y, then there is relationship:When transmission data packet quantity Y, the probability that stream daily record is sampled is
Step 5:Testing server obtains the DNS recursion server lists that service is provided in cloud service, to recurrence DNS service
Device sends the data packet of specified format domain name, and carries out packet capturing, generates PCAP files;
Step 6:Testing server sends PCAP files to intermediate robot, and intermediate robot carries out packet authentication to protect
Demonstrate,prove safety;Testing server, which is sent the packet within away by intermediate robot and generates the daily record S-Log that gives out a contract for a project, is sent to control
Central server processed;
Step 7:The connection status of control centre's server authentication office point database, if connection failure or inquiry are super
When, it will be in problem log to problem log E-Log;
Step 8:Control centre's server is according to the finger print information of the log acquisition data packet of giving out a contract for a project of reception, from each office point number
Data query is carried out according to library, generates capture file;The finger print information is (source IP, destination IP, source port, destination
Mouth, protocol number, rule ID).
Step 9:For some task, control centre's server obtains daily record R-Log and the problem day of capture file
Will E-Log searches the corresponding daily record S-Log that gives out a contract for a project;By Traversal Problem daily record E-Log, to problem office point database in control
The problems of be marked, and mark in the database of the heart;
Step 10:Control centre's server traverses the daily record S-Log that gives out a contract for a project, to normal office point database, by daily record of giving out a contract for a project
S-Log is compared with daily record R-Log, if testing data are the flow data sent to destination IP, according to data fingerprint information, meter
Calculate the average sample ratio of passed through router, the storage rate of calculating task, if testing data are sent to DNS recursion servers
Data are compared according to data fingerprint information, the storage rate of calculating task;
Step 11:By the storage rate of calculating and preset corresponding storage rate threshold value comparison, if it is less than threshold value then in WEB
Interface is prompted.
The method of the present invention compared with traditional business monitoring technology, has the following advantages that and good effect with system:
(1) method and system of the invention provides a kind of data monitoring side for router, DNS recursion servers
Case, the function that the characteristic that can be sampled using router, recursion server provide domain name mapping carry out the knowledge for being directed to specific data
, it compares and analyzes, using the method for Black-box Testing, event is carried out to operation system under the premise of not influencing existing business system
Barrier test.In the method and system of the present invention, by the stream monitoring for router, calculates storage rate and can determine cloud service link
The problem of, the monitoring of the DNS data by middle machine people forwarding is analyzed, and authentication is carried out, and provides safe DNS prisons
It surveys, demonstrates the parsing function of dns server.
(2) method and system of the invention is monitored using asynchronous a wide range of timesharing, wide for cloud service range, office point is more
The characteristics of, it is detached with monitoring analysis platform into a wide range of timesharing testing of line asynchronous, and by testing, reduces the pressure of server
Power reduces the loss of data packet, greatly enhances flow data acquisition rate, load balancing is realized, to system under test (SUT)
Influence is smaller, has higher robustness and stability, improves the robustness and validity of business monitoring system.
(3) system of the invention is disposed using distributed mode, cloud service testing module and testing data analysis module
Phase separation can carry out accurately asynchronous test to business bureau's point of different office points, different operators.The present invention is by load balancing
Technology introduces this method, realizes the parallel processing to data, realizes the asynchronous monitoring that testing module is detached with statistical module, system
Counting is not influenced by testing module, the performance and autgmentability for enhancing the availability of system, improving system.
Description of the drawings
Fig. 1 is that the business of the cloud service-oriented infrastructure of the present invention monitors the overall structure figure of system;
Fig. 2 is the function implementation flow chart of cloud service testing module in present system;
Fig. 3 is the function implementation flow chart of data acquisition module in present system;
Fig. 4 is the function implementation flow chart of testing data analysis module in present system.
Specific implementation mode
Illustrate technical scheme of the present invention with reference to the accompanying drawings and examples.
The business monitoring method and system of cloud service-oriented infrastructure provided by the invention use asynchronous group a wide range of
The monitoring method of survey for router, provides the DNS recursion servers serviced, realizes the monitoring to cloud service infrastructure,
With implementing to monitor, the function of real-time prompting is accurate to infrastructure, and investigating mistake for cloud service provides effective support.
As shown in Figure 1, the invention discloses a kind of business of cloud service-oriented infrastructure to monitor system, including it is arranged in
The testing server and control centre's server of each department, wherein cloud service testing mould is disposed on each testing server
Block is disposed with testing mission dispatching module on control centre's server, data acquisition module, testing data analysis module, dials
Survey alarm module and database.
Control centre's server is a service cluster, and the above-mentioned module for being arranged in control centre's server is available individual
Server is realized.Or it is limited to resource, wherein several modules are integrated on a server and are realized.
User configures Detection task by testing mission dispatching module, and configuration file is handed down to corresponding testing service
Device.In testing mission dispatching module, user by WEB interface configuration distributing task, configure multiple source IPs (different provinces), more innings
Destination IP, the testing task of various protocols (TCP, UDP, HTTP) of point, by different mission dispatching to corresponding testing server,
So that testing server carries out testing.Testing mission dispatching module provides IP authentication functions, is verified to destination IP, to not being inconsistent
The IP tasks of conjunction, which are not given, to be issued.
After cloud service testing module receives the configuration file that testing mission dispatching module issues, to destination IP Information Authentication
After carry out testing, timesharing, intermittent testing are carried out, to ensure the stability of testing, by corresponding destination IP and offer
The recursion server of service sends mass data packet, records daily record of giving out a contract for a project, and the daily record that will give out a contract for a project carries out compression transmission, is sent to
Testing data analytics server.As shown in Fig. 2, cloud service testing module reads configuration file, the task in configuration file is traversed,
The correctness of the purpose IP address of each task is verified, when being verified, configuration task is regular accordingly, gives destination IP
Transmission data packet, and during the record that will give out a contract for a project is added to and gives out a contract for a project daily record S-Log, it, will after all tasks in configuration file
The daily record S-Log that gives out a contract for a project returns to control centre's server.
Testing mission dispatching module and cloud service testing module are all by the libraries IP to the verification of correctness of purpose IP address
Whether the information (country, province, city, operator) for inquiring destination IP is correct, is verified if correct, otherwise verifies and do not pass through.
Cloud service testing module uses asynchronous timesharing testing mode, and the router passed through according to the cloud service flow of acquisition
The data packet number of list and the sampling of router than testing is arranged.DNS recursion servers to providing service in cloud service are sent out
Send the data packet for formulating format domain name.The data packet that the timesharing of cloud service testing module, interval are sent is according to the different network in various regions
Environment can be configured flexibly, to avoid to destination server regular traffic influence and disappear to local network bandwidth
Consumption.
Data acquisition module obtains data fingerprint information according to the daily record S-Log that gives out a contract for a project, as shown in figure 3, traversal office point data
The connection of office point database is first verified in library before data query, if connection failure or query timeout, by problem log to asking
It inscribes in daily record E-Log, if successful connection, the storage data of inquiry office point data base, authorities' point data base traversal is completed, complete
When at query task, capture file, the entitled .ok of suffix of this document are generated, and generate the daily record R- of capture file
Log.The data fingerprint information is expressed as hexa-atomic group information (source IP, destination IP, source port, destination interface, an agreement
Number, rule ID).Data fingerprint information is also dyeing information, can be with the data message of one testing of unique mark, the method for dyeing
It with label is made of the module information of test system, there is prodigious flexibility and operability.
Testing data analysis module obtains the daily record R-Log of the corresponding capture file of some task, problem log E-
The Log and daily record S-Log that gives out a contract for a project, as shown in figure 4, being carried out first to problematic office point database according to problem log E-Log
Label, then to each task, traverses its daily record S-Log that gives out a contract for a project, daily record R-Log is compared with S-Log.Encounter problems office
When point data base, it is marked in the database of control centre's server and records corresponding office point database problem.To just
Normal office point database, if testing data are the flow data sent to destination IP, according to data fingerprint information, calculating is passed through
The average sample ratio of router, calculates the storage rate of each task;If testing data are the numbers sent to DNS recursion servers
According to being compared according to data fingerprint information, calculate the storage rate of each task.Testing data analysis module also statistical is precipitated
Storage rate variation in each period, provides Long-term change trend.In Fig. 4, if the daily record traversal of giving out a contract for a project to some task terminates
Afterwards, capture file daily record R-Log, the problem log E-Log of analysis and the daily record S-Log that gives out a contract for a project will be participated in from current analysis
Catalogue is moved under backup directory.
Stream monitoring of the testing alarm module for router and the monitoring for the DNS recursion servers for providing service, all
It is previously provided with threshold value, the storage rate for the task that testing data analysis module calculates is compared with corresponding threshold value, for
Storage rate carries out alarm prompt less than the task of threshold value.
The local data base of database central server in order to control flows daily record wherein storing collected stream log information
In information other than the finger print information (source IP, destination IP, source port, destination interface, protocol number, rule ID) for being included, also wrap
Containing the content in a part of data packet.
The business monitoring method of cloud service-oriented infrastructure provided by the invention, includes the following steps 1~11.
Step 1:Using asynchronous a wide range of timesharing testing technology, asynchronous test pattern is realized, by testing and monitoring point
Analysis is detached, and carries out timesharing testing, is reduced the pressure of destination IP server, is reduced the loss of data packet.
In this step, more testing servers are provided, for P different province, there is testing server in each province,
Load-balancing function is realized, to ensure the robustness and convergence of algorithm.P is the integer more than 2.
Every testing server uses asynchronous timesharing testing mode, i.e. testing data are sent asynchronous, to every K data
Packet is arranged testing interval t, can largely improve storage rate.K is positive integer.Timesharing testing is carried out, destination IP clothes are reduced
The pressure of business device, reduces the loss of data packet.Testing interval 1s is arranged in every 1000 data packets in the embodiment of the present invention, can be with
Largely improve storage rate.
Asynchronous mode in the method for the present invention is also embodied in, will because testing data loading has delay (10min-30min)
Cloud service testing is detached with testing data analysis, i.e. testing data analysis is not located at testing server in identical platform,
The two is independent of each other.
Step 2:User issues monitoring task by the WEB interface of control centre's server, and testing mission dispatching module is tested
The correctness for demonstrate,proving the destination IP of monitoring task, it is whether correct by the information (country, province, city, operator) of IP library inquiries IP, no
It is correctly prompted, if all properly generating monitoring task configuration file, enters step 3.
Step 3:The configuration file for the task that monitors is issued to testing server, testing server by testing mission dispatching module
The correctness of destination IP in same verification configuration file, by the information (country, province, city, operator) of IP library inquiries IP whether
Correctly, if mistake, to the destination IP without testing, and feedback error is prompted to control centre's server.
Step 4:The data packet number that each testing server is sent to destination IP is set, data packet group at stream record can quilt
The router sample acquisition of approach target ip address arrives.The router-list that cloud service flow passes through is obtained, and obtains its routing
The sampling ratio of device.If router has sampling than 1/X, the daily record of router is equally sampled data.It is surveyed for the stream of router
Examination needs to ensure that the stream daily record that message generates can be captured under sampling since the stream daily record that router generates is sampling.
If the sampling of router is compared for 1/X, it is desirable that the probability that stream daily record is sampled is more than G%, the quantity of testing data packet
For Y, that is, the stream quantity sent is Y.X, Y is positive integer, and G is the positive number less than 100.
When Y=1, the collected probability of stream daily record isIf X=1000 at this time, acquisition probability 1/
1000。
Y>When 1, the probability that stream daily record is sampled is
Then at this point, in sampling ratio 1:Under 1000, the probability that stream daily record is sampled needs value 10000 more than 99.99%, Y, should
Probability is 99.995483%;
In sampling ratio 1:Under 2000, it is 99.995471% that Y, which needs value 20000, the probability that stream daily record is sampled,;
In sampling ratio 1:Under 5000, it is 99.995465% that Y, which needs value 50000, the probability that stream daily record is sampled,.
In addition stream daily record output is carried out in view of router, needs to consider to send interval t, ensures the stability for connecing packet.For
Cloud service carries out a wide range of testing, 50000 data packets of each destination IP testing.Testing server is sending testing data every time
The daily record S-Log that gives out a contract for a project is generated after packet is sent to control centre's server.
Step 5:Testing server obtains the DNS recursion server lists that service is provided in cloud service, to recurrence DNS service
Device sends the data packet of specified format domain name, and carries out packet capturing, obtains PCAP files.
PCAP file formats are common data packet storage formats, and the mainstream packet capturing software including wireshark is all
The data packet of this format can be generated.
Step 6:Testing server sends PCAP files to intermediate robot, and intermediate robot carries out packet authentication to protect
Demonstrate,prove safety.Whether in order to control the source IP of the go-between's machine authentication data packet IP in the white list at center, whether domain name
For regular Domain Name, if not in white list IP or domain name be not regular Domain Name, parse PCAP files, reorganizing packets,
It is the IP in white list to forge source IP, forges specific standard domain name etc..Intermediate robot by the data packet being verified and
The data packet of recombination is sent.Testing server generates after sending testing data packet by middle machine human hair every time gives out a contract for a project day
Will S-Log is sent to control centre's server.
Step 7:The connection status of certification authority's point data base.Because data are stored in the corresponding office point database in more ground, because
This increases database authentication function.The connection status of control centre's server authentication database, if connection failure or inquiry
Problem be recorded in problem log E-Log by time-out.
Step 8:The finger print information for the log acquisition data of giving out a contract for a project that control centre sends according to testing server obtains fingerprint
Hexa-atomic group information (source IP, destination IP, source port, destination interface, protocol number, rule ID) then carries out according to finger print information more
The data query of office point generates data and acquires .ok files;
Step 9:Control centre's server obtains the daily record R-Log of the capture file of task, searches correspondence and gives out a contract for a project day
Will S-Log, Traversal Problem daily record E-Log, determine problematic office point database and there are the problem of, and in control centre
It is marked in database;
Step 10:The daily record S-Log that gives out a contract for a project is traversed, finger print information is then passed through to normal office point database and data acquire day
Will R-Log is compared.If testing data are flow data, the router calculating average sample ratio passed through by stream, final point
Warehousing quantity is precipitated, calculates the storage rate gone out on missions;Comparing is directly then carried out according to fingerprint if it is DNS data, is calculated
The storage rate of task;
Step 11:According to preset threshold value, the storage rate of calculating is compared with threshold value, is then existed if it is less than threshold value
WEB interface prompts, and provides corresponding problem clew.In embodiments of the present invention, storage rate threshold value is arranged in the step is all
65%.
Compared with the prior art, the method for the present invention calculates storage rate, it is determined that cloud takes by the stream monitoring for router
The problem of business link;By the DNS data method for monitoring and analyzing that go-between forwards, authentication is carried out, safe DNS is provided
Monitoring, verifies the parsing function of dns server.A kind of method of asynchronous a wide range of timesharing monitoring provided by the invention, timesharing
Testing reduces the loss of data packet, realizes load balancing, and the influence to system under test (SUT) is smaller, with higher robustness and surely
It is qualitative.
Claims (5)
1. the business of cloud service-oriented infrastructure a kind of monitors system, which is characterized in that including control centre's server and
It is arranged in the testing server of each department;Wherein, cloud service testing module is disposed on each testing server, in control centre
Testing mission dispatching module, data acquisition module, testing data analysis module, testing alarm module sum number are disposed on server
According to library;
User configures monitoring task by testing mission dispatching module, and configuration file is handed down to corresponding testing server;
Multiple source IPs, more destination IPs, the monitoring task of various protocols of user configuration described in configuration file;Under the testing task
Hair module the destination IP in configuration file is verified, to verify not by task do not issue;
After the cloud service testing module receives configuration file, the task in configuration file, the purpose of validation task are traversed
The correctness of IP configures legal data packet to being verified for task, and data are carried out using asynchronous timesharing testing mode
Testing;The cloud service testing module includes two kinds of testing data:One is the DNS recurrence clothes to providing service in cloud service
The data packet for device transmission formulation format domain name of being engaged in;One is the router-list passed through according to the cloud service flow of acquisition and routings
Data packet number of the sampling of device than testing is arranged, to destination IP transmission data packet;Cloud service testing module to destination IP with
And after the DNS recursion server transmission data packets of service are provided, record and give out a contract for a project daily record S-Log and be sent to control centre's server
Testing data analysis module;
The data acquisition module obtains data fingerprint information according to the daily record S-Log that gives out a contract for a project, and office point database is traversed, in data
The connection that office point database is first verified before inquiry, if connection failure or query timeout, by problem log to problem log E-
In Log, if successful connection, the storage data of inquiry office point data base after the completion of authorities' point data base traverses, generate data
Acquire file and capture file daily record R-Log;
The testing data analysis module obtains daily record R-Log, the problem log of the corresponding capture file of some task
The E-Log and daily record S-Log that gives out a contract for a project, first Traversal Problem daily record E-Log, to problematic office point database in the database into
Line flag, and record corresponding office point database problem;Secondly the daily record S-Log that gives out a contract for a project is traversed, is compared with daily record R-Log,
If testing data are the flow data sent to destination IP, according to data fingerprint information, the average sample of passed through router is calculated
Than the storage rate of calculating task;If testing data are the data sent to DNS recursion servers, according to data fingerprint information into
Row comparison, the storage rate of calculating task;
Stream monitoring of the testing alarm module for router and the monitoring for the DNS recursion servers for providing service,
It is all preset with threshold value, the storage rate and threshold value comparison of the task that testing data analysis module is calculated are less than threshold value to storage rate
Task carry out alarm prompt.
2. a kind of business of cloud service-oriented infrastructure according to claim 1 monitors system, which is characterized in that described
Control centre's server be a service cluster, the individual server of each module for being arranged in control centre's server
To realize.
3. a kind of business of cloud service-oriented infrastructure according to claim 1 monitors system, which is characterized in that described
Testing mission dispatching module to monitor task destination IP, pass through the information (country, province, city, operator) of IP library inquiries IP
It is whether correct, it is verified if correct.
4. a kind of business of cloud service-oriented infrastructure according to claim 1 monitors system, which is characterized in that described
Data fingerprint information be expressed as hexa-atomic group information (source IP, destination IP, source port, destination interface, protocol number, a rule
ID)。
5. a kind of business monitoring method of cloud service-oriented infrastructure, which is characterized in that include the following steps:
Step 1:Testing server is set in different regions, setting testing server uses asynchronous timesharing testing mode;Described
Asynchronous timesharing testing mode refers to that setting testing data transmission is asynchronous, and the data packet of setting quantity is sent every setting time;
Step 2:User configures monitoring task, testing mission dispatching module verification task in the WEB interface of control centre's server
Destination IP correctness, if properly generating the configuration file of monitoring task;
Step 3:The configuration file for the task that monitors is issued to each testing server, testing server pair by testing mission dispatching module
Destination IP correctness in configuration file is verified, if mistake, to the destination IP without testing, and is fed back in control
Central server;
Step 4:The sampling ratio for obtaining router-list and router that cloud service flow passes through, according to the sampling of router ratio
The Probability Condition that stream daily record with setting is sampled, testing data packet of the setting testing server to destination IP;Testing server
The daily record S-Log that gives out a contract for a project, which is generated, after sending testing data packet every time is sent to control centre's server;
If the sampling of router is compared for 1/X, it is desirable that the probability that stream daily record is sampled is more than G%, and the quantity of testing data packet is Y,
Then there is relationship:When transmission data packet quantity Y, the probability that stream daily record is sampled is
Step 5:Testing server obtains the DNS recursion server lists that service is provided in cloud service, is sent out to recurrence dns server
The data packet of specified format domain name is sent, and carries out packet capturing, generates PCAP files;
Step 6:Testing server sends PCAP files to intermediate robot, and intermediate robot carries out packet authentication to ensure to pacify
Quan Xing;Testing server is sent the packet within away by intermediate robot and generates the daily record S-Log that gives out a contract for a project and is sent in control
Central server;
Step 7:The connection status of control centre's server authentication office point database will if connection failure or query timeout
In problem log to problem log E-Log;
Step 8:Finger print information is obtained in the daily record of giving out a contract for a project that control centre's server is sent from testing server, according to finger print information
The data query of more office points is carried out, capture file is generated;The finger print information is (source IP, destination IP, source port, mesh
Port, protocol number, rule ID);
Step 9:To some task, control centre's server obtains the daily record R-Log and problem log E- of capture file
Log searches the corresponding daily record S-Log that gives out a contract for a project;By Traversal Problem daily record E-Log, to problem office point database in control centre
The problems of be marked, and mark in database;
Step 10:Control centre's server traverses the daily record S-Log that gives out a contract for a project, and to normal office point database, will give out a contract for a project daily record S-Log
It is compared with daily record R-Log, if testing data are the flow data sent to destination IP, according to data fingerprint information, calculates institute
By the average sample ratio of router, the storage rate of calculating task, if testing data are the numbers sent to DNS recursion servers
According to being compared according to data fingerprint information, the storage rate of calculating task;
Step 11:By the storage rate of calculating and preset corresponding storage rate threshold value comparison, if it is less than threshold value then in WEB interface
It is prompted.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810585690.XA CN108683569B (en) | 2018-06-06 | 2018-06-06 | Service monitoring method and system for cloud service infrastructure |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810585690.XA CN108683569B (en) | 2018-06-06 | 2018-06-06 | Service monitoring method and system for cloud service infrastructure |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108683569A true CN108683569A (en) | 2018-10-19 |
CN108683569B CN108683569B (en) | 2020-06-09 |
Family
ID=63810284
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810585690.XA Active CN108683569B (en) | 2018-06-06 | 2018-06-06 | Service monitoring method and system for cloud service infrastructure |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108683569B (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109921925A (en) * | 2019-02-15 | 2019-06-21 | 北京奇艺世纪科技有限公司 | A kind of dial testing method and device |
CN110519303A (en) * | 2019-09-30 | 2019-11-29 | 北京市天元网络技术股份有限公司 | Communication means and system across xegregating unit |
CN112100133A (en) * | 2020-11-04 | 2020-12-18 | 广州市玄武无线科技股份有限公司 | Distributed log processing system |
CN112463572A (en) * | 2019-09-06 | 2021-03-09 | 福建天泉教育科技有限公司 | Cross-border multi-service dial testing software testing system and method thereof |
CN112866053A (en) * | 2020-12-31 | 2021-05-28 | 天翼物联科技有限公司 | Internet of things testing method, system and device and storage medium |
CN113572644A (en) * | 2021-07-26 | 2021-10-29 | 武汉众邦银行股份有限公司 | Internet cloud dial-up test automatic monitoring method and device |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101727389A (en) * | 2009-11-23 | 2010-06-09 | 中兴通讯股份有限公司 | Automatic test system and method of distributed integrated service |
CN201601833U (en) * | 2009-12-28 | 2010-10-06 | 福建邮科通信技术有限公司 | Automatic detecting system for wireless network |
CN102546269A (en) * | 2010-12-07 | 2012-07-04 | 中国移动通信集团广东有限公司 | Method and system capable of fast monitoring internet protocol (IP) network |
KR20140039686A (en) * | 2012-09-25 | 2014-04-02 | 에스케이텔레콤 주식회사 | Apparatus and method for providing quality analysis of data service |
CN104753735A (en) * | 2013-12-31 | 2015-07-01 | 中国移动通信集团上海有限公司 | Dialing testing system and method |
-
2018
- 2018-06-06 CN CN201810585690.XA patent/CN108683569B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101727389A (en) * | 2009-11-23 | 2010-06-09 | 中兴通讯股份有限公司 | Automatic test system and method of distributed integrated service |
CN201601833U (en) * | 2009-12-28 | 2010-10-06 | 福建邮科通信技术有限公司 | Automatic detecting system for wireless network |
CN102546269A (en) * | 2010-12-07 | 2012-07-04 | 中国移动通信集团广东有限公司 | Method and system capable of fast monitoring internet protocol (IP) network |
KR20140039686A (en) * | 2012-09-25 | 2014-04-02 | 에스케이텔레콤 주식회사 | Apparatus and method for providing quality analysis of data service |
CN104753735A (en) * | 2013-12-31 | 2015-07-01 | 中国移动通信集团上海有限公司 | Dialing testing system and method |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109921925A (en) * | 2019-02-15 | 2019-06-21 | 北京奇艺世纪科技有限公司 | A kind of dial testing method and device |
CN112463572A (en) * | 2019-09-06 | 2021-03-09 | 福建天泉教育科技有限公司 | Cross-border multi-service dial testing software testing system and method thereof |
CN112463572B (en) * | 2019-09-06 | 2023-09-15 | 福建天泉教育科技有限公司 | Cross-border multi-service dial testing software testing system and method thereof |
CN110519303A (en) * | 2019-09-30 | 2019-11-29 | 北京市天元网络技术股份有限公司 | Communication means and system across xegregating unit |
CN112100133A (en) * | 2020-11-04 | 2020-12-18 | 广州市玄武无线科技股份有限公司 | Distributed log processing system |
CN112866053A (en) * | 2020-12-31 | 2021-05-28 | 天翼物联科技有限公司 | Internet of things testing method, system and device and storage medium |
CN113572644A (en) * | 2021-07-26 | 2021-10-29 | 武汉众邦银行股份有限公司 | Internet cloud dial-up test automatic monitoring method and device |
CN113572644B (en) * | 2021-07-26 | 2024-01-23 | 武汉众邦银行股份有限公司 | Internet cloud dial testing automatic monitoring method and device |
Also Published As
Publication number | Publication date |
---|---|
CN108683569B (en) | 2020-06-09 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108683569A (en) | A kind of the business monitoring method and system of cloud service-oriented infrastructure | |
Dhamdhere et al. | Inferring persistent interdomain congestion | |
US8443074B2 (en) | Constructing an inference graph for a network | |
Cinque et al. | Microservices monitoring with event logs and black box execution tracing | |
US7076547B1 (en) | System and method for network performance and server application performance monitoring and for deriving exhaustive performance metrics | |
US9210050B2 (en) | System and method for a testing vector and associated performance map | |
Carneiro et al. | Flowmonitor: a network monitoring framework for the network simulator 3 (ns-3) | |
US8578017B2 (en) | Automatic correlation of service level agreement and operating level agreement | |
US9712415B2 (en) | Method, apparatus and communication network for root cause analysis | |
CN107113203B (en) | Apparatus, system and method for debugging network connectivity | |
US9172593B2 (en) | System and method for identifying problems on a network | |
US20030005145A1 (en) | Network service assurance with comparison of flow activity captured outside of a service network with flow activity captured in or at an interface of a service network | |
CN108900640A (en) | Node calls link generation method, device, computer equipment and storage medium | |
US20140280904A1 (en) | Session initiation protocol testing control | |
CN101656642B (en) | Method, device and system for testing authentication performance of network access equipment | |
Khanna et al. | Automated online monitoring of distributed applications through external monitors | |
Cinque et al. | An exploratory study on zeroconf monitoring of microservices systems | |
CN114389792B (en) | WEB log NAT (network Address translation) front-back association method and system | |
Viipuri | Traffic analysis and modeling of IP core networks | |
Madariaga et al. | PePa ping dataset: Comprehensive contextualization of periodic passive ping in wireless networks | |
Bocchi et al. | Statistical network monitoring: Methodology and application to carrier-grade NAT | |
Toll et al. | IoTreeplay: Synchronous Distributed Traffic Replay in IoT Environments | |
Flach et al. | Diagnosing slow web page access at the client side | |
US11899568B2 (en) | Enriched application outage insights | |
Putra | Cloud-based Distributed Internet Measurement Platform |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |