CN107229695A - Multi-platform aviation electronics big data system and method - Google Patents

Multi-platform aviation electronics big data system and method Download PDF

Info

Publication number
CN107229695A
CN107229695A CN201710367759.7A CN201710367759A CN107229695A CN 107229695 A CN107229695 A CN 107229695A CN 201710367759 A CN201710367759 A CN 201710367759A CN 107229695 A CN107229695 A CN 107229695A
Authority
CN
China
Prior art keywords
data
module
storage
relation analysis
real
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710367759.7A
Other languages
Chinese (zh)
Inventor
毛睿
陆敏华
李荣华
王毅
廖好
周明洋
商烁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen University
Original Assignee
Shenzhen University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen University filed Critical Shenzhen University
Priority to CN201710367759.7A priority Critical patent/CN107229695A/en
Publication of CN107229695A publication Critical patent/CN107229695A/en
Priority to PCT/CN2017/106322 priority patent/WO2018214388A1/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2471Distributed queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines

Abstract

The invention discloses a kind of multi-platform aviation electronics big data system, including data acquisition module, data memory module, data relation analysis module and data relation analysis application module;Data acquisition module obtains pcap data APMB packages from data source 1, after acquired classification into data memory module, completes the process of data storage;Data relation analysis module obtains training data from data source 2, data correlation model is completed to set up, it is supplied to data relation analysis application module to use on model, complete real-time estimate, and include result on screen, the cloud storage function that data relation analysis application module is realized using data memory module completes the function of real-time storage.In addition, the invention also discloses the implementation method of the system.The system integration data acquisition, data Classification Management, the function such as data storage and data analysis is gathered and the multi-source heterogeneous data of Classification Management, and real-time storage is on " resource cloud " platform, it is ensured that the real-time of data.

Description

Multi-platform aviation electronics big data system and method
Technical field
The invention belongs to computer realm, and in particular to a kind of aviation flight data system, more particularly to a kind of multi-platform Aviation electronics big data system;Moreover, it relates to the implementation method of the multi-platform aviation electronics big data system.
Background technology
Aviation flight operation is a huge integrated system.In the overall process of flight, between each post of all departments, all There are substantial amounts of, miscellaneous data to need transmission, such as unit information, meteorological condition, sail information, course line risk factor are commented Estimate, manifest information, takeoff data, the data such as special feelings prediction scheme.Due to being limited by technology and management mode, traditional data are passed The mode of passing is by phone, provides paper document, handbook etc..Be present shortcomings in these traditional safeguard ways, or even turn into Limit the bottleneck that Civil Aviation Industry continues to develop.Aeronautical data has of crucial importance to the safe take-off and economic benefit of flight each time Influence.And be multi-source the characteristics of aeronautical data, it is complexity, extensive, the application of the data system of existing single platform is limited, Therefore for the extensive flying quality of these multi-sources, a kind of multi-platform aviation electronics big data system of research and development is needed badly.
The present invention is built on the basis of existing Distributed Architecture and database platform, the following is existing frequently-used Distributed Architecture and database platform.
1.Hadoop
Hadoop is a distributed system architecture, is developed by Apache foundations.User can not know about point In the case of cloth low-level details, distributed program is developed using it.Make full use of the power high-speed computation and storage of cluster.Letter Singly get on very well, Hadoop is a software platform that can be easier to develop and run processing large-scale data.The platform is used Be Object-Oriented Programming Language Java realize, with good portability.
Hadoop core is HDFS and MapReduce.
HDFS (Hadoop Distributed File System) is a kind of distributed file system, hides lower floor's load Equilibrium, redundancy replicate etc. details, to upper procedure provide a unified file system api interface.HDFS is directed to mass data Feature has done special optimization, including:The access of super large file, read operation ratio exceeds well over write operation, and PC easily breaks down Cause node failure etc..HDFS is divided into file 64MB block, is distributed on the machine of cluster, uses Linux file system Storage.While the redundancy per at least more than 3 parts of block file.Center is a NameNode node, according to file index, is looked for Blocks of files.
MapReduce is a set of programming model that result set is finally returned to from mass data extraction and analysis element, most of Distributed arithmetic can be with abstract for MapReduce operations.Map is the key-value pair for input being resolved into centre, and Reduce is according to key Value, the key-value pair that Map is exported carries out synthesis arrangement, final output result.The two functions are supplied to system by programmer, Lower floor's facility is distributed in Map and Reduce operations on cluster and run, and stores the result on HDFS.
Hadoop has following advantage so that user can easily be utilized to develop and run processing magnanimity number According to application program.
High reliability:Hadoop can automatically safeguard the book copying of data, and can automatically be weighed after mission failure New deployment calculating task.
High scalability:Hadoop is that data are distributed between available computer cluster and calculating task is completed, and these collect Group can be easily extended in thousands of nodes.Therefore, on the premise of low delay is not ensured, Hadoop has suitable Big handling capacity, is especially suitable for the computing of mass data.
High efficiency:Hadoop can dynamically mobile data, and ensure the dynamic equilibrium of each node among the nodes, because This processing speed is very fast.
High fault tolerance:Hadoop can automatically save multiple copies of data, and can be automatically by the task weight of failure New distribution.
Low cost:The server zone that Hadoop can be made up of common machines is distributed and processing data, these clothes Device group total of being engaged in is operated in above open source operating system Linux up to many thousands of nodes, and each node, therefore firmly Part cost can be substantially reduced.In addition, compared with all-in-one, commercial data warehouse etc., Hadoop is increased income, and software cost also can Substantially reduce.
2.HBase
HBase is Hadoop Database abbreviation, be a high reliability, high-performance, towards row, telescopic point Cloth storage system, its major function is to store the structuring of magnanimity with the mode of row storage on the basis of Hadoop HDFS Data.
The table stored in HBase mainly has these following features.
Big table:One table can have billions of rows, row up to a million.
Non-mode:Often row has a major key that can be sorted and any many row, and row can dynamically increase as needed, Different rows can have completely different row in same table.
Towards row:Towards the storage and control of authority of row (race), (race) independent retrieval is arranged.
It is sparse:For the row of empty (null), and memory space is not take up, it is very sparse that table can be designed.
Data multi version:Data in each unit can have version number under multiple versions, default situations to distribute automatically, be Timestamp when cell is inserted.
Data type is single:Data in HBase are all character strings, without type.
Hbase, which is applicable scene, mainly to be had:
● there is high concurrent read-write
● the row race of table structure is frequently necessary to adjustment
● storage organization or semi-structured data
● the key-value storages of high concurrent
● key random writings, in order storage
● the set of a fixed size is preserved for each key
HBase also has some shortcomings and inapplicable scene:
● due to that can only provide capable lock, HBase supports bad to distributed transaction
● operated for join, group by inquiry etc., HBase poor performance
● if inquiry is inquired about without using row-key, and performance can be very poor, because can now carry out full table scan, sets up two Level index or multiple index need to safeguard a concordance list simultaneously
● limited is supported to the random write of high concurrent.
In architecture Antagonistic Environment, the data of real-time perception data source are a critical problems, and these data sources are led to Often from multiple sensors, the data for the isomery that efficient management data source is produced turn into a difficult point of this problem.This hair It is bright to be directed to these problems, certain research has been carried out to existing Distributed Architecture and related data analysing method, has attempted to look for Go out the effective ways of the extensive flying quality of processing and analysis multi-source.At present, there is not yet application distribution formula framework and database The relevant report of the multi-platform aviation electronics big data system of platform.
The content of the invention
The technical problem to be solved in the present invention is to provide a kind of multi-platform aviation electronics big data system, system processing With the extensive flying quality of analysis, integrated data collection, data Classification Management, the function such as data storage and data analysis, collection And the multi-source heterogeneous data of Classification Management, and by these data real-time storages to " resource cloud " platform, " resource cloud " platform client End node obtains data in real time from cloud, ensures the real-time of data by cloud platform.On the basis of real-time property, it is System supports the correlation model of historical data to set up, and real-time estimate is completed using real-time data and correlation model, to pilot's Decision-making provides certain guidance.Specifically, the system needs to realize following functions:Flying quality collection, flying quality are real-time Shared, flying quality association analysis and real-time aid decision.Therefore, the present invention also provides a kind of multi-platform big number of aviation electronics According to the implementation method of system.
In order to solve the above technical problems, the present invention provides a kind of multi-platform aviation electronics big data system, including data are adopted Collect module, data memory module, data relation analysis module and data relation analysis application module;
Data acquisition module obtains pcap data APMB packages from data source 1, and data storage mould is arrived after acquired classification In block, the process of data storage is completed;Data relation analysis module obtains training data from data source 2, completes data correlation Model is set up, and is supplied to data relation analysis application module to use on model, completes real-time estimate, and result is included in screen On, the cloud storage function that data relation analysis application module is realized using data memory module completes the function of real-time storage.
As currently preferred technical scheme, the data acquisition module includes import folders path unit, output Folder path unit and data block selection unit;The import folders path unit and the export folders path unit For reading the input of user's selection and the folder path of output, the data block selection unit is used to read user's selection Data block type, the content that the data acquisition module is read according to above unit carries out data acquisition;The data acquisition The temporal information field of key, the source IP of bag, Target IP are obtained in the pcap bags that module is captured using libpcap bags from network The data field of information and storage information, respectively time fields, sourceIP fields, destIP fields and data fields, make Information is sent with the data in destIP and sourceIP combination simulated scenarios, package informatin data block is primarily determined that out;Distinguish not Same data block, parses, obtains independent data block data structure, by data structure in a text form in different formats Hard disk is write back, is used for next stage.
As currently preferred technical scheme, the data memory module includes reading file path unit and demonstration is controlled Unit processed;It is described to read the data source file storage path that file path unit is used to read user's selection;The demonstration control Unit is used for the storage condition of demonstration data, and it periodically reads stored record and is shown on panel;The data storage Module uses Hadoop distributed storages platform and HBase distributed data bases, obtains data in real time from multi-aircraft, then It is then stored into by cloud storage mode on multi-aircraft, and obtains and share in real time the data of multi-aircraft.
As currently preferred technical scheme, the data relation analysis module includes training data path unit, instruction Practice parameter selection unit and data partitioning scheme selecting unit;The training data path unit is used for the instruction for reading user's selection Practice data storage path, the training parameter selecting unit is used for each training parameter value for reading user's selection, the data Partitioning scheme selecting unit is used for the data partitioning scheme for reading user's selection, and the data relation analysis module is according to above-mentioned list The content that member is read carries out the foundation and training of model;The data relation analysis module uses SVM classifier, correspondence code SVM bags, by SVM method, existing data and analysis result are classified, its nucleus module is data disassembler With the libsvm grader bags called, data source result is split into N parts by disassembler for 0 record, and N is inputted by user, point Other and result constitutes N number of training dataset for 1 record, and N number of model is exported after being trained with libsvm, and N number of mould is used during prediction Type result is predicted result and carries out and/or operate output to predict the outcome;Data correlation mould in the data relation analysis module Type is set up specifies input parameter to complete by user.The SVM classifier is preferably the nonlinear s vm graders using RBF cores; The SVM graders are preferably two segmentation and classification devices.
As currently preferred technical scheme, the data relation analysis application module includes model Path selection list Member, reading file path unit and demonstration control unit;The model path selection unit is used for the training for reading user's selection Model deposits path, and the reading file path unit is used for the data source file storage path for reading user's selection, described to drill Show that control unit is analyzed data using the model of reading, be shown to predicting the outcome on panel.
In addition, the present invention also provides a kind of implementation method of said system, include the data acquisition reality of data acquisition module The data storage of existing, data memory module is realized, data relation analysis module set up data correlation model realization and data are closed The real-time estimate result Display Realization of connection analysis application module.
As currently preferred technical scheme, the data acquisition of the data acquisition module, which is realized, to be comprised the following steps:
1) interface program is initialized;
2) user's operation is waited;
3) get parms, call processing routine;
4) judge whether file also has and do not read file, be then to enter step 5), otherwise terminate program;
5) judge whether still there is data in file, be then to enter step 5), otherwise return to step 4);
6) whether be user needs, be then to enter step 7 if judging the data block), otherwise return to step 5);
7) parsing and output data, return to step 5).
As currently preferred technical scheme, the data storage of the data memory module, which is realized, to be comprised the following steps:
1) HBase connections are initialized;
2) table, row cluster are created;
3) native data imports internal memory;
4) demonstration is started;
5) real time data uploads HBase, while obtaining all node datas from HBase in real time;
6) judge whether to terminate demonstration, be to terminate, otherwise return to step 4).
As currently preferred technical scheme, the data relation analysis module sets up data correlation model realization bag Include following steps:
1) data are read, the bound of each property value is taken out;
2) scan data again, svm_problem is produced with read_prob functions are called after bound scaled data;
3) svm_problem carries out cross validation, obtains training accuracy rate;
4) svm_train functions are called based on svm_problem, generation model is simultaneously stored;
5) terminate.
As currently preferred technical scheme, the real-time estimate result display of the data relation analysis application module is real Now comprise the following steps:
1) HBase connections are initialized;
2) table, row cluster are created;
3) native data imports internal memory;
4) demonstration is started;
5) real time data uploads HBase, while it is real-time to reuse SVM algorithm from all node datas of HBase acquisitions in real time Predict the outcome;
6) judge whether to terminate demonstration, be to terminate, otherwise return to step 4).
According to technical scheme provided above, compared with prior art, the big number of multi-platform aviation electronics that the present invention is provided According to system, have the advantages that:
1st, the system integration data acquisition, data Classification Management, the function such as data storage and data analysis is gathered and classified Manage multi-source heterogeneous data, and by these data real-time storages to " resource cloud " platform, " resource cloud " platform client node Obtain data in real time from cloud, ensure the real-time of data by cloud platform.On the basis of real-time property, system is supported The correlation model of historical data is set up, and completes real-time estimate using real-time data and correlation model, the decision-making to pilot is carried For certain guidance.Specifically, the system needs to realize following functions:Flying quality collection, flying quality Real-Time Sharing, fly Row data relation analysis, real-time aid decision.
2nd, the present invention will be applied to aviation electronics after Hadoop distributed storages platform and the optimization of HBase distributed data bases Big data system, is the pioneering of this area, the present invention carries out integrated and distributed storage, Neng Goushi to large-scale avionics data When gather, store and shared data, and using historical data analysis, to real time data carry out Strike prediction, from And effective decision guidance successfully is provided for pilot, success rate prediction is up to 94%.
3rd, the present invention solves the problems, such as the prediction of result of in-flight Strike with the sorting algorithm in machine learning, compares In directly obtaining result with software simulated flight process in the past, this method speed on the premise of certain accuracy rate is ensured is fast A lot of times, therefore improve the efficiency of decision-making of architecture countermeasure system.Because situation about being hit in strike will be well below hitting not In, cause training data uneven, influence the decision-making degree of accuracy.Therefore, we are on the basis of SVM, innovatively using data point The method cut, to improve the degree of accuracy.Decision assistant function is integrated into avionics system, you can be carried out using the data of storage Grader is trained, real-time Strike prediction can be carried out with the grader trained again, and according to predicting the outcome as aircraft Decision recommendation is provided in real time.
3rd, experiment proves that, present system preferably uses the nonlinear s vm grader accuracy rate highests of RBF cores, and excellent Choosing uses the F1 value highests of two segmentation and classification devices.
4th, experiment proves that, present system is supported static to reduce node and dynamic increase node.
Brief description of the drawings
The present invention is further described with reference to the accompanying drawings and examples.
Fig. 1 is the frame construction drawing of data memory module in present system.
Fig. 2 is the exemplary plot of non-linear SVM in data relation analysis module in present system.
Fig. 3 and Fig. 4 are the exemplary plots that data are split in data relation analysis module in present system.
Fig. 5 is the overall framework figure of multi-platform aviation electronics big data system of the invention.
Fig. 6 is the functional structure chart of multi-platform aviation electronics big data system of the invention.
Fig. 7 is the main program flow chart of multi-platform aviation electronics big data system of the invention.
Fig. 8 is the exemplary plot of data relation analysis application module in present system.
Fig. 9 is data acquisition module logical flow chart in present system.
Figure 10 is data memory module logical flow chart in present system.
Figure 11 is data relation analysis module logic flow chart in present system.
Figure 12 is data relation analysis application module logical flow chart in present system.
Embodiment
In conjunction with the accompanying drawings, the present invention is further explained in detail.These accompanying drawings are simplified schematic diagram, only with Illustration illustrates the basic structure of the present invention, therefore it only shows the composition relevant with the present invention.
Many sensings of the reference data that policymaker uses in architecture confrontation in different aircraft systems, different platform Device, multi-data source, obtain simultaneously these data of reliable memory in real time, and data are applied in time in decision tree to turn into and fought successfully Basis.To simulate this architecture environment, the present invention simulates a flight node cluster with many special test equipments, and crawl is real The data that sensor is produced in the flight environment of vehicle of border connect each special test equipment as data source, with an interchanger and build local Data communication in net, simulated system Antagonistic Environment.Policymaker can pass through any special test equipment real time inspection node The real time information of each device node in group, decision-making is completed according to these data messages.In the imaginary operation scene of this simulation, To ensure that policymaker obtains the real-time and reliability of data, the present invention proposes the multi-platform avionics big data based on " resource cloud " System.
The core of the multi-platform aviation electronics big data system of the present invention is " the resource cloud " built in imaginary operational environment Platform, builds a data sharing platform on many special test equipments, and the data platform is based on existing cloud software of increasing income (Hadoop, HBase) is built, main to complete information Real-Time Sharing, reliable memory, the function of information processing between flight node. The data source of platform is the data message after classifying by data acquisition, and initial data completes collection by data acquisition module Classification, is transferred to " resource cloud " platform afterwards.Finally, the data analysis module on each node is obtained in real time from " resource cloud " platform The information of all nodes, the data correlation model set up with reference to historical data carries out data analysis, by the data of each node point Analysis result is presented to policymaker, and there is provided decision guidance.
The technical solution of modules is as follows in the multi-platform aviation electronics big data system of the present invention:
1st, data acquisition and classification schemes
The data of imaginary operational environment have isomerism between different sensor, data, directly result in system work The data complexity of the lower avionics big data system of war.Data by collection are needed by the big data based on aerial cloud platform point Class technology is analyzed platform and data from multiple angles, strengthens data correlation, so as to reduce the boat of System of System Oriented operation The data complexity of TV university data system.In the specific implementation, need for multimode situation space build different data acquisitions with Preconditioned pattern.For imaginary operational environment, the present invention takes the method for parsing packet after packet capturing according to data protocol one by one To gather the data in classification environment, " resource cloud " platform data source is used as.
In practical application, conventional system data acquisition scheme has two kinds:
(1) packet capturing
Packet capturing program Wireshark obtains packet.Wireshark by the binary data captured from network according to Different protocol packet structure specifications, is shown in Packet Details panels.Main data frame profile, the number for including physical layer According to link layer ethernet frame header information, internet layer IP packet header informations, the data segment header information of transport layer, application layer Information etc..Process uses libpcap storehouses, and libpcap is a network data Packet capturing function library, and function is very powerful, for Network interface, port and protocol carry out packet interception.
(2) reptile
Network data acquisition discloses the modes such as API by web crawlers or website and data message is obtained from website.The party Method can extract unstructured data from webpage, be stored as unified local data file, and with structuring Mode store.It supports the collection of the files such as picture, audio, video or annex, and annex can be with auto-associating with text.Except Outside the content included in network, the collection for network traffics can be used at DPI or DFI equiband administrative skills Reason.
It is higher for the communication cost between operational node in imaginary operational environment, so the present invention gives up reptile and actively obtained The mode for evidence of fetching, is resisted preferably from actual operational environment at the network switch using the architecture of wireshark software grabs Packet on platform is used as source data.
The packet that data acquisition module parsing is got from interchanger, according to the source IP of bag and purpose IP, and number Structuring and non-structured data are splitted data into according to the agreement of bag.Our communication protocols in imaginary operational environment Data are parsed content by view from binary file one by one.Obtained data basic classification is as follows:
(1) structured data:
Structural data data at once, are storable in relevant database, real come logical expression by bivariate table structure Existing data.Data will for different target receive target be distributed, such as unmanned plane, radar simulation, photoelectricity, electronic station, Three-dimensional phonic warning, cockpit etc..
Gathered data result splits data into multiple data blocks.Data block includes essential information, such as data type, transmission Source, target number, block length, update cycle, virtual link, maximum delay time, receiving port etc..In addition to essential information, different numbers Structuring conclusion can be carried out according to the main contents in block.
(2) unstructured data
In multi-platform aviation electronics big data system big data management platform of the invention, non-structural data mainly include figure The forms such as piece, audio, video, hypermedia, such as Radar meteorology image, geographical distribution image, detection enemy plane audiograph and video Stream etc..These data do not have fixed structure, for structural data, and unstructured data is inconvenient to use database two Logical table is tieed up to show, but based on the non-relational database on distributed cloud storage platform, it is possible to achieve unstructured data Efficiently, stable storage.For non-structured data, we reserve interface to complete these functions.
2nd, data storage scheme
In architecture Antagonistic Environment, in real time, accurately obtaining the real-time change situation of relevant information in countermeasure system has been Into a key factor of the architecture of confrontation.Each operational node can generate some crucial avionics information, including section in real time Point carrier aircraft data, the information of target etc., these information are obtained by other nodes in real time, and add the decision-making of architecture confrontation in real time In system.To realize this target, we build " resource cloud " in the node of imaginary operational environment, collect each node life Into avionics information after, information is uploaded on " resource cloud " in real time, other node real-time query data variations, it is flat using cloud Platform it is high fault-tolerant, real-time and reliability ensure the real-time availability of all avionics information and difficult lose property.
(1) " resource cloud " platform
Traditional " resource cloud " framework is divided into several different types:The first is by raw data acquisition in client (client), then each memory node is sent data to by client and carries out distributed storage;Second exists data acquisition Node is local, is then distributed to all memory nodes.In view of relative to existing big data management framework, characteristic of the invention It is that data source is identical with data storage destination, i.e., obtains data in real time from multi-aircraft, then by cloud storage mode again Store on multi-aircraft, and obtain and share in real time the data of multi-aircraft, therefore the present invention is using second of framework type.
Imaginary platform totally uses principal and subordinate (Master/Slave) structural model (as shown in Figure 1), by a host node with Several are constituted from node.Host node is used as master server, the access of management file system name space and client to file Operation.From node as from server, it is responsible for the storage of data.System is using " write-once, repeatedly reading (write-once- Read-many) " model, the model reduces concurrency control requirement, simplifies data aggregate, support high-throughput to visit Ask.
(1) reliability
Big file cutting is the small documents of fixed size by " resource cloud " platform by way of file division, and is stored Small documents are made multiple copies by contingency table, are respectively stored in above different nodes, read file when by contingency table come by Part returns to user after reading splicing file.
Data source writes hard disk temporal cache on the local node, due to the complexity of node after collection and classification Property, part of nodes storage message file it is larger, more than cloud platform give tacit consent to file size, then can produce file segmentation process, On the one hand line bandwidth is saved by file division, on the other hand can increases System Error-tolerance Property.
" the resource cloud " of present system passes through Hash by the way that a file is divided into multiple pieces in physical store They are split on multiple nodes of cluster by scheduling algorithm respectively, and it is sufficiently large that this characteristic can allow distributed memory system to preserve File.Compared to not splitting on file backup to specified machine, the process of file division saves single-point between single-point The bandwidth of communication, makes the load of system more balanced to a certain extent, on the other hand, if single node produces failure, it is impossible to read The nodal information is taken, completion can be spliced by backuping to the segmentation of other nodes by way of file division and recover work Make.
(2) fault-tolerance
After " resource cloud " platform is split to each file, data chunk redundancy is backed up by certain hash algorithm To above other nodes, fault tolerant mechanism of the redundancy fault-tolerant based on HDFS of cloud platform, mainly have it is following some:
File parts are marked, remembered by Master nodes by file division, record contingency table as the decision-making of duplication The contingency table of current block is recorded, is passed through according to contingency table content on hash algorithm redundancy backup to other corresponding nodes.Access text There is no corresponding file division during part, above present node, then to request on a nearest redundancy backup.
The backup of Master nodes, is completed by zookeeper, and all nodes elect a master node and one Backup-master nodes, backup-master nodes timing completes snapshot to master nodes, it is ensured that backup-master The information master that keeps up with is too many.After heartbeat mechanism detects master node collapses, backup-master substitutions Master nodes, and another backup-master node is selected by zookeeper election mechanism, backup is current Master node contents.
(3) " resource cloud " other features
Timing snapshot:Snapshot supports that snapshot can be by the cluster of failure in special time one data copy of storage On the previous normal time point rolled back to.
Flow state:When creating data, file data is buffered in local temporary file by client at the beginning.Using The write operation of program is transparently redirected to this interim local file.When local file is stacked into a piecemeal size Wait, client is just notified that host node.Filename is inserted into file system hierarchy by host node, then distributes a number for it According to block.Host node construction includes back end ID (being probably multiple, the node of copy data block storage also has) and target data The message of block identification, the request of client is replied with it.Client flushes to local temporary file the number specified after receiving According in node data block.Because if client is write direct to Telefile without any local caching, This will have a huge impact to network speed and network throughput.When closing of a file, what is do not uploaded in local temporary files is residual Residual evidence will be transferred to back end.Then client just can notify host node file to have been switched off.Now, host node The establishment operation of file is added into persistent storage.If host node is died before closing of a file, file is just lost .
Continuous-flow type is replicated:When client is write data in file, as described above, data are first written to local file In, it is assumed that the replicator of file is 3, and when local file is stacked into the data of a block size, client obtains one from host node The list of individual back end.This list also includes the back end of storage data block copy.When client refreshes data block is arrived First back end.First back end starts to receive data by unit of 4kb, and each fritter is all write into local library In, while each fritter to be all sent to second back end in list.Similarly, second back end is by small block data The 3rd back end is transmitted to simultaneously in write-in local library, the 3rd back end is written directly in local library.One data section Point can also pass to data continuous-flow type next node while previous node data is connect, so, data are flowing water It is delivered to from a back end next likes.
Autgmentability:The verified distributed platform of substantial amounts of application practice has great autgmentability, can be light Expand on the cluster that hundreds of node is constituted.
3rd, data analysis scheme
In architecture opposed decision-making system, historical data information is most valuable resource, the analysis to historical information Many functions can be completed with refining, such as history Strike information can be for aid decision.By to one group of history The interpretation of result of flight course and Strike, we can obtain the sorter model of a state of flight, utilize this mould Type can predict node Strike result.After forecast model is introduced on " resource cloud " platform, we can be according to each The Strike of node predicts the outcome, and completes some decision-making functions, improves the efficiency of decision-making of architecture countermeasure system.
For existing state of flight message data set and strike result, a input that problem is regarded as that can be approximately is The absolute location information of avionics information and target when aircraft is launched a guided missile, output be hit and hit off the target target two classification point Class device model, two conventional classification graders of com-parison and analysis, show that an optimal sorter model of result is applied to decision-making system In system.
(1) classifier algorithm
It is two classification problems due to what is solved, marked as 0 and 1.So grader seeks to find a face, will All sample points assign to the both sides in face.That is, for any sample x=(b1, b2... bm), grader decision function F:
F (x)=g (f (x))
A. linear separability SVM
F (x)=w in linear separability SVM classifier decision functionTX+b, it is substantially that searching one can be by sample point Assign to the hyperplane that maximizes margin of having of both sides by label, margin is all data points to the geometry interval of hyperplane Minimum value.Said from the angle of statistics, because positive negative sample is considered as obtaining from two different distribution random samplings, if point Class border and the distance of two distributions are bigger, and the probability that the sample sampled out falls in classification boundaries another side is smaller.So, it is maximum Changing margin can ensure that the extensive error under worst case is minimum, and grader certainty factor is higher.
F (x)=w in grader decision functionTX+b, then its hyperplane is wTX+b=0.
Given training set T, hyperplane wTX+b=0, defines sample point (xi, yi) to hyperplane function at intervals of:
Geometry at intervals of:
If N is sample point number, the minimum value for defining the function interval of all sample points in T is:
The margin of hyperplane is the minimum value at the geometry interval of all sample points in T:
Margin is maximized to be represented by:
Change:
As can be seen that w, b equal proportion scaling all do not influence on hyperplane and geometry interval, and function interval can be in proportion Scaling.So, orderAbove formula is substituted into, and is maximizedIt is equivalent to minimizeLinear separability svm is thus obtained Optimization problem:
This is a convex quadratic programming problem, using Lagrange duality, by solve dual problem can obtain it is optimal Solution, the process of solution is not just repeated.
B. non-linear SVM
For nonlinear classification problem, decision surface is a curved surface, and curved surface can become higher dimensional space by necessarily mapping In a hyperplane, can thus be solved with the method in linear separability svm.
For example, two class data distributions are the shape (as shown in Figure 2) of two circles, such data are linearly can not in itself Point, preferable interface should be a circle rather than a line (hyperplane).
If using x1And x1Represent the coordinate of this two dimensional surface, then its decision surface can be write as such form:
a0+a1x1+a2x2+a3x1 2+a4x2 2+a5x1x2=0
If we construct a quintuple space, coordinate value is respectively z1=x1、z2=x2、z3=x1 2、z4=x2 2、z5= x1x2, then decision surface equation above can be write in new space:
As can be seen that the equation of this exactly one hyperplane.If we map the data into five dimensions in such a way Space, then original nonlinear data reforms into linear separability in new space, so as to use linear svm algorithms Processing.
Due in linear separability svm solution procedure, it is necessary to which the place data vector calculated is always in the form of inner product Occur, so, the function that we define the inner product for calculating two vectors in the space after mapping is kernel function, uses kernel function To simplify the inner product operation in mapping space.
So, for nonlinear situation, processing method is one kernel function of selection, and it is empty to map the data into higher-dimension by it Between, become a linear separability problem in higher dimensional space, the linear inseparable problem in luv space is solved with this, so Handled again with linear separability SVM algorithm afterwards.Kernel function conventional svm has four kinds:Linear kernel (is equal to linear separability Svm), polynomial kernel, RBF cores, sigmoid cores, concrete form such as table 1 below.
Table 1
Type Function expression
Linear kernel uT*v
Polynomial kernel (g*uT*v+coef0)degree
RBF cores exp(-g*||u-v||2)
Sigmoid cores tanh(g*uT*v+coef0)
Data are split
Because sample data concentrates two class ratio datas great disparity, imbalance problem is caused.Attempt by ratio in training set compared with That high class sample decomposition is into several pieces, and every piece separately constitutes a sub- training set with another kind of sample, to every sub- training set It is trained, obtains subclassification model.Subclassification model can be made up of to new grader some computings, data are carried out Prediction.So handle, data nonbalance problem can be improved to a certain extent.
For example, by label=0 sample decomposition into four pieces, the sample with label=1 constitutes four sub- training sets respectively, They are trained and obtains four sub- disaggregated models.Each subclassification model is predicted to input data, obtains four This four output can be carried out and computing, obtain final output, this is equivalent to a new classification by output Device.Schematic diagram is as shown in Figure 3 and Figure 4.
The system architecture and flow of the present invention are further elaborated with below in conjunction with the accompanying drawings:
(1) program architecture and flow scheme design
As shown in figure 5, multi-platform aviation electronics big data system of the invention is integrally divided into 4 modules, data acquisition module, Data memory module, data relation analysis module and data relation analysis application module.Data acquisition module is obtained from data source 1 Pcap data APMB packages are taken, after acquired classification into data memory module, the process of data storage are completed.Data correlation Analysis module obtains training data from data source 2, can specify input parameter by user, completes data correlation model and builds It is vertical, it is supplied to data relation analysis application module to use on model, completes real-time estimate, and result is included on screen, number The cloud storage function of being realized according to association analysis application module using data memory module completes the function of real-time storage.
Because system is developed on the basis of distributed platform, (developed firstly the need of in multiple devices when building system During system use 6) on build the complete distributed environments of Hadoop and HBase.Equivalent to one flight node of every equipment, its In have one as host node, the operation such as to be scheduled and show.
1. data acquisition module
The temporal information field of key, the source IP of bag, target are obtained in the pcap bags captured using libpcap bags from network The data field of IP information and storage information, respectively time fields, sourceIP fields, destIP fields and data fields, Information is sent using the data in destIP and sourceIP combination simulated scenarios, package informatin data block can be primarily determined that out.
Different data blocks are distinguished, parses in different formats, obtains independent data block data structure, by data knot Structure writes back hard disk in a text form, is used for next stage.
As shown in Figure 6 and Figure 7, data acquisition module include import folders path unit, export folders path unit, Data block selection unit.Import folders path unit and export folders path unit be used for the input for reading user's selection and The folder path of output, data block selection unit be used for read user selection data block type, data acquisition module according to The content that these units are read carries out data acquisition.
As shown in figure 9, data acquisition module logic flow comprises the following steps:
1) interface program is initialized;
2) user's operation is waited;
3) get parms, call processing routine;
4) judge whether file also has and do not read file, be then to enter step 5), otherwise terminate program;
5) judge whether still there is data in file, be then to enter step 5), otherwise return to step 4);
6) whether be user needs, be then to enter step 7 if judging the data block), otherwise return to step 5);
7) parsing and output data, return to step 5).
2. data memory module
(1) distributed storage platform
To complete data reliability storing process, with reference to the design in technical scheme, by existing distributed cloud platform, Data storage function is realized based on HDFS.HDFS service end is disposed on six special test equipments, all node simulations are treated After pilot's (device power-up) in place, HDFS start-all.sh orders are started in any node, six test equipments are set up Into unified data sharing platform, the port of corresponding function is monitored respectively.When data storage or inquiry request reach, correspondence is used Port transmission data.
The data reliability and fault-tolerance of platform are completed by HDFS redundant backup function.
(2) distributed data base
On the basis of existing HDFS stable storages, project is all data of standardized management, is realized based on HBase One distributed data base, reliable memory is completed using Hadoop HDFS, is added using Hadoop MapReduce frameworks Speed system data query operation.
HBase Table Design is as follows:
During actual storage, each packet correspondence one rowKey, each rowKey only include the information of a data block, The mode that HBase is deposited using row ensures the reliability of system data.
(3) operational process
The module running includes two steps of data storage and data display.
Data storage:Every a 40ms data of discharge, store data into HBase, because sample data volume is smaller, read Spued again since first data after completion.
Data display:The another journey that bursts at the seams completes the reading process of file, every 10ms from HBase environment real-time query from upper Secondary timestamp inquires records all in the present time stamp time, and the last item record is read from record, is shown in real time On screen.
As shown in Figure 6 and Figure 7, data memory module includes reading file path unit and demonstration control unit, for counting According to storage demonstration.The data source file storage path that file path unit is used to read user's selection is read, control unit is demonstrated For the storage condition of demonstration data, it periodically reads stored record and is shown on panel.
As shown in Figure 10, data memory module logic flow comprises the following steps:
1) HBase connections are initialized;
2) table, row cluster are created;
3) native data imports internal memory;
4) demonstration is started;
5) real time data uploads HBase, while obtaining all node datas from HBase in real time;
6) judge whether to terminate demonstration, be to terminate, otherwise return to step 4).
3. data relation analysis module
The SVM classifier that this part is mainly used, the SVM bags of correspondence code, by SVM method, to existing number Classified according to analysis result, its nucleus module is data disassembler and the libsvm grader bags called, disassembler Data source result is split into N parts (N is inputted by user) for 0 record, N number of training number is constituted with result for 1 record respectively According to collection, N number of model is exported after being trained with libsvm, being predicted result using N number of model result during prediction carries out and/or operate Output predicts the outcome.
Running mainly includes three below step.
Data normalization:Scan data set, takes out bound, completes the normalization operation of data, it is ensured that each variable pair As a result effect balance.
Data are split:Because the particularity of data, as a result for 0 record quantity far more than result be 1, so the present invention is adopted The partition strategy in technical scheme is taken, result is divided into N parts for 1 data, N number of data source is formed after being combined respectively with 0, This part is realized in read_prob functions.
Data are trained:Each function (including svm_scale, svm_train etc.) in libsvm software kits is called, to each Svm_problem is trained, generation svm_model and dump (unloading) is on hard disk.
As shown in Figure 6 and Figure 7, data relation analysis module include training data path unit, training parameter selecting unit, Data partitioning scheme selecting unit, for setting up model, carrying out model training.Training data path unit is used to read user's choosing The training data storage path selected, training parameter selecting unit is used for each training parameter value for reading user's selection, data point The data partitioning scheme that mode selecting unit is used to read user's selection is cut, data relation analysis module is read according to these units Content carry out the foundation and training of model.
As shown in figure 11, data relation analysis module logic flow comprises the following steps:
1) data are read, the bound of each property value is taken out, including longitude, latitude, height, roll angle, direct route angle, pitching 7 attributes in angle and speed;
2) scan data again, with bound scale data (scaled data, to improve the place of training and pre- chronometric data Reason speed) after call read_prob functions produce svm_problem;
3) svm_problem carries out cross validation (cross validation), obtains training accuracy rate;
4) svm_train functions are called based on svm_problem, generation model is simultaneously stored;
5) terminate.
4. data relation analysis application module
The global design principle of application module is to complete storage using data memory module, utilizes data relation analysis module The optimal models of output is as input model, to any data real-time estimate, as shown in Figure 8.
Wherein, the data prediction of many sub-models follows following rule:
2 points:
Or model:n1|n2
With model:n1&n2
4 points:
First with it is rear or:(n1&n2)|(n3&n4)
First or afterwards with:(n1|n2)&(n3|n4)
8 points:
First with it is rear or:(n1&n2&n3&n4)|(n5&n6&n7&n8)
First or afterwards with:(n1|n2|n3|n4)&(n5|n6|n7|n8)
Running mainly includes three below step.
Initialization:HBase connection is initialized, the establishment of table is completed, the establishment of row cluster etc. is operated, and being read from hard disk needs The file content of storage.
Data are produced:Every a 40ms data of discharge, store data into HBase, because sample data volume is smaller, read Spued again since first data after taking into.
Data display:The another journey that bursts at the seams completes the reading process of file, every 10ms from HBase environment real-time query from upper Secondary timestamp inquires records all in the present time stamp time, and the last item record is read from record, this data is used Call SVM to complete real-time estimate, and result is included on screen.
As shown in Figure 6 and Figure 7, data relation analysis application module includes model path selection unit, reads file path Unit, demonstration control unit, for data analysis demonstration.Model path selection unit is used for the training pattern for reading user's selection Path is deposited, the data source file storage path that file path unit is used to read user's selection, demonstration control unit profit is read Data are analyzed with the model of reading, are shown to predicting the outcome on panel.
As shown in figure 12, data relation analysis application module logic flow comprises the following steps:
1) HBase connections are initialized;
2) table, row cluster are created;
3) native data imports internal memory;
4) demonstration is started;
5) real time data uploads HBase, while it is real-time to reuse SVM algorithm from all node datas of HBase acquisitions in real time Predict the outcome;
6) judge whether to terminate demonstration, be to terminate, otherwise return to step 4).
(2) Interface design
1. data acquisition module
Data acquisition is the data basis of multi-platform avionics big data system " resource cloud " platform, is provided necessarily for software Data Analysis Data source.Data source requirement is to be obtained from actual motion environment at interchanger with the packet capturing of wireshark softwares Data, data format requirement is pcap data, and the Target IP and source IP of bag meet following require:
Table 2
Data block title Purpose IP/ source IPs
Networking is instructed 224.224.0.110
Demonstrate scene information 224.224.0.107/224.224.0.108
Integration objective data block 224.224.0.89
Helicopter carrier aircraft data block 224.224.0.140
Demonstrate control information
Data field datas block meets agreement in each packet《XX type demo system data interface protocols》.
2. data memory module
Data memory module is the core of multi-platform avionics big data system, and thus module completes system data storage work( Energy.This module receives the data output from " data acquisition " module, and input data form is complete txt texts, often CSV is used between packet content after one behavior one parsing, field, each packets fields information is as follows:
Helicopter carrier aircraft data block:Bag timestamp, data block ID, the data block time, longitude, latitude, height, the angle of pitch, Roll angle, Zhen Hangjiao, the angle of attack, ground velocity, north orientation speed, east orientation speed, day speed
Integration objective data block:Bag data is stabbed, data block ID, data block time, target number, the attribute of target 1, target 1 Longitude, the latitude of target 1, the height of target 1, the orientation of target 1, the angle of pitch of target 1, the north orientation speed of target 1, the east orientation speed of target 1, The sky orientation speed of target 1, the attribute ... ... of target 2, the sky orientation speed of target 20
3. data relation analysis module
The major function of data relation analysis module is that data model is set up in the analysis to historical data, this module input one Group training data, data modeling process is completed by SVM classifier and division classification policy.Training data requirement is STK simulations The data that software is collected into, its form is to use tab between 7 input variable forms and the result data of one 0/1, all fields Tab (" t ") separates, and field information is as follows:
The parameter 70 of 4 parameter of latitude, longitude height parameter, 5 parameter 6
4. data relation analysis application module
This module is exported using the output file of STK simulation softwards as prediction data source with " data relation analysis part " Model carries out real-time prediction of result based on data memory module, is shown in real time on interface as input model.The part is defeated Enter data format (i.e. STK simulation softwards output file) field information as follows:
The parameter 7 of 4 parameter of latitude, longitude height parameter, 5 parameter 6
(3) global data structures are designed
1. physical arrangement
Some data structures that the data structure that software is mainly used in realizing is corresponded in data protocol, define data The data that structure storage is parsed from binary file, the data structure of realization has following several:
The data header information that FrameHeader//storage is each wrapped
Helicopt_Carrier_Parm//storage carrier aircraft data block field information
SingleTargetParameter//all information of storage single target
The integrated target data block field information of Interrgrated_Target_Parm//storage, inside may include multiple SingleTargetParameter
Demo_Scene_Info//demonstration scene information data block
Demo_Ctrl_Info//demonstration control information data block
Build_Net_Cmd//networking director data block
It is identified as row data, being recorded as a Record, use after in Record//each data block to system Length and type fields distinguish classification.
2. table structure
Classified according to existing data, the characteristics of with reference to HBase by row storage, we design following table structure by Multiple row clusters (ColumnFamily) are constituted, and each row cluster is made up of multiple attributes, and each attribute is corresponding to one in data block Individual field.Row cluster in table has following several:
CF_HelicoptCarrierParm//helicopter carrier aircraft data block row cluster
CF_IntegeratedTargetParm//basic target data block row cluster
The corresponding row cluster of CF_STK//STK data
The non-structured bag row cluster of CF_EmptyString//storage
The corresponding row cluster of the unidentified data block of CF_Unrecognised//storage
What is included in each row cluster is classified as the bag letter stored in each piece of field information, CF_EmptyString comprising textual form Breath, such as " 87a34b2345f86544e ", CF_Unrecognised containing type information.
Line unit information in form uses customized form, and form is " row "+system time+helicopterId, example Such as line unit is " row147926317632301 ", represents that system time (the millisecond number from 1 day 0 January in 1970) is When 1479263176323, the node that numbering is 01 stores the data into cloud platform.
3. class formation
The class formation being related in realization mainly has the Record classes of record row information, calls bottom HBase's Complete each to call in HBaseEngine classes and the SVMEngine classes for calling SVM classifier, the member variable of each comfortable class Journey.
4. constant
Realize that the constant being related in design is mainly field name information, quantity is larger, does not list in detail herein.
The effect of the present invention is verified below by way of specific experiment:
1. classifier algorithm evaluation and test experiment
(1) data set
Totally 4497432, original flying quality sample as experiment, wherein that hits (label=1) has 316768, That does not hit (label=0) has 4180664.Initial data is divided into according to 50%, 25%, 25% ratio uniform Train set, validation set, tri- set of test set.Wherein, train set are used for training grader; Validation set are used for testing the performance of different classifications device, determine that the network structure or Controlling model of disaggregated model are answered The parameter of miscellaneous degree;Test set are used for examining the performance of the optimal classification model of final choice.
(2) experimental result
Test experiments are carried out to different classifications device algorithm, experimental result is assessed, optimal sorter model is chosen, uses test Set is verified.
A. linear separability svm
Linear separability svm is realized with Liblinear, is tested, as a result such as table 3 below:
Table 3
accuracy precision recall F1
92.9669% 0 0 0
Due to example number of the number well below label=0 of label=1 in data set, (ratio is about 1:13), because This linear svm can all predict 0, but it is clear that being so skimble-skamble.
B. nonlinear s vm
Different types of nonlinear s vm is realized with Libsvm, is tested, as a result such as table 4 below:
Table 4
Kernel function accuracy precision recall F1
Linear kernel 92.9669% 0 0 0
Polynomial kernel 92.9669% 0 0 0
RBF cores 94.3549% 0.599 0.596 0.597
Sigmod cores 85.9684% 0 0 0
It can be seen that the result from RBF kernel functions is best, rate of accuracy reached to 94.4%, 1 prediction rate have also exceeded 50%.
C. data are split
Sub- training set is trained with above-mentioned libsvm RBF core types, because its effect is best.
I. two segmentation
By label=0 training data random division into two pieces, two sub- training sets are constituted for 1 data with label, Training obtain two model, validation set are predicted respectively, two output are obtained, by with and/or two kinds pass System processing output obtains final classification result.Test result such as table 5 below:
Table 5
accuracy precision recall F1
With 94.1015% 0.556 0.806 0.658
Or 94.0866% 0.554 0.811 0.659
Ii. four segmentation
By label=0 training data random division into four pieces, four sub- training sets are constituted for 1 data with label, Training obtains four model, and validation set are predicted respectively, four output are obtained, by entirely with, Quan Huo, elder generation With it is rear or, first or afterwards with four kinds of Automated generalization output obtain final classification result.Test result such as table 6 below:
Table 6
Iii. eight segmentation
By label=0 training data random division into eight pieces, eight sub- training sets are constituted for 1 data with label, Training obtains eight model, and validation set are predicted respectively, eight output are obtained, by entirely with, Quan Huo, elder generation With it is rear or, first or afterwards with four kinds of Automated generalization output obtain final classification result.Test result such as table 7 below:
Table 7
accuracy Precision recall F1
Entirely with 91.5268% 0.453 0.984 0.620
Quan Huo 91.2967% 0.446 0.987 0.615
First with it is rear or 91.4762% 0.451 0.985 0.619
First or afterwards with 91.3750% 0.449 0.986 0.617
Iv. 2/3rds segmentation
By label=0 training data random division into three pieces, every two pieces constitute three son instructions with label for 1 data Practice collection, training obtain three model, validation set are predicted respectively, three output are obtained, by with and/or two Plant Automated generalization output and obtain final classification result.Test result such as table 8 below:
Table 8
accuracy precision recall F1
With 94.3033% 0.575 0.729 0.643
Or 94.2959% 0.574 0.734 0.644
D. confirmatory experiment
Tested more than, it can be seen that the simple nonlinear s vm grader accuracy rate highests using RBF cores, and two points Cut the F1 value highests of grader.Confirmatory experiment is carried out to both optimal classification models with test set, as a result such as table 9 below:
Table 9
Checking is obtained, and both classifier performances and test result above are basically identical, really optimal.
2. multi-platform aviation electronics big data system testing
A. data acquisition module is tested
Runs software system, into data acquisition module, sets and starts collection after parameter.Check the data block text of output Part, correctly, it was demonstrated that acquisition function is normal.Different Block selection parameters are set, the size of data of output is checked, It is different, it was demonstrated that acquisition module can be acquired to a variety of forms data blocks.
B. data memory module is tested
Runs software system, into data acquisition module, then starts demonstration.The data on Dashboard panels are observed, As program is run, panel can show the status information of each node in cluster in real time, and can be seen that flying quality is just stored, Prove that the module is capable of the data of real-time storage each node.
C. data relation analysis module testing
Runs software system, into data relation analysis module, is respectively adopted different Selection of kernel function parameter and segmentation Parameter, is trained to input data set, can be successfully obtained disaggregated model, it was demonstrated that the module can be carried out differently Data analysis.
D. data relation analysis application module is tested
Runs software system, into data relation analysis application module, then Selecting All Parameters start demonstration.Interface can be real When show all nodes flying quality and prediction Strike result, it was demonstrated that the module can in real time be deposited to flying quality Storage and prediction.
E. system node static state reduces test
According to corresponding method, system node is reduced to 4 by 6 static state, Hadoop and Hbase in cluster is checked Nodes, become 4, illustrate that system supports static reduction node.
F. system node dynamically increase test
According to corresponding method, system node is added dynamically to 6 by 4 in previous test, and newly increased Runtime software on node.The change of nodal information, is successfully become by original 4 on inspection system data storage function interface 6, illustrate that system supports dynamic increase node.
Using the above-mentioned desirable embodiment according to the present invention as enlightenment, by above-mentioned description, relevant staff is complete Various changes and amendments can be carried out without departing from the scope of the technological thought of the present invention' entirely.The technology of this invention Property scope is not limited to the content on specification, it is necessary to its technical scope is determined according to right.

Claims (10)

1. a kind of multi-platform aviation electronics big data system, it is characterised in that including data acquisition module, data memory module, Data relation analysis module and data relation analysis application module;
Data acquisition module obtains pcap data APMB packages from data source 1, after acquired classification into data memory module, Complete the process of data storage;Data relation analysis module obtains training data from data source 2, completes data correlation model and builds It is vertical, it is supplied to data relation analysis application module to use on model, completes real-time estimate, and result is included on screen, number The cloud storage function of being realized according to association analysis application module using data memory module completes the function of real-time storage.
2. the system as claimed in claim 1, it is characterised in that it is single that the data acquisition module includes import folders path Member, export folders path unit and data block selection unit;The import folders path unit and the export folders Path unit is used for the folder path for the input and output for reading user's selection, and the data block selection unit, which is used to read, to be used The data block type of family selection, the content that the data acquisition module is read according to above unit carries out data acquisition;
The temporal information field of key is obtained in the pcap bags that the data acquisition module is captured using libpcap bags from network, The data field of the source IP of bag, Target IP information and storage information, respectively time fields, sourceIP fields, destIP words Section and data fields, information is sent using the data in destIP and sourceIP combination simulated scenarios, primarily determines that out that bag is believed Cease data block;Different data blocks are distinguished, parses in different formats, independent data block data structure is obtained, by data Structure writes back hard disk in a text form, is used for next stage.
3. the system as claimed in claim 1, it is characterised in that the data memory module include reading file path unit and Demonstrate control unit;It is described to read the data source file storage path that file path unit is used to read user's selection;It is described to drill Show that control unit is used for the storage condition of demonstration data, it periodically reads stored record and is shown on panel;The number Hadoop distributed storages platform and HBase distributed data bases are used according to memory module, data are obtained in real time from multi-aircraft, Then it is then stored into by cloud storage mode on multi-aircraft, and obtains and share in real time the data of multi-aircraft.
4. the system as claimed in claim 1, it is characterised in that it is single that the data relation analysis module includes training data path Member, training parameter selecting unit and data partitioning scheme selecting unit;The training data path unit is used to read user's choosing The training data storage path selected, the training parameter selecting unit is used for each training parameter value for reading user's selection, institute State data partitioning scheme selecting unit be used for read user selection data partitioning scheme, the data relation analysis module according to The content that said units are read carries out the foundation and training of model;
The data relation analysis module uses SVM classifier, the SVM bags of correspondence code, by SVM method, to existing Data and analysis result are classified, and its nucleus module is data disassembler and the libsvm grader bags called, splits journey Data source result is split into N parts by sequence for 0 record, and N is inputted by user, constitutes N number of training number with result for 1 record respectively According to collection, N number of model is exported after being trained with libsvm, being predicted result using N number of model result during prediction carries out and/or operate Output predicts the outcome;Data correlation model is set up in the data relation analysis module specifies input parameter to complete by user.
5. the system as claimed in claim 1, it is characterised in that the data relation analysis application module is selected including model path Select unit, read file path unit and demonstration control unit;The model path selection unit is used to read user's selection Training pattern deposits path, and the reading file path unit is used for the data source file storage path for reading user's selection, institute State demonstration control unit to analyze data using the model of reading, be shown to predicting the outcome on panel.
6. the implementation method of a kind of system as described in claim any one of 1-5, it is characterised in that including data acquisition module Data acquisition realization, the data storage of data memory module is realized, data relation analysis module sets up data correlation model Realize the real-time estimate result Display Realization with data relation analysis application module.
7. method as claimed in claim 6, it is characterised in that the data acquisition of the data acquisition module is realized including as follows Step:
1) interface program is initialized;
2) user's operation is waited;
3) get parms, call processing routine;
4) judge whether file also has and do not read file, be then to enter step 5), otherwise terminate program;
5) judge whether still there is data in file, be then to enter step 5), otherwise return to step 4);
6) whether be user needs, be then to enter step 7 if judging the data block), otherwise return to step 5);
7) parsing and output data, return to step 5).
8. method as claimed in claim 6, it is characterised in that the data storage of the data memory module is realized including as follows Step:
1) HBase connections are initialized;
2) table, row cluster are created;
3) native data imports internal memory;
4) demonstration is started;
5) real time data uploads HBase, while obtaining all node datas from HBase in real time;
6) judge whether to terminate demonstration, be to terminate, otherwise return to step 4).
9. method as claimed in claim 6, it is characterised in that the data relation analysis module sets up data correlation model Realization comprises the following steps:
1) data are read, the bound of each property value is taken out;
2) scan data again, svm_problem is produced with read_prob functions are called after bound scaled data;
3) svm_problem carries out cross validation, obtains training accuracy rate;
4) svm_train functions are called based on svm_problem, generation model is simultaneously stored;
5) terminate.
10. method as claimed in claim 6, it is characterised in that the real-time estimate knot of the data relation analysis application module Fruit Display Realization comprises the following steps:
1) HBase connections are initialized;
2) table, row cluster are created;
3) native data imports internal memory;
4) demonstration is started;
5) real time data uploads HBase, while obtaining all node datas from HBase in real time reuses SVM algorithm real-time estimate As a result;
6) judge whether to terminate demonstration, be to terminate, otherwise return to step 4).
CN201710367759.7A 2017-05-23 2017-05-23 Multi-platform aviation electronics big data system and method Pending CN107229695A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201710367759.7A CN107229695A (en) 2017-05-23 2017-05-23 Multi-platform aviation electronics big data system and method
PCT/CN2017/106322 WO2018214388A1 (en) 2017-05-23 2017-10-16 Multi-platform big data system and method for aviation electronics

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710367759.7A CN107229695A (en) 2017-05-23 2017-05-23 Multi-platform aviation electronics big data system and method

Publications (1)

Publication Number Publication Date
CN107229695A true CN107229695A (en) 2017-10-03

Family

ID=59933807

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710367759.7A Pending CN107229695A (en) 2017-05-23 2017-05-23 Multi-platform aviation electronics big data system and method

Country Status (2)

Country Link
CN (1) CN107229695A (en)
WO (1) WO2018214388A1 (en)

Cited By (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108052616A (en) * 2017-12-15 2018-05-18 四川汉科计算机信息技术有限公司 Aviation big data intelligent analysis method based on remote embedded data acquisition
CN108052617A (en) * 2017-12-15 2018-05-18 四川汉科计算机信息技术有限公司 Aviation big data intelligent analysis system based on remote embedded data acquisition
CN108092802A (en) * 2017-12-04 2018-05-29 中国船舶重工集团公司第七〇九研究所 The numerical prediction maintenance system and method for ocean nuclear power platform nuclear power unit
CN108228378A (en) * 2018-01-05 2018-06-29 中车青岛四方机车车辆股份有限公司 The data processing method and device of train groups failure predication
CN108650229A (en) * 2018-04-03 2018-10-12 国家计算机网络与信息安全管理中心 A kind of network application behavior parsing restoring method and system
CN108762225A (en) * 2018-04-24 2018-11-06 中国商用飞机有限责任公司北京民用飞机技术研究中心 A kind of failure in flight control system copes with decision making device
WO2018214388A1 (en) * 2017-05-23 2018-11-29 深圳大学 Multi-platform big data system and method for aviation electronics
WO2018214387A1 (en) * 2017-05-23 2018-11-29 深圳大学 Distributed mining system and method for aviation-oriented electronic data
CN109408694A (en) * 2018-09-25 2019-03-01 广东中标数据科技股份有限公司 A kind of customs reaches a standard shipping bill analysis method, system and device
CN109828988A (en) * 2019-01-25 2019-05-31 重庆科技学院 A kind of big data statistical method and the system for big data statistics
CN110472122A (en) * 2019-07-31 2019-11-19 重庆古扬科技有限公司 A kind of dynamic distributed academic resources acquisition method of multichannel
CN111078687A (en) * 2019-11-14 2020-04-28 青岛民航空管实业发展有限公司 Flight operation data fusion method, device and equipment
CN111190992A (en) * 2019-12-10 2020-05-22 华能集团技术创新中心有限公司 Mass storage method and storage system for unstructured data
CN111753926A (en) * 2020-07-07 2020-10-09 广州驰兴通用技术研究有限公司 Data sharing method and system for smart city
CN111881213A (en) * 2020-07-28 2020-11-03 东航技术应用研发中心有限公司 System for storing, processing and using flight big data
CN112084148A (en) * 2020-09-18 2020-12-15 陕西千山航空电子有限责任公司 Comprehensive application platform for aviation objective information
CN112182094A (en) * 2019-07-01 2021-01-05 成都启英泰伦科技有限公司 Big data distributed storage method in voice data character text form
CN112416753A (en) * 2020-11-02 2021-02-26 中关村科学城城市大脑股份有限公司 Method, system and equipment for standardized management of urban brain application scene data
CN113159371A (en) * 2021-01-27 2021-07-23 南京航空航天大学 Unknown target feature modeling and demand prediction method based on cross-modal data fusion
CN114168243A (en) * 2021-11-23 2022-03-11 广西电网有限责任公司 Dashbird multi-chart-based system and method for dynamically merging data
CN115378674A (en) * 2022-08-11 2022-11-22 国网湖南综合能源服务有限公司 Application method and system of shared energy storage cloud platform based on block chain technology
CN115857899A (en) * 2022-11-16 2023-03-28 电子科技大学 Heterogeneous data packet-oriented analysis software automatic construction method

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110807552B (en) * 2019-10-30 2023-07-25 合肥工业大学 Urban electric bus driving condition construction method based on improved K-means
WO2021195943A1 (en) * 2020-03-31 2021-10-07 深圳市大疆创新科技有限公司 Flight record data storage method, flight record data acquisition method, and unmanned aerial vehicle
CN111737529B (en) * 2020-07-23 2020-12-18 北京东方通科技股份有限公司 Multi-source heterogeneous data acquisition method
CN113177022A (en) * 2021-04-29 2021-07-27 东北大学 Full-process big data storage method for aluminum/copper plate strip production
CN113656370B (en) * 2021-08-16 2024-04-30 南方电网数字电网集团有限公司 Data processing method and device for electric power measurement system and computer equipment
CN113688100B (en) * 2021-09-06 2023-07-18 北京普睿德利科技有限公司 Meteorological data processing method, device, terminal and storage medium
CN115225730A (en) * 2022-07-05 2022-10-21 北京赛思信安技术股份有限公司 High-concurrency offline data packet analysis method supporting multiple tasks
CN115168396A (en) * 2022-07-15 2022-10-11 全图通位置网络有限公司 Comprehensive intelligent platform data management method and system based on spatio-temporal system
CN115474021B (en) * 2022-07-19 2023-08-08 北京普利永华科技发展有限公司 Satellite transponder data processing method and system under multi-component combined control
CN115269704B (en) * 2022-08-02 2023-08-18 贵州财经大学 Multi-element heterogeneous agricultural data management system
CN116303729B (en) * 2023-05-17 2023-08-01 北京煜象软件技术有限公司 Information acquisition method, device, equipment and medium

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104573068A (en) * 2015-01-23 2015-04-29 四川中科腾信科技有限公司 Information processing method based on megadata

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10644950B2 (en) * 2014-09-25 2020-05-05 At&T Intellectual Property I, L.P. Dynamic policy based software defined network mechanism
CN104394211A (en) * 2014-11-21 2015-03-04 浪潮电子信息产业股份有限公司 Design and implementation method for user behavior analysis system based on Hadoop
CN107229695A (en) * 2017-05-23 2017-10-03 深圳大学 Multi-platform aviation electronics big data system and method

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104573068A (en) * 2015-01-23 2015-04-29 四川中科腾信科技有限公司 Information processing method based on megadata

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
刘进军: "基于惩罚的SVM和集成学习的非平衡数据分类算法研究", 《计算机应用与软件》 *
勒加雷: "《嵌入式协议栈μC\TCP-IP 基于STM32微控制器》", 31 January 2013 *
高红旭,等: "大数据技术在民航空管监控系统中的应用", 《现代导航》 *

Cited By (28)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018214388A1 (en) * 2017-05-23 2018-11-29 深圳大学 Multi-platform big data system and method for aviation electronics
WO2018214387A1 (en) * 2017-05-23 2018-11-29 深圳大学 Distributed mining system and method for aviation-oriented electronic data
CN108092802A (en) * 2017-12-04 2018-05-29 中国船舶重工集团公司第七〇九研究所 The numerical prediction maintenance system and method for ocean nuclear power platform nuclear power unit
CN108052616A (en) * 2017-12-15 2018-05-18 四川汉科计算机信息技术有限公司 Aviation big data intelligent analysis method based on remote embedded data acquisition
CN108052617A (en) * 2017-12-15 2018-05-18 四川汉科计算机信息技术有限公司 Aviation big data intelligent analysis system based on remote embedded data acquisition
CN108228378A (en) * 2018-01-05 2018-06-29 中车青岛四方机车车辆股份有限公司 The data processing method and device of train groups failure predication
CN108650229A (en) * 2018-04-03 2018-10-12 国家计算机网络与信息安全管理中心 A kind of network application behavior parsing restoring method and system
CN108650229B (en) * 2018-04-03 2021-07-16 国家计算机网络与信息安全管理中心 Network application behavior analysis and restoration method and system
CN108762225A (en) * 2018-04-24 2018-11-06 中国商用飞机有限责任公司北京民用飞机技术研究中心 A kind of failure in flight control system copes with decision making device
CN109408694A (en) * 2018-09-25 2019-03-01 广东中标数据科技股份有限公司 A kind of customs reaches a standard shipping bill analysis method, system and device
CN109828988A (en) * 2019-01-25 2019-05-31 重庆科技学院 A kind of big data statistical method and the system for big data statistics
CN112182094A (en) * 2019-07-01 2021-01-05 成都启英泰伦科技有限公司 Big data distributed storage method in voice data character text form
CN110472122A (en) * 2019-07-31 2019-11-19 重庆古扬科技有限公司 A kind of dynamic distributed academic resources acquisition method of multichannel
CN111078687A (en) * 2019-11-14 2020-04-28 青岛民航空管实业发展有限公司 Flight operation data fusion method, device and equipment
CN111190992A (en) * 2019-12-10 2020-05-22 华能集团技术创新中心有限公司 Mass storage method and storage system for unstructured data
CN111190992B (en) * 2019-12-10 2023-09-08 华能集团技术创新中心有限公司 Mass storage method and storage system for unstructured data
CN111753926A (en) * 2020-07-07 2020-10-09 广州驰兴通用技术研究有限公司 Data sharing method and system for smart city
CN111881213A (en) * 2020-07-28 2020-11-03 东航技术应用研发中心有限公司 System for storing, processing and using flight big data
CN111881213B (en) * 2020-07-28 2021-03-19 东航技术应用研发中心有限公司 System for storing, processing and using flight big data
CN112084148A (en) * 2020-09-18 2020-12-15 陕西千山航空电子有限责任公司 Comprehensive application platform for aviation objective information
CN112416753A (en) * 2020-11-02 2021-02-26 中关村科学城城市大脑股份有限公司 Method, system and equipment for standardized management of urban brain application scene data
CN113159371A (en) * 2021-01-27 2021-07-23 南京航空航天大学 Unknown target feature modeling and demand prediction method based on cross-modal data fusion
CN114168243A (en) * 2021-11-23 2022-03-11 广西电网有限责任公司 Dashbird multi-chart-based system and method for dynamically merging data
CN114168243B (en) * 2021-11-23 2024-04-02 广西电网有限责任公司 Data system and method based on dashboard multi-chart dynamic merging
CN115378674A (en) * 2022-08-11 2022-11-22 国网湖南综合能源服务有限公司 Application method and system of shared energy storage cloud platform based on block chain technology
CN115378674B (en) * 2022-08-11 2023-11-03 国网湖南综合能源服务有限公司 Application method and system of shared energy storage cloud platform based on blockchain technology
CN115857899A (en) * 2022-11-16 2023-03-28 电子科技大学 Heterogeneous data packet-oriented analysis software automatic construction method
CN115857899B (en) * 2022-11-16 2023-12-15 电子科技大学 Heterogeneous data packet-oriented automatic construction method for analysis software

Also Published As

Publication number Publication date
WO2018214388A1 (en) 2018-11-29

Similar Documents

Publication Publication Date Title
CN107229695A (en) Multi-platform aviation electronics big data system and method
CN104885104B (en) Satellite dispatches system
Rosendo et al. Distributed intelligence on the Edge-to-Cloud Continuum: A systematic literature review
US10331490B2 (en) Scalable cloud-based time series analysis
Rosalie et al. Chaos-enhanced mobility models for multilevel swarms of UAVs
Cao et al. Analytics everywhere: generating insights from the internet of things
Rios Big data infrastructure for analyzing data generated by wireless sensor networks
Emmanouil et al. Big data analytics in prevention, preparedness, response and recovery in crisis and disaster management
Braun et al. Pattern mining from big IoT data with fog computing: models, issues, and research perspectives
US10503498B2 (en) Scalable cloud-based time series analysis
Yin et al. Dynamic data mining of sensor data
Song et al. Military simulation big data: background, state of the art, and challenges
Guo Research on anomaly detection in massive multimedia data transmission network based on improved PSO algorithm
CN107229234A (en) The distributed libray system and method for Aviation electronic data
Niazalizadeh Moghadam et al. Multi-agent distributed data mining approach for classifying meteorology data: case study on Iran’s synoptic weather stations
US20220318202A1 (en) Method and subsystem of a distributed log-analytics system that automatically determine the source of log/event messages
Hireche et al. Toward a Novel RESTFUL Big Data-Based Urban Traffic Incident Data Web Service for Connected Vehicles
Bobbio et al. Markovian agent models: a dynamic population of interdependent Markovian agents
Kaur et al. Generative adversarial networks with quantum optimization model for mobile edge computing in IoT big data
Shih et al. Implementation and visualization of a netflow log data lake system for cyberattack detection using distributed deep learning
He et al. Mining moving object gathering pattern based on resilient distributed datasets and R-tree index
Perko et al. Evaluating probability of default: Intelligent agents in managing a multi-model system
Chen et al. Towards low-latency big data infrastructure at sangfor
Yao et al. Data fusion of geographically dispersed information: experience with the scalable data grid
Airlangga et al. A novel architectural design for solving lost-link problems in UAV collaboration

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20171003

WD01 Invention patent application deemed withdrawn after publication