CN107229695A - Multi-platform aviation electronics big data system and method - Google Patents
Multi-platform aviation electronics big data system and method Download PDFInfo
- Publication number
- CN107229695A CN107229695A CN201710367759.7A CN201710367759A CN107229695A CN 107229695 A CN107229695 A CN 107229695A CN 201710367759 A CN201710367759 A CN 201710367759A CN 107229695 A CN107229695 A CN 107229695A
- Authority
- CN
- China
- Prior art keywords
- data
- module
- storage
- relation analysis
- real
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/18—File system types
- G06F16/182—Distributed file systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2458—Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
- G06F16/2471—Distributed queries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2411—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
Abstract
The invention discloses a kind of multi-platform aviation electronics big data system, including data acquisition module, data memory module, data relation analysis module and data relation analysis application module;Data acquisition module obtains pcap data APMB packages from data source 1, after acquired classification into data memory module, completes the process of data storage;Data relation analysis module obtains training data from data source 2, data correlation model is completed to set up, it is supplied to data relation analysis application module to use on model, complete real-time estimate, and include result on screen, the cloud storage function that data relation analysis application module is realized using data memory module completes the function of real-time storage.In addition, the invention also discloses the implementation method of the system.The system integration data acquisition, data Classification Management, the function such as data storage and data analysis is gathered and the multi-source heterogeneous data of Classification Management, and real-time storage is on " resource cloud " platform, it is ensured that the real-time of data.
Description
Technical field
The invention belongs to computer realm, and in particular to a kind of aviation flight data system, more particularly to a kind of multi-platform
Aviation electronics big data system;Moreover, it relates to the implementation method of the multi-platform aviation electronics big data system.
Background technology
Aviation flight operation is a huge integrated system.In the overall process of flight, between each post of all departments, all
There are substantial amounts of, miscellaneous data to need transmission, such as unit information, meteorological condition, sail information, course line risk factor are commented
Estimate, manifest information, takeoff data, the data such as special feelings prediction scheme.Due to being limited by technology and management mode, traditional data are passed
The mode of passing is by phone, provides paper document, handbook etc..Be present shortcomings in these traditional safeguard ways, or even turn into
Limit the bottleneck that Civil Aviation Industry continues to develop.Aeronautical data has of crucial importance to the safe take-off and economic benefit of flight each time
Influence.And be multi-source the characteristics of aeronautical data, it is complexity, extensive, the application of the data system of existing single platform is limited,
Therefore for the extensive flying quality of these multi-sources, a kind of multi-platform aviation electronics big data system of research and development is needed badly.
The present invention is built on the basis of existing Distributed Architecture and database platform, the following is existing frequently-used
Distributed Architecture and database platform.
1.Hadoop
Hadoop is a distributed system architecture, is developed by Apache foundations.User can not know about point
In the case of cloth low-level details, distributed program is developed using it.Make full use of the power high-speed computation and storage of cluster.Letter
Singly get on very well, Hadoop is a software platform that can be easier to develop and run processing large-scale data.The platform is used
Be Object-Oriented Programming Language Java realize, with good portability.
Hadoop core is HDFS and MapReduce.
HDFS (Hadoop Distributed File System) is a kind of distributed file system, hides lower floor's load
Equilibrium, redundancy replicate etc. details, to upper procedure provide a unified file system api interface.HDFS is directed to mass data
Feature has done special optimization, including:The access of super large file, read operation ratio exceeds well over write operation, and PC easily breaks down
Cause node failure etc..HDFS is divided into file 64MB block, is distributed on the machine of cluster, uses Linux file system
Storage.While the redundancy per at least more than 3 parts of block file.Center is a NameNode node, according to file index, is looked for
Blocks of files.
MapReduce is a set of programming model that result set is finally returned to from mass data extraction and analysis element, most of
Distributed arithmetic can be with abstract for MapReduce operations.Map is the key-value pair for input being resolved into centre, and Reduce is according to key
Value, the key-value pair that Map is exported carries out synthesis arrangement, final output result.The two functions are supplied to system by programmer,
Lower floor's facility is distributed in Map and Reduce operations on cluster and run, and stores the result on HDFS.
Hadoop has following advantage so that user can easily be utilized to develop and run processing magnanimity number
According to application program.
High reliability:Hadoop can automatically safeguard the book copying of data, and can automatically be weighed after mission failure
New deployment calculating task.
High scalability:Hadoop is that data are distributed between available computer cluster and calculating task is completed, and these collect
Group can be easily extended in thousands of nodes.Therefore, on the premise of low delay is not ensured, Hadoop has suitable
Big handling capacity, is especially suitable for the computing of mass data.
High efficiency:Hadoop can dynamically mobile data, and ensure the dynamic equilibrium of each node among the nodes, because
This processing speed is very fast.
High fault tolerance:Hadoop can automatically save multiple copies of data, and can be automatically by the task weight of failure
New distribution.
Low cost:The server zone that Hadoop can be made up of common machines is distributed and processing data, these clothes
Device group total of being engaged in is operated in above open source operating system Linux up to many thousands of nodes, and each node, therefore firmly
Part cost can be substantially reduced.In addition, compared with all-in-one, commercial data warehouse etc., Hadoop is increased income, and software cost also can
Substantially reduce.
2.HBase
HBase is Hadoop Database abbreviation, be a high reliability, high-performance, towards row, telescopic point
Cloth storage system, its major function is to store the structuring of magnanimity with the mode of row storage on the basis of Hadoop HDFS
Data.
The table stored in HBase mainly has these following features.
Big table:One table can have billions of rows, row up to a million.
Non-mode:Often row has a major key that can be sorted and any many row, and row can dynamically increase as needed,
Different rows can have completely different row in same table.
Towards row:Towards the storage and control of authority of row (race), (race) independent retrieval is arranged.
It is sparse:For the row of empty (null), and memory space is not take up, it is very sparse that table can be designed.
Data multi version:Data in each unit can have version number under multiple versions, default situations to distribute automatically, be
Timestamp when cell is inserted.
Data type is single:Data in HBase are all character strings, without type.
Hbase, which is applicable scene, mainly to be had:
● there is high concurrent read-write
● the row race of table structure is frequently necessary to adjustment
● storage organization or semi-structured data
● the key-value storages of high concurrent
● key random writings, in order storage
● the set of a fixed size is preserved for each key
HBase also has some shortcomings and inapplicable scene:
● due to that can only provide capable lock, HBase supports bad to distributed transaction
● operated for join, group by inquiry etc., HBase poor performance
● if inquiry is inquired about without using row-key, and performance can be very poor, because can now carry out full table scan, sets up two
Level index or multiple index need to safeguard a concordance list simultaneously
● limited is supported to the random write of high concurrent.
In architecture Antagonistic Environment, the data of real-time perception data source are a critical problems, and these data sources are led to
Often from multiple sensors, the data for the isomery that efficient management data source is produced turn into a difficult point of this problem.This hair
It is bright to be directed to these problems, certain research has been carried out to existing Distributed Architecture and related data analysing method, has attempted to look for
Go out the effective ways of the extensive flying quality of processing and analysis multi-source.At present, there is not yet application distribution formula framework and database
The relevant report of the multi-platform aviation electronics big data system of platform.
The content of the invention
The technical problem to be solved in the present invention is to provide a kind of multi-platform aviation electronics big data system, system processing
With the extensive flying quality of analysis, integrated data collection, data Classification Management, the function such as data storage and data analysis, collection
And the multi-source heterogeneous data of Classification Management, and by these data real-time storages to " resource cloud " platform, " resource cloud " platform client
End node obtains data in real time from cloud, ensures the real-time of data by cloud platform.On the basis of real-time property, it is
System supports the correlation model of historical data to set up, and real-time estimate is completed using real-time data and correlation model, to pilot's
Decision-making provides certain guidance.Specifically, the system needs to realize following functions:Flying quality collection, flying quality are real-time
Shared, flying quality association analysis and real-time aid decision.Therefore, the present invention also provides a kind of multi-platform big number of aviation electronics
According to the implementation method of system.
In order to solve the above technical problems, the present invention provides a kind of multi-platform aviation electronics big data system, including data are adopted
Collect module, data memory module, data relation analysis module and data relation analysis application module;
Data acquisition module obtains pcap data APMB packages from data source 1, and data storage mould is arrived after acquired classification
In block, the process of data storage is completed;Data relation analysis module obtains training data from data source 2, completes data correlation
Model is set up, and is supplied to data relation analysis application module to use on model, completes real-time estimate, and result is included in screen
On, the cloud storage function that data relation analysis application module is realized using data memory module completes the function of real-time storage.
As currently preferred technical scheme, the data acquisition module includes import folders path unit, output
Folder path unit and data block selection unit;The import folders path unit and the export folders path unit
For reading the input of user's selection and the folder path of output, the data block selection unit is used to read user's selection
Data block type, the content that the data acquisition module is read according to above unit carries out data acquisition;The data acquisition
The temporal information field of key, the source IP of bag, Target IP are obtained in the pcap bags that module is captured using libpcap bags from network
The data field of information and storage information, respectively time fields, sourceIP fields, destIP fields and data fields, make
Information is sent with the data in destIP and sourceIP combination simulated scenarios, package informatin data block is primarily determined that out;Distinguish not
Same data block, parses, obtains independent data block data structure, by data structure in a text form in different formats
Hard disk is write back, is used for next stage.
As currently preferred technical scheme, the data memory module includes reading file path unit and demonstration is controlled
Unit processed;It is described to read the data source file storage path that file path unit is used to read user's selection;The demonstration control
Unit is used for the storage condition of demonstration data, and it periodically reads stored record and is shown on panel;The data storage
Module uses Hadoop distributed storages platform and HBase distributed data bases, obtains data in real time from multi-aircraft, then
It is then stored into by cloud storage mode on multi-aircraft, and obtains and share in real time the data of multi-aircraft.
As currently preferred technical scheme, the data relation analysis module includes training data path unit, instruction
Practice parameter selection unit and data partitioning scheme selecting unit;The training data path unit is used for the instruction for reading user's selection
Practice data storage path, the training parameter selecting unit is used for each training parameter value for reading user's selection, the data
Partitioning scheme selecting unit is used for the data partitioning scheme for reading user's selection, and the data relation analysis module is according to above-mentioned list
The content that member is read carries out the foundation and training of model;The data relation analysis module uses SVM classifier, correspondence code
SVM bags, by SVM method, existing data and analysis result are classified, its nucleus module is data disassembler
With the libsvm grader bags called, data source result is split into N parts by disassembler for 0 record, and N is inputted by user, point
Other and result constitutes N number of training dataset for 1 record, and N number of model is exported after being trained with libsvm, and N number of mould is used during prediction
Type result is predicted result and carries out and/or operate output to predict the outcome;Data correlation mould in the data relation analysis module
Type is set up specifies input parameter to complete by user.The SVM classifier is preferably the nonlinear s vm graders using RBF cores;
The SVM graders are preferably two segmentation and classification devices.
As currently preferred technical scheme, the data relation analysis application module includes model Path selection list
Member, reading file path unit and demonstration control unit;The model path selection unit is used for the training for reading user's selection
Model deposits path, and the reading file path unit is used for the data source file storage path for reading user's selection, described to drill
Show that control unit is analyzed data using the model of reading, be shown to predicting the outcome on panel.
In addition, the present invention also provides a kind of implementation method of said system, include the data acquisition reality of data acquisition module
The data storage of existing, data memory module is realized, data relation analysis module set up data correlation model realization and data are closed
The real-time estimate result Display Realization of connection analysis application module.
As currently preferred technical scheme, the data acquisition of the data acquisition module, which is realized, to be comprised the following steps:
1) interface program is initialized;
2) user's operation is waited;
3) get parms, call processing routine;
4) judge whether file also has and do not read file, be then to enter step 5), otherwise terminate program;
5) judge whether still there is data in file, be then to enter step 5), otherwise return to step 4);
6) whether be user needs, be then to enter step 7 if judging the data block), otherwise return to step 5);
7) parsing and output data, return to step 5).
As currently preferred technical scheme, the data storage of the data memory module, which is realized, to be comprised the following steps:
1) HBase connections are initialized;
2) table, row cluster are created;
3) native data imports internal memory;
4) demonstration is started;
5) real time data uploads HBase, while obtaining all node datas from HBase in real time;
6) judge whether to terminate demonstration, be to terminate, otherwise return to step 4).
As currently preferred technical scheme, the data relation analysis module sets up data correlation model realization bag
Include following steps:
1) data are read, the bound of each property value is taken out;
2) scan data again, svm_problem is produced with read_prob functions are called after bound scaled data;
3) svm_problem carries out cross validation, obtains training accuracy rate;
4) svm_train functions are called based on svm_problem, generation model is simultaneously stored;
5) terminate.
As currently preferred technical scheme, the real-time estimate result display of the data relation analysis application module is real
Now comprise the following steps:
1) HBase connections are initialized;
2) table, row cluster are created;
3) native data imports internal memory;
4) demonstration is started;
5) real time data uploads HBase, while it is real-time to reuse SVM algorithm from all node datas of HBase acquisitions in real time
Predict the outcome;
6) judge whether to terminate demonstration, be to terminate, otherwise return to step 4).
According to technical scheme provided above, compared with prior art, the big number of multi-platform aviation electronics that the present invention is provided
According to system, have the advantages that:
1st, the system integration data acquisition, data Classification Management, the function such as data storage and data analysis is gathered and classified
Manage multi-source heterogeneous data, and by these data real-time storages to " resource cloud " platform, " resource cloud " platform client node
Obtain data in real time from cloud, ensure the real-time of data by cloud platform.On the basis of real-time property, system is supported
The correlation model of historical data is set up, and completes real-time estimate using real-time data and correlation model, the decision-making to pilot is carried
For certain guidance.Specifically, the system needs to realize following functions:Flying quality collection, flying quality Real-Time Sharing, fly
Row data relation analysis, real-time aid decision.
2nd, the present invention will be applied to aviation electronics after Hadoop distributed storages platform and the optimization of HBase distributed data bases
Big data system, is the pioneering of this area, the present invention carries out integrated and distributed storage, Neng Goushi to large-scale avionics data
When gather, store and shared data, and using historical data analysis, to real time data carry out Strike prediction, from
And effective decision guidance successfully is provided for pilot, success rate prediction is up to 94%.
3rd, the present invention solves the problems, such as the prediction of result of in-flight Strike with the sorting algorithm in machine learning, compares
In directly obtaining result with software simulated flight process in the past, this method speed on the premise of certain accuracy rate is ensured is fast
A lot of times, therefore improve the efficiency of decision-making of architecture countermeasure system.Because situation about being hit in strike will be well below hitting not
In, cause training data uneven, influence the decision-making degree of accuracy.Therefore, we are on the basis of SVM, innovatively using data point
The method cut, to improve the degree of accuracy.Decision assistant function is integrated into avionics system, you can be carried out using the data of storage
Grader is trained, real-time Strike prediction can be carried out with the grader trained again, and according to predicting the outcome as aircraft
Decision recommendation is provided in real time.
3rd, experiment proves that, present system preferably uses the nonlinear s vm grader accuracy rate highests of RBF cores, and excellent
Choosing uses the F1 value highests of two segmentation and classification devices.
4th, experiment proves that, present system is supported static to reduce node and dynamic increase node.
Brief description of the drawings
The present invention is further described with reference to the accompanying drawings and examples.
Fig. 1 is the frame construction drawing of data memory module in present system.
Fig. 2 is the exemplary plot of non-linear SVM in data relation analysis module in present system.
Fig. 3 and Fig. 4 are the exemplary plots that data are split in data relation analysis module in present system.
Fig. 5 is the overall framework figure of multi-platform aviation electronics big data system of the invention.
Fig. 6 is the functional structure chart of multi-platform aviation electronics big data system of the invention.
Fig. 7 is the main program flow chart of multi-platform aviation electronics big data system of the invention.
Fig. 8 is the exemplary plot of data relation analysis application module in present system.
Fig. 9 is data acquisition module logical flow chart in present system.
Figure 10 is data memory module logical flow chart in present system.
Figure 11 is data relation analysis module logic flow chart in present system.
Figure 12 is data relation analysis application module logical flow chart in present system.
Embodiment
In conjunction with the accompanying drawings, the present invention is further explained in detail.These accompanying drawings are simplified schematic diagram, only with
Illustration illustrates the basic structure of the present invention, therefore it only shows the composition relevant with the present invention.
Many sensings of the reference data that policymaker uses in architecture confrontation in different aircraft systems, different platform
Device, multi-data source, obtain simultaneously these data of reliable memory in real time, and data are applied in time in decision tree to turn into and fought successfully
Basis.To simulate this architecture environment, the present invention simulates a flight node cluster with many special test equipments, and crawl is real
The data that sensor is produced in the flight environment of vehicle of border connect each special test equipment as data source, with an interchanger and build local
Data communication in net, simulated system Antagonistic Environment.Policymaker can pass through any special test equipment real time inspection node
The real time information of each device node in group, decision-making is completed according to these data messages.In the imaginary operation scene of this simulation,
To ensure that policymaker obtains the real-time and reliability of data, the present invention proposes the multi-platform avionics big data based on " resource cloud "
System.
The core of the multi-platform aviation electronics big data system of the present invention is " the resource cloud " built in imaginary operational environment
Platform, builds a data sharing platform on many special test equipments, and the data platform is based on existing cloud software of increasing income
(Hadoop, HBase) is built, main to complete information Real-Time Sharing, reliable memory, the function of information processing between flight node.
The data source of platform is the data message after classifying by data acquisition, and initial data completes collection by data acquisition module
Classification, is transferred to " resource cloud " platform afterwards.Finally, the data analysis module on each node is obtained in real time from " resource cloud " platform
The information of all nodes, the data correlation model set up with reference to historical data carries out data analysis, by the data of each node point
Analysis result is presented to policymaker, and there is provided decision guidance.
The technical solution of modules is as follows in the multi-platform aviation electronics big data system of the present invention:
1st, data acquisition and classification schemes
The data of imaginary operational environment have isomerism between different sensor, data, directly result in system work
The data complexity of the lower avionics big data system of war.Data by collection are needed by the big data based on aerial cloud platform point
Class technology is analyzed platform and data from multiple angles, strengthens data correlation, so as to reduce the boat of System of System Oriented operation
The data complexity of TV university data system.In the specific implementation, need for multimode situation space build different data acquisitions with
Preconditioned pattern.For imaginary operational environment, the present invention takes the method for parsing packet after packet capturing according to data protocol one by one
To gather the data in classification environment, " resource cloud " platform data source is used as.
In practical application, conventional system data acquisition scheme has two kinds:
(1) packet capturing
Packet capturing program Wireshark obtains packet.Wireshark by the binary data captured from network according to
Different protocol packet structure specifications, is shown in Packet Details panels.Main data frame profile, the number for including physical layer
According to link layer ethernet frame header information, internet layer IP packet header informations, the data segment header information of transport layer, application layer
Information etc..Process uses libpcap storehouses, and libpcap is a network data Packet capturing function library, and function is very powerful, for
Network interface, port and protocol carry out packet interception.
(2) reptile
Network data acquisition discloses the modes such as API by web crawlers or website and data message is obtained from website.The party
Method can extract unstructured data from webpage, be stored as unified local data file, and with structuring
Mode store.It supports the collection of the files such as picture, audio, video or annex, and annex can be with auto-associating with text.Except
Outside the content included in network, the collection for network traffics can be used at DPI or DFI equiband administrative skills
Reason.
It is higher for the communication cost between operational node in imaginary operational environment, so the present invention gives up reptile and actively obtained
The mode for evidence of fetching, is resisted preferably from actual operational environment at the network switch using the architecture of wireshark software grabs
Packet on platform is used as source data.
The packet that data acquisition module parsing is got from interchanger, according to the source IP of bag and purpose IP, and number
Structuring and non-structured data are splitted data into according to the agreement of bag.Our communication protocols in imaginary operational environment
Data are parsed content by view from binary file one by one.Obtained data basic classification is as follows:
(1) structured data:
Structural data data at once, are storable in relevant database, real come logical expression by bivariate table structure
Existing data.Data will for different target receive target be distributed, such as unmanned plane, radar simulation, photoelectricity, electronic station,
Three-dimensional phonic warning, cockpit etc..
Gathered data result splits data into multiple data blocks.Data block includes essential information, such as data type, transmission
Source, target number, block length, update cycle, virtual link, maximum delay time, receiving port etc..In addition to essential information, different numbers
Structuring conclusion can be carried out according to the main contents in block.
(2) unstructured data
In multi-platform aviation electronics big data system big data management platform of the invention, non-structural data mainly include figure
The forms such as piece, audio, video, hypermedia, such as Radar meteorology image, geographical distribution image, detection enemy plane audiograph and video
Stream etc..These data do not have fixed structure, for structural data, and unstructured data is inconvenient to use database two
Logical table is tieed up to show, but based on the non-relational database on distributed cloud storage platform, it is possible to achieve unstructured data
Efficiently, stable storage.For non-structured data, we reserve interface to complete these functions.
2nd, data storage scheme
In architecture Antagonistic Environment, in real time, accurately obtaining the real-time change situation of relevant information in countermeasure system has been
Into a key factor of the architecture of confrontation.Each operational node can generate some crucial avionics information, including section in real time
Point carrier aircraft data, the information of target etc., these information are obtained by other nodes in real time, and add the decision-making of architecture confrontation in real time
In system.To realize this target, we build " resource cloud " in the node of imaginary operational environment, collect each node life
Into avionics information after, information is uploaded on " resource cloud " in real time, other node real-time query data variations, it is flat using cloud
Platform it is high fault-tolerant, real-time and reliability ensure the real-time availability of all avionics information and difficult lose property.
(1) " resource cloud " platform
Traditional " resource cloud " framework is divided into several different types:The first is by raw data acquisition in client
(client), then each memory node is sent data to by client and carries out distributed storage;Second exists data acquisition
Node is local, is then distributed to all memory nodes.In view of relative to existing big data management framework, characteristic of the invention
It is that data source is identical with data storage destination, i.e., obtains data in real time from multi-aircraft, then by cloud storage mode again
Store on multi-aircraft, and obtain and share in real time the data of multi-aircraft, therefore the present invention is using second of framework type.
Imaginary platform totally uses principal and subordinate (Master/Slave) structural model (as shown in Figure 1), by a host node with
Several are constituted from node.Host node is used as master server, the access of management file system name space and client to file
Operation.From node as from server, it is responsible for the storage of data.System is using " write-once, repeatedly reading (write-once-
Read-many) " model, the model reduces concurrency control requirement, simplifies data aggregate, support high-throughput to visit
Ask.
(1) reliability
Big file cutting is the small documents of fixed size by " resource cloud " platform by way of file division, and is stored
Small documents are made multiple copies by contingency table, are respectively stored in above different nodes, read file when by contingency table come by
Part returns to user after reading splicing file.
Data source writes hard disk temporal cache on the local node, due to the complexity of node after collection and classification
Property, part of nodes storage message file it is larger, more than cloud platform give tacit consent to file size, then can produce file segmentation process,
On the one hand line bandwidth is saved by file division, on the other hand can increases System Error-tolerance Property.
" the resource cloud " of present system passes through Hash by the way that a file is divided into multiple pieces in physical store
They are split on multiple nodes of cluster by scheduling algorithm respectively, and it is sufficiently large that this characteristic can allow distributed memory system to preserve
File.Compared to not splitting on file backup to specified machine, the process of file division saves single-point between single-point
The bandwidth of communication, makes the load of system more balanced to a certain extent, on the other hand, if single node produces failure, it is impossible to read
The nodal information is taken, completion can be spliced by backuping to the segmentation of other nodes by way of file division and recover work
Make.
(2) fault-tolerance
After " resource cloud " platform is split to each file, data chunk redundancy is backed up by certain hash algorithm
To above other nodes, fault tolerant mechanism of the redundancy fault-tolerant based on HDFS of cloud platform, mainly have it is following some:
File parts are marked, remembered by Master nodes by file division, record contingency table as the decision-making of duplication
The contingency table of current block is recorded, is passed through according to contingency table content on hash algorithm redundancy backup to other corresponding nodes.Access text
There is no corresponding file division during part, above present node, then to request on a nearest redundancy backup.
The backup of Master nodes, is completed by zookeeper, and all nodes elect a master node and one
Backup-master nodes, backup-master nodes timing completes snapshot to master nodes, it is ensured that backup-master
The information master that keeps up with is too many.After heartbeat mechanism detects master node collapses, backup-master substitutions
Master nodes, and another backup-master node is selected by zookeeper election mechanism, backup is current
Master node contents.
(3) " resource cloud " other features
Timing snapshot:Snapshot supports that snapshot can be by the cluster of failure in special time one data copy of storage
On the previous normal time point rolled back to.
Flow state:When creating data, file data is buffered in local temporary file by client at the beginning.Using
The write operation of program is transparently redirected to this interim local file.When local file is stacked into a piecemeal size
Wait, client is just notified that host node.Filename is inserted into file system hierarchy by host node, then distributes a number for it
According to block.Host node construction includes back end ID (being probably multiple, the node of copy data block storage also has) and target data
The message of block identification, the request of client is replied with it.Client flushes to local temporary file the number specified after receiving
According in node data block.Because if client is write direct to Telefile without any local caching,
This will have a huge impact to network speed and network throughput.When closing of a file, what is do not uploaded in local temporary files is residual
Residual evidence will be transferred to back end.Then client just can notify host node file to have been switched off.Now, host node
The establishment operation of file is added into persistent storage.If host node is died before closing of a file, file is just lost
.
Continuous-flow type is replicated:When client is write data in file, as described above, data are first written to local file
In, it is assumed that the replicator of file is 3, and when local file is stacked into the data of a block size, client obtains one from host node
The list of individual back end.This list also includes the back end of storage data block copy.When client refreshes data block is arrived
First back end.First back end starts to receive data by unit of 4kb, and each fritter is all write into local library
In, while each fritter to be all sent to second back end in list.Similarly, second back end is by small block data
The 3rd back end is transmitted to simultaneously in write-in local library, the 3rd back end is written directly in local library.One data section
Point can also pass to data continuous-flow type next node while previous node data is connect, so, data are flowing water
It is delivered to from a back end next likes.
Autgmentability:The verified distributed platform of substantial amounts of application practice has great autgmentability, can be light
Expand on the cluster that hundreds of node is constituted.
3rd, data analysis scheme
In architecture opposed decision-making system, historical data information is most valuable resource, the analysis to historical information
Many functions can be completed with refining, such as history Strike information can be for aid decision.By to one group of history
The interpretation of result of flight course and Strike, we can obtain the sorter model of a state of flight, utilize this mould
Type can predict node Strike result.After forecast model is introduced on " resource cloud " platform, we can be according to each
The Strike of node predicts the outcome, and completes some decision-making functions, improves the efficiency of decision-making of architecture countermeasure system.
For existing state of flight message data set and strike result, a input that problem is regarded as that can be approximately is
The absolute location information of avionics information and target when aircraft is launched a guided missile, output be hit and hit off the target target two classification point
Class device model, two conventional classification graders of com-parison and analysis, show that an optimal sorter model of result is applied to decision-making system
In system.
(1) classifier algorithm
It is two classification problems due to what is solved, marked as 0 and 1.So grader seeks to find a face, will
All sample points assign to the both sides in face.That is, for any sample x=(b1, b2... bm), grader decision function F:
F (x)=g (f (x))
A. linear separability SVM
F (x)=w in linear separability SVM classifier decision functionTX+b, it is substantially that searching one can be by sample point
Assign to the hyperplane that maximizes margin of having of both sides by label, margin is all data points to the geometry interval of hyperplane
Minimum value.Said from the angle of statistics, because positive negative sample is considered as obtaining from two different distribution random samplings, if point
Class border and the distance of two distributions are bigger, and the probability that the sample sampled out falls in classification boundaries another side is smaller.So, it is maximum
Changing margin can ensure that the extensive error under worst case is minimum, and grader certainty factor is higher.
F (x)=w in grader decision functionTX+b, then its hyperplane is wTX+b=0.
Given training set T, hyperplane wTX+b=0, defines sample point (xi, yi) to hyperplane function at intervals of:
Geometry at intervals of:
If N is sample point number, the minimum value for defining the function interval of all sample points in T is:
The margin of hyperplane is the minimum value at the geometry interval of all sample points in T:
Margin is maximized to be represented by:
Change:
As can be seen that w, b equal proportion scaling all do not influence on hyperplane and geometry interval, and function interval can be in proportion
Scaling.So, orderAbove formula is substituted into, and is maximizedIt is equivalent to minimizeLinear separability svm is thus obtained
Optimization problem:
This is a convex quadratic programming problem, using Lagrange duality, by solve dual problem can obtain it is optimal
Solution, the process of solution is not just repeated.
B. non-linear SVM
For nonlinear classification problem, decision surface is a curved surface, and curved surface can become higher dimensional space by necessarily mapping
In a hyperplane, can thus be solved with the method in linear separability svm.
For example, two class data distributions are the shape (as shown in Figure 2) of two circles, such data are linearly can not in itself
Point, preferable interface should be a circle rather than a line (hyperplane).
If using x1And x1Represent the coordinate of this two dimensional surface, then its decision surface can be write as such form:
a0+a1x1+a2x2+a3x1 2+a4x2 2+a5x1x2=0
If we construct a quintuple space, coordinate value is respectively z1=x1、z2=x2、z3=x1 2、z4=x2 2、z5=
x1x2, then decision surface equation above can be write in new space:
As can be seen that the equation of this exactly one hyperplane.If we map the data into five dimensions in such a way
Space, then original nonlinear data reforms into linear separability in new space, so as to use linear svm algorithms
Processing.
Due in linear separability svm solution procedure, it is necessary to which the place data vector calculated is always in the form of inner product
Occur, so, the function that we define the inner product for calculating two vectors in the space after mapping is kernel function, uses kernel function
To simplify the inner product operation in mapping space.
So, for nonlinear situation, processing method is one kernel function of selection, and it is empty to map the data into higher-dimension by it
Between, become a linear separability problem in higher dimensional space, the linear inseparable problem in luv space is solved with this, so
Handled again with linear separability SVM algorithm afterwards.Kernel function conventional svm has four kinds:Linear kernel (is equal to linear separability
Svm), polynomial kernel, RBF cores, sigmoid cores, concrete form such as table 1 below.
Table 1
Type | Function expression |
Linear kernel | uT*v |
Polynomial kernel | (g*uT*v+coef0)degree |
RBF cores | exp(-g*||u-v||2) |
Sigmoid cores | tanh(g*uT*v+coef0) |
Data are split
Because sample data concentrates two class ratio datas great disparity, imbalance problem is caused.Attempt by ratio in training set compared with
That high class sample decomposition is into several pieces, and every piece separately constitutes a sub- training set with another kind of sample, to every sub- training set
It is trained, obtains subclassification model.Subclassification model can be made up of to new grader some computings, data are carried out
Prediction.So handle, data nonbalance problem can be improved to a certain extent.
For example, by label=0 sample decomposition into four pieces, the sample with label=1 constitutes four sub- training sets respectively,
They are trained and obtains four sub- disaggregated models.Each subclassification model is predicted to input data, obtains four
This four output can be carried out and computing, obtain final output, this is equivalent to a new classification by output
Device.Schematic diagram is as shown in Figure 3 and Figure 4.
The system architecture and flow of the present invention are further elaborated with below in conjunction with the accompanying drawings:
(1) program architecture and flow scheme design
As shown in figure 5, multi-platform aviation electronics big data system of the invention is integrally divided into 4 modules, data acquisition module,
Data memory module, data relation analysis module and data relation analysis application module.Data acquisition module is obtained from data source 1
Pcap data APMB packages are taken, after acquired classification into data memory module, the process of data storage are completed.Data correlation
Analysis module obtains training data from data source 2, can specify input parameter by user, completes data correlation model and builds
It is vertical, it is supplied to data relation analysis application module to use on model, completes real-time estimate, and result is included on screen, number
The cloud storage function of being realized according to association analysis application module using data memory module completes the function of real-time storage.
Because system is developed on the basis of distributed platform, (developed firstly the need of in multiple devices when building system
During system use 6) on build the complete distributed environments of Hadoop and HBase.Equivalent to one flight node of every equipment, its
In have one as host node, the operation such as to be scheduled and show.
1. data acquisition module
The temporal information field of key, the source IP of bag, target are obtained in the pcap bags captured using libpcap bags from network
The data field of IP information and storage information, respectively time fields, sourceIP fields, destIP fields and data fields,
Information is sent using the data in destIP and sourceIP combination simulated scenarios, package informatin data block can be primarily determined that out.
Different data blocks are distinguished, parses in different formats, obtains independent data block data structure, by data knot
Structure writes back hard disk in a text form, is used for next stage.
As shown in Figure 6 and Figure 7, data acquisition module include import folders path unit, export folders path unit,
Data block selection unit.Import folders path unit and export folders path unit be used for the input for reading user's selection and
The folder path of output, data block selection unit be used for read user selection data block type, data acquisition module according to
The content that these units are read carries out data acquisition.
As shown in figure 9, data acquisition module logic flow comprises the following steps:
1) interface program is initialized;
2) user's operation is waited;
3) get parms, call processing routine;
4) judge whether file also has and do not read file, be then to enter step 5), otherwise terminate program;
5) judge whether still there is data in file, be then to enter step 5), otherwise return to step 4);
6) whether be user needs, be then to enter step 7 if judging the data block), otherwise return to step 5);
7) parsing and output data, return to step 5).
2. data memory module
(1) distributed storage platform
To complete data reliability storing process, with reference to the design in technical scheme, by existing distributed cloud platform,
Data storage function is realized based on HDFS.HDFS service end is disposed on six special test equipments, all node simulations are treated
After pilot's (device power-up) in place, HDFS start-all.sh orders are started in any node, six test equipments are set up
Into unified data sharing platform, the port of corresponding function is monitored respectively.When data storage or inquiry request reach, correspondence is used
Port transmission data.
The data reliability and fault-tolerance of platform are completed by HDFS redundant backup function.
(2) distributed data base
On the basis of existing HDFS stable storages, project is all data of standardized management, is realized based on HBase
One distributed data base, reliable memory is completed using Hadoop HDFS, is added using Hadoop MapReduce frameworks
Speed system data query operation.
HBase Table Design is as follows:
During actual storage, each packet correspondence one rowKey, each rowKey only include the information of a data block,
The mode that HBase is deposited using row ensures the reliability of system data.
(3) operational process
The module running includes two steps of data storage and data display.
Data storage:Every a 40ms data of discharge, store data into HBase, because sample data volume is smaller, read
Spued again since first data after completion.
Data display:The another journey that bursts at the seams completes the reading process of file, every 10ms from HBase environment real-time query from upper
Secondary timestamp inquires records all in the present time stamp time, and the last item record is read from record, is shown in real time
On screen.
As shown in Figure 6 and Figure 7, data memory module includes reading file path unit and demonstration control unit, for counting
According to storage demonstration.The data source file storage path that file path unit is used to read user's selection is read, control unit is demonstrated
For the storage condition of demonstration data, it periodically reads stored record and is shown on panel.
As shown in Figure 10, data memory module logic flow comprises the following steps:
1) HBase connections are initialized;
2) table, row cluster are created;
3) native data imports internal memory;
4) demonstration is started;
5) real time data uploads HBase, while obtaining all node datas from HBase in real time;
6) judge whether to terminate demonstration, be to terminate, otherwise return to step 4).
3. data relation analysis module
The SVM classifier that this part is mainly used, the SVM bags of correspondence code, by SVM method, to existing number
Classified according to analysis result, its nucleus module is data disassembler and the libsvm grader bags called, disassembler
Data source result is split into N parts (N is inputted by user) for 0 record, N number of training number is constituted with result for 1 record respectively
According to collection, N number of model is exported after being trained with libsvm, being predicted result using N number of model result during prediction carries out and/or operate
Output predicts the outcome.
Running mainly includes three below step.
Data normalization:Scan data set, takes out bound, completes the normalization operation of data, it is ensured that each variable pair
As a result effect balance.
Data are split:Because the particularity of data, as a result for 0 record quantity far more than result be 1, so the present invention is adopted
The partition strategy in technical scheme is taken, result is divided into N parts for 1 data, N number of data source is formed after being combined respectively with 0,
This part is realized in read_prob functions.
Data are trained:Each function (including svm_scale, svm_train etc.) in libsvm software kits is called, to each
Svm_problem is trained, generation svm_model and dump (unloading) is on hard disk.
As shown in Figure 6 and Figure 7, data relation analysis module include training data path unit, training parameter selecting unit,
Data partitioning scheme selecting unit, for setting up model, carrying out model training.Training data path unit is used to read user's choosing
The training data storage path selected, training parameter selecting unit is used for each training parameter value for reading user's selection, data point
The data partitioning scheme that mode selecting unit is used to read user's selection is cut, data relation analysis module is read according to these units
Content carry out the foundation and training of model.
As shown in figure 11, data relation analysis module logic flow comprises the following steps:
1) data are read, the bound of each property value is taken out, including longitude, latitude, height, roll angle, direct route angle, pitching
7 attributes in angle and speed;
2) scan data again, with bound scale data (scaled data, to improve the place of training and pre- chronometric data
Reason speed) after call read_prob functions produce svm_problem;
3) svm_problem carries out cross validation (cross validation), obtains training accuracy rate;
4) svm_train functions are called based on svm_problem, generation model is simultaneously stored;
5) terminate.
4. data relation analysis application module
The global design principle of application module is to complete storage using data memory module, utilizes data relation analysis module
The optimal models of output is as input model, to any data real-time estimate, as shown in Figure 8.
Wherein, the data prediction of many sub-models follows following rule:
2 points:
Or model:n1|n2
With model:n1&n2
4 points:
First with it is rear or:(n1&n2)|(n3&n4)
First or afterwards with:(n1|n2)&(n3|n4)
8 points:
First with it is rear or:(n1&n2&n3&n4)|(n5&n6&n7&n8)
First or afterwards with:(n1|n2|n3|n4)&(n5|n6|n7|n8)
Running mainly includes three below step.
Initialization:HBase connection is initialized, the establishment of table is completed, the establishment of row cluster etc. is operated, and being read from hard disk needs
The file content of storage.
Data are produced:Every a 40ms data of discharge, store data into HBase, because sample data volume is smaller, read
Spued again since first data after taking into.
Data display:The another journey that bursts at the seams completes the reading process of file, every 10ms from HBase environment real-time query from upper
Secondary timestamp inquires records all in the present time stamp time, and the last item record is read from record, this data is used
Call SVM to complete real-time estimate, and result is included on screen.
As shown in Figure 6 and Figure 7, data relation analysis application module includes model path selection unit, reads file path
Unit, demonstration control unit, for data analysis demonstration.Model path selection unit is used for the training pattern for reading user's selection
Path is deposited, the data source file storage path that file path unit is used to read user's selection, demonstration control unit profit is read
Data are analyzed with the model of reading, are shown to predicting the outcome on panel.
As shown in figure 12, data relation analysis application module logic flow comprises the following steps:
1) HBase connections are initialized;
2) table, row cluster are created;
3) native data imports internal memory;
4) demonstration is started;
5) real time data uploads HBase, while it is real-time to reuse SVM algorithm from all node datas of HBase acquisitions in real time
Predict the outcome;
6) judge whether to terminate demonstration, be to terminate, otherwise return to step 4).
(2) Interface design
1. data acquisition module
Data acquisition is the data basis of multi-platform avionics big data system " resource cloud " platform, is provided necessarily for software
Data Analysis Data source.Data source requirement is to be obtained from actual motion environment at interchanger with the packet capturing of wireshark softwares
Data, data format requirement is pcap data, and the Target IP and source IP of bag meet following require:
Table 2
Data block title | Purpose IP/ source IPs |
Networking is instructed | 224.224.0.110 |
Demonstrate scene information | 224.224.0.107/224.224.0.108 |
Integration objective data block | 224.224.0.89 |
Helicopter carrier aircraft data block | 224.224.0.140 |
Demonstrate control information |
Data field datas block meets agreement in each packet《XX type demo system data interface protocols》.
2. data memory module
Data memory module is the core of multi-platform avionics big data system, and thus module completes system data storage work(
Energy.This module receives the data output from " data acquisition " module, and input data form is complete txt texts, often
CSV is used between packet content after one behavior one parsing, field, each packets fields information is as follows:
Helicopter carrier aircraft data block:Bag timestamp, data block ID, the data block time, longitude, latitude, height, the angle of pitch,
Roll angle, Zhen Hangjiao, the angle of attack, ground velocity, north orientation speed, east orientation speed, day speed
Integration objective data block:Bag data is stabbed, data block ID, data block time, target number, the attribute of target 1, target 1
Longitude, the latitude of target 1, the height of target 1, the orientation of target 1, the angle of pitch of target 1, the north orientation speed of target 1, the east orientation speed of target 1,
The sky orientation speed of target 1, the attribute ... ... of target 2, the sky orientation speed of target 20
3. data relation analysis module
The major function of data relation analysis module is that data model is set up in the analysis to historical data, this module input one
Group training data, data modeling process is completed by SVM classifier and division classification policy.Training data requirement is STK simulations
The data that software is collected into, its form is to use tab between 7 input variable forms and the result data of one 0/1, all fields
Tab (" t ") separates, and field information is as follows:
The parameter 70 of 4 parameter of latitude, longitude height parameter, 5 parameter 6
4. data relation analysis application module
This module is exported using the output file of STK simulation softwards as prediction data source with " data relation analysis part "
Model carries out real-time prediction of result based on data memory module, is shown in real time on interface as input model.The part is defeated
Enter data format (i.e. STK simulation softwards output file) field information as follows:
The parameter 7 of 4 parameter of latitude, longitude height parameter, 5 parameter 6
(3) global data structures are designed
1. physical arrangement
Some data structures that the data structure that software is mainly used in realizing is corresponded in data protocol, define data
The data that structure storage is parsed from binary file, the data structure of realization has following several:
The data header information that FrameHeader//storage is each wrapped
Helicopt_Carrier_Parm//storage carrier aircraft data block field information
SingleTargetParameter//all information of storage single target
The integrated target data block field information of Interrgrated_Target_Parm//storage, inside may include multiple
SingleTargetParameter
Demo_Scene_Info//demonstration scene information data block
Demo_Ctrl_Info//demonstration control information data block
Build_Net_Cmd//networking director data block
It is identified as row data, being recorded as a Record, use after in Record//each data block to system
Length and type fields distinguish classification.
2. table structure
Classified according to existing data, the characteristics of with reference to HBase by row storage, we design following table structure by
Multiple row clusters (ColumnFamily) are constituted, and each row cluster is made up of multiple attributes, and each attribute is corresponding to one in data block
Individual field.Row cluster in table has following several:
CF_HelicoptCarrierParm//helicopter carrier aircraft data block row cluster
CF_IntegeratedTargetParm//basic target data block row cluster
The corresponding row cluster of CF_STK//STK data
The non-structured bag row cluster of CF_EmptyString//storage
The corresponding row cluster of the unidentified data block of CF_Unrecognised//storage
What is included in each row cluster is classified as the bag letter stored in each piece of field information, CF_EmptyString comprising textual form
Breath, such as " 87a34b2345f86544e ", CF_Unrecognised containing type information.
Line unit information in form uses customized form, and form is " row "+system time+helicopterId, example
Such as line unit is " row147926317632301 ", represents that system time (the millisecond number from 1 day 0 January in 1970) is
When 1479263176323, the node that numbering is 01 stores the data into cloud platform.
3. class formation
The class formation being related in realization mainly has the Record classes of record row information, calls bottom HBase's
Complete each to call in HBaseEngine classes and the SVMEngine classes for calling SVM classifier, the member variable of each comfortable class
Journey.
4. constant
Realize that the constant being related in design is mainly field name information, quantity is larger, does not list in detail herein.
The effect of the present invention is verified below by way of specific experiment:
1. classifier algorithm evaluation and test experiment
(1) data set
Totally 4497432, original flying quality sample as experiment, wherein that hits (label=1) has 316768,
That does not hit (label=0) has 4180664.Initial data is divided into according to 50%, 25%, 25% ratio uniform
Train set, validation set, tri- set of test set.Wherein, train set are used for training grader;
Validation set are used for testing the performance of different classifications device, determine that the network structure or Controlling model of disaggregated model are answered
The parameter of miscellaneous degree;Test set are used for examining the performance of the optimal classification model of final choice.
(2) experimental result
Test experiments are carried out to different classifications device algorithm, experimental result is assessed, optimal sorter model is chosen, uses test
Set is verified.
A. linear separability svm
Linear separability svm is realized with Liblinear, is tested, as a result such as table 3 below:
Table 3
accuracy | precision | recall | F1 |
92.9669% | 0 | 0 | 0 |
Due to example number of the number well below label=0 of label=1 in data set, (ratio is about 1:13), because
This linear svm can all predict 0, but it is clear that being so skimble-skamble.
B. nonlinear s vm
Different types of nonlinear s vm is realized with Libsvm, is tested, as a result such as table 4 below:
Table 4
Kernel function | accuracy | precision | recall | F1 |
Linear kernel | 92.9669% | 0 | 0 | 0 |
Polynomial kernel | 92.9669% | 0 | 0 | 0 |
RBF cores | 94.3549% | 0.599 | 0.596 | 0.597 |
Sigmod cores | 85.9684% | 0 | 0 | 0 |
It can be seen that the result from RBF kernel functions is best, rate of accuracy reached to 94.4%, 1 prediction rate have also exceeded
50%.
C. data are split
Sub- training set is trained with above-mentioned libsvm RBF core types, because its effect is best.
I. two segmentation
By label=0 training data random division into two pieces, two sub- training sets are constituted for 1 data with label,
Training obtain two model, validation set are predicted respectively, two output are obtained, by with and/or two kinds pass
System processing output obtains final classification result.Test result such as table 5 below:
Table 5
accuracy | precision | recall | F1 | |
With | 94.1015% | 0.556 | 0.806 | 0.658 |
Or | 94.0866% | 0.554 | 0.811 | 0.659 |
Ii. four segmentation
By label=0 training data random division into four pieces, four sub- training sets are constituted for 1 data with label,
Training obtains four model, and validation set are predicted respectively, four output are obtained, by entirely with, Quan Huo, elder generation
With it is rear or, first or afterwards with four kinds of Automated generalization output obtain final classification result.Test result such as table 6 below:
Table 6
Iii. eight segmentation
By label=0 training data random division into eight pieces, eight sub- training sets are constituted for 1 data with label,
Training obtains eight model, and validation set are predicted respectively, eight output are obtained, by entirely with, Quan Huo, elder generation
With it is rear or, first or afterwards with four kinds of Automated generalization output obtain final classification result.Test result such as table 7 below:
Table 7
accuracy | Precision | recall | F1 | |
Entirely with | 91.5268% | 0.453 | 0.984 | 0.620 |
Quan Huo | 91.2967% | 0.446 | 0.987 | 0.615 |
First with it is rear or | 91.4762% | 0.451 | 0.985 | 0.619 |
First or afterwards with | 91.3750% | 0.449 | 0.986 | 0.617 |
Iv. 2/3rds segmentation
By label=0 training data random division into three pieces, every two pieces constitute three son instructions with label for 1 data
Practice collection, training obtain three model, validation set are predicted respectively, three output are obtained, by with and/or two
Plant Automated generalization output and obtain final classification result.Test result such as table 8 below:
Table 8
accuracy | precision | recall | F1 | |
With | 94.3033% | 0.575 | 0.729 | 0.643 |
Or | 94.2959% | 0.574 | 0.734 | 0.644 |
D. confirmatory experiment
Tested more than, it can be seen that the simple nonlinear s vm grader accuracy rate highests using RBF cores, and two points
Cut the F1 value highests of grader.Confirmatory experiment is carried out to both optimal classification models with test set, as a result such as table 9 below:
Table 9
Checking is obtained, and both classifier performances and test result above are basically identical, really optimal.
2. multi-platform aviation electronics big data system testing
A. data acquisition module is tested
Runs software system, into data acquisition module, sets and starts collection after parameter.Check the data block text of output
Part, correctly, it was demonstrated that acquisition function is normal.Different Block selection parameters are set, the size of data of output is checked,
It is different, it was demonstrated that acquisition module can be acquired to a variety of forms data blocks.
B. data memory module is tested
Runs software system, into data acquisition module, then starts demonstration.The data on Dashboard panels are observed,
As program is run, panel can show the status information of each node in cluster in real time, and can be seen that flying quality is just stored,
Prove that the module is capable of the data of real-time storage each node.
C. data relation analysis module testing
Runs software system, into data relation analysis module, is respectively adopted different Selection of kernel function parameter and segmentation
Parameter, is trained to input data set, can be successfully obtained disaggregated model, it was demonstrated that the module can be carried out differently
Data analysis.
D. data relation analysis application module is tested
Runs software system, into data relation analysis application module, then Selecting All Parameters start demonstration.Interface can be real
When show all nodes flying quality and prediction Strike result, it was demonstrated that the module can in real time be deposited to flying quality
Storage and prediction.
E. system node static state reduces test
According to corresponding method, system node is reduced to 4 by 6 static state, Hadoop and Hbase in cluster is checked
Nodes, become 4, illustrate that system supports static reduction node.
F. system node dynamically increase test
According to corresponding method, system node is added dynamically to 6 by 4 in previous test, and newly increased
Runtime software on node.The change of nodal information, is successfully become by original 4 on inspection system data storage function interface
6, illustrate that system supports dynamic increase node.
Using the above-mentioned desirable embodiment according to the present invention as enlightenment, by above-mentioned description, relevant staff is complete
Various changes and amendments can be carried out without departing from the scope of the technological thought of the present invention' entirely.The technology of this invention
Property scope is not limited to the content on specification, it is necessary to its technical scope is determined according to right.
Claims (10)
1. a kind of multi-platform aviation electronics big data system, it is characterised in that including data acquisition module, data memory module,
Data relation analysis module and data relation analysis application module;
Data acquisition module obtains pcap data APMB packages from data source 1, after acquired classification into data memory module,
Complete the process of data storage;Data relation analysis module obtains training data from data source 2, completes data correlation model and builds
It is vertical, it is supplied to data relation analysis application module to use on model, completes real-time estimate, and result is included on screen, number
The cloud storage function of being realized according to association analysis application module using data memory module completes the function of real-time storage.
2. the system as claimed in claim 1, it is characterised in that it is single that the data acquisition module includes import folders path
Member, export folders path unit and data block selection unit;The import folders path unit and the export folders
Path unit is used for the folder path for the input and output for reading user's selection, and the data block selection unit, which is used to read, to be used
The data block type of family selection, the content that the data acquisition module is read according to above unit carries out data acquisition;
The temporal information field of key is obtained in the pcap bags that the data acquisition module is captured using libpcap bags from network,
The data field of the source IP of bag, Target IP information and storage information, respectively time fields, sourceIP fields, destIP words
Section and data fields, information is sent using the data in destIP and sourceIP combination simulated scenarios, primarily determines that out that bag is believed
Cease data block;Different data blocks are distinguished, parses in different formats, independent data block data structure is obtained, by data
Structure writes back hard disk in a text form, is used for next stage.
3. the system as claimed in claim 1, it is characterised in that the data memory module include reading file path unit and
Demonstrate control unit;It is described to read the data source file storage path that file path unit is used to read user's selection;It is described to drill
Show that control unit is used for the storage condition of demonstration data, it periodically reads stored record and is shown on panel;The number
Hadoop distributed storages platform and HBase distributed data bases are used according to memory module, data are obtained in real time from multi-aircraft,
Then it is then stored into by cloud storage mode on multi-aircraft, and obtains and share in real time the data of multi-aircraft.
4. the system as claimed in claim 1, it is characterised in that it is single that the data relation analysis module includes training data path
Member, training parameter selecting unit and data partitioning scheme selecting unit;The training data path unit is used to read user's choosing
The training data storage path selected, the training parameter selecting unit is used for each training parameter value for reading user's selection, institute
State data partitioning scheme selecting unit be used for read user selection data partitioning scheme, the data relation analysis module according to
The content that said units are read carries out the foundation and training of model;
The data relation analysis module uses SVM classifier, the SVM bags of correspondence code, by SVM method, to existing
Data and analysis result are classified, and its nucleus module is data disassembler and the libsvm grader bags called, splits journey
Data source result is split into N parts by sequence for 0 record, and N is inputted by user, constitutes N number of training number with result for 1 record respectively
According to collection, N number of model is exported after being trained with libsvm, being predicted result using N number of model result during prediction carries out and/or operate
Output predicts the outcome;Data correlation model is set up in the data relation analysis module specifies input parameter to complete by user.
5. the system as claimed in claim 1, it is characterised in that the data relation analysis application module is selected including model path
Select unit, read file path unit and demonstration control unit;The model path selection unit is used to read user's selection
Training pattern deposits path, and the reading file path unit is used for the data source file storage path for reading user's selection, institute
State demonstration control unit to analyze data using the model of reading, be shown to predicting the outcome on panel.
6. the implementation method of a kind of system as described in claim any one of 1-5, it is characterised in that including data acquisition module
Data acquisition realization, the data storage of data memory module is realized, data relation analysis module sets up data correlation model
Realize the real-time estimate result Display Realization with data relation analysis application module.
7. method as claimed in claim 6, it is characterised in that the data acquisition of the data acquisition module is realized including as follows
Step:
1) interface program is initialized;
2) user's operation is waited;
3) get parms, call processing routine;
4) judge whether file also has and do not read file, be then to enter step 5), otherwise terminate program;
5) judge whether still there is data in file, be then to enter step 5), otherwise return to step 4);
6) whether be user needs, be then to enter step 7 if judging the data block), otherwise return to step 5);
7) parsing and output data, return to step 5).
8. method as claimed in claim 6, it is characterised in that the data storage of the data memory module is realized including as follows
Step:
1) HBase connections are initialized;
2) table, row cluster are created;
3) native data imports internal memory;
4) demonstration is started;
5) real time data uploads HBase, while obtaining all node datas from HBase in real time;
6) judge whether to terminate demonstration, be to terminate, otherwise return to step 4).
9. method as claimed in claim 6, it is characterised in that the data relation analysis module sets up data correlation model
Realization comprises the following steps:
1) data are read, the bound of each property value is taken out;
2) scan data again, svm_problem is produced with read_prob functions are called after bound scaled data;
3) svm_problem carries out cross validation, obtains training accuracy rate;
4) svm_train functions are called based on svm_problem, generation model is simultaneously stored;
5) terminate.
10. method as claimed in claim 6, it is characterised in that the real-time estimate knot of the data relation analysis application module
Fruit Display Realization comprises the following steps:
1) HBase connections are initialized;
2) table, row cluster are created;
3) native data imports internal memory;
4) demonstration is started;
5) real time data uploads HBase, while obtaining all node datas from HBase in real time reuses SVM algorithm real-time estimate
As a result;
6) judge whether to terminate demonstration, be to terminate, otherwise return to step 4).
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710367759.7A CN107229695A (en) | 2017-05-23 | 2017-05-23 | Multi-platform aviation electronics big data system and method |
PCT/CN2017/106322 WO2018214388A1 (en) | 2017-05-23 | 2017-10-16 | Multi-platform big data system and method for aviation electronics |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710367759.7A CN107229695A (en) | 2017-05-23 | 2017-05-23 | Multi-platform aviation electronics big data system and method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN107229695A true CN107229695A (en) | 2017-10-03 |
Family
ID=59933807
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710367759.7A Pending CN107229695A (en) | 2017-05-23 | 2017-05-23 | Multi-platform aviation electronics big data system and method |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN107229695A (en) |
WO (1) | WO2018214388A1 (en) |
Cited By (22)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108052616A (en) * | 2017-12-15 | 2018-05-18 | 四川汉科计算机信息技术有限公司 | Aviation big data intelligent analysis method based on remote embedded data acquisition |
CN108052617A (en) * | 2017-12-15 | 2018-05-18 | 四川汉科计算机信息技术有限公司 | Aviation big data intelligent analysis system based on remote embedded data acquisition |
CN108092802A (en) * | 2017-12-04 | 2018-05-29 | 中国船舶重工集团公司第七〇九研究所 | The numerical prediction maintenance system and method for ocean nuclear power platform nuclear power unit |
CN108228378A (en) * | 2018-01-05 | 2018-06-29 | 中车青岛四方机车车辆股份有限公司 | The data processing method and device of train groups failure predication |
CN108650229A (en) * | 2018-04-03 | 2018-10-12 | 国家计算机网络与信息安全管理中心 | A kind of network application behavior parsing restoring method and system |
CN108762225A (en) * | 2018-04-24 | 2018-11-06 | 中国商用飞机有限责任公司北京民用飞机技术研究中心 | A kind of failure in flight control system copes with decision making device |
WO2018214388A1 (en) * | 2017-05-23 | 2018-11-29 | 深圳大学 | Multi-platform big data system and method for aviation electronics |
WO2018214387A1 (en) * | 2017-05-23 | 2018-11-29 | 深圳大学 | Distributed mining system and method for aviation-oriented electronic data |
CN109408694A (en) * | 2018-09-25 | 2019-03-01 | 广东中标数据科技股份有限公司 | A kind of customs reaches a standard shipping bill analysis method, system and device |
CN109828988A (en) * | 2019-01-25 | 2019-05-31 | 重庆科技学院 | A kind of big data statistical method and the system for big data statistics |
CN110472122A (en) * | 2019-07-31 | 2019-11-19 | 重庆古扬科技有限公司 | A kind of dynamic distributed academic resources acquisition method of multichannel |
CN111078687A (en) * | 2019-11-14 | 2020-04-28 | 青岛民航空管实业发展有限公司 | Flight operation data fusion method, device and equipment |
CN111190992A (en) * | 2019-12-10 | 2020-05-22 | 华能集团技术创新中心有限公司 | Mass storage method and storage system for unstructured data |
CN111753926A (en) * | 2020-07-07 | 2020-10-09 | 广州驰兴通用技术研究有限公司 | Data sharing method and system for smart city |
CN111881213A (en) * | 2020-07-28 | 2020-11-03 | 东航技术应用研发中心有限公司 | System for storing, processing and using flight big data |
CN112084148A (en) * | 2020-09-18 | 2020-12-15 | 陕西千山航空电子有限责任公司 | Comprehensive application platform for aviation objective information |
CN112182094A (en) * | 2019-07-01 | 2021-01-05 | 成都启英泰伦科技有限公司 | Big data distributed storage method in voice data character text form |
CN112416753A (en) * | 2020-11-02 | 2021-02-26 | 中关村科学城城市大脑股份有限公司 | Method, system and equipment for standardized management of urban brain application scene data |
CN113159371A (en) * | 2021-01-27 | 2021-07-23 | 南京航空航天大学 | Unknown target feature modeling and demand prediction method based on cross-modal data fusion |
CN114168243A (en) * | 2021-11-23 | 2022-03-11 | 广西电网有限责任公司 | Dashbird multi-chart-based system and method for dynamically merging data |
CN115378674A (en) * | 2022-08-11 | 2022-11-22 | 国网湖南综合能源服务有限公司 | Application method and system of shared energy storage cloud platform based on block chain technology |
CN115857899A (en) * | 2022-11-16 | 2023-03-28 | 电子科技大学 | Heterogeneous data packet-oriented analysis software automatic construction method |
Families Citing this family (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110807552B (en) * | 2019-10-30 | 2023-07-25 | 合肥工业大学 | Urban electric bus driving condition construction method based on improved K-means |
WO2021195943A1 (en) * | 2020-03-31 | 2021-10-07 | 深圳市大疆创新科技有限公司 | Flight record data storage method, flight record data acquisition method, and unmanned aerial vehicle |
CN111737529B (en) * | 2020-07-23 | 2020-12-18 | 北京东方通科技股份有限公司 | Multi-source heterogeneous data acquisition method |
CN113177022A (en) * | 2021-04-29 | 2021-07-27 | 东北大学 | Full-process big data storage method for aluminum/copper plate strip production |
CN113656370B (en) * | 2021-08-16 | 2024-04-30 | 南方电网数字电网集团有限公司 | Data processing method and device for electric power measurement system and computer equipment |
CN113688100B (en) * | 2021-09-06 | 2023-07-18 | 北京普睿德利科技有限公司 | Meteorological data processing method, device, terminal and storage medium |
CN115225730A (en) * | 2022-07-05 | 2022-10-21 | 北京赛思信安技术股份有限公司 | High-concurrency offline data packet analysis method supporting multiple tasks |
CN115168396A (en) * | 2022-07-15 | 2022-10-11 | 全图通位置网络有限公司 | Comprehensive intelligent platform data management method and system based on spatio-temporal system |
CN115474021B (en) * | 2022-07-19 | 2023-08-08 | 北京普利永华科技发展有限公司 | Satellite transponder data processing method and system under multi-component combined control |
CN115269704B (en) * | 2022-08-02 | 2023-08-18 | 贵州财经大学 | Multi-element heterogeneous agricultural data management system |
CN116303729B (en) * | 2023-05-17 | 2023-08-01 | 北京煜象软件技术有限公司 | Information acquisition method, device, equipment and medium |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104573068A (en) * | 2015-01-23 | 2015-04-29 | 四川中科腾信科技有限公司 | Information processing method based on megadata |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10644950B2 (en) * | 2014-09-25 | 2020-05-05 | At&T Intellectual Property I, L.P. | Dynamic policy based software defined network mechanism |
CN104394211A (en) * | 2014-11-21 | 2015-03-04 | 浪潮电子信息产业股份有限公司 | Design and implementation method for user behavior analysis system based on Hadoop |
CN107229695A (en) * | 2017-05-23 | 2017-10-03 | 深圳大学 | Multi-platform aviation electronics big data system and method |
-
2017
- 2017-05-23 CN CN201710367759.7A patent/CN107229695A/en active Pending
- 2017-10-16 WO PCT/CN2017/106322 patent/WO2018214388A1/en active Application Filing
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104573068A (en) * | 2015-01-23 | 2015-04-29 | 四川中科腾信科技有限公司 | Information processing method based on megadata |
Non-Patent Citations (3)
Title |
---|
刘进军: "基于惩罚的SVM和集成学习的非平衡数据分类算法研究", 《计算机应用与软件》 * |
勒加雷: "《嵌入式协议栈μC\TCP-IP 基于STM32微控制器》", 31 January 2013 * |
高红旭,等: "大数据技术在民航空管监控系统中的应用", 《现代导航》 * |
Cited By (28)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2018214388A1 (en) * | 2017-05-23 | 2018-11-29 | 深圳大学 | Multi-platform big data system and method for aviation electronics |
WO2018214387A1 (en) * | 2017-05-23 | 2018-11-29 | 深圳大学 | Distributed mining system and method for aviation-oriented electronic data |
CN108092802A (en) * | 2017-12-04 | 2018-05-29 | 中国船舶重工集团公司第七〇九研究所 | The numerical prediction maintenance system and method for ocean nuclear power platform nuclear power unit |
CN108052616A (en) * | 2017-12-15 | 2018-05-18 | 四川汉科计算机信息技术有限公司 | Aviation big data intelligent analysis method based on remote embedded data acquisition |
CN108052617A (en) * | 2017-12-15 | 2018-05-18 | 四川汉科计算机信息技术有限公司 | Aviation big data intelligent analysis system based on remote embedded data acquisition |
CN108228378A (en) * | 2018-01-05 | 2018-06-29 | 中车青岛四方机车车辆股份有限公司 | The data processing method and device of train groups failure predication |
CN108650229A (en) * | 2018-04-03 | 2018-10-12 | 国家计算机网络与信息安全管理中心 | A kind of network application behavior parsing restoring method and system |
CN108650229B (en) * | 2018-04-03 | 2021-07-16 | 国家计算机网络与信息安全管理中心 | Network application behavior analysis and restoration method and system |
CN108762225A (en) * | 2018-04-24 | 2018-11-06 | 中国商用飞机有限责任公司北京民用飞机技术研究中心 | A kind of failure in flight control system copes with decision making device |
CN109408694A (en) * | 2018-09-25 | 2019-03-01 | 广东中标数据科技股份有限公司 | A kind of customs reaches a standard shipping bill analysis method, system and device |
CN109828988A (en) * | 2019-01-25 | 2019-05-31 | 重庆科技学院 | A kind of big data statistical method and the system for big data statistics |
CN112182094A (en) * | 2019-07-01 | 2021-01-05 | 成都启英泰伦科技有限公司 | Big data distributed storage method in voice data character text form |
CN110472122A (en) * | 2019-07-31 | 2019-11-19 | 重庆古扬科技有限公司 | A kind of dynamic distributed academic resources acquisition method of multichannel |
CN111078687A (en) * | 2019-11-14 | 2020-04-28 | 青岛民航空管实业发展有限公司 | Flight operation data fusion method, device and equipment |
CN111190992A (en) * | 2019-12-10 | 2020-05-22 | 华能集团技术创新中心有限公司 | Mass storage method and storage system for unstructured data |
CN111190992B (en) * | 2019-12-10 | 2023-09-08 | 华能集团技术创新中心有限公司 | Mass storage method and storage system for unstructured data |
CN111753926A (en) * | 2020-07-07 | 2020-10-09 | 广州驰兴通用技术研究有限公司 | Data sharing method and system for smart city |
CN111881213A (en) * | 2020-07-28 | 2020-11-03 | 东航技术应用研发中心有限公司 | System for storing, processing and using flight big data |
CN111881213B (en) * | 2020-07-28 | 2021-03-19 | 东航技术应用研发中心有限公司 | System for storing, processing and using flight big data |
CN112084148A (en) * | 2020-09-18 | 2020-12-15 | 陕西千山航空电子有限责任公司 | Comprehensive application platform for aviation objective information |
CN112416753A (en) * | 2020-11-02 | 2021-02-26 | 中关村科学城城市大脑股份有限公司 | Method, system and equipment for standardized management of urban brain application scene data |
CN113159371A (en) * | 2021-01-27 | 2021-07-23 | 南京航空航天大学 | Unknown target feature modeling and demand prediction method based on cross-modal data fusion |
CN114168243A (en) * | 2021-11-23 | 2022-03-11 | 广西电网有限责任公司 | Dashbird multi-chart-based system and method for dynamically merging data |
CN114168243B (en) * | 2021-11-23 | 2024-04-02 | 广西电网有限责任公司 | Data system and method based on dashboard multi-chart dynamic merging |
CN115378674A (en) * | 2022-08-11 | 2022-11-22 | 国网湖南综合能源服务有限公司 | Application method and system of shared energy storage cloud platform based on block chain technology |
CN115378674B (en) * | 2022-08-11 | 2023-11-03 | 国网湖南综合能源服务有限公司 | Application method and system of shared energy storage cloud platform based on blockchain technology |
CN115857899A (en) * | 2022-11-16 | 2023-03-28 | 电子科技大学 | Heterogeneous data packet-oriented analysis software automatic construction method |
CN115857899B (en) * | 2022-11-16 | 2023-12-15 | 电子科技大学 | Heterogeneous data packet-oriented automatic construction method for analysis software |
Also Published As
Publication number | Publication date |
---|---|
WO2018214388A1 (en) | 2018-11-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107229695A (en) | Multi-platform aviation electronics big data system and method | |
CN104885104B (en) | Satellite dispatches system | |
Rosendo et al. | Distributed intelligence on the Edge-to-Cloud Continuum: A systematic literature review | |
US10331490B2 (en) | Scalable cloud-based time series analysis | |
Rosalie et al. | Chaos-enhanced mobility models for multilevel swarms of UAVs | |
Cao et al. | Analytics everywhere: generating insights from the internet of things | |
Rios | Big data infrastructure for analyzing data generated by wireless sensor networks | |
Emmanouil et al. | Big data analytics in prevention, preparedness, response and recovery in crisis and disaster management | |
Braun et al. | Pattern mining from big IoT data with fog computing: models, issues, and research perspectives | |
US10503498B2 (en) | Scalable cloud-based time series analysis | |
Yin et al. | Dynamic data mining of sensor data | |
Song et al. | Military simulation big data: background, state of the art, and challenges | |
Guo | Research on anomaly detection in massive multimedia data transmission network based on improved PSO algorithm | |
CN107229234A (en) | The distributed libray system and method for Aviation electronic data | |
Niazalizadeh Moghadam et al. | Multi-agent distributed data mining approach for classifying meteorology data: case study on Iran’s synoptic weather stations | |
US20220318202A1 (en) | Method and subsystem of a distributed log-analytics system that automatically determine the source of log/event messages | |
Hireche et al. | Toward a Novel RESTFUL Big Data-Based Urban Traffic Incident Data Web Service for Connected Vehicles | |
Bobbio et al. | Markovian agent models: a dynamic population of interdependent Markovian agents | |
Kaur et al. | Generative adversarial networks with quantum optimization model for mobile edge computing in IoT big data | |
Shih et al. | Implementation and visualization of a netflow log data lake system for cyberattack detection using distributed deep learning | |
He et al. | Mining moving object gathering pattern based on resilient distributed datasets and R-tree index | |
Perko et al. | Evaluating probability of default: Intelligent agents in managing a multi-model system | |
Chen et al. | Towards low-latency big data infrastructure at sangfor | |
Yao et al. | Data fusion of geographically dispersed information: experience with the scalable data grid | |
Airlangga et al. | A novel architectural design for solving lost-link problems in UAV collaboration |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20171003 |
|
WD01 | Invention patent application deemed withdrawn after publication |