WO2018214388A1 - 多平台航空电子大数据系统及方法 - Google Patents

多平台航空电子大数据系统及方法 Download PDF

Info

Publication number
WO2018214388A1
WO2018214388A1 PCT/CN2017/106322 CN2017106322W WO2018214388A1 WO 2018214388 A1 WO2018214388 A1 WO 2018214388A1 CN 2017106322 W CN2017106322 W CN 2017106322W WO 2018214388 A1 WO2018214388 A1 WO 2018214388A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
module
storage
association analysis
real
Prior art date
Application number
PCT/CN2017/106322
Other languages
English (en)
French (fr)
Inventor
毛睿
陆敏华
李荣华
王毅
廖好
周明洋
商烁
Original Assignee
深圳大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳大学 filed Critical 深圳大学
Publication of WO2018214388A1 publication Critical patent/WO2018214388A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2471Distributed queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines

Definitions

  • the invention belongs to the field of computers, and in particular relates to an aviation flight data system, in particular to a multi-platform avionics big data system; in addition, the invention also relates to a method for realizing the multi-platform avionics big data system.
  • Aviation flight operations are a large integrated system.
  • data In the whole process of flight, there are a large number of data to be transmitted between various departments, such as crew information, meteorological conditions, navigation information, route risk coefficient assessment, manifest information, takeoff data, special circumstances. Plan and other data.
  • Due to technical and management patterns, traditional methods of data transfer are by telephone, paper documents, manuals, and so on. These traditional methods of protection have many shortcomings and even become a bottleneck restricting the continued development of the civil aviation industry.
  • Aviation data has an extremely important impact on the safe take-off and economic benefits of each flight.
  • the characteristics of aeronautical data are multi-source, complex, large-scale, and the application of existing single-platform data systems is limited. Therefore, for these multi-source large-scale flight data, it is urgent to develop a multi-platform avionics big data system.
  • the invention is built on the basis of the existing distributed framework and database platform.
  • the following is a commonly used distributed framework and database platform.
  • Hadoop is a distributed system infrastructure developed by the Apache Foundation. Users can use it to develop distributed programs without knowing the underlying details of the distribution. Take advantage of the power of clusters for high-speed computing and storage. Simply put, Hadoop is a software platform that makes it easier to develop and run large-scale data. The platform is implemented in the object-oriented programming language Java and has good portability.
  • the core of Hadoop is HDFS and MapReduce.
  • HDFS Hadoop Distributed File System
  • HDFS has been specially optimized for the characteristics of massive data, including: access to large files, the proportion of read operations far exceeds the write operation, and the PC is prone to failure and node failure.
  • HDFS divides files into 64MB blocks, which are distributed on clustered machines and stored using Linux file systems. At the same time, each file has at least 3 or more redundancy.
  • the center is a NameNode node that looks for file blocks based on the file index.
  • MapReduce is a programming model that extracts the analysis elements from massive data and returns the result set. Most distributed operations can be abstracted into MapReduce operations. Map is to decompose the input into intermediate key-value pairs. Reduce combines the key-value pairs of the Map output according to the key value, and finally outputs the result. These two functions are provided by the programmer to the system, and the underlying facilities distribute the Map and Reduce operations on the cluster and store the results on HDFS.
  • Hadoop has several advantages that make it easy for users to develop and run applications that process massive amounts of data.
  • High reliability Hadoop automatically maintains multiple copies of data and automatically redeploys compute tasks after a task fails.
  • Hadoop can dynamically move data between nodes and ensure the dynamic balance of each node, so the processing speed is very fast.
  • Hadoop can distribute and process data through a server group consisting of ordinary machines. These server groups can reach thousands of nodes in total, and each node runs on the open source operating system Linux, so the hardware cost will be greatly reduced. . In addition, compared to all-in-one, commercial data warehouse, etc., Hadoop is open source, and the software cost will be greatly reduced.
  • HBase is the abbreviation of Hadoop Database. It is a high-reliability, high-performance, column-oriented and scalable distributed storage system. Its main function is to store massive amounts of structured data in the form of column storage on the basis of Hadoop HDFS.
  • the tables stored in HBase mainly have the following characteristics.
  • a table can have billions of rows and millions of columns.
  • Each row has a sortable primary key and any number of columns. Columns can be dynamically added as needed. Different rows in the same table can have distinct columns.
  • Multiple versions of data There can be multiple versions of data in each cell. By default, the version number is automatically assigned, which is the timestamp when the cell is inserted.
  • HBase The data in HBase is a string, there is no type.
  • HBase also has some drawbacks and scenarios that are not applicable:
  • ⁇ HBase does not support distributed transactions because it can only provide row locks.
  • ⁇ HBase performance is poor for operations such as join and group by in the query.
  • the technical problem to be solved by the present invention is to provide a multi-platform avionics big data system, which processes and analyzes large-scale flight data, integrates data collection, data classification management, data storage and data analysis functions, and collects and classifies management.
  • Source heterogeneous data and store these data in real time on the “resource cloud” platform.
  • the “resource cloud” platform client node obtains data in real time from the cloud, and uses the cloud platform to ensure real-time data.
  • the system supports the establishment of correlation models of historical data, real-time prediction using real-time data and correlation models, and provides some guidance for pilots' decision-making.
  • the system needs to implement the following functions: flight data acquisition, real-time sharing of flight data, flight data correlation analysis, and real-time assisted decision making.
  • the present invention also provides a method for implementing the multi-platform avionics big data system.
  • the present invention provides a multi-platform avionics big data system, including a data acquisition module, a data storage module, a data association analysis module, and a data association analysis application module;
  • the data acquisition module obtains the pcap data package file from the data source 1, and after collecting and sorting, goes to the data storage module to complete the data storage process; the data association analysis module acquires the training data from the data source 2, and completes the data association model establishment, The model is provided to the data association analysis application module to complete real-time prediction, and the result is displayed on the screen, and the data association analysis application module utilizes the cloud storage function implemented by the data storage module to complete the real-time storage function.
  • the data collection module includes an input folder path unit, an output folder path unit, and a data block selection unit; the input folder path unit and the output folder path unit are used for reading a user selected input and output folder path, the data block selecting unit is configured to read a data block type selected by the user, and the data collecting module performs data collection according to the content read by the unit; the data collecting module Use the libpcap package to obtain key time information fields from the pcap package captured by the network.
  • the source IP address, target IP information and data fields of the storage information are time field, sourceIP field, destIP field and data field, respectively, using destIP and
  • the sourceIP combines the data transmission information in the simulation scenario to initially determine the packet information data block; distinguishes different data blocks, parses according to different formats, and obtains an independent data block data structure, and writes the data structure back to the hard disk in the form of text. Used in the next stage.
  • the data storage module includes a read file path unit and a presentation control unit; the read file path unit is configured to read a data source file storage path selected by a user; In the storage of the demo data, it periodically reads the storage record and displays it on the panel; the data storage module uses the Hadoop distributed storage platform and the HBase distributed database to acquire data from multiple aircraft in real time, and then through the cloud storage. The method is stored on multiple aircraft and the data of multiple aircraft is acquired and shared in real time.
  • the data association analysis module includes a training data path unit, a training parameter selection unit, and a data segmentation mode selection unit.
  • the training data path unit is configured to read a training data storage path selected by the user.
  • the training parameter selection unit is configured to read each training parameter value selected by the user, and the data segmentation mode selection unit is configured to read a data segmentation mode selected by the user, and the data association analysis module performs the content according to the unit reading.
  • Model establishment and training the data association analysis module uses SVM classifier, corresponding code SVM package, and classifies existing data and analysis results by SVM method, and the core module is data splitting program and calling The libsvm classifier package, the split program splits the record with the data source result of 0 into N shares, N is input by the user, and the record with the result of 1 is composed of N training data sets, and the N models are output after training with libsvm. Forecasting results are performed using N model results for prediction and/or operation output prediction results; Data analysis module establishes association model input parameters specified by the user is completed.
  • the SVM classifier is preferably a non-linear svm classifier using an RBF core; the SVM classifier is preferably a two-segment classifier.
  • the data association analysis application module includes a model path selection unit, a read file path unit, and a presentation control unit.
  • the model path selection unit is configured to read a training model storage path selected by the user.
  • the read file path unit is configured to read a data source file storage path selected by the user, and the demo control unit analyzes the data by using the read model, and displays the predicted result on the panel.
  • the present invention also provides an implementation method of the above system, comprising data acquisition implementation of a data acquisition module, data storage implementation of a data storage module, establishment of a data association model of a data association analysis module, and real-time prediction of a data association analysis application module.
  • the result shows the implementation.
  • the data collection implementation of the data collection module includes the following steps:
  • step 5 judging whether there is still data in the file, if yes, proceed to step 5), otherwise return to step 4);
  • step 6 judging whether the data block is required by the user, if yes, proceed to step 7), otherwise return to step 5);
  • the data storage implementation of the data storage module includes the following steps:
  • the establishing the data association model of the data association analysis module includes the following steps:
  • the real-time prediction result display implementation of the data association analysis application module includes the following steps:
  • the multi-platform avionics big data system provided by the present invention has the following beneficial effects as compared with the prior art:
  • the system integrates data collection, data classification management, data storage and data analysis functions, collects and classifies and manages multi-source heterogeneous data, and stores these data in real time on the “resource cloud” platform, “resource cloud” platform customer. End nodes obtain data in real time from the cloud, and use the cloud platform to ensure real-time data.
  • the system supports the establishment of correlation models of historical data, real-time prediction using real-time data and correlation models, and provides some guidance for pilots' decision-making.
  • the system needs to implement the following functions: flight data acquisition, real-time sharing of flight data, flight data correlation analysis, and real-time assisted decision making.
  • the present invention is the first in the field to optimize the Hadoop distributed storage platform and the HBase distributed database and apply to the avionics big data system.
  • the present invention integrates and distributes large-scale avionics data in real time. Collect, store and share data, and use historical data analysis to predict the firepower of real-time data, thus successfully providing effective decision-making guidance for pilots with a predicted success rate of 94%.
  • the present invention uses a classification algorithm in machine learning to solve the problem of predicting the result of firepower in flight. Compared with the previous simulation of the flight process directly by software, the method is much faster than the accuracy of a certain accuracy. Times, thus improving the decision-making efficiency of the systematic confrontation system. Because the hit in the strike is far lower than the hit, resulting in unbalanced training data, affecting the accuracy of the decision. Therefore, we use the data segmentation method innovatively on the basis of SVM to improve the accuracy. Integrating the decision support function into the avionics system, the stored data can be used to train the classifier, and the trained classifier can be used for real-time fire attack prediction, and the aircraft can provide decision suggestions in real time according to the prediction result.
  • the nonlinear svm classifier of the system of the present invention preferably uses the RBF core has the highest accuracy, and it is preferable to use the two-segment classifier to have the highest F1 value.
  • the system of the present invention supports static reduction of nodes and dynamic addition of nodes.
  • Figure 1 is a block diagram showing the structure of a data storage module in the system of the present invention.
  • FIG. 2 is a diagram showing an example of a nonlinear SVM in a data association analysis module in the system of the present invention.
  • 3 and 4 are diagrams showing an example of data segmentation in a data association analysis module in the system of the present invention.
  • Figure 5 is a general framework diagram of a multi-platform avionics big data system of the present invention.
  • Figure 6 is a functional block diagram of a multi-platform avionics big data system of the present invention.
  • Figure 7 is a flow chart showing the main routine of the multi-platform avionics big data system of the present invention.
  • Figure 8 is a diagram showing an example of a data association analysis application module in the system of the present invention.
  • Figure 9 is a logic flow diagram of a data acquisition module in the system of the present invention.
  • Figure 10 is a logic flow diagram of a data storage module in the system of the present invention.
  • Figure 11 is a logic flow diagram of a data association analysis module in the system of the present invention.
  • Figure 12 is a logic flow diagram of a data association analysis application module in the system of the present invention.
  • the present invention simulates a flight node group with multiple dedicated test equipments, grabs data generated by sensors in the actual flight environment as a data source, and connects a dedicated test device with a switch to construct a local area network, simulation system.
  • the decision maker can view the real-time information of each device node in the node group in real time through any dedicated test equipment, and complete the decision based on the data information.
  • the present invention proposes a multi-platform avionics big data system based on "resource cloud”.
  • the core of the multi-platform avionics big data system of the present invention is to build a "resource cloud” platform in a virtual combat environment, and build a data sharing platform on multiple dedicated test equipments, which is based on existing open source cloud software (Hadoop). , HBase) to build, mainly to complete the real-time sharing of information between flight nodes, reliable storage, information processing functions. Number of platforms
  • the source is the data information after data collection and classification. The original data is collected and classified by the data acquisition module, and then transmitted to the “resource cloud” platform.
  • the data analysis module on each node obtains the information of all nodes in real time from the “resource cloud” platform, and combines the data association model established by the historical data to analyze the data, and presents the data analysis results of each node to the decision makers to provide decision guidance. .
  • the data of the imaginary operational environment comes from different sensors, and the heterogeneity between the data directly leads to the data complexity of the avionics big data system under the system operation.
  • the collected data needs to analyze the platform and data from multiple angles through the big data classification technology based on the air cloud platform, and enhance the data association, thereby reducing the data complexity of the avionics big data system oriented to the system operation.
  • different data acquisition and preprocessing modes need to be constructed for the multi-mode potential space.
  • the present invention collects the data in the classification environment by capturing the data packets one by one according to the data protocol, and uses the data source as the “resource cloud” platform.
  • the packet capture program Wireshark gets the packet. Wireshark will display the binary data captured from the network according to different protocol package structure specifications and display it in the Packet Details panel. It mainly includes a data frame profile of the physical layer, data link layer Ethernet frame header information, Internet layer IP packet header information, data segment header information of the transport layer, and application layer information.
  • the process uses the libpcap library, a network packet capture function library that is very powerful and performs packet interception for network interfaces, ports, and protocols.
  • Network data collection obtains data information from websites through web crawlers or website disclosure APIs.
  • the method extracts unstructured data from a web page, stores it as a unified local data file, and stores it in a structured manner. It supports the collection of files, audio, video and other files or attachments, and attachments and texts can be automatically associated.
  • the collection of network traffic can be handled using bandwidth management techniques such as DPI or DFI.
  • the present invention discards the way in which the crawler actively acquires data, and preferably uses the wireshark software to capture the data packets on the systematic confrontation platform from the network switch in the actual combat environment. data.
  • the data acquisition module parses the data packets obtained from the switch, and divides the data into structured and unstructured data according to the source IP address and destination IP address of the packet and the protocol of the data packet.
  • the basic data obtained are classified as follows:
  • Structured data that is, row data
  • the data will be distributed for different target receiving targets, such as drones, radar simulation, optoelectronics, electronic stations, 3D voice alarms, cockpits, etc.
  • Collecting data results divides the data into multiple data blocks.
  • the data block includes basic information such as data type, source, destination number, block length, update period, virtual link, maximum delay time, and receiving port.
  • basic information such as data type, source, destination number, block length, update period, virtual link, maximum delay time, and receiving port.
  • the main content in different data blocks can be structured and summarized.
  • the unstructured data mainly includes pictures, audio, video, hypermedia, and the like, such as radar weather images, geographical distribution images, detection of enemy sound waves and video streams, and the like. Wait. These data have no fixed structure. Compared with structured data, unstructured data is not convenient to be represented by database two-dimensional logical tables. However, based on non-relational databases on distributed cloud storage platforms, unstructured data can be efficiently realized. Stable storage. For unstructured data, we leave an interface to do these things.
  • Each combat node will generate some key avionics information in real time, including node carrier data, target information, etc., which are acquired by other nodes in real time and added to the systemized decision-making system in real time.
  • node carrier data including node carrier data, target information, etc.
  • the cloud platform's high fault tolerance, real-time and reliability ensure real-time accessibility and loss of all avionics information.
  • the traditional "resource cloud” framework is divided into several different types: the first one collects the original data on the client, and then the client transmits the data to each storage node for distributed storage; the second uses the data.
  • the collection is local to the node and then distributed to all storage nodes.
  • the present invention is characterized in that the data source is the same as the data storage destination, that is, the data is acquired from multiple aircraft in real time, and then stored in a plurality of aircraft through the cloud storage mode, and real-time.
  • the data of multiple aircraft is acquired and shared, so the present invention uses the second type of frame.
  • the hypothetical platform generally adopts a master/slave structure model (as shown in Figure 1), consisting of one master node and several slave nodes.
  • the master node acts as the primary server, managing file system namespaces and client access to files.
  • the slave node acts as the slave server and is responsible for the storage of data.
  • the system uses a "write-once-read-many" model that reduces concurrency control requirements, simplifies data aggregation, and supports high-throughput access.
  • the "resource cloud” platform divides large files into small files of fixed size by means of file segmentation, and stores the partition table, and makes multiple copies of the small files, which are respectively stored on different nodes, and are divided by the file when reading the file.
  • the table returns the spliced file one by one and returns it to the user.
  • the data source After the data source is collected and classified, it is temporarily cached on the local node. Due to the complexity of the node, the information file stored by some nodes is larger than the default file size of the cloud platform. Save line bandwidth through file segmentation and increase system fault tolerance on the other.
  • the "resource cloud" of the system of the present invention divides a file into a plurality of blocks on a physical storage, and calculates by hashing The method splits them into multiple nodes of the cluster, which allows the distributed storage system to hold large enough files. Compared with not dividing the file to the specified machine, the process of file segmentation saves the bandwidth of communication between single point and single point, and the load of the system is more balanced to some extent. On the other hand, if a single node fails, The node information cannot be read, and the file splitting method can be used to complete the recovery work by splicing to other nodes.
  • the data block is redundantly backed up to other nodes through a certain hash algorithm.
  • the redundancy fault tolerance of the cloud platform is based on the fault-tolerant mechanism of HDFS, and the main points are as follows:
  • the master node divides the file, records the partition table as a copy decision, marks each part of the file, records the partition table of the current block, and redundantly backs up to the corresponding other nodes according to the content of the partition table through the hash algorithm.
  • the backup of the master node is completed by the zookeeper. All the nodes elect a master node and a backup-master node.
  • the backup-master node periodically snapshots the master node to ensure that the backup-master information does not fall behind the master. After the heartbeat mechanism detects that the master node has crashed, the backup-master replaces the master node and selects another backup-master node through the zookeeper's election mechanism to back up the current master node content.
  • Timed Snapshot supports storing a copy of a data at a specific time. The snapshot can roll back the failed cluster to a previous normal point in time.
  • the client When creating data, the client initially caches the file data in a local temporary file. The application's write operation is transparently redirected to this temporary local file. The client notifies the primary node when the local files are stacked to a block size. The master node inserts the file name into the file system hierarchy and then assigns it a data block.
  • the master node configuration includes a data node ID (possibly multiple, the node where the replica data block is stored) and a packet identified by the target data block, and is used to reply to the client's request. After receiving the client, the local temporary file is flushed to the specified data node data block. Because if the client writes directly to the remote file system without any local cache, this will have a large impact on network speed and network throughput.
  • the residual data that was not uploaded in the local temporary file is forwarded to the data node.
  • the client can then notify the master that the file has been closed.
  • the primary node adds the file creation operation to the persistent store. If the primary node dies before the file is closed, the file is lost.
  • Pipelined replication When the client writes data to a file, as described above, the data is first written to the local file, assuming that the file's replication factor is 3, when the local file is stacked into a piece of data, the client is from the primary node. Get a list of data nodes. This list also contains data nodes that hold copies of the data block. When the client refreshes the data block to the first data node. The first data node begins to receive data in units of 4 kb, writes each small block to the local library, and transmits each small block to the second data node in the list. Similarly, the second data node writes the small block data to the local library and simultaneously to the third data node, and the third data node writes directly to the local library. A data node can also stream data to the next node while it is connected to the previous node data. Therefore, the data is streamed from a number. According to the node passed to the next one.
  • historical firepower information can be used to assist decision-making.
  • a classifier model of flight state which can be used to predict node firepower strike results.
  • the problem can be approximated as an input.
  • the classifier model analyzes the commonly used two-class classifiers and derives an optimal classifier model for application to the decision system.
  • the margin is all data. The minimum value of the geometric spacing from the point to the hyperplane. From a statistical point of view, since positive and negative samples can be regarded as random sampling from two different distributions, if the distance between the classification boundary and the two distributions is larger, the probability that the sampled samples fall on the other side of the classification boundary is smaller. Therefore, maximizing the margin ensures that the worst-case generalization error is minimized and the classifier is more confident.
  • the hyperplane w T x+b 0
  • the functional interval defining the sample point (x i , y i ) to the hyperplane is:
  • the geometric spacing is:
  • N the number of sample points.
  • the minimum value of the function interval for defining all sample points in T is:
  • the hyperplane's margin is the minimum of the geometric spacing of all sample points in T:
  • the decision surface is a surface, and the surface becomes a hyperplane in the high-dimensional space through a certain mapping, so that it can be solved by linearly dividing the method in svm.
  • the two types of data are distributed in the shape of two circles (as shown in Figure 2).
  • the data itself is linearly inseparable.
  • the ideal interface should be a circle rather than a line (hyperplane).
  • the processing method is to select a kernel function, which maps the data into a high-dimensional space and becomes a linear separable problem in the high-dimensional space, thereby solving the problem of linear inseparability in the original space. Then, it is processed by the linear separable SVM algorithm.
  • kernel functions in svm linear kernel (equivalent to linear separable svm), polynomial kernel, RBF kernel, and sigmoid kernel. The specific form is shown in Table 1.
  • the imbalance is caused. Try to divide the type of sample with a higher proportion of training set into several blocks. Each block and another type of sample form a sub-training set, and each sub-training set is trained to obtain a sub-classification model.
  • the sub-category model can be used to form a new classifier through some operations to predict the data. This treatment can improve the data imbalance problem to some extent.
  • Each sub-category model predicts the input data and obtains four outputs.
  • the four outputs can be ANDed to obtain the final output, which is equivalent to a new classifier.
  • the schematic diagram is shown in Figures 3 and 4.
  • the multi-platform avionics big data system of the present invention is generally divided into four modules, a data acquisition module, a data storage module, a data association analysis module, and a data association analysis application module.
  • the data acquisition module is obtained from data source 1
  • the pcap data packet file is collected and classified into a data storage module to complete the data storage process.
  • the data association analysis module obtains the training data from the data source 2, and can complete the data association model establishment by inputting the input parameters by the user, and provide the model to the data association analysis application module to complete the real-time prediction, and display the result on the screen, the data.
  • the association analysis application module utilizes the cloud storage function implemented by the data storage module to complete the real-time storage function.
  • the source IP address, target IP information and data fields of the storage information are time field, sourceIP field, destIP field and data field, respectively, using destIP and The sourceIP combines the data transmission information in the simulation scenario to initially determine the packet information data block.
  • Differentiate different data blocks parse according to different formats, obtain independent data block data structure, and write the data structure back to the hard disk in the form of text for use in the next stage.
  • the data collection module includes an input folder path unit, an output folder path unit, and a data block selection unit.
  • the input folder path unit and the output folder path unit are used to read the folder path of the input and output selected by the user, and the data block selection unit is configured to read the data block type selected by the user, and the data acquisition module reads according to the unit. Content for data collection.
  • the data collection module logic flow includes the following steps:
  • step 5 judging whether there is still data in the file, if yes, proceed to step 5), otherwise return to step 4);
  • step 6 judging whether the data block is required by the user, if yes, proceed to step 7), otherwise return to step 5);
  • the data reliability and fault tolerance of the platform is achieved by the redundant backup function of HDFS.
  • the project manages all data in a standardized way.
  • HBase a distributed database is implemented.
  • Hadoop's HDFS is used to complete reliable storage.
  • Hadoop's MapReduce framework is used to accelerate system data query operations.
  • the HBase form is designed as follows:
  • each data packet corresponds to a rowKey, and each rowKey contains only one data block information.
  • HBase uses the column storage method to ensure the reliability of the system data.
  • the module running process includes two steps of data storage and data display.
  • Data storage The data is spit out every 40ms, and the data is stored in HBase. Because the sample data volume is small, the data is spit out again after the reading is completed.
  • the data shows that the other thread finishes the file reading process, and every 10ms from the HBase environment in real time queries all the records from the last timestamp query to the current timestamp time, and reads the last record from the record, which is displayed in real time. on the screen.
  • the data storage module includes a read file path unit and a presentation control unit for data storage presentation.
  • the read file path unit is used to read the data source file storage path selected by the user, and the demo control unit is used to demonstrate the storage of the data. It periodically reads the storage record and displays it on the panel.
  • the data storage module logic flow includes the following steps:
  • This part mainly uses the SVM classifier, corresponding to the SVM package of the code, and classifies the existing data and analysis results by the SVM method.
  • the core modules are the data splitting program and the called libsvm classifier package, and the splitting program.
  • the records with the data source result of 0 are split into N parts (N is input by the user), and the records with the result of 1 are respectively composed of N training data sets, and the N models are output after training with libsvm, and N model results are used for prediction.
  • the prediction result is performed and/or the operation outputs the prediction result.
  • the running process mainly includes the following three steps.
  • Data normalization scan the data set, take out the upper and lower bounds, complete the normalization of the data, and ensure the results of each variable The role of balance.
  • the data association analysis module includes a training data path unit, a training parameter selection unit, and a data segmentation mode selection unit for establishing a model and performing model training.
  • the training data path unit is configured to read the training data storage path selected by the user
  • the training parameter selection unit is configured to read each training parameter value selected by the user
  • the data segmentation mode selection unit is configured to read the data segmentation mode selected by the user
  • the data association The analysis module performs model building and training based on the content read by these units.
  • the data association analysis module logic flow includes the following steps:
  • the overall design principle of the application module is to use the data storage module to complete the storage, and use the optimal model output by the data association analysis module as the input model to predict any data in real time, as shown in FIG. 8 .
  • the running process mainly includes the following three steps.
  • Initialization Initialize the connection of HBase, complete the creation of the table, create the column cluster, etc. Stored file content.
  • Data generation Data is spit out every 40ms, and the data is stored in HBase. Since the sample data volume is small, the reading is completed and the spit is again spit out from the first data.
  • the data shows that the other thread finishes the file reading process, and every 10ms from the HBase environment queries all the records from the last timestamp query to the current timestamp time, and reads the last record from the record, using this data.
  • the data association analysis application module includes a model path selection unit, a read file path unit, and a presentation control unit for data analysis and demonstration.
  • the model path selection unit is configured to read the training model storage path selected by the user
  • the read file path unit is configured to read the data source file storage path selected by the user
  • the demonstration control unit analyzes the data by using the read model, and the prediction result is obtained. Displayed on the panel.
  • the data association analysis application module logic flow includes the following steps:
  • Data collection is the data foundation of the “resource cloud” platform of the multi-platform avionics big data system, providing a certain data analysis data source for the software.
  • the data source requirement is the data obtained by capturing packets from the switch in the actual operating environment using wireshark software.
  • the data format requirement is pcap data, and the target IP and source IP of the packet meet the following requirements:
  • Data block name Destination IP/Source IP Networking instruction 224.224.0.110 Demo scene information 224.224.0.107/224.224.0.108 Integrated target data block 224.224.0.89 Helicopter carrier data block 224.224.0.140 Demo control information
  • the data field data block in each data packet satisfies the protocol "XX type demo system data interface protocol".
  • the data storage module is the core of the multi-platform avionics big data system, and the module completes the system data storage function.
  • This module accepts data output from the "Data Acquisition" module, the input data format is a complete txt text file, each behavior A parsed packet content, separated by commas, each packet field information is as follows:
  • Helicopter carrier data block packet timestamp, data block ID, data block time, longitude, latitude, altitude, pitch angle, roll angle, true angle, angle of attack, ground speed, north speed, east speed, sky speed
  • Integrated target data block packet data stamp, data block ID, data block time, target number, target 1 attribute, target 1 longitude, target 1 latitude, target 1 altitude, target 1 orientation, target 1 pitch angle, target 1 north direction speed , target 1 eastward speed, target 1 day to speed, target 2 attribute, ..., target 20 days to speed
  • the main function of the data association analysis module is to establish a data model for the analysis of historical data.
  • This module inputs a set of training data, and completes the data modeling process through SVM classifier and partition classification strategy.
  • the training data requires data collected by the STK simulation software.
  • the format is 7 input variable formats and one 0/1 result data. All fields are separated by tab tabs (" ⁇ t").
  • the field information is as follows:
  • This module uses the output file of STK simulation software as the predictive data source, and uses the “data association analysis part” output model as the input model.
  • the data storage module performs real-time result prediction and displays it on the interface in real time.
  • the field information of this part of the input data format ie STK simulation software output file
  • the data structure mainly used in software implementation corresponds to some data structures in the data protocol.
  • the definition data structure stores the data parsed from the binary file.
  • the implemented data structure has the following:
  • FrameHeader / / store the data header information of each packet
  • SingleTargetParameter// stores all information for a single target
  • Interrgrated_Target_Parm// stores integrated target data block field information, which may contain multiple SingleTargetParameters internally
  • Each data block is recognized as row data after being into the system, recorded as a Record, and the length and type fields are used to distinguish the categories.
  • each column cluster is composed of multiple attributes, each attribute corresponding to a field in the data block.
  • the column clusters in the table are as follows:
  • the columns included in each column cluster are the information of each block field, and the CF_EmptyString contains the package information stored in the form of text, such as "87a34b2345f86544e", etc., CF_Unrecognised contains only type information.
  • the row key information in the table uses a custom format, the format is "row” + system time +helicopterId, for example, the row key is "row147926317632301", indicating that the system time (the number of milliseconds since 0:00 on January 1, 1970) is At 1479263176323, the node numbered 01 stores data in the cloud platform.
  • the class structure involved in the implementation mainly has a Record class that records row information, calls the HBaseEngine class of the underlying HBase and the SVMEngine class that calls the SVM classifier, and each completes the respective calling process in the member variables of the class.
  • the constants involved in implementing the design are mainly field name information, which is large in number and will not be listed in detail here.
  • the original data is evenly divided into three sets of train set, validation set, and test set according to the ratio of 50%, 25%, and 25%.
  • the train set is used to train the classifier;
  • the validation set is used to test the performance of different classifiers, determine the network structure of the classification model or the parameters that control the complexity of the model;
  • the test set is used to test the performance of the final selected optimal classification model.
  • Test experiments on different classifier algorithms evaluate the experimental results, select the best classifier model, and verify with the test set.
  • Kernel function Exception Precision Recall F1 Linear core 92.9669% 0 0 0 0 Polynomial kernel 92.9669% 0 0 0 0 RBF core 94.3549% 0.599 0.596 0.597 Sigmod core 85.9684% 0 0 0
  • the selection of the RBF kernel function is the best, the accuracy rate is 94.4%, and the prediction rate of 1 is more than 50%.
  • the subtrain set is trained with the RBF core type of libsvm mentioned earlier because it works best.
  • Table 5 The test results are shown in Table 5 below:
  • the final classification result is obtained by processing the output with the first or the last, the first or the last.
  • Table 6 The test results are shown in Table 6 below:
  • the final classification result is obtained by processing the output with the first or the last, the first or the last.
  • Table 7 The test results are shown in Table 7 below:
  • Table 8 The test results are shown in Table 8 below:
  • Run the software system enter the data acquisition module, set the parameters and start collecting. Check the output data block file, all correct, and prove that the collection function is normal. Set different Block selection parameters and check the output data size, which is different. It proves that the acquisition module can collect various single data blocks.
  • Run the software system enter the data acquisition module, and start the demo. Observe the data on the Dashboard panel.
  • the panel can display the status information of each node in the cluster in real time, and it can be seen that the flight data is being stored, which proves that the module can store the data of each node in real time.
  • Run the software system enter the data association analysis module, use different kernel function selection parameters and segmentation parameters, and train the input data set to successfully obtain the classification model, which proves that the module can analyze the data by different methods.
  • Run the software system enter the data association analysis application module, select the parameters, and then start the demonstration.
  • the interface can display flight data and predicted firepower results of all nodes in real time, which proves that the module can store and predict flight data in real time.
  • the number of system nodes is reduced from 6 static to 4.
  • the number of nodes in Hadoop and Hbase in the cluster is changed to 4, indicating that the system supports static reduction of nodes.
  • the system node is dynamically increased from 4 in the previous test to 6 and the system software is run on the newly added node.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Mathematical Physics (AREA)
  • Fuzzy Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本发明公开了一种多平台航空电子大数据系统,包括数据采集模块、数据存储模块、数据关联分析模块和数据关联分析应用模块;数据采集模块从数据源1中获取pcap数据包文件,经采集分类之后到数据存储模块中,完成数据存储的过程;数据关联分析模块从数据源2中获取训练数据,完成数据关联模型建立,将模型提供给数据关联分析应用模块使用,完成实时预测,并将结果显示在屏幕上,数据关联分析应用模块利用数据存储模块实现的云存储功能完成实时存储的功能。此外,本发明还公开了该系统的实现方法。该系统集成数据采集,数据分类管理,数据存储和数据分析等功能,采集并分类管理多源异构数据,并实时存储到"资源云"平台上,保证数据的实时性。

Description

多平台航空电子大数据系统及方法 技术领域
本发明属于计算机领域,具体涉及一种航空飞行数据系统,尤其涉及一种多平台航空电子大数据系统;此外,本发明还涉及该多平台航空电子大数据系统的实现方法。
背景技术
航空飞行运行是一项庞大的综合系统。在飞行的全过程中,在各部门各岗位间,都有大量的、种类繁多的数据需要传递,比如机组信息、气象状况、航行信息、航线风险系数评估、舱单信息、起飞数据、特情预案等数据。由于受到技术和管理模式的限制,传统的数据传递方式是通过电话、发放纸质文档、手册等。这些传统的保障方式存在诸多缺点,甚至成为限制民航业继续发展的瓶颈。航空数据对每一次航班的安全起飞和经济效益有着极其重要的影响。而航空数据的特点是多源、复杂、大规模,现有的单一平台的数据系统的应用有限,因此针对这些多源的大规模飞行数据,亟需研发一种多平台航空电子大数据系统。
本发明是在现有的分布式框架和数据库平台的基础上搭建而来,以下是现在常用的分布式框架和数据库平台。
1.Hadoop
Hadoop是一个分布式系统基础架构,由Apache基金会开发。用户可以在不了解分布式底层细节的情况下,利用它开发分布式程序。充分利用集群的威力高速运算和存储。简单地说来,Hadoop是一个可以更容易开发和运行处理大规模数据的软件平台。该平台使用的是面向对象编程语言Java实现的,具有良好的可移植性。
Hadoop的核心是HDFS和MapReduce。
HDFS(Hadoop Distributed File System)是一种分布式文件系统,隐藏下层负载均衡,冗余复制等细节,对上层程序提供一个统一的文件系统API接口。HDFS针对海量数据特点做了特别优化,包括:超大文件的访问,读操作比例远超过写操作,PC机极易发生故障造成节点失效等。HDFS把文件分成64MB的块,分布在集群的机器上,使用Linux的文件系统存放。同时每块文件至少有3份以上的冗余。中心是一个NameNode节点,根据文件索引,找寻文件块。
MapReduce是一套从海量数据提取分析元素最后返回结果集的编程模型,大多数分布式运算可以抽象为MapReduce操作。Map是把输入分解成中间的键值对,Reduce根据键值,把Map输出的键值对进行合成整理,最终输出结果。这两个函数由程序员提供给系统,下层设施把Map和Reduce操作分布在集群上运行,并把结果存储在HDFS上。
Hadoop具有以下几个优点,使得用户可以轻松地利用它来开发和运行处理海量数据的应用程序。
高可靠性:Hadoop能自动地维护数据的多份复制,并且在任务失败后能自动地重新部署计算任务。
高扩展性:Hadoop是在可用的计算机集群间分配数据并完成计算任务的,这些集群可以方便地扩展到数以千计的节点中。因此,在不保证低延时的前提下,Hadoop具有相当大的吞吐量,非常适合海量数据的运算。
高效性:Hadoop能够在节点之间动态地移动数据,并保证各个节点的动态平衡,因此处理速度非常快。
高容错性:Hadoop能够自动保存数据的多个副本,并且能够自动将失败的任务重新分配。
低成本:Hadoop可以通过普通机器组成的服务器群来分发以及处理数据,这些服务器群总计可达数千个节点,而且每个节点都是运行在开源操作系统Linux上面的,因此硬件成本会大大降低。此外,与一体机、商用数据仓库等相比,Hadoop是开源的,软件成本也会大大降低。
2.HBase
HBase是Hadoop Database的简称,是一个高可靠性、高性能、面向列、可伸缩的分布式存储系统,其主要功能是在Hadoop的HDFS的基础上用列存储的方式存储海量的结构化数据。
HBase中存储的表主要有以下这些特点。
大表:一个表可以有数十亿行,上百万列。
无模式:每行都有一个可排序的主键和任意多的列,列可以根据需要动态的增加,同一张表中不同的行可以有截然不同的列。
面向列:面向列(族)的存储和权限控制,列(族)独立检索。
稀疏:对于空(null)的列,并不占用存储空间,表可以设计的非常稀疏。
数据多版本:每个单元中的数据可以有多个版本,默认情况下版本号自动分配,是单元格插入时的时间戳。
数据类型单一:HBase中的数据都是字符串,没有类型。
Hbase适用场景主要有:
·存在高并发读写
·表结构的列族经常需要调整
·存储结构化或半结构化数据
·高并发的key-value存储
·key随机写入,有序存储
·针对每个key保存一个固定大小的集合
HBase也有一些缺点和不适用的场景:
·由于只能提供行锁,HBase对分布式事务支持不好
·对于查询中的join、group by等操作,HBase的性能很差
·查询如果不使用row-key查询,性能会很差,因为此时会进行全表扫描,建立二级索引或多级索引需要同时维护一张索引表
·对高并发的随机读支持有限。
体系化对抗环境中,实时感知数据源的数据是一个关键性的问题,这些数据源通常来自多种传感器,高效的管理数据源产生的异构的数据成为这个问题的一个难点。本发明针对这些问题,对现有的分布式框架和相关的数据分析方法进行了一定的研究,尝试找出处理和分析多源的大规模飞行数据的有效方法。目前,尚未见应用分布式框架和数据库平台的多平台航空电子大数据系统的相关报道。
发明内容
本发明要解决的技术问题在于提供一种多平台航空电子大数据系统,该系统处理和分析大规模飞行数据,集成数据采集,数据分类管理,数据存储和数据分析等功能,采集并分类管理多源异构数据,并将这些数据实时存储到“资源云”平台上,“资源云”平台客户端节点从云上实时获取数据,借助云平台来保证数据的实时性。在数据实时性的基础上,系统支持历史数据的关联模型建立,利用实时的数据和关联模型完成实时预测,对飞行员的决策提供一定的指导。具体来说,该系统需要实现以下功能:飞行数据采集、飞行数据实时共享、飞行数据关联分析和实时辅助决策。为此,本发明还提供一种该多平台航空电子大数据系统的实现方法。
为解决上述技术问题,本发明提供一种多平台航空电子大数据系统,包括数据采集模块、数据存储模块、数据关联分析模块和数据关联分析应用模块;
数据采集模块从数据源1中获取pcap数据包文件,经采集分类之后到数据存储模块中,完成数据存储的过程;数据关联分析模块从数据源2中获取训练数据,完成数据关联模型建立,将模型提供给数据关联分析应用模块使用,完成实时预测,并将结果显示在屏幕上,数据关联分析应用模块利用数据存储模块实现的云存储功能完成实时存储的功能。
作为本发明优选的技术方案,所述数据采集模块包括输入文件夹路径单元、输出文件夹路径单元和数据块选择单元;所述输入文件夹路径单元和所述输出文件夹路径单元用于读取用户选择的输入和输出的文件夹路径,所述数据块选择单元用于读取用户选择的数据块类型,所述数据采集模块根据以上单元读取的内容来进行数据采集;所述数据采集模块使用libpcap包从网络抓取的pcap包中获取关键的时间信息字段,包的源IP,目标IP信息和存储信息的数据字段,分别为time字段,sourceIP字段,destIP字段和data字段,使用destIP和sourceIP结合模拟场景中的数据发送信息,初步确定出包信息数据块;区分不同的数据块,按照不同的格式解析,得到独立的数据块数据结构,将数据结构以文本的形式写回硬盘,供下一阶段使用。
作为本发明优选的技术方案,所述数据存储模块包括读取文件路径单元和演示控制单元;所述读取文件路径单元用于读取用户选择的数据源文件存放路径;所述演示控制单元用于演示数据的存储情况,它周期性地读取存储记录并显示到面板上;所述数据存储模块采用Hadoop分布式存储平台及HBase分布式数据库,从多架飞机实时获取数据,然后通过云存储方式再存储到多架飞机上,并实时获取并共享多架飞机的数据。
作为本发明优选的技术方案,所述数据关联分析模块包括训练数据路径单元、训练参数选择单元和数据分割方式选择单元;所述训练数据路径单元用于读取用户选择的训练数据存放路径,所述训练参数选择单元用于读取用户选择的各个训练参数值,所述数据分割方式选择单元用于读取用户选择的数据分割方式,所述数据关联分析模块根据上述单元读取的内容来进行模型的建立和训练;所述数据关联分析模块采用SVM分类器,对应代码的SVM包,通过SVM的方法,对已有的数据和分析结果进行分类,其核心模块是数据拆分程序和调用的libsvm分类器包,拆分程序将数据源结果为0的记录拆分成N份,N由用户输入,分别和结果为1的记录组成N个训练数据集,用libsvm训练后输出N个模型,预测时使用N个模型结果进行预测结果进行与/或操作输出预测结果;所述数据关联分析模块中数据关联模型建立通过用户指定输入参数完成。所述SVM分类器优选为使用RBF核的非线性svm分类器;所述SVM分类器优选为二分割分类器。
作为本发明优选的技术方案,所述数据关联分析应用模块包括模型路径选择单元、读取文件路径单元和演示控制单元;所述模型路径选择单元用于读取用户选择的训练模型存放路径,所述读取文件路径单元用于读取用户选择的数据源文件存放路径,所述演示控制单元利用读取的模型对数据进行分析,将预测结果显示到面板上。
此外,本发明还提供一种上述系统的实现方法,包括数据采集模块的数据采集实现、数据存储模块的数据存储实现、数据关联分析模块的建立数据关联模型实现和数据关联分析应用模块的实时预测结果显示实现。
作为本发明优选的技术方案,所述数据采集模块的数据采集实现包括如下步骤:
1)界面程序初始化;
2)等待用户操作;
3)获取参数、调用处理程序;
4)判断文件夹是否还有未读文件,是则进入步骤5),否则结束程序;
5)判断文件中是否仍有数据,是则进入步骤5),否则回到步骤4);
6)判断该数据块是否为用户需要,是则进入步骤7),否则回到步骤5);
7)解析并输出数据,回到步骤5)。
作为本发明优选的技术方案,所述数据存储模块的数据存储实现包括如下步骤:
1)初始化HBase连接;
2)创建表、列簇;
3)本机数据导入内存;
4)开始演示;
5)实时数据上传HBase,同时实时从HBase获取所有节点数据;
6)判断是否终止演示,是则结束,否则回到步骤4)。
作为本发明优选的技术方案,所述数据关联分析模块的建立数据关联模型实现包括如下步骤:
1)读取数据、取出各属性值的上下界;
2)再次扫描数据,用上下界缩放数据后调用read_prob函数产生svm_problem;
3)svm_problem进行交叉验证,得到训练准确率;
4)基于svm_problem调用svm_train函数,生成模型并存储;
5)结束。
作为本发明优选的技术方案,所述数据关联分析应用模块的实时预测结果显示实现包括如下步骤:
1)初始化HBase连接;
2)创建表、列簇;
3)本机数据导入内存;
4)开始演示;
5)实时数据上传HBase,同时实时从HBase获取所有节点数据再使用SVM算法实时预测结果;
6)判断是否终止演示,是则结束,否则回到步骤4)。
根据以上提供的技术方案,与现有技术相比,本发明提供的多平台航空电子大数据系统,具有以下有益效果:
1、该系统集成数据采集,数据分类管理,数据存储和数据分析等功能,采集并分类管理多源异构数据,并将这些数据实时存储到“资源云”平台上,“资源云”平台客户端节点从云上实时获取数据,借助云平台来保证数据的实时性。在数据实时性的基础上,系统支持历史数据的关联模型建立,利用实时的数据和关联模型完成实时预测,对飞行员的决策提供一定的指导。具体来说,该系统需要实现以下功能:飞行数据采集、飞行数据实时共享、飞行数据关联分析、实时辅助决策。
2、本发明将Hadoop分布式存储平台及HBase分布式数据库优化后应用到航空电子大数据系统,是本领域的首创,本发明对大规模的航电数据进行集成和分布式存储,能够实时地采集、存储和共享数据,并利用历史数据的分析,对实时数据进行火力打击的预测,从而成功地为飞行员提供有效的决策指导,预测成功率高达94%。
3、本发明用机器学习中的分类算法来解决飞行中火力打击的结果预测问题,相比于以前直接用软件模拟飞行过程来得到结果,该方法在保证一定准确率的前提下速度要快好多 倍,因此提高了体系化对抗系统的决策效率。由于打击中击中的情况要远远低于击不中,造成训练数据不平衡,影响决策准确度。因此,我们在SVM的基础上,创新地使用数据分割的方法,来提高准确度。将决策辅助功能集成到航电系统中,即可以利用存储的数据进行训练分类器,又能用训练好的分类器进行实时的火力打击预测,并根据预测结果为飞行器实时地提供决策建议。
3、经试验验证,本发明系统优选使用RBF核的非线性svm分类器准确率最高,而优选使用二分割分类器的F1值最高。
4、经试验验证,本发明系统支持静态减少节点以及动态增加节点。
附图说明
下面结合附图和实施例对本发明进一步说明。
图1是本发明系统中数据存储模块的框架结构图。
图2是本发明系统中数据关联分析模块中非线性SVM的示例图。
图3和图4是本发明系统中数据关联分析模块中数据分割的示例图。
图5是本发明多平台航空电子大数据系统的总体框架图。
图6是本发明多平台航空电子大数据系统的功能结构图。
图7是本发明多平台航空电子大数据系统的主程序流程图。
图8是本发明系统中数据关联分析应用模块的示例图。
图9是本发明系统中数据采集模块逻辑流程图。
图10是本发明系统中数据存储模块逻辑流程图。
图11是本发明系统中数据关联分析模块逻辑流程图。
图12是本发明系统中数据关联分析应用模块逻辑流程图。
具体实施方式
现在结合附图对本发明作进一步详细的说明。这些附图均为简化的示意图,仅以示意方式说明本发明的基本结构,因此其仅显示与本发明有关的构成。
体系化对抗中决策者使用的参考数据来自不同飞机系统、不同平台上的多传感器、多数据源,实时获取并可靠存储这些数据,将数据及时应用到决策体系中成为作战成功的基础。为模拟这一体系化环境,本发明用多台专用测试设备模拟一个飞行节点群,抓取实际飞行环境中传感器产生的数据作为数据源,用一台交换机连接各专用测试设备构建局域网,模拟体系化对抗环境中的数据通信。决策者可以通过任一专用测试设备实时查看节点群中各设备节点的实时信息,依据这些数据信息完成决策。在这个模拟的假想作战场景中,为保证决策者获取数据的实时性和可靠性,本发明提出基于“资源云”的多平台航电大数据系统。
本发明多平台航空电子大数据系统的核心是搭建在假想作战环境中的“资源云”平台,在多台专用测试设备上搭建一个数据共享平台,该数据平台基于已有的开源云软件(Hadoop,HBase)搭建,主要完成飞行节点之间信息实时共享,可靠存储,信息处理的功能。平台的数 据源是经过数据采集分类之后的数据信息,原始数据经过数据采集模块完成采集分类,之后传输到“资源云”平台。最后,各节点上的数据分析模块从“资源云”平台实时获取所有节点的信息,结合历史数据建立的数据关联模型进行数据分析,将对各节点的数据分析结果呈现给决策者,提供决策指导。
本发明多平台航空电子大数据系统中各个模块的技术解决方案如下:
1、数据采集与分类方案
假想作战环境的数据来自不同的传感器,数据之间具有异构性,直接导致体系作战下航电大数据系统的数据复杂性。经过采集的数据需要通过基于空中云平台的大数据分类技术来从多个角度对平台及数据进行分析,增强数据关联,从而降低面向体系作战的航电大数据系统的数据复杂性。具体实现中,需要针对多模态势空间构建不同的数据采集与预处理模式。针对假想作战环境,本发明采取抓包后逐个按照数据协议解析数据包的方法来采集分类环境中的数据,作为“资源云”平台数据源。
实际应用中,常用的系统数据采集方案有两种:
(1)抓包
抓包程序Wireshark获取数据包。Wireshark将从网络中捕获到的二进制数据按照不同的协议包结构规范,显示在Packet Details面板中。主要包含物理层的数据帧概况、数据链路层以太网帧头部信息、互联网层IP包头部信息、传输层的数据段头部信息、应用层的信息等。过程采用libpcap库,libpcap是一个网络数据包捕获函数库,功能非常强大,针对网络接口、端口和协议进行数据包截取。
(2)爬虫
网络数据采集通过网络爬虫或网站公开API等方式从网站上获取数据信息。该方法可以将非结构化数据从网页中抽取出来,将其存储为统一的本地数据文件,并以结构化的方式存储。它支持图片、音频、视频等文件或附件的采集,附件与正文可以自动关联。除了网络中包含的内容之外,对于网络流量的采集可以使用DPI或DFI等带宽管理技术进行处理。
针对假想作战环境中作战节点之间的通信代价较高,所以本发明舍弃爬虫主动获取数据的方式,优选从实际作战环境中网络交换机处利用wireshark软件抓取体系化对抗平台上的数据包作为源数据。
数据采集模块解析从交换机中获取到的数据包,根据包的源IP和目的IP,以及数据包的协议来将数据分为结构化和非结构化的数据。我们根据假想作战环境中的通信协议,来将数据逐个从二进制文件中解析出内容。得到的数据基本分类如下:
(1)结构数据:
结构化数据即行数据,可存储在关系型数据库里,通过二维表结构来逻辑表达实现的数据。数据将针对不同目标接收目标进行分发,比如无人机、雷达仿真、光电、电子站、三维语音告警、座舱等。
采集数据结果将数据分为多个数据块。数据块中包括基本信息,如数据类型、发送源、目标号、块长、更新周期、虚拟链路、最大延迟时间、接收端口等。除基本信息外,不同数据块中的主要内容可进行结构化归纳。
(2)非结构化数据
在本发明多平台航空电子大数据系统大数据管理平台中,非结构数据主要包括图片、音频、视频、超媒体等形式,比如雷达气象图像、地理分布图像、探测敌机声波图和视频流等等。这些数据没有固定结构,相对于结构化数据而言,非结构化数据不方便用数据库二维逻辑表来表现,但基于分布式云存储平台上的非关系型数据库,可以实现非结构化数据高效、稳定的存储。针对非结构化的数据,我们留出接口来完成这些功能。
2、数据存储方案
体系化对抗环境中,实时、准确的获取对抗系统中相关信息的实时变化情况是完成对抗的体系化的一个重要因素。每个作战节点会实时生成一些关键的航电信息,包括节点载机数据,目标的信息等,这些信息实时被其他节点获取,并实时加入体系化对抗的决策体系中。为实现这个目标,我们在假想作战环境的节点中搭建“资源云”,采集到每个节点生成的航电信息之后,实时将信息上传到“资源云”上,其他节点实时查询数据变化,利用云平台的高容错,实时性和可靠性来保证所有航电信息的实时获取性和难丢失性。
(1)“资源云”平台
传统的“资源云”框架分为几种不同类型:第一种将原始数据采集在客户端(client),再由客户端将数据传输至各存储节点进行分布式存储;第二种则将数据采集在节点本地,继而分发至所有存储节点。考虑到相对于现有的大数据管理架构,本发明的特色在于数据源与数据存储目的地相同,即从多架飞机实时获取数据,然后通过云存储方式再存储到多架飞机上,并实时获取并共享多架飞机的数据,故本发明采用第二种框架类型。
假想平台总体采用主从(Master/Slave)结构模型(如图1所示),由一个主节点和若干个从节点组成。主节点作为主服务器,管理文件系统命名空间和客户端对文件的访问操作。从节点作为从服务器,负责数据的存储。系统采用“一次写入、多次读取(write-once-read-many)”模型,该模型降低了并发性控制要求,简化了数据聚合性,支持高吞吐量访问。
(1)可靠性
“资源云”平台通过文件分割的方式来将大文件切分为固定大小的小文件,并存储分割表,将小文件制作多个副本,分别存储在不同节点上面,在读取文件时通过分割表来逐份读取拼接文件后返回给用户。
数据源经过采集和分类之后,写入硬盘临时缓存在本地节点上,由于节点的复杂性,部分节点存储的信息文件较大,超过云平台默认的文件大小,则会产生文件分割过程,一方面通过文件分割节省线路带宽,另一方面可以增加系统容错性。
本发明系统的“资源云”通过将一个文件在物理存储上分割成多个块,并通过哈希等算 法分别将它们拆分到集群的多个节点上,这种特性可以让分布式存储系统保存足够大文件。相比不分割将文件备份到指定机器上来说,文件分割的过程节省了单点到单点之间通信的带宽,一定程度上使系统的负载更加均衡,另一方面,如果单节点产生故障,无法读取该节点信息,通过文件分割的方式可以通过备份到其他节点的分割来拼接完成恢复工作。
(2)容错性
“资源云”平台对每个文件进行分割之后,通过一定的哈希算法将数据块冗余备份到其他节点上面,云平台的冗余容错基于HDFS的容错机制,主要有以下几点:
Master节点将文件分割,记录分割表作为复制的决策,将文件各部分进行标记,记录当前块的分割表,按照分割表内容通过哈希算法冗余备份到对应的其他节点上。访问文件时,当前节点上面没有对应的文件分割,则到最近的一个冗余备份上请求。
Master节点的备份,通过zookeeper完成,所有节点选举出一个master节点和一个backup-master节点,backup-master节点定时对master节点完成快照,保证backup-master信息不落后master太多。当心跳机制检测到master节点崩溃之后,backup-master取代master节点,并通过zookeeper的选举机制选出另外一个backup-master节点,备份当前master节点内容。
(3)“资源云”其他特点
定时快照:快照支持在一个特定时间存储一个数据拷贝,快照可以将失效的集群回滚到之前一个正常的时间点上。
流程状态:创建数据时,一开始客户端将文件数据缓存在本地的临时文件中。应用程序的写操作被透明地重定向到这个临时本地文件。当本地文件堆积到一个分块大小的时候,客户端才会通知主节点。主节点将文件名插入到文件系统层次中,然后为它分配一个数据块。主节点构造包括数据节点ID(可能是多个,副本数据块存放的节点也有)和目标数据块标识的报文,用它回复客户端的请求。客户端收到后将本地的临时文件刷新到指定的数据节点数据块中。因为如果客户端对远程文件系统进行直接写入而没有任何本地的缓存,这就会对网速和网络吞吐量产生很大的影响。当文件关闭时,本地临时文件中未上传的残留数据就会被转送到数据节点。然后客户端就可以通知主节点文件已经关闭。此时,主节点将文件的创建操作添加到到持久化存储中。假如主节点在文件关闭之前死掉,文件就丢掉了。
流水式复制:当客户端写数据到文件中时,如上所述,数据首先被写入本地文件中,假设文件的复制因子是3,当本地文件堆积到一块大小的数据,客户端从主节点获得一个数据节点的列表。这个列表也包含存放数据块副本的数据节点。当客户端刷新数据块到第一个数据节点。第一个数据节点开始以4kb为单元接收数据,将每一小块都写到本地库中,同时将每一小块都传送到列表中的第二个数据节点。同理,第二个数据节点将小块数据写入本地库中同时传给第三个数据节点,第三个数据节点直接写到本地库中。一个数据节点在接前一个节点数据的同时,还可以将数据流水式传递给下一个节点,所以,数据是流水式地从一个数 据节点传递到下一个。
扩展性:大量的应用实践已经证明该分布式平台具有着极大的扩展性,可以轻松扩展到数以百计的节点构成的集群上。
3、数据分析方案
在体系化对抗决策体系中,历史数据信息是十分宝贵的资源,对历史信息的分析和提炼可以完成许多功能,比如说历史火力打击信息可以用来辅助决策。通过对一组历史飞行过程及火力打击的结果分析,我们可以获取一个飞行状态的分类器模型,利用这个模型可以预测节点火力打击结果。把预测模型引入“资源云”平台上之后,我们可以根据每个节点的火力打击预测结果,完成一些辅助决策功能,提高体系化对抗系统的决策效率。
针对已有的飞行状态信息数据集和打击结果,可以近似的把问题看做一个输入是飞机发射导弹时的航电信息和目标的绝对位置信息,输出是击中和没击中目标的二分类分类器模型,分析比较常用的二分类分类器,得出一个结果最优的分类器模型应用到决策系统中。
(1)分类器算法
由于要解决的是一个二分类问题,标号为0和1。那么分类器就是要找到一个面,将所有样本点分到面的两侧。即,对于任一样本x=(b1,b2,…bm),分类器决策函数F:
F(x)=g(f(x))
Figure PCTCN2017106322-appb-000001
a.线性可分SVM
线性可分SVM分类器决策函数中的f(x)=wTx+b,它本质上是寻找一个能将样本点按标号分到两侧的具有最大化margin的超平面,margin是所有数据点到超平面的几何间隔的最小值。从统计的角度讲,由于正负样本可以看作从两个不同的分布随机抽样得到,若分类边界与两个分布的距离越大,抽样出的样本落在分类边界另一边的概率越小。所以,最大化margin可以保证最坏情况下的泛化误差最小,分类器确信度更高。
分类器决策函数中的f(x)=wTx+b,那么它的超平面为wTx+b=0.
给定训练集合T,超平面wTx+b=0,定义样本点(xi,yi)到超平面的函数间隔为:
Figure PCTCN2017106322-appb-000002
几何间隔为:
Figure PCTCN2017106322-appb-000003
设N为样本点数目,定义T中所有样本点的函数间隔的最小值为:
Figure PCTCN2017106322-appb-000004
超平面的margin为T中所有样本点的几何间隔的最小值:
Figure PCTCN2017106322-appb-000005
最大化margin可表示为:
Figure PCTCN2017106322-appb-000006
变化得:
Figure PCTCN2017106322-appb-000007
可以看出,w、b等比例缩放对超平面和几何间隔都没有影响,而函数间隔会同比例缩放。所以,令
Figure PCTCN2017106322-appb-000008
代入上式,而最大化
Figure PCTCN2017106322-appb-000009
等价于最小化
Figure PCTCN2017106322-appb-000010
这样就得到了线性可分svm的最优化问题:
Figure PCTCN2017106322-appb-000011
这是一个凸二次规划问题,应用拉格朗日对偶性,通过求解对偶问题可得到最优解,求解的过程就不赘述了。
b.非线性SVM
对于非线性的分类问题,决策面是一个曲面,曲面通过一定映射,会变成高维空间中的一个超平面,这样就可以用线性可分svm中的方法来解决。
例如,两类数据分布为两个圆圈的形状(如图2所示),这样的数据本身是线性不可分的,理想的分界面应该是一个圆而不是一条线(超平面)。
若用x1和x1表示这个二维平面的坐标,那么它的决策面可写成这样的形式:
a0+a1x1+a2x2+a3x1 2+a4x2 2+a5x1x2=0
如果我们构造一个五维空间,坐标值分别为z1=x1、z2=x2、z3=x1 2、z4=x2 2、z5=x1x2,那么上面的决策面方程在新的空间中可以写作:
Figure PCTCN2017106322-appb-000012
可以看出,这正是一个超平面的方程。如果我们按这样的方式将数据映射到五维空间,那么在新空间中原来的非线性数据就变成线性可分的了,从而可以使用线性svm算法处理。
由于在线性可分svm的求解过程中,需要计算的地方数据向量总是以内积的形式出现,所以,我们定义计算两个向量在映射过后的空间中的内积的函数为核函数,用核函数来简化映射空间中的内积运算。
所以,对于非线性情况,处理方法是选择一个核函数,通过它将数据映射到高维空间,变成高维空间中的一个线性可分问题,以此来解决在原始空间中线性不可分的问题,然后再用线性可分SVM算法进行处理。svm常用的核函数有四种:线性核(等同于线性可分svm)、多项式核、RBF核、sigmoid核,具体形式如下表1。
表1
类型 函数表达式
线性核 uT*v
多项式核 (g*uT*v+coef0)degree
RBF核 exp(-g*||u-v||2)
sigmoid核 tanh(g*uT*v+coef0)
数据分割
由于样本数据集中两类数据比例悬殊,造成不平衡问题。尝试将训练集中比例较高的那一类样本分割成几块,每块与另一类样本分别组成一个子训练集,对每个子训练集进行训练,得到子分类模型。将子分类模型通过一些运算可以组成新的分类器,对数据进行预测。这样处理,可以一定程度上改善数据不平衡问题。
例如,将label=0的样本分割成四块,分别与label=1的样本组成四个子训练集,对它们进行训练得到四个子分类模型。每个子分类模型对输入数据进行预测,得到四个output,可以对这四个output进行与运算,得到最终的output,这就相当于一个新的分类器。示意图如图3和图4所示。
下面结合附图进一步具体说明本发明的系统架构及流程:
(一)程序架构和流程设计
如图5所示,本发明多平台航空电子大数据系统整体分为4个模块,数据采集模块,数据存储模块,数据关联分析模块和数据关联分析应用模块。数据采集模块从数据源1中获取 pcap数据包文件,经采集分类之后到数据存储模块中,完成数据存储的过程。数据关联分析模块从数据源2中获取训练数据,可以通过用户指定输入参数,完成数据关联模型建立,将模型提供给数据关联分析应用模块使用,完成实时预测,并将结果显示在屏幕上,数据关联分析应用模块利用数据存储模块实现的云存储功能完成实时存储的功能。
由于系统是在分布式平台基础上开发的,搭建系统时首先需要在多台设备(开发系统时使用6台)上搭建Hadoop和HBase完全分布式环境。每台设备相当于一个飞行节点,其中有一台作为主节点,来进行调度和显示等操作。
1.数据采集模块
使用libpcap包从网络抓取的pcap包中获取关键的时间信息字段,包的源IP,目标IP信息和存储信息的数据字段,分别为time字段,sourceIP字段,destIP字段和data字段,使用destIP和sourceIP结合模拟场景中的数据发送信息,可以初步确定出包信息数据块。
区分不同的数据块,按照不同的格式解析,得到独立的数据块数据结构,将数据结构以文本的形式写回硬盘,供下一阶段使用。
如图6和图7所示,数据采集模块包括输入文件夹路径单元、输出文件夹路径单元、数据块选择单元。输入文件夹路径单元和输出文件夹路径单元用于读取用户选择的输入和输出的文件夹路径,数据块选择单元用于读取用户选择的数据块类型,数据采集模块根据这些单元读取的内容来进行数据采集。
如图9所示,数据采集模块逻辑流程包括如下步骤:
1)界面程序初始化;
2)等待用户操作;
3)获取参数、调用处理程序;
4)判断文件夹是否还有未读文件,是则进入步骤5),否则结束程序;
5)判断文件中是否仍有数据,是则进入步骤5),否则回到步骤4);
6)判断该数据块是否为用户需要,是则进入步骤7),否则回到步骤5);
7)解析并输出数据,回到步骤5)。
2.数据存储模块
(1)分布式存储平台
为完成数据可靠性存储过程,参考技术方案中的设计,借助已有的分布式云平台,基于HDFS实现数据存储功能。在六台专用测试设备上部署HDFS的服务端,待所有节点模拟飞行员就位(设备开机)后,在任一节点启动HDFS的start-all.sh命令,六台测试设备组建成统一的数据共享平台,分别监听相应功能的端口。数据存储或查询请求达到时,使用对应端口传输数据。
平台的数据可靠性和容错性借助HDFS的冗余备份功能完成。
(2)分布式数据库
在已有的HDFS稳定存储的基础上,项目为规范化管理所有数据,基于HBase实现了一个分布式数据库,使用Hadoop的HDFS来完成可靠存储,使用Hadoop的MapReduce框架来加速系统数据查询操作。
HBase的表格设计如下:
Figure PCTCN2017106322-appb-000013
实际存储时,每个数据包对应一个rowKey,每个rowKey只包含一个数据块的信息,HBase利用列存的方式保证系统数据的可靠性。
(3)运行流程
该模块运行过程包括数据存储和数据显示两个步骤。
数据存储:隔40ms吐出一次数据,将数据存储到HBase中,因样本数据量较小,读取完成之后从第一个数据开始再次吐出。
数据显示:另开线程完成文件的读取过程,每隔10ms从HBase环境中实时查询从上次时间戳查询到现在时间戳时间内所有的记录,从记录中读取最后一条记录,实时显示在屏幕上。
如图6和图7所示,数据存储模块包括读取文件路径单元和演示控制单元,用于数据存储演示。读取文件路径单元用于读取用户选择的数据源文件存放路径,演示控制单元用于演示数据的存储情况,它周期性地读取存储记录并显示到面板上。
如图10所示,数据存储模块逻辑流程包括如下步骤:
1)初始化HBase连接;
2)创建表、列簇;
3)本机数据导入内存;
4)开始演示;
5)实时数据上传HBase,同时实时从HBase获取所有节点数据;
6)判断是否终止演示,是则结束,否则回到步骤4)。
3.数据关联分析模块
这一部分主要使用的SVM分类器,对应代码的SVM包,通过SVM的方法,对已有的数据和分析结果进行分类,其核心模块是数据拆分程序和调用的libsvm分类器包,拆分程序将数据源结果为0的记录拆分成N份(N由用户输入),分别和结果为1的记录组成N个训练数据集,用libsvm训练后输出N个模型,预测时使用N个模型结果进行预测结果进行与/或操作输出预测结果。
运行过程主要包括以下三个步骤。
数据归一化:扫描数据集,取出上下界,完成数据的归一化操作,保证每个变量对结果 的作用平衡。
数据分割:因为数据的特殊性,结果为0的记录数量远多于结果为1,所以本发明采取技术方案中的划分策略,将结果为1的数据划分成N份,分别与0组合之后形成N个数据源,这一部分在read_prob函数中实现。
数据训练:调用libsvm软件包中的各个函数(包括svm_scale、svm_train等),对各svm_problem训练,生成svm_model并dump(转存)到硬盘上。
如图6和图7所示,数据关联分析模块包括训练数据路径单元、训练参数选择单元、数据分割方式选择单元,用于建立模型、进行模型训练。训练数据路径单元用于读取用户选择的训练数据存放路径,训练参数选择单元用于读取用户选择的各个训练参数值,数据分割方式选择单元用于读取用户选择的数据分割方式,数据关联分析模块根据这些单元读取的内容来进行模型的建立和训练。
如图11所示,数据关联分析模块逻辑流程包括如下步骤:
1)读取数据、取出各属性值的上下界,包括经度、纬度、高度、横滚角、直航角、俯仰角和速度7个属性;
2)再次扫描数据,用上下界scale数据(缩放数据,以提高训练和预测时数据的处理速度)后调用read_prob函数产生svm_problem;
3)svm_problem进行cross validation(交叉验证),得到训练准确率;
4)基于svm_problem调用svm_train函数,生成模型并存储;
5)结束。
4.数据关联分析应用模块
应用模块的整体设计原则是利用数据存储模块完成存储,利用数据关联分析模块输出的最优模型作为输入模型,对任一数据实时预测,如图8所示。
其中,多分模型的数据预测遵循如下规则:
2分:
或模型:n1|n2
与模型:n1&n2
4分:
先与后或:(n1&n2)|(n3&n4)
先或后与:(n1|n2)&(n3|n4)
8分:
先与后或:(n1&n2&n3&n4)|(n5&n6&n7&n8)
先或后与:(n1|n2|n3|n4)&(n5|n6|n7|n8)
运行过程主要包括以下三个步骤。
初始化:初始化HBase的连接,完成表的创建,列簇的创建等操作,从硬盘读取需要存 储的文件内容。
数据产生:每隔40ms吐出一次数据,将数据存储到HBase中,因样本数据量较小,读取完成之后从第一个数据开始再次吐出。
数据显示:另开线程完成文件的读取过程,每隔10ms从HBase环境中实时查询从上次时间戳查询到现在时间戳时间内所有的记录,从记录中读取最后一条记录,用这个数据调用SVM完成实时预测,并将结果显示在屏幕上。
如图6和图7所示,数据关联分析应用模块包括模型路径选择单元、读取文件路径单元、演示控制单元,用于数据分析演示。模型路径选择单元用于读取用户选择的训练模型存放路径,读取文件路径单元用于读取用户选择的数据源文件存放路径,演示控制单元利用读取的模型对数据进行分析,将预测结果显示到面板上。
如图12所示,数据关联分析应用模块逻辑流程包括如下步骤:
1)初始化HBase连接;
2)创建表、列簇;
3)本机数据导入内存;
4)开始演示;
5)实时数据上传HBase,同时实时从HBase获取所有节点数据再使用SVM算法实时预测结果;
6)判断是否终止演示,是则结束,否则回到步骤4)。
(二)接口设计
1.数据采集模块
数据采集是多平台航电大数据系统的“资源云”平台的数据基础,为软件提供一定的数据分析数据源。数据源要求是从实际运行环境中交换机处用wireshark软件抓包获取的数据,数据格式要求是pcap数据,包的目标IP和源IP满足如下要求:
表2
数据块名称 目的IP/源IP
组网指令 224.224.0.110
演示场景信息 224.224.0.107/224.224.0.108
综合目标数据块 224.224.0.89
直升机载机数据块 224.224.0.140
演示控制信息  
各数据包中data字段数据块满足协议《XX型演示系统数据接口协议》。
2.数据存储模块
数据存储模块是多平台航电大数据系统的核心,由此模块完成系统数据存储功能。此模块接受来自“数据采集”模块的数据输出,输入数据格式为完整的txt文本文件,每一行为 一个解析后的数据包内容,字段之间用逗号分隔,每个数据包字段信息如下:
直升机载机数据块:包时间戳,数据块ID,数据块时间,经度,纬度,高度,俯仰角,横滚角,真航角,攻角,地速,北向速度,东向速度,天速
综合目标数据块:包数据戳,数据块ID,数据块时间,目标个数,目标1属性,目标1经度,目标1纬度,目标1高度,目标1方位,目标1俯仰角,目标1北向速度,目标1东向速度,目标1天向速度,目标2属性,……,目标20天向速度
3.数据关联分析模块
数据关联分析模块的主要功能是对历史数据的分析建立数据模型,此模块输入一组训练数据,通过SVM分类器和划分分类策略完成数据建模过程。训练数据要求为STK模拟软件收集到的数据,其格式为7个输入变量格式和一个0/1的结果数据,所有字段之间用tab制表符(“\t”)分隔,字段信息如下:
经度 纬度 高度 参数4 参数5 参数6 参数7 0
4.数据关联分析应用模块
此模块以STK模拟软件的输出文件作为预测数据源,以“数据关联分析部分”输出模型作为输入模型,基于数据存储模块进行实时的结果预测,实时显示在界面上。该部分输入数据格式(即STK模拟软件输出文件)字段信息如下:
经度 纬度 高度 参数4 参数5 参数6 参数7
(三)全局数据结构设计
1.物理结构
软件实现中主要使用的数据结构是对应于数据协议中的一些数据结构,定义数据结构存储从二进制文件中解析出来的数据,实现的数据结构有下面几个:
FrameHeader//存储每个包的数据头信息
Helicopt_Carrier_Parm//存储载机数据块字段信息
SingleTargetParameter//存储单个目标所有信息
Interrgrated_Target_Parm//存储集成目标数据块字段信息,内部可能包含多个SingleTargetParameter
Demo_Scene_Info//演示场景信息数据块
Demo_Ctrl_Info//演示控制信息数据块
Build_Net_Cmd//组网指令数据块
Record//每个数据块到系统内之后均被识别成为行数据,记录成一个Record,用长度和type字段来区别类别。
2.表结构
根据已有的数据分类,结合HBase的按列存储的特点,我们设计出如下的表结构由多个 列簇(ColumnFamily)构成,每个列簇由多个属性组成,每个属性对应于数据块中的一个字段。表中的列簇有如下几个:
CF_HelicoptCarrierParm//直升机载机数据块列簇
CF_IntegeratedTargetParm//基础目标数据块列簇
CF_STK//STK数据对应的列簇
CF_EmptyString//存储非结构化的包列簇
CF_Unrecognised//存储未识别的数据块对应的列簇
各列簇中包含的列为各块字段信息,CF_EmptyString中包含文本形式存储的包信息,如”87a34b2345f86544e”等,CF_Unrecognised只包含类型信息。
表格中的行键信息使用自定义的格式,格式为“row”+系统时间+helicopterId,例如行键为“row147926317632301”,表示系统时间(从1970年1月1日0时起的毫秒数)为1479263176323时,编号为01的节点存储到云平台中的数据。
3.类结构
实现中涉及到的类结构主要有记录行信息的Record类,调用底层HBase的HBaseEngine类和调用SVM分类器的SVMEngine类,各自在类的成员变量中完成各自调用过程。
4.常量
实现设计中涉及到的常量主要是字段名称信息,数量较大,在此不详细列出。
以下通过具体实验来验证本发明的效果:
1.分类器算法评测实验
(1)数据集
用作实验的原始飞行数据样本共4497432个,其中击中(label=1)的有316768个,未击中(label=0)的有4180664个。将原始数据按照50%、25%、25%的比例均匀划分为train set、validation set、test set三个集合。其中,train set用来训练分类器;validation set用来测试不同分类器的性能,确定分类模型的网络结构或者控制模型复杂程度的参数;test set用来检验最终选择的最优分类模型的性能。
(2)实验结果
对不同分类器算法进行测试实验,评估实验结果,选取最佳的分类器模型,用test set进行验证。
a.线性可分svm
用Liblinear实现线性可分svm,进行测试,结果如下表3:
表3
accuracy precision recall F1
92.9669% 0 0 0
由于数据集中label=1的数目远远低于label=0的实例数目(比例约为1:13),因此线性svm会全部预测0,但是显然这样是毫无意义的。
b.非线性svm
用Libsvm实现不同类型的非线性svm,进行测试,结果如下表4:
表4
核函数 accuracy precision recall F1
线性核 92.9669% 0 0 0
多项式核 92.9669% 0 0 0
RBF核 94.3549% 0.599 0.596 0.597
sigmod核 85.9684% 0 0 0
可以看出选用RBF核函数的结果最好,准确率达到了94.4%,1的预测率也超过了50%。
c.数据分割
子训练集用前面提到的libsvm的RBF核类型进行训练,因为它的效果最好。
i.二分割
将label=0的训练数据随机分割成两块,与label为1的数据组成两个子训练集,训练得到两个model,分别对validation set进行预测,得到两个output,按与和或两种关系处理output得到最终分类结果。测试结果如下表5:
表5
  accuracy precision recall F1
94.1015% 0.556 0.806 0.658
94.0866% 0.554 0.811 0.659
ii.四分割
将label=0的训练数据随机分割成四块,与label为1的数据组成四个子训练集,训练得到四个model,分别对validation set进行预测,得到四个output,按全与、全或、先与后或、先或后与四种关系处理output得到最终分类结果。测试结果如下表6:
表6
  accuracy precision recall F1
全与 93.1026% 0.505 0.926 0.654
全或 93.0137% 0.502 0.931 0.652
先与后或 93.0717% 0.504 0.928 0.653
先或后与 93.0503% 0.503 0.929 0.653
iii.八分割
将label=0的训练数据随机分割成八块,与label为1的数据组成八个子训练集,训练得到八个model,分别对validation set进行预测,得到八个output,按全与、全或、先与后或、先或后与四种关系处理output得到最终分类结果。测试结果如下表7:
表7
  accuracy Precision recall F1
全与 91.5268% 0.453 0.984 0.620
全或 91.2967% 0.446 0.987 0.615
先与后或 91.4762% 0.451 0.985 0.619
先或后与 91.3750% 0.449 0.986 0.617
iv.三分之二分割
将label=0的训练数据随机分割成三块,每两块与label为1的数据组成三个子训练集,训练得到三个model,分别对validation set进行预测,得到三个output,按与和或两种关系处理output得到最终分类结果。测试结果如下表8:
表8
  accuracy precision recall F1
94.3033% 0.575 0.729 0.643
94.2959% 0.574 0.734 0.644
d.验证实验
根据以上测试,可以看出,单纯使用RBF核的非线性svm分类器准确率最高,而二分割分类器的F1值最高。用test set对这两种最优分类模型进行验证实验,结果如下表9:
表9
分类器 accuracy Precision recall F1
RBF核svm 94.3391% 0.599 0.595 0.597
二分割-与 94.0945% 0.555 0.807 0.658
二分割-或 94.0772% 0.554 0.812 0.659
验证得到,这两种分类器性能与前面的测试结果基本一致,确实最优。
2.多平台航空电子大数据系统测试
a.数据采集模块测试
运行软件系统,进入数据采集模块,设置好参数后开始采集。检查输出的数据块文件,均正确,证明采集功能正常。设置不同的Block selection参数,检查输出的数据大小,均不同,证明采集模块能对各种不同的单数据块进行采集。
b.数据存储模块测试
运行软件系统,进入数据采集模块,然后开始演示。观察Dashboard面板上的数据,随着程序运行,面板能实时显示集群中各节点的状态信息,且可以看出飞行数据正被存储,证明该模块能够实时存储各个节点的数据。
c.数据关联分析模块测试
运行软件系统,进入数据关联分析模块,分别采用不同的核函数选择参数和分割参数,对输入数据集进行训练,均能成功地得到分类模型,证明该模块能够用不同方法进行数据分析。
d.数据关联分析应用模块测试
运行软件系统,进入数据关联分析应用模块,选取参数,然后开始演示。界面能实时显示所有节点的飞行数据和预测火力打击结果,证明该模块能够对飞行数据进行实时存储和预测。
e.系统节点静态减少测试
按照相应的方法,将系统节点由6个静态减少到4个,检查集群中Hadoop和Hbase的节点数,均变成了4,说明系统支持静态减少节点。
f.系统节点动态增加测试
按照相应的方法,将系统节点由前一测试中的4个动态增加到6个,并在新增加的节点上运行系统软件。检查系统数据存储功能界面上节点信息的变化,由原来的4成功变成了6,说明系统支持动态增加节点。
以上述依据本发明的理想实施例为启示,通过上述的说明内容,相关工作人员完全可以在不偏离本项发明技术思想的范围内,进行多样的变更以及修改。本项发明的技术性范围并不局限于说明书上的内容,必须要根据权利要求范围来确定其技术性范围。

Claims (10)

  1. 一种多平台航空电子大数据系统,其特征在于,包括数据采集模块、数据存储模块、数据关联分析模块和数据关联分析应用模块;
    数据采集模块从数据源1中获取pcap数据包文件,经采集分类之后到数据存储模块中,完成数据存储的过程;数据关联分析模块从数据源2中获取训练数据,完成数据关联模型建立,将模型提供给数据关联分析应用模块使用,完成实时预测,并将结果显示在屏幕上,数据关联分析应用模块利用数据存储模块实现的云存储功能完成实时存储的功能。
  2. 如权利要求1所述的系统,其特征在于,所述数据采集模块包括输入文件夹路径单元、输出文件夹路径单元和数据块选择单元;所述输入文件夹路径单元和所述输出文件夹路径单元用于读取用户选择的输入和输出的文件夹路径,所述数据块选择单元用于读取用户选择的数据块类型,所述数据采集模块根据以上单元读取的内容来进行数据采集;
    所述数据采集模块使用libpcap包从网络抓取的pcap包中获取关键的时间信息字段,包的源IP,目标IP信息和存储信息的数据字段,分别为time字段,sourceIP字段,destIP字段和data字段,使用destIP和sourceIP结合模拟场景中的数据发送信息,初步确定出包信息数据块;区分不同的数据块,按照不同的格式解析,得到独立的数据块数据结构,将数据结构以文本的形式写回硬盘,供下一阶段使用。
  3. 如权利要求1所述的系统,其特征在于,所述数据存储模块包括读取文件路径单元和演示控制单元;所述读取文件路径单元用于读取用户选择的数据源文件存放路径;所述演示控制单元用于演示数据的存储情况,它周期性地读取存储记录并显示到面板上;所述数据存储模块采用Hadoop分布式存储平台及HBase分布式数据库,从多架飞机实时获取数据,然后通过云存储方式再存储到多架飞机上,并实时获取并共享多架飞机的数据。
  4. 如权利要求1所述的系统,其特征在于,所述数据关联分析模块包括训练数据路径单元、训练参数选择单元和数据分割方式选择单元;所述训练数据路径单元用于读取用户选择的训练数据存放路径,所述训练参数选择单元用于读取用户选择的各个训练参数值,所述数据分割方式选择单元用于读取用户选择的数据分割方式,所述数据关联分析模块根据上述单元读取的内容来进行模型的建立和训练;
    所述数据关联分析模块采用SVM分类器,对应代码的SVM包,通过SVM的方法,对已有的数据和分析结果进行分类,其核心模块是数据拆分程序和调用的libsvm分类器包,拆分程序将数据源结果为0的记录拆分成N份,N由用户输入,分别和结果为1的记录组成N个训练数据集,用libsvm训练后输出N个模型,预测时使用N个模型结果进行预测结果进行与/或操作输出预测结果;所述数据关联分析模块中数据关联模型建立通过用户指定输入参数完成。
  5. 如权利要求1所述的系统,其特征在于,所述数据关联分析应用模块包括模型路径选择单元、读取文件路径单元和演示控制单元;所述模型路径选择单元用于读取用户选择的 训练模型存放路径,所述读取文件路径单元用于读取用户选择的数据源文件存放路径,所述演示控制单元利用读取的模型对数据进行分析,将预测结果显示到面板上。
  6. 一种如权利要求1-5任一项所述的系统的实现方法,其特征在于,包括数据采集模块的数据采集实现、数据存储模块的数据存储实现、数据关联分析模块的建立数据关联模型实现和数据关联分析应用模块的实时预测结果显示实现。
  7. 如权利要求6所述的方法,其特征在于,所述数据采集模块的数据采集实现包括如下步骤:
    1)界面程序初始化;
    2)等待用户操作;
    3)获取参数、调用处理程序;
    4)判断文件夹是否还有未读文件,是则进入步骤5),否则结束程序;
    5)判断文件中是否仍有数据,是则进入步骤5),否则回到步骤4);
    6)判断该数据块是否为用户需要,是则进入步骤7),否则回到步骤5);
    7)解析并输出数据,回到步骤5)。
  8. 如权利要求6所述的方法,其特征在于,所述数据存储模块的数据存储实现包括如下步骤:
    1)初始化HBase连接;
    2)创建表、列簇;
    3)本机数据导入内存;
    4)开始演示;
    5)实时数据上传HBase,同时实时从HBase获取所有节点数据;
    6)判断是否终止演示,是则结束,否则回到步骤4)。
  9. 如权利要求6所述的方法,其特征在于,所述数据关联分析模块的建立数据关联模型实现包括如下步骤:
    1)读取数据、取出各属性值的上下界;
    2)再次扫描数据,用上下界缩放数据后调用read_prob函数产生svm_problem;
    3)svm_problem进行交叉验证,得到训练准确率;
    4)基于svm_problem调用svm_train函数,生成模型并存储;
    5)结束。
  10. 如权利要求6所述的方法,其特征在于,所述数据关联分析应用模块的实时预测结果显示实现包括如下步骤:
    1)初始化HBase连接;
    2)创建表、列簇;
    3)本机数据导入内存;
    4)开始演示;
    5)实时数据上传HBase,同时实时从HBase获取所有节点数据再使用SVM算法实时预测结果;
    6)判断是否终止演示,是则结束,否则回到步骤4)。
PCT/CN2017/106322 2017-05-23 2017-10-16 多平台航空电子大数据系统及方法 WO2018214388A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201710367759.7A CN107229695A (zh) 2017-05-23 2017-05-23 多平台航空电子大数据系统及方法
CN201710367759.7 2017-05-23

Publications (1)

Publication Number Publication Date
WO2018214388A1 true WO2018214388A1 (zh) 2018-11-29

Family

ID=59933807

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/106322 WO2018214388A1 (zh) 2017-05-23 2017-10-16 多平台航空电子大数据系统及方法

Country Status (2)

Country Link
CN (1) CN107229695A (zh)
WO (1) WO2018214388A1 (zh)

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110807552A (zh) * 2019-10-30 2020-02-18 合肥工业大学 一种基于改进K-means的城市电动客车行驶工况构建方法
CN111737529A (zh) * 2020-07-23 2020-10-02 北京东方通科技股份有限公司 一种多源异构数据采集方法
CN113177022A (zh) * 2021-04-29 2021-07-27 东北大学 铝/铜板带材生产全流程大数据存储方法
CN113656370A (zh) * 2021-08-16 2021-11-16 南方电网数字电网研究院有限公司 电力量测系统数据处理方法、装置和计算机设备
CN113688100A (zh) * 2021-09-06 2021-11-23 北京普睿德利科技有限公司 一种气象数据的处理方法、装置、终端及存储介质
CN113767353A (zh) * 2020-03-31 2021-12-07 深圳市大疆创新科技有限公司 飞行记录数据存储方法、获取方法及无人飞行器
CN115225730A (zh) * 2022-07-05 2022-10-21 北京赛思信安技术股份有限公司 一种支持多任务的高并发离线数据包分析方法
CN115269704A (zh) * 2022-08-02 2022-11-01 贵州财经大学 一种多元异构农业数据管理系统
CN115474021A (zh) * 2022-07-19 2022-12-13 北京普利永华科技发展有限公司 一种多组件联控下的卫星转发器数据处理方法及系统
CN116303729A (zh) * 2023-05-17 2023-06-23 北京煜象软件技术有限公司 一种信息获取方法、装置、设备及介质
WO2024011829A1 (zh) * 2022-07-15 2024-01-18 全图通位置网络有限公司 一种基于时空体系的综合智能平台数据管理方法及系统
CN115225730B (zh) * 2022-07-05 2024-05-31 北京赛思信安技术股份有限公司 一种支持多任务的高并发离线数据包分析方法

Families Citing this family (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107229234A (zh) * 2017-05-23 2017-10-03 深圳大学 面向航空电子数据的分布式挖掘系统及方法
CN107229695A (zh) * 2017-05-23 2017-10-03 深圳大学 多平台航空电子大数据系统及方法
CN108092802B (zh) * 2017-12-04 2021-02-09 中国船舶重工集团公司第七一九研究所 海洋核动力平台核动力装置的数值预测维修系统及方法
CN108052616A (zh) * 2017-12-15 2018-05-18 四川汉科计算机信息技术有限公司 基于远程嵌入式数据采集的航空大数据智能分析方法
CN108052617A (zh) * 2017-12-15 2018-05-18 四川汉科计算机信息技术有限公司 基于远程嵌入式数据采集的航空大数据智能分析系统
CN108228378A (zh) * 2018-01-05 2018-06-29 中车青岛四方机车车辆股份有限公司 列车组故障预测的数据处理方法及装置
CN108650229B (zh) * 2018-04-03 2021-07-16 国家计算机网络与信息安全管理中心 一种网络应用行为解析还原方法及系统
CN108762225B (zh) * 2018-04-24 2020-11-10 中国商用飞机有限责任公司北京民用飞机技术研究中心 一种飞行控制系统中的故障应对时的机下设备决策方法
CN109408694A (zh) * 2018-09-25 2019-03-01 广东中标数据科技股份有限公司 一种海关过关舱单分析方法、系统及装置
CN109828988A (zh) * 2019-01-25 2019-05-31 重庆科技学院 一种大数据统计方法及用于大数据统计的系统
CN112182094A (zh) * 2019-07-01 2021-01-05 成都启英泰伦科技有限公司 一种语音数据文字文本形式的大数据分布式存储方法
CN110472122A (zh) * 2019-07-31 2019-11-19 重庆古扬科技有限公司 一种多通道动态分布式学术资源采集方法
CN111078687B (zh) * 2019-11-14 2023-07-25 青岛民航空管实业发展有限公司 航班运行数据融合方法、装置及设备
CN111190992B (zh) * 2019-12-10 2023-09-08 华能集团技术创新中心有限公司 一种非结构化数据的海量存储方法及存储系统
CN111753926B (zh) * 2020-07-07 2021-03-16 中国生态城市研究院有限公司 一种用于智慧城市的数据共享方法及系统
CN111881213B (zh) * 2020-07-28 2021-03-19 东航技术应用研发中心有限公司 一种储存、加工、使用飞行大数据的系统
CN112084148A (zh) * 2020-09-18 2020-12-15 陕西千山航空电子有限责任公司 一种航空客观信息的综合应用平台
CN112416753A (zh) * 2020-11-02 2021-02-26 中关村科学城城市大脑股份有限公司 一种城市大脑应用场景数据规范化管理方法、系统及设备
CN113159371B (zh) * 2021-01-27 2022-05-20 南京航空航天大学 基于跨模态数据融合的未知目标特征建模与需求预测方法
CN114168243B (zh) * 2021-11-23 2024-04-02 广西电网有限责任公司 基于dashboard多图表动态合并数据系统及方法
CN115378674B (zh) * 2022-08-11 2023-11-03 国网湖南综合能源服务有限公司 一种基于区块链技术的共享储能云平台的应用方法及系统
CN115857899B (zh) * 2022-11-16 2023-12-15 电子科技大学 一种面向异构数据包的解析软件自动构建方法

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104394211A (zh) * 2014-11-21 2015-03-04 浪潮电子信息产业股份有限公司 一种基于Hadoop用户行为分析系统设计与实现方法
US20160094395A1 (en) * 2014-09-25 2016-03-31 At&T Intellectual Property I, L.P. Dynamic policy based software defined network mechanism
CN107229695A (zh) * 2017-05-23 2017-10-03 深圳大学 多平台航空电子大数据系统及方法

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104573068A (zh) * 2015-01-23 2015-04-29 四川中科腾信科技有限公司 一种基于大数据的信息处理方法

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160094395A1 (en) * 2014-09-25 2016-03-31 At&T Intellectual Property I, L.P. Dynamic policy based software defined network mechanism
CN104394211A (zh) * 2014-11-21 2015-03-04 浪潮电子信息产业股份有限公司 一种基于Hadoop用户行为分析系统设计与实现方法
CN107229695A (zh) * 2017-05-23 2017-10-03 深圳大学 多平台航空电子大数据系统及方法

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
JI, HUAYI: "Thinking of Development of Information Countermeasures Combat System Based on the Big Data and Cloud Computing", AEROSPACE ELECTRONIC WARFARE, 28 December 2015 (2015-12-28), ISSN: 1673-2421 *
MA, YANQING: "Internet Traffic Identification Based on Machine Learning and Implementation", CHINA MASTER'S THESES FULL-TEXT DATABASEELECTRONIC TECHNOLOGY & INFORMATION SCIENCE, 15 October 2014 (2014-10-15), ISSN: 1674-0246 *

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110807552A (zh) * 2019-10-30 2020-02-18 合肥工业大学 一种基于改进K-means的城市电动客车行驶工况构建方法
CN110807552B (zh) * 2019-10-30 2023-07-25 合肥工业大学 一种基于改进K-means的城市电动客车行驶工况构建方法
CN113767353A (zh) * 2020-03-31 2021-12-07 深圳市大疆创新科技有限公司 飞行记录数据存储方法、获取方法及无人飞行器
CN111737529A (zh) * 2020-07-23 2020-10-02 北京东方通科技股份有限公司 一种多源异构数据采集方法
CN113177022A (zh) * 2021-04-29 2021-07-27 东北大学 铝/铜板带材生产全流程大数据存储方法
CN113656370A (zh) * 2021-08-16 2021-11-16 南方电网数字电网研究院有限公司 电力量测系统数据处理方法、装置和计算机设备
CN113656370B (zh) * 2021-08-16 2024-04-30 南方电网数字电网集团有限公司 电力量测系统数据处理方法、装置和计算机设备
CN113688100A (zh) * 2021-09-06 2021-11-23 北京普睿德利科技有限公司 一种气象数据的处理方法、装置、终端及存储介质
CN113688100B (zh) * 2021-09-06 2023-07-18 北京普睿德利科技有限公司 一种气象数据的处理方法、装置、终端及存储介质
CN115225730A (zh) * 2022-07-05 2022-10-21 北京赛思信安技术股份有限公司 一种支持多任务的高并发离线数据包分析方法
CN115225730B (zh) * 2022-07-05 2024-05-31 北京赛思信安技术股份有限公司 一种支持多任务的高并发离线数据包分析方法
WO2024011829A1 (zh) * 2022-07-15 2024-01-18 全图通位置网络有限公司 一种基于时空体系的综合智能平台数据管理方法及系统
CN115474021A (zh) * 2022-07-19 2022-12-13 北京普利永华科技发展有限公司 一种多组件联控下的卫星转发器数据处理方法及系统
CN115474021B (zh) * 2022-07-19 2023-08-08 北京普利永华科技发展有限公司 一种多组件联控下的卫星转发器数据处理方法及系统
CN115269704B (zh) * 2022-08-02 2023-08-18 贵州财经大学 一种多元异构农业数据管理系统
CN115269704A (zh) * 2022-08-02 2022-11-01 贵州财经大学 一种多元异构农业数据管理系统
CN116303729A (zh) * 2023-05-17 2023-06-23 北京煜象软件技术有限公司 一种信息获取方法、装置、设备及介质

Also Published As

Publication number Publication date
CN107229695A (zh) 2017-10-03

Similar Documents

Publication Publication Date Title
WO2018214388A1 (zh) 多平台航空电子大数据系统及方法
Zaharia et al. Fast and interactive analytics over Hadoop data with Spark
CN105138661B (zh) 一种基于Hadoop的网络安全日志k-means聚类分析系统及方法
CN104885104B (zh) 卫星调度系统
CN105183834B (zh) 一种基于本体库的交通大数据语义应用服务方法
US11676066B2 (en) Parallel model deployment for artificial intelligence using a primary storage system
Steer et al. Raphtory: Streaming analysis of distributed temporal graphs
US20190392001A1 (en) Systems and Methods for an Artificial Intelligence Data Fusion Platform
CN111709527A (zh) 运维知识图谱库的建立方法、装置、设备及存储介质
Emmanouil et al. Big data analytics in prevention, preparedness, response and recovery in crisis and disaster management
Limkar et al. A novel method for parallel indexing of real time geospatial big data generated by IoT devices
CN104809242A (zh) 一种基于分布式结构的大数据聚类方法和装置
CN114399006B (zh) 基于超算的多源异构图数据融合方法及系统
CN102999633A (zh) 网络信息的云聚类提取方法
WO2018214387A1 (zh) 面向航空电子数据的分布式挖掘系统及方法
CN103390037A (zh) 基于移动终端的万人协同标绘方法
Gu et al. Chronos: An elastic parallel framework for stream benchmark generation and simulation
CN104820708A (zh) 一种基于云计算平台的大数据聚类方法和装置
CN104809244A (zh) 一种大数据环境下的数据挖掘方法和装置
CN112148578A (zh) 基于机器学习的it故障缺陷预测方法
CN110322931A (zh) 一种碱基识别方法、装置、设备及存储介质
Li et al. The overview of big data storage and management
CN116166191A (zh) 湖仓一体系统
CN105320711B (zh) 巨量数据存取方法以及使用该方法的系统
Bouzidi et al. Enhancing crisis management because of deep learning, big data and parallel computing environment: survey

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17911068

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC , EPO FORM 1205A DATED 27.02.2020.

122 Ep: pct application non-entry in european phase

Ref document number: 17911068

Country of ref document: EP

Kind code of ref document: A1