CN108009290A - A kind of data modeling and storage method of track traffic command centre gauze big data - Google Patents
A kind of data modeling and storage method of track traffic command centre gauze big data Download PDFInfo
- Publication number
- CN108009290A CN108009290A CN201711426597.6A CN201711426597A CN108009290A CN 108009290 A CN108009290 A CN 108009290A CN 201711426597 A CN201711426597 A CN 201711426597A CN 108009290 A CN108009290 A CN 108009290A
- Authority
- CN
- China
- Prior art keywords
- data
- modeling
- gauze
- storage
- track traffic
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 28
- 238000004458 analytical method Methods 0.000 claims abstract description 12
- 238000013499 data model Methods 0.000 claims abstract description 6
- 230000008859 change Effects 0.000 claims description 11
- 230000009466 transformation Effects 0.000 claims description 11
- 238000012512 characterization method Methods 0.000 claims description 6
- 238000007726 management method Methods 0.000 claims description 6
- 238000013500 data storage Methods 0.000 claims description 5
- 238000005516 engineering process Methods 0.000 claims description 4
- 230000007774 longterm Effects 0.000 claims description 4
- 241001269238 Data Species 0.000 claims description 3
- 230000032683 aging Effects 0.000 claims description 3
- 238000005192 partition Methods 0.000 claims description 3
- 238000006116 polymerization reaction Methods 0.000 claims description 3
- 238000013316 zoning Methods 0.000 claims description 3
- 230000008520 organization Effects 0.000 abstract description 3
- 230000008569 process Effects 0.000 abstract description 3
- 230000006870 function Effects 0.000 description 3
- 239000012141 concentrate Substances 0.000 description 2
- 238000010276 construction Methods 0.000 description 2
- 238000007405 data analysis Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 238000007418 data mining Methods 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000005553 drilling Methods 0.000 description 1
- 238000003672 processing method Methods 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/13—File access structures, e.g. distributed indices
- G06F16/134—Distributed indices
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/18—File system types
- G06F16/182—Distributed file systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
- G06Q10/063—Operations research, analysis or management
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/10—Services
- G06Q50/26—Government or public services
Landscapes
- Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Theoretical Computer Science (AREA)
- Human Resources & Organizations (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Tourism & Hospitality (AREA)
- Economics (AREA)
- Strategic Management (AREA)
- Entrepreneurship & Innovation (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Marketing (AREA)
- Databases & Information Systems (AREA)
- General Business, Economics & Management (AREA)
- Educational Administration (AREA)
- Development Economics (AREA)
- Game Theory and Decision Science (AREA)
- General Health & Medical Sciences (AREA)
- Primary Health Care (AREA)
- Health & Medical Sciences (AREA)
- Operations Research (AREA)
- Quality & Reliability (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Train Traffic Observation, Control, And Security (AREA)
Abstract
The invention discloses a kind of data modeling and storage method of track traffic command centre gauze big data, the data modeling and storage organization method of use.In structural data modeling process, by the structural data in gauze big data, with reference to Hadoop platform and feature of the module and the data application situation of track transportation industry, select rational data model, data modeling and storage data are carried out, is elaborated especially for the method for logic modeling:In ephemeral data file deposit Hbase, to being stored in after historical data unified Modeling in Hbase, intermediate data layer and Data Mart layer, are built in table deposit Hive with dropping the method for normal form;To unstructured data, application is pressed using Hadoop platform and chronological classification stores small documents, realizes the full-text search and analysis of unstructured data.Therefore specification and the storage of efficient data are realized to gauze big data.
Description
Technical field
The present invention relates to a kind of data modeling and storage method of track traffic command centre gauze big data, belong to track
Traffic surveillance and control system technical field.
Background technology
With the quickening of Construction of Urban Rail Traffic, the subway line in each city gradually develops to gauzeization, and track is handed over
Logical gauze data type is more and more, and data volume is also increasing, and the tidal data recovering of magnanimity has arrived track traffic command centre.
In field of track traffic, how effectively to collect, arrange, store or even handle and analyze these structurings and unstructured number
According to carrying out the data mining and data analysis of depth, wherein valuable information excavated, so as to improve the operation water of track traffic
It is flat, science decision ability is lifted, Improve Efficiency reduces cost, lifts information service and safety assurance ability, has become industry
The emphasis of concern.
At present, the data with existing construction of warehouse case of track traffic command centre is also considerably less, and all uses MPP DB
The data warehouse of framework carries out structural data storage, but with the increase of data volume and increasing for data type, MPP DB
Price is high, the weakness that is difficult to extend, cannot store unstructured data, cannot carry out stream process is exposed.
Hadoop is a relatively inexpensive and good framework.But if doing data storage using Hadoop platform, this
Kind platform has the framework and data processing method different from MPP DB, its structure is more flexible, the data magnitude bigger of storage,
Support high concurrent and in real time processing, be easy to extend.But do not indexed in the storage of the data of Hadoop, data block is bigger than MPP DB
Very much, so accurate data inquiry, the speed of watch and the combined inquiry of watch are slower than MPP DB, conventional MPP DB based on model
The Data Modeling Method of formula modeling cannot indiscriminately imitate Hadoop platform, therefore the characteristics of be directed to Hadoop platform, select properly
Component, to the structural data in the gauze big data of track traffic, design and new model two kinds with dimensionality analysis and normal form
The Data Modeling Method that method is combined, plays the shortcomings that to evade Hadoop platform and to greatest extent its advantage, to realize line
The safety of net big data and efficient storage and access.
The unstructured data of conventional track traffic command centre, such as video, image, voice, journal file, the page
Crawl etc., is all to be stored using disk array, simply realizes storage backup functionality, unstructured data can not realized entirely
Text retrieval is so that further analysis.The demand day that track transportation industry is analyzed unstructured data and excavated at the same time
Gradually rise, it is necessary to carry out content retrieval and the processing of unstructured data, this is also required to introduce Hadoop platform substitution disk battle array
The memory module of row.
The content of the invention
Purpose:In order to overcome the deficiencies in the prior art, the present invention provides a kind of track traffic command centre gauze
The data modeling and storage method of big data.
Technical solution:In order to solve the above technical problems, the technical solution adopted by the present invention is:
A kind of data modeling and storage method of track traffic command centre gauze big data, are carried out using Hadoop platform
Data store, and specific steps include as follows:
Step 1:For the structural data in Metro Network big data, include each subsystem that each circuit gathers
The time series data of system, is pooled to the big data platform of gauze command centre, and platform uses Hadoop framework;
Step 2:Structural data is modeled, including conceptual modelling, logic modeling and physical modeling;
Step 3:For a large amount of unstructured datas of track traffic command centre, unstructured data is compressed
Afterwards, store in HBase, and table storage associated metadata is built in HBase.
Preferably, the conceptual modelling:According to the application of rail traffic structure gauze big data, and data
Actual conditions, carry out conceptual modelling, being associated between each Subject elements of track traffic;Associated Subject elements
It is as follows:Safeguard using-subsystem, gauze-circuit-station, Customer information-passenger flow-satisfaction, signal-train operation-train-
Driver, equipment-point-equipment running status record, linkage-sequence-alarm-event.
Preferably, logic modeling is carried out to each data field of data warehouse, included the following steps:
Step 2.2.1:Ephemeral data area logic modeling;
Step 2.2.2:Historical data area logic modeling step;
Step 2.2.3:Data Mart logic modeling.
Preferably, ephemeral data area logic modeling:The every line that different circuit integrators is sent up
The data on road, store in a manner of less than 1M in HBase in ephemeral data area, and storage time is in 1 month.
Preferably, historical data area logic modeling, includes the following steps:
Step 2.2.2.1:Time series data in gauze big data includes two classes, ordinal number when one kind is the change of equipment point
According to another kind of is the time series data of passenger flow;
Step 2.2.2.2:For the time series data of equipment point change, the index RowKey of HBase is with input data point
All fronts net uniqueness index and the data variation ageing of the data point form;It is with character for all fronts net uniqueness index
String is modeled as keyword, the organizational form such as table 1 of index:
Circuit | Station | Using | Equipment | Vertex type | Point | Transformation period |
Table 1;
Step 2.2.2.3:For the time series data of passenger flow, the organizational form such as table 2 of index:
Card number | Time out of the station |
Table 2.
Preferably, the Data Mart logic modeling, includes the following steps:
Step 2.2.3.1:Data Mart is designed according to the specific data application of track traffic command centre, Data Mart
Table store into Hive;The data model of table is designed according to the characteristics of Hadoop itself in Hive, using dimensionality analysis method
Design, carries out data drop normal form processing;By way of polymerization, different attributes is converged in an entity and is deposited
Storage;I.e. using the wide table more than attribute column, content and the relevant attribute of function, there are in same table;
Step 2.2.3.2:For ISCS data, in circuit centered on the point for characterizing device attribute build table in a manner of change
Build table mode for facility center management in gauze, i.e., all outputs characterization point for each equipment needing application and equipment
Attribute all concentrates that there are in a table;The table of building of facility center management includes output characterization and counts out as n in the gauze, specifically
Such as table 3:
Table 3;
Step 2.2.3.3:For passenger flow year analytical table, passenger flow year analytical table be data cube, have three dimensions:Month
Part, station, passenger flow attribute information, three dimensions is closed into table 4:
Table 4;
Step 2.2.3.4:For travelling data, actual train running graph and planned train graph expansion, train status are believed
Breath, load factor, motion time, dwell time relevant information are combined in table 5:
Table 5.
Preferably, the physical modeling, includes the following steps:
Step 2.3.1:Input data to importing Hadoop platform, is organized in a manner of the small documents less than 64M,
Using Avro forms, the ephemeral data area being stored in HBase;
Step 2.3.2:Historical data area data storage for HBase, sets multiple Region, data are according to Rowkey
Design storage to transformation period, transformation period in relevant Region, include in Rowkey be random, use on long-term
In ensureing random storage of the data on each Region, i.e. data can be evenly distributed to each back end;
Step 2.3.3:In the zoning design of Hive, using month subregion, i.e., annual 12 subregions;Table is combed
Reason, string types are replaced with int, substitute date types with timestamp, Float or Double are replaced with Decimal types
Type.
Preferably, the step 3 includes the following steps:
Step 3.1:Since quantity of documents is more, so the partition directory organizational form of storage is document source application, it is divided into
Five classes:Historical archives, emergency command, system log, report, network crawl content, such as table 6, then classify according to concrete application,
Archive finally realized for a phase to rear four kinds of every half a years, compressing file is done using Snappy technologies during storage, for non-
The further retrieval and content analysis of structural data;
Table 6.
Beneficial effect:A kind of data modeling of track traffic command centre gauze big data provided by the invention and storage side
Method, it is special with reference to Hadoop platform and component by the structural data in gauze big data in structural data modeling process
The data application situation of point and track transportation industry, selects rational data model, carries out data modeling and storage data:Temporarily
In data file deposit Hbase, to being stored in after historical data unified Modeling in Hbase, intermediate data layer and Data Mart layer, with
The method of drop normal form is built in table deposit Hive;To unstructured data, application is pressed using Hadoop platform and chronological classification stores
Small documents, realize the full-text search and analysis of unstructured data.
This method realizes gauze big data specification and the storage of efficient data and extemporaneous retrieval, further data are dug
Pick and data analysis, and then instruct the function of rail transportation operation.
Brief description of the drawings
Fig. 1 is the data organization and storage organization schematic diagram of gauze big data;
Embodiment
The present invention is further described below in conjunction with the accompanying drawings.
Structural data in gauze big data and unstructured data are respectively processed by the present invention.As shown in Figure 1
Structural data, according to the technical characterstic of Hadoop platform, divides different levels, stage by stage by the business procedure of data processing
Different components is selected, comes storage and the structure of tissue data, carries out the design of data modeling;Unstructured data uses small text
Part mode is stored in Hadoop platform.
A kind of data modeling and storage method of track traffic command centre gauze big data, are carried out using Hadoop platform
Data store, and specific steps include as follows:
Step 1:For the structural data in Metro Network big data, include each subsystem that each circuit gathers
The time series data of system, is pooled to the big data platform of gauze command centre, and platform uses Hadoop framework.
Step 2:Structural data is modeled, including conceptual modelling, logic modeling and physical modeling.
Step 3:For a large amount of unstructured datas of track traffic command centre, these files in track transportation industry
Feature is that all less file is respectively less than 100M, after being compressed to file, stores in HBase, and builds table in HBase and deposit
Store up associated metadata.
Step 2.1:The conceptual modelling:According to the reality of the application of rail traffic structure gauze big data, and data
Situation, carries out conceptual modelling, being associated between each Subject elements of track traffic;Associated Subject elements are such as
Under:Using-subsystem, gauze-circuit-station, Customer information-passenger flow-satisfaction, signal-train operation-train maintenance-department
Machine, equipment-point-equipment running status record, linkage-sequence-alarm-event.
Step 2.2:Logic modeling is carried out to each data field of data warehouse, is included the following steps:
Step 2.2.1:Ephemeral data area logic modeling:
The data for every circuit that different circuit integrators is sent up, store in HBase in a manner of less than 1M
In ephemeral data area, storage time is in 1 month.
Step 2.2.2:Historical data area logic modeling, includes the following steps:
Step 2.2.2.1:Time series data in gauze big data includes two classes, ordinal number when one kind is the change of equipment point
According to another kind of is the time series data of passenger flow.
Step 2.2.2.2:For the time series data of equipment point change, the index RowKey of HBase is with input data point
All fronts net uniqueness index and the data variation ageing of the data point form.It is with character for all fronts net uniqueness index
String is modeled as keyword, the organizational form such as table 1 of index:
Circuit | Station | Using | Equipment | Vertex type | Point | Transformation period |
Table 1
The character string that circuit, station, application, equipment, vertex type, point are combined into is the index value of each data point, during change
Between be data point transformation period, such classification building form, should be readily appreciated that and realize the unification of completely net, convenient into line number
According to extension and new line access.It is easy to implement that data are averaged and equally distributed form can be presented on long-term in transformation period
Distribution is stored in different back end, ensures the pressure balance between each back end.The index of track traffic command centre point
There is the demand that batch reading is largely carried out to change in a period of time of point in analysis, this storage format can once read in one
The multiple delta data of a point, realizes convenient and efficient data access.
Step 2.2.2.3:For the time series data of passenger flow, the organizational form such as table 2 of index:
Card number | Time out of the station |
Table 2
Step 2.2.3:Data Mart logic modeling, includes the following steps:
Step 2.2.3.1:Data Mart is designed according to the specific data application of track traffic command centre, Data Mart
Table store into Hive.The data model of table is designed according to the characteristics of Hadoop itself in Hive, using dimensionality analysis method
Design, carries out data drop normal form processing;By way of polymerization, different attributes is converged in an entity and is deposited
Storage;I.e. using the wide table more than attribute column, content and the relevant attribute of function, there are in same table;In gauze, original line
The row of table in circuit-switched data model now according to specific requirements, are incorporated into the big table of hundreds of row less than 30 row and repeat to store, to increase
Big data memory space, which exchanges for, effectively shortens read access time.For the logic modeling of Data Mart, each of data cube
The facade expansion of dimension, compresses dimensionality reduction, various dimensions is merged into a dimension.It is convenient for year-on-year, ring ratio and drilling analysis,
Data modeling can have bulk redundancy row, not follow normal form modeling principle.
Step 2.2.3.2:For ISCS data, in circuit centered on the point for characterizing device attribute build table in a manner of change
Build table mode for facility center management in gauze, i.e., all outputs characterization point for each equipment needing application and equipment
Attribute all concentrates that there are in a table.The table of building of facility center management includes output characterization and counts out as n in the gauze, specifically
Such as table 3:
Table 3
Step 2.2.3.3:For passenger flow year analytical table, passenger flow year analytical table be data cube, have three dimensions:Month
Part, station, passenger flow attribute information, three dimensions is closed into table 4.
Table 4
Step 2.2.3.4:For travelling data, actual train running graph and planned train graph expansion, train status are believed
Breath, load factor, motion time, dwell time relevant information are combined in table 5.
Table 5
Step 2.3:The physical modeling, includes the following steps:
Step 2.3.1:The input data for importing Hadoop platform is organized in a manner of the small documents less than 64M,
Using Avro forms, the ephemeral data area being stored in HBase.
Step 2.3.2:Historical data area data storage for HBase, sets multiple Region, data are according to Rowkey
Design storage to transformation period, transformation period in relevant Region, include in Rowkey be random, use on long-term
In ensureing random storage of the data on each Region, i.e. data can be evenly distributed to each back end.
Step 2.3.3:In the zoning design of Hive, do and optimize for the efficiency of Hive, it is using month subregion, i.e., annual
12 subregions.The table in physical modeling stage is combed, string types are replaced with int, date classes are substituted with timestamp
Type, Float or Double types are replaced with Decimal types.
The step 3 includes the following steps:
Step 3.1:Since quantity of documents is more, so the partition directory organizational form of storage is document source application, it is divided into
Five classes:Historical archives, emergency command, system log, report, network crawl content, such as table 6, then classify according to concrete application,
Archive finally realized for a phase to rear four kinds of every half a years, compressing file is done using Snappy technologies during storage, for non-
The further retrieval and content analysis of structural data.
Table 6.
Claims (8)
1. a kind of data modeling and storage method of track traffic command centre gauze big data, using Hadoop platform into line number
According to storage, it is characterised in that:Specific steps include as follows:
Step 1:For the structural data in Metro Network big data, including the subsystems of each circuit collection
Time series data, is pooled to the big data platform of gauze command centre, and platform uses Hadoop framework;
Step 2:Structural data is modeled, including conceptual modelling, logic modeling and physical modeling;
Step 3:For a large amount of unstructured datas of track traffic command centre, after being compressed to unstructured data, deposit
Store up in HBase, and table storage associated metadata is built in HBase.
2. the data modeling and storage method of a kind of track traffic command centre gauze big data according to claim 1,
It is characterized in that:The conceptual modelling:According to the actual conditions of the application of rail traffic structure gauze big data, and data,
Conceptual modelling is carried out, being associated between each Subject elements of track traffic;Associated Subject elements are as follows:Should
With-subsystem, gauze-circuit-station, Customer information-passenger flow-satisfaction, signal-train operation-train maintenance-driver, set
Standby-point-equipment running status record, linkage-sequence-alarm-event.
3. the data modeling and storage method of a kind of track traffic command centre gauze big data according to claim 1,
It is characterized in that:Logic modeling is carried out to each data field of data warehouse, is included the following steps:
Step 2.2.1:Ephemeral data area logic modeling;
Step 2.2.2:Historical data area logic modeling step;
Step 2.2.3:Data Mart logic modeling.
4. the data modeling and storage method of a kind of track traffic command centre gauze big data according to claim 3,
It is characterized in that:Ephemeral data area logic modeling:The data for every circuit that different circuit integrators is sent up, with
Mode less than 1M is stored in HBase in ephemeral data area, and storage time is in 1 month.
5. the data modeling and storage method of a kind of track traffic command centre gauze big data according to claim 3,
It is characterized in that:Historical data area logic modeling, includes the following steps:
Step 2.2.2.1:Time series data in gauze big data includes two classes, and one kind is the time series data of equipment point change, separately
One kind is the time series data of passenger flow;
Step 2.2.2.2:For the time series data of equipment point change, the index RowKey of HBase is with all fronts of input data point
Net uniqueness indexes and the data variation ageing of the data point forms;For all fronts net uniqueness index made with character string
Modeled for keyword, the organizational form such as table 1 of index:
Table 1;
Step 2.2.2.3:For the time series data of passenger flow, the organizational form such as table 2 of index:
Table 2.
6. the data modeling and storage method of a kind of track traffic command centre gauze big data according to claim 3,
It is characterized in that:The Data Mart logic modeling, includes the following steps:
Step 2.2.3.1:Data Mart is designed according to the specific data application of track traffic command centre, the table of Data Mart
Store in Hive;The data model of table is designed according to the characteristics of Hadoop itself in Hive, is designed using dimensionality analysis method,
Data are carried out with drop normal form processing;By way of polymerization, different attributes is converged in an entity and is stored;Adopt
With the wide table more than attribute column, content and the relevant attribute of function, there are in same table;
Step 2.2.3.2:For ISCS data, in circuit centered on the point for characterizing device attribute build table in a manner of be changed to line
Facility center management builds table mode in net, i.e., all outputs characterization point and the attribute of equipment that each equipment need application
All there are in a table for concentration;The table of building of facility center management is counted out as n, specific such as table comprising output characterization in the gauze
3:
Table 3;
Step 2.2.3.3:For passenger flow year analytical table, passenger flow year analytical table be data cube, have three dimensions:Month, car
Stand, passenger flow attribute information, three dimensions is closed into table 4:
Table 4;
Step 2.2.3.4:For travelling data, actual train running graph and planned train graph expansion, train status information, expire
Load rate, motion time, dwell time relevant information are combined in table 5:
Table 5.
7. the data modeling and storage method of a kind of track traffic command centre gauze big data according to claim 1,
It is characterized in that:The physical modeling, includes the following steps:
Step 2.3.1:Input data to importing Hadoop platform, is organized in a manner of the small documents less than 64M, is used
Avro forms, the ephemeral data area being stored in HBase;
Step 2.3.2:Historical data area data storage for HBase, sets multiple Region, data are set according to Rowkey's
Meter storage is arrived in relevant Region, transformation period is included in Rowkey, transformation period is random on long-term, for protecting
Random storage of the data on each Region is demonstrate,proved, i.e. data can be evenly distributed to each back end;
Step 2.3.3:In the zoning design of Hive, using month subregion, i.e., annual 12 subregions;Table is combed, is used
Int replaces string types, substitutes date types with timestamp, Float or Double types are replaced with Decimal types.
8. the data modeling and storage method of a kind of track traffic command centre gauze big data according to claim 1,
It is characterized in that:The step 3 includes the following steps:
Step 3.1:Since quantity of documents is more, so the partition directory organizational form of storage is document source application, it is divided into five classes:
Historical archives, emergency command, system log, report, network crawl content, such as table 6, then classify, finally according to concrete application
Archive realized for a phase to rear four kinds of every half a years, compressing file is done using Snappy technologies during storage, for non-structural
Change the further retrieval and content analysis of data;
Table 6.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711426597.6A CN108009290B (en) | 2017-12-25 | 2017-12-25 | Data modeling and storage method for large data of rail transit command center line network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711426597.6A CN108009290B (en) | 2017-12-25 | 2017-12-25 | Data modeling and storage method for large data of rail transit command center line network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108009290A true CN108009290A (en) | 2018-05-08 |
CN108009290B CN108009290B (en) | 2022-03-15 |
Family
ID=62061202
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201711426597.6A Active CN108009290B (en) | 2017-12-25 | 2017-12-25 | Data modeling and storage method for large data of rail transit command center line network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108009290B (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109947896A (en) * | 2019-03-11 | 2019-06-28 | 浙江邦盛科技有限公司 | A kind of unstructured flow data real-time storage method of rail traffic |
CN111144696A (en) * | 2019-11-28 | 2020-05-12 | 国电南瑞科技股份有限公司 | Rail transit data analysis method based on big data |
CN112693502A (en) * | 2019-10-23 | 2021-04-23 | 上海宝信软件股份有限公司 | Urban rail transit monitoring system and method based on big data architecture |
CN112905571A (en) * | 2021-01-07 | 2021-06-04 | 中车工业研究院有限公司 | Train rail transit sensor data management method and device |
CN113704310A (en) * | 2021-09-08 | 2021-11-26 | 北京城建设计发展集团股份有限公司 | Rail transit driving command system and driving data processing method |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7325010B1 (en) * | 1999-12-22 | 2008-01-29 | Chungtae Kim | Information modeling method and database searching method using the information modeling method |
CN102779186A (en) * | 2012-06-29 | 2012-11-14 | 浙江大学 | Whole process modeling method of unstructured data management |
CN104217003A (en) * | 2014-09-15 | 2014-12-17 | 国家电网公司 | Data modeling system |
CN107133273A (en) * | 2017-04-07 | 2017-09-05 | 青岛海信网络科技股份有限公司 | A kind of transit's routes data processing method and server cluster based on big data |
-
2017
- 2017-12-25 CN CN201711426597.6A patent/CN108009290B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7325010B1 (en) * | 1999-12-22 | 2008-01-29 | Chungtae Kim | Information modeling method and database searching method using the information modeling method |
CN102779186A (en) * | 2012-06-29 | 2012-11-14 | 浙江大学 | Whole process modeling method of unstructured data management |
CN104217003A (en) * | 2014-09-15 | 2014-12-17 | 国家电网公司 | Data modeling system |
CN107133273A (en) * | 2017-04-07 | 2017-09-05 | 青岛海信网络科技股份有限公司 | A kind of transit's routes data processing method and server cluster based on big data |
Non-Patent Citations (1)
Title |
---|
孙乐: "海洋环境监测数据建模及索引技术研究", 《中国优秀硕士学位论文全文数据库基础科学辑》 * |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109947896A (en) * | 2019-03-11 | 2019-06-28 | 浙江邦盛科技有限公司 | A kind of unstructured flow data real-time storage method of rail traffic |
CN112693502A (en) * | 2019-10-23 | 2021-04-23 | 上海宝信软件股份有限公司 | Urban rail transit monitoring system and method based on big data architecture |
CN111144696A (en) * | 2019-11-28 | 2020-05-12 | 国电南瑞科技股份有限公司 | Rail transit data analysis method based on big data |
CN112905571A (en) * | 2021-01-07 | 2021-06-04 | 中车工业研究院有限公司 | Train rail transit sensor data management method and device |
CN112905571B (en) * | 2021-01-07 | 2024-03-19 | 中车工业研究院有限公司 | Train rail transit sensor data management method and device |
CN113704310A (en) * | 2021-09-08 | 2021-11-26 | 北京城建设计发展集团股份有限公司 | Rail transit driving command system and driving data processing method |
Also Published As
Publication number | Publication date |
---|---|
CN108009290B (en) | 2022-03-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108009290A (en) | A kind of data modeling and storage method of track traffic command centre gauze big data | |
CN105631003B (en) | Support intelligent index construct, inquiry and the maintaining method of mass data classified statistic | |
CN104361018B (en) | Electronic archives information reorganization method and device | |
CN102929901B (en) | The method and apparatus improving data warehouse performance | |
CN104102737B (en) | A kind of historical data storage method and system | |
CN106611046A (en) | Big data technology-based space data storage processing middleware framework | |
CN104318376A (en) | Library information management system | |
CN106708993A (en) | Spatial data storage processing middleware framework realization method based on big data technology | |
CN103631909A (en) | System and method for combined processing of large-scale structured and unstructured data | |
CN104239377A (en) | Platform-crossing data retrieval method and device | |
CN108446391A (en) | Processing method, device, electronic equipment and the computer-readable medium of data | |
CN107506477A (en) | A kind of archive management system | |
CN105378730A (en) | Social media content analysis and output | |
CN107644050A (en) | A kind of querying method and device of the Hbase based on solr | |
CN106599040A (en) | Layered indexing method and search method for cloud storage | |
CN107391502A (en) | The data query method, apparatus and index structuring method of time interval, device | |
CN103198150A (en) | Big data indexing method and system | |
CN104809252A (en) | Internet data extraction system | |
CN110990676A (en) | Social media hotspot topic extraction method and system | |
CN104750826A (en) | Structural data resource metadata automatically-identifying and dynamically-registering method | |
CN101963993B (en) | Method for fast searching database sheet table record | |
CN108038188A (en) | A kind of document handling method and device | |
CN103455896A (en) | Paperless assembling quality control method based on internet of things | |
CN104834739A (en) | Internet information storage system | |
CN107480235A (en) | A kind of database framework of data platform |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right | ||
TR01 | Transfer of patent right |
Effective date of registration: 20221128 Address after: 210006 Building 2, No. 19, Chengxin Avenue, Jiangning Economic and Technological Development Zone, Nanjing, Jiangsu Province Patentee after: NARI TECHNOLOGY Co.,Ltd. Patentee after: NARI Rail Transit Technology Co.,Ltd. Address before: No. 19, Jiangning District, Jiangning District, Nanjing, Jiangsu Patentee before: NARI TECHNOLOGY Co.,Ltd. Patentee before: NARI NANJING CONTROL SYSTEM Co.,Ltd. |