CN110290179A - A kind of distributed mobile base station data storage system based on Hadoop - Google Patents
A kind of distributed mobile base station data storage system based on Hadoop Download PDFInfo
- Publication number
- CN110290179A CN110290179A CN201910469125.1A CN201910469125A CN110290179A CN 110290179 A CN110290179 A CN 110290179A CN 201910469125 A CN201910469125 A CN 201910469125A CN 110290179 A CN110290179 A CN 110290179A
- Authority
- CN
- China
- Prior art keywords
- hadoop
- node
- layer
- base station
- mobile base
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/18—File system types
- G06F16/182—Distributed file systems
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
- H04L67/1097—Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]
Abstract
The distributed mobile base station data storage system based on Hadoop that the present invention relates to a kind of, the system includes sequentially connected interface layer, functional layer, data Layer and physical layer, the physical layer includes an at least application server, backup server and core layer switch, each application server, each backup server, data Layer is separately connected core layer switch, the data Layer includes Linux storage cluster, the Linux storage cluster uses Hadoop cluster platform, the Hadoop cluster platform includes YARN, index database, HBase database, Mysql database and the Zookeeper to carry out distributed coordination service, the bottom of the Hadoop cluster platform is equipped with the HD to store the file on all nodes FS.Compared with prior art, the compatibility that the present invention has many advantages, such as hoist capacity, improves data.
Description
Technical field
The present invention relates to mobile base station data technical fields, more particularly, to a kind of mobile base of the distribution based on Hadoop
It stands data-storage system.
Background technique
Mobile base station data access technology key point includes: the flexible expansion of 1, data retrieval performance: the big frequency of data scale
Degree height there are short-term peak emerge in large numbers as, to platform assembly impact it is larger, due to Distributed Message Queue using trunking mode deployment, can
Hardware resource extending transversely according to demand, therefore impact can effectively be shielded by application distribution formula message queue.2, distribution disappears
The creation of breath queue theme and tuning: it to achieve the purpose that access magnanimity high-frequency data in real time, needs according to time series data
Generate frequency, collection period, measuring point scale, targeted design data distribution strategy;Monitoring data are distributed to by data classification
In Distributed Message Queue;The ginseng such as number of partitions, replicator, theme distribution of different classes of data is adjusted according to system load
Number;The storage organization of time series data in the distributed message queue is set, realizes high speed writein, and reduce transition overhead;Simultaneously
The recovery mechanism based on Distributed Message Queue need to be realized, to ensure that data are not lost.
Data store link and realize to the distributed storage for adopting data.In principle, acquisition metric data is stored in big data
In platform distribution column data database (HBase), and Recent data (when in front of noon or one day) is buffered in big number
According in platform distributed memory database (Redis), handled convenient for application higher for requirement of real-time.
However, being continuously increased with mobile terminal, mobile base station data are more and more, and conventional method can not be coped with
The mobile base station data of magnanimity.High expense, the poor compatibility of data, backstage of existing business private clound GIS platform deployment
Service interface customization is difficult.
Summary of the invention
It is an object of the present invention to overcome the above-mentioned drawbacks of the prior art and provide a kind of based on Hadoop's
Distributed mobile base station data storage system.
The purpose of the present invention can be achieved through the following technical solutions:
A kind of distributed mobile base station data storage system based on Hadoop, including sequentially connected interface layer, function
Layer, data Layer and physical layer, the physical layer include an at least application server, backup server and core layer exchange
Machine, each application server, each backup server, data Layer are separately connected core layer switch, and the data Layer includes Linux
Storage cluster, the Linux storage cluster use Hadoop cluster platform, and the Hadoop cluster platform includes YARN, index
Library, HBase database, Mysql database and the Zookeeper to carry out distributed coordination service, the Hadoop
The bottom of cluster platform is equipped with the HDFS to store the file on all nodes.
The Hadoop cluster platform is tree structure, which includes internal node and leaf node, described
Internal node represent a router or core layer switch, the leaf node represents deployment DataNode back end
Machine, DataNode back end comes from NameNode title to respond the read-write requests from HDFS, and for responding
The order of creation, the deletion and copy block of node.
The administrator of the Hadoop cluster platform specifies a script file by setting the parameters to, in NameNode
Name node loads this script after starting successfully automatically and executes the script, will be in cluster by the setting in the script
The IP of DataNode back end translates into corresponding rackname, if being not provided with parameter, each DataNode data section
The IP of point can be resolved to default rack, and NameNode name node is used to receive the regular heart of each DataNode back end
Jump message.
In the Hadoop cluster platform, DataNode is actively initiated between NameNode every one section of heart time
Connection, the eartbeat interval time set by configuration parameter, and can set maximum duration by configuration parameter, if NameNode
Claim node to find a node more than not getting in touch yet with it after maximum duration, then assert that the node of discovery is dead, by the section
Point is labeled as DeadNode death nodes.
The Mysql database is synchronous with HBase database realizing by Sqoop.
The interface layer uses Java API programming interface.
The functional layer includes points of interest attribute query unit, point of interest space querying unit, point of interest administrative unit.
Compared with prior art, the invention has the following advantages that
(1) present system can be combined multiple economic machines using Hadoop Distributed Architecture, form a collection
Group, the physical disk of more machines form a big logic storage, can greatly promote capacity, improve the compatibility of data, solve
The certainly difficult problem of background service interface customization;
(2) Distributed Storage of the invention using Java API as interface layer, is answered equipped at least one by physical layer
With server, backup server, data Layer include YARN, index database, HBase database, Mysql database and for for
Hadoop and HBase provide the significant components Zookeeper of Distributed Services, data Layer have it is enough reliable, can be safe complete
Whole storing data, within the system, the number of copies of setting DataNode back end are three, can also be fast even if breaking down
Speed is replicated and is backed up from other nodes again.
Detailed description of the invention
Fig. 1 is the structural schematic diagram of present system.
Specific embodiment
The present invention is described in detail with specific embodiment below in conjunction with the accompanying drawings.Obviously, described embodiment is this
A part of the embodiment of invention, rather than whole embodiments.Based on the embodiments of the present invention, those of ordinary skill in the art exist
Every other embodiment obtained under the premise of creative work is not made, all should belong to the scope of protection of the invention.
As shown in Figure 1, the present invention relates to a kind of distributed mobile base station data storage system based on Hadoop, comprising:
Interface layer, functional layer, data Layer and physical layer.Wherein, interface layer uses Java API programming interface.Functional layer is equipped with point of interest
Attribute query unit, point of interest space querying unit, point of interest administrative unit etc..Physical layer is equipped at least one application service
Device, backup server and at least one core layer switch.
Data Layer includes Linux storage cluster: it uses the linux system PC equipped with Centos 6.5 to set up cluster.Collection
For group using Hadoop system as basic architecture platform, Hadoop, which is one, to carry out the soft of distributed treatment to mass data
Part frame, is handled in a reliable, efficient and scalable way.It works in a parallel fashion, passes through parallel processing
Speed up processing.Hadoop or telescopic, is capable of handling PB grades of data.There are many element structures for Hadoop system frame
At.Its bottommost is HDFS (Hadoop Distributed File System, Hadoop distributed file system), storage
The user of file in Hadoop cluster on all memory nodes, client can be created by HDFS, be deleted, moved or again
Name the operation such as file;Data Layer further includes YARN (Yet Another Resource Negotiator, another resource association
Tune person), index database, HBase database, Mysql database and for providing the weight of Distributed Services for Hadoop and HBase
Want component Zookeeper;Mysql database is synchronous with HBase database realizing by Sqoop.HBase database passes through
Lucene and index database contact realization retrieval.Entire data Layer is using MapReduce as distributed computing framework.
The present invention is using distributed file system HDFS as HBase, the storage equipment of Hive and other application data.
Resource manager of the YARN as cluster is responsible for the management and scheduling of resource.It is used to support data using Hive data warehouse
HQL inquiry etc. operation.HBase is distributed column storage database, is used for structured data.Zookeepe is for carrying out
The various services of coordination system are responsible in distributed coordination service in systems.
HDSF rack topology: Hadoop cluster organization form is tree structure, is divided into internal node with leaf node, inside
Node generally represents a router or interchanger, and leaf node then represents the machine for disposing DataNode back end.
HDFS oneself cannot judge rack topological relation, that is, the topology of DataNode back end under default situations.NameNode
Name node name node is used to receive the periodic heartbeat message of each DataNode back end back end, DataNode number
According to node for responding the read-write requests from HDFS module client;It is also used to respond the wound from NameNode name node
It builds, delete and the order of copy block.But the administrator of cluster can be by configuring in topology.script.file.name
Parameter specify a script file, this script can be automatically loaded after NameNode name node starts successfully and hold
The row script, it is corresponding the IP of DataNode back end in cluster is translated by the setting in the script
Rackname, if being not provided with parameter, the IP of each Data Node can be resolved to/default-rack.According to this
Kind of topological structure defines a kind of distance and is called network distance, and node to the distance between its Parent node is 1, any two section
The distance of point is equal to them and arrives the sum of the distance of nearest public Parent node.It is usually the case that, it is desirable to make network communication
It is faster it is necessary to as far as possible make node between node at a distance from it is smaller.It is clear that the network communication of machine frame inside than rack with
Network communication between rack is faster.
Heartbeat mechanism.Because NameNode name node will not actively be interacted with DataNode back end and ditch
Logical, so the connection between them is all that DataNode back end is actively initiated, the main purpose done so is to reduce
The load of NameNode name node reduces the pressure of NameNode name node, also secure to the stability of cluster in this way,
Meanwhile dynamically increasing in cluster or large effect will not be generated to NameNode name node when deletion of node.Institute
Just to need to establish a kind of heartbeat mechanism, keep just contacting at regular intervals for DataNode back end active primary
NameNode name node.Between the time that parameter by configuring dfs.heartbeat.interval can set heartbeat
Every, and a maximum duration can be set, when NameNode name node finds that a node has been more than this maximum duration
If being all not in contact with oneself, then this vertex ticks will be DeadNode dead it is assumed that this node is dead by that
Node.
After cluster starting, offerService method is by according to the heart time of setting, if being set as 5 seconds, that is just
The sendHeartbeat method of NameNode name node, the starting of NameNode name node were called by RPC every 5 seconds
After will establish a RPC Server, for monitoring the RPC request of DataNode back end, then NameNode title section
The sendHeartbeat method call handleHeartbeat method of point.Heartbeat mechanism in this way, NameNode title
Node can send instruction, such as the additions and deletions etc. to data to DataNode back end.
Rack perception.Rack perception is realized based on network topology structure.Because the placement of copy in HDFS data can
Performance by property and cluster is crucial, rack perceptual strategy also for improve the reliabilities of data, safety and
The utilization rate of network bandwidth can be improved.It integrally breaks down to prevent some rack, it can be each duplicate copy to not
In same rack, the bandwidth that can also make full use of each rack is done so, the overall performance of cluster is improved, if default is each
The number of copies of file is 3, and first copy is placed on local rack first, and second copy is placed on other random racks
In, third copy is placed in the different machines with second same machine frame, if copy amount is greater than 3, copy after that
With regard to random selection node storage.There can be one very big mention to the reliance security of data in a simple manner in this way
It is high.It when copy is stored in node, first has to verify node, purpose makes to determine that each state of node whether may be used
With whether the memory space by the isGoodTarget method in NameNode name node class, first calculating disk is enough
Current copy is written, if insufficient space, other nodes can be selected, then counts the operation that the node is currently executing
Number can assert that the state of the node is if the current operation number of the node has been more than 2 times of the current average operation number of cluster
Overload would not also store new copy on this node, then go to verify other node.Such strategy not only ensure that
Certain write performance, and reliability and the safety etc. that ensure that load balancing and data in a certain range.
The above description is merely a specific embodiment, but scope of protection of the present invention is not limited thereto, any
The staff for being familiar with the art in the technical scope disclosed by the present invention, can readily occur in various equivalent modifications or replace
It changes, these modifications or substitutions should be covered by the protection scope of the present invention.Therefore, protection scope of the present invention should be with right
It is required that protection scope subject to.
Claims (7)
1. a kind of distributed mobile base station data storage system based on Hadoop, which is characterized in that the system includes successively connecting
Interface layer, functional layer, data Layer and the physical layer connect, the physical layer include at least application server, a backup services
Device and core layer switch, each application server, each backup server, data Layer are separately connected core layer switch, described
Data Layer include Linux storage cluster, the Linux storage cluster use Hadoop cluster platform, the Hadoop cluster
Platform includes YARN, index database, HBase database, Mysql database and to carry out distributed coordination service
Zookeeper, the bottom of the Hadoop cluster platform are equipped with the HDFS to store the file on all nodes.
2. a kind of distributed mobile base station data storage system based on Hadoop according to claim 1, feature exist
In the Hadoop cluster platform is tree structure, which includes internal node and leaf node, the inside
One router of node on behalf or core layer switch, the leaf node represent the machine of deployment DataNode back end
Device, DataNode back end come from NameNode name node to respond the read-write requests from HDFS, and for responding
Creation, deletion and copy block order.
3. a kind of distributed mobile base station data storage system based on Hadoop according to claim 2, feature exist
In the administrator of the Hadoop cluster platform specifies a script file by setting the parameters to, in NameNode title
Node loads this script after starting successfully automatically and executes the script, by the setting in the script by DataNode in cluster
The IP of back end translates into corresponding rackname, if being not provided with parameter, the IP meeting of each DataNode back end
It is resolved to default rack, NameNode name node is used to receive the periodic heartbeat message of each DataNode back end.
4. a kind of distributed mobile base station data storage system based on Hadoop according to claim 3, feature exist
In in the Hadoop cluster platform, DataNode actively initiates the connection between NameNode every one section of heart time
System, eartbeat interval time are set by configuration parameter, and can set maximum duration by configuration parameter, if NameNode title section
One node of point discovery then assert that the node of discovery is dead, by the node mark more than not getting in touch yet with it after maximum duration
It is denoted as DeadNode death nodes.
5. a kind of distributed mobile base station data storage system based on Hadoop according to claim 1, feature exist
In the Mysql database is synchronous with HBase database realizing by Sqoop.
6. a kind of distributed mobile base station data storage system based on Hadoop according to claim 1, feature exist
In the interface layer uses Java API programming interface.
7. a kind of distributed mobile base station data storage system based on Hadoop according to claim 1, feature exist
In the functional layer includes points of interest attribute query unit, point of interest space querying unit, point of interest administrative unit.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910469125.1A CN110290179A (en) | 2019-05-31 | 2019-05-31 | A kind of distributed mobile base station data storage system based on Hadoop |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910469125.1A CN110290179A (en) | 2019-05-31 | 2019-05-31 | A kind of distributed mobile base station data storage system based on Hadoop |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110290179A true CN110290179A (en) | 2019-09-27 |
Family
ID=68003021
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910469125.1A Pending CN110290179A (en) | 2019-05-31 | 2019-05-31 | A kind of distributed mobile base station data storage system based on Hadoop |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110290179A (en) |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106709003A (en) * | 2016-12-23 | 2017-05-24 | 长沙理工大学 | Hadoop-based mass log data processing method |
CN107800808A (en) * | 2017-11-15 | 2018-03-13 | 广东奥飞数据科技股份有限公司 | A kind of data-storage system based on Hadoop framework |
-
2019
- 2019-05-31 CN CN201910469125.1A patent/CN110290179A/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106709003A (en) * | 2016-12-23 | 2017-05-24 | 长沙理工大学 | Hadoop-based mass log data processing method |
CN107800808A (en) * | 2017-11-15 | 2018-03-13 | 广东奥飞数据科技股份有限公司 | A kind of data-storage system based on Hadoop framework |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN104486445B (en) | Distributed extendable resource monitoring system based on cloud platform | |
Lin et al. | QoS-aware data replication for data-intensive applications in cloud computing systems | |
CN105247529B (en) | The synchronous voucher hash between directory service | |
CN106709003A (en) | Hadoop-based mass log data processing method | |
CN109218100A (en) | Distributed objects storage cluster and its request responding method, system and storage medium | |
CN105025053A (en) | Distributed file upload method based on cloud storage technology and system | |
CN107368369B (en) | Distributed container management method and system | |
CN105868333A (en) | File processing method and device | |
CN107800808A (en) | A kind of data-storage system based on Hadoop framework | |
CN107888666A (en) | A kind of cross-region data-storage system and method for data synchronization and device | |
CN105577423A (en) | Real-time data center cluster management system | |
CN114518934A (en) | Unified operation and maintenance platform architecture system | |
CN109639773A (en) | A kind of the distributed data cluster control system and its method of dynamic construction | |
CN106446268A (en) | Database lateral extension system and method | |
CN112671840A (en) | Cross-department data sharing system and method based on block chain technology | |
CN109783564A (en) | Support the distributed caching method and equipment of multinode | |
CN103036952A (en) | Enterprise-level heterogeneous fusion memory management system | |
CN106027623A (en) | Distributed cluster state management method and system thereof | |
CN110061876A (en) | The optimization method and system of O&M auditing system | |
CN107180034A (en) | The group system of MySQL database | |
Martinez-Mosquera et al. | Development and evaluation of a big data framework for performance management in mobile networks | |
CN105069024B (en) | Distributed file system write access method towards parallel data acquisition | |
CN110290179A (en) | A kind of distributed mobile base station data storage system based on Hadoop | |
CN109120443A (en) | A kind of management method and device of network attached storage NAS device | |
CN108829709A (en) | Distributed database management method, apparatus, storage medium and processor |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190927 |
|
RJ01 | Rejection of invention patent application after publication |