CN106484857A - Data collecting system and its method - Google Patents
Data collecting system and its method Download PDFInfo
- Publication number
- CN106484857A CN106484857A CN201610881306.1A CN201610881306A CN106484857A CN 106484857 A CN106484857 A CN 106484857A CN 201610881306 A CN201610881306 A CN 201610881306A CN 106484857 A CN106484857 A CN 106484857A
- Authority
- CN
- China
- Prior art keywords
- business datum
- data
- module
- business
- buffer unit
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/21—Design, administration or maintenance of databases
- G06F16/215—Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/242—Query formulation
- G06F16/2433—Query languages
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2455—Query execution
- G06F16/24552—Database cache management
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/25—Integrating or interfacing systems involving database management systems
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Quality & Reliability (AREA)
- Software Systems (AREA)
- Mathematical Physics (AREA)
- Computer And Data Communications (AREA)
Abstract
The invention discloses a kind of data collecting system and its method, the data collecting system includes:At least one slave computer, host computer and the database being connected with host computer;Host computer includes communication device and information collecting device;Communication device includes:The communication unit being connected with the slave computer cluster;The data pre-processing unit being connected with the communication unit;Described information harvester includes:The first buffer unit being connected with the data pre-processing unit;The caching process unit being connected with first buffer unit;The second buffer unit being connected with the caching process unit;The information memory cell being connected with the second buffer unit;The present invention disclosure satisfy that substantial amounts of high frequency concurrent data collection demand, while effectively reducing calculation resources and reducing hardware cost.
Description
Technical field
The present invention relates to data acquisition technology field, specially a kind of data collecting system and its method.
Background technology
In recent years, with the tremendous development of the correlative technology fields such as computer, Internet of Things, within including production and living in
Social activities aspect in, employ a large amount of sensors for producing data and other produce the soft hardware equipment of data, these numbers
According to being that national product and daily life bring greatly value.Being gradually increased of appliance arrangement quantity of data is supervened,
Particularly some of which internet of things equipment has that long operational time, data generation frequency be high, real-time data amount is huge
Deng attribute, it is that data acquisition, real-time exhibition and follow-up data analysis bring very big technical barrier.In the face of these technology
A difficult problem, existing data acquisition technology are unable to effectively solving, and concrete condition is as follows:Need in the face of the collection of substantial amounts of high frequency concurrent data
Ask, available data acquisition technique is typically launched from two technique directions:On hardware trend, using computer cluster framework
Mode, increases substantial amounts of data processing equipment, and this kind of mode needs constantly to put into more substantial hardware, increased energy consumption and be
The management O&M cost of system, rolls up the integrated cost of data separate;On software and hardware combining direction, using software engineering
The load balance process to Data Concurrent is realized, while some special load balancing hardware devices are also required to, this kind of mode
System environments configures complicated, professional degree and the purchase cost height using difficulty height, equipment and software, while the skill to operation maintenance personnel
The O&M cost that art and skill requirement also cause system indirectly is high larger with stability risk.
Content of the invention
The present invention is for the proposition of problem above, and develops one kind and disclosure satisfy that the collection of substantial amounts of high frequency concurrent data is needed
Ask, while effectively reducing calculation resources and reducing data collecting system and its method for hardware cost.
The technological means of the present invention is as follows:
A kind of data collecting system, including:At least one slave computer, host computer and the data being connected with host computer
Storehouse;The host computer includes communication device and information collecting device;
The communication device includes:
The communication unit being connected with the slave computer;The communication unit is used for receiving the business datum that slave computer is reported
And it is transferred to data pre-processing unit;
The data pre-processing unit being connected with the communication unit;The data pre-processing unit include convergence module and
The distribution module being connected with the convergence module;The convergence module is used for adding corresponding slave computer to every business datum
Device identifier information data is sent to distribution module after receiving timestamp information;The distribution module is used for the convergence
Business datum after resume module is preserved to the first buffer unit;
Described information harvester includes:
The first buffer unit being connected with the data pre-processing unit;
The caching process unit being connected with first buffer unit;The caching process unit include filtering module,
The sort module being connected with the filtering module and cleaning modul;It is single from the described first caching that the filtering module is used for timing
Batch extracting business datum in unit, and the business datum being extracted preserved in the first buffer unit is labeled as locating
Reason, while the business datum to being extracted carries out Effective judgement, according to business datum Effective judgement result to business datum
Valid data or invalid data mark is carried out, and the multiple business datums to marking with valid data form valid data team
Row;The sort module is used for classifying the multiple business datums in the valid data queue according to default classifying rules
And store sorted business datum to the second buffer unit;The cleaning modul is used for according to default cleaning cycle to first
The mark of each business datum stored in buffer unit is detected, and will be labeled as processed business datum from the first caching
Remove in unit;
The second buffer unit being connected with the caching process unit;
The information memory cell being connected with the second buffer unit;Described information memory cell include compression module and with institute
State the writing module that compression module is connected;The compression module is used for according to default extracting cycle from second buffer unit
Middle extraction business datum, and the business datum for meeting sampling of data space requirement that has extracted is formed the pending team of business datum
Row, carry out repeating to generate non-duplicate business datum queue after business data processing work to the pending queue of the business datum;
Said write module was packed to the non-duplicate business datum queue according to the default packing cycle, and by the business after packing
Batch data is saved in database;
Further, communication unit is additionally operable to detect communication state between host computer and slave computer and detection more
The working condition of new slave computer;
Further, the data pre-processing unit also includes to be connected with the distribution module, for according to default exhibition
Show the real-time exhibition module that the cycle is shown to business datum;The distribution module convergence module is processed after business
Data are distributed to the real-time exhibition module according to the default distribution cycle;
Further, the filtering module to the process that the business datum that is extracted carries out Effective judgement is:Judge institute
Whether the Data Identification number of the business datum of extraction is Data Identification number interested, is, the business datum is labeled as effectively
The business datum is otherwise labeled as invalid data by data;
Further,
The compression module carries out repeating the concrete mistake of business data processing work to the pending queue of the business datum
Cheng Wei:Compression module judges according to repeated data judgment rule whether the adjacent data in the pending queue of business datum belongs to weight
Multiple business datum, is to record the data value data of one of them repeated in business datum to receive timestamp information,
The data value data for otherwise recording non-duplicate business datum receives timestamp information;
It is non-heavy that the compression module receives timestamp information generation according to the data value data of the business datum for being recorded
Restitution business data queue;For the business datum for being recorded data value by compression module, the compression module is cached second
The corresponding business datum stored in unit is labeled as processed;Said write module is by the non-duplicate business number after packing
After database being saved according to queue batch, the business datum for being marked as processed in the second buffer unit is purged;
Further, the workflow of the communication unit is as follows:
A1:The communication state between host computer and slave computer is detected according to default detection cycle, execute A2;
A2:Whether judge all being capable of normal communication between host computer and slave computer?It is that A4 is then executed, otherwise executes A3;
A3:The working condition for updating slave computer is off-line state, executes A7;
A4:Judge whether the working condition of slave computer is normal?It is that A5 is then executed, otherwise executes A6;
A5:The working condition for updating slave computer is normal condition, executes A7;
A6:The working condition for updating slave computer is malfunction, executes A7;
A7:Treat next default detection cycle;
Further, the workflow of the data pre-processing unit is as follows:
B1:Convergence module adds corresponding slave computer device identifier information data to the every business datum for receiving
Distribution module is sent to after receiving timestamp information, executes B2;
B2:Business datum after convergence module is processed by distribution module is preserved to the first buffer unit, executes B3;
B3:Business datum after convergence module is processed by distribution module is distributed to real-time exhibition mould according to the default distribution cycle
Block, executes B4;
B4:Real-time exhibition module was shown to business datum according to the default displaying cycle;
Also there is before the workflow of the data pre-processing unit following workflow:
Host computer turn-on data reception state, slave computer is to host computer reporting service data;
Further, the workflow of the caching process unit is as follows:
C1:Filtering module timing is from batch extracting business datum in the first buffer unit, and will protect in the first buffer unit
The business datum being extracted that deposits is labeled as processed, execution C2;
C2:Filtering module judges whether the Data Identification number of extracted business datum is Data Identification number interested,
It is that C3 is then executed, otherwise executes C4;
C3:The business datum is labeled as valid data by filtering module, executes C5;
C4:The business datum is labeled as invalid data by filtering module;
C5:Filtering module forms effective data queue to the multiple business datums marked with valid data, executes C6:
C6:Sort module is classified according to default classifying rules to the multiple business datums in the valid data queue
And store sorted business datum to the second buffer unit;
After step C1, the workflow of the caching process unit also has following steps:
The mark of each business datum of the cleaning modul according to default cleaning cycle to storing in the first buffer unit is examined
Survey, and processed business datum will be labeled as and remove from the first buffer unit;
Further, the workflow of described information memory cell is as follows:
D1:Compression module extracts business datum according to default extracting cycle from second buffer unit, executes D2;
D2:It is pending that the business datum for meeting sampling of data space requirement that has extracted is formed business datum by compression module
Queue, executes D3;
D3:According to repeated data judgment rule, compression module judges adjacent data in the pending queue of business datum whether
Belong to repetition business datum, be that D4 is then executed, otherwise execute D5;
D4:Record the data value data of one of them repeated in business datum and timestamp information is received, execute
D6;
D5:The data value data for recording non-duplicate business datum receives timestamp information, executes D6;
D6:It is non-duplicate that compression module receives timestamp information generation according to the data value data of the business datum for being recorded
Business datum queue, executes D7;
D7:For the business datum for being recorded data value by compression module, compression module will be deposited in the second buffer unit
The corresponding business datum of storage is labeled as processed, is simultaneously written module according to the default packing cycle to the non-duplicate business
Data queue is packed, and the business datum batch after packing is saved in database, executes D8;
D8:Writing module is purged processed business datum is marked as in the second buffer unit.
A kind of collecting method, comprises the steps:
Step 1:Slave computer is according to certain business datum report cycle to host computer reporting service data, execution step 2;
Step 2:Host computer receives the business datum that reports of slave computer by communication unit and is transferred to data prediction list
Unit, execution step 3;
Step 3:The data pre-processing unit of host computer is added corresponding the next by convergence module to every business datum
Machine equipment identifier information data is sent to the distribution module included by data pre-processing unit after receiving timestamp information, hold
Row step 4;
Step 4:Business datum after convergence module is processed by distribution module is preserved to the first buffer unit of host computer, is held
Row step 5;
Step 5:The caching process unit of host computer is by filtering module timing from batch extracting industry in the first buffer unit
Business data, and the business datum being extracted preserved in the first buffer unit is labeled as processed, execution step 6;
Step 6:Filtering module carries out Effective judgement to the business datum that is extracted, according to business datum Effective judgement
As a result valid data or invalid data mark, and the multiple business datums to marking with valid data are carried out to business datum
Form effective data queue, execution step 7;
Step 7:The caching process unit of host computer is by sort module to the multiple business in the valid data queue
Data are classified according to default classifying rules and are stored sorted business datum to the second buffer unit of host computer, hold
Row step 8;
Step 8:The caching process unit of host computer is by cleaning modul according to default cleaning cycle to the first buffer unit
The mark of each business datum of middle storage is detected, and it is clear from the first buffer unit to be labeled as processed business datum
Remove, execution step 9;
Step 9:The information memory cell of host computer is by compression module according to default extracting cycle from the described second caching
Business datum is extracted in unit, and the business datum for meeting sampling of data space requirement that has extracted is formed business datum wait to locate
Reason queue, execution step 10;
Step 10:Compression module carries out repeating to generate after business data processing work to the pending queue of the business datum
Non-duplicate business datum queue, execution step 11;
Step 11:The information memory cell of host computer is by writing module according to the default packing cycle to the non-duplicate industry
Business data queue is packed, and the business datum batch after packing is saved in database.
Due to employing technique scheme, data collecting system and its method that the present invention is provided, it is suitable for but is not limited to
The energy, building and agricultural etc. have a large amount of high-frequency datas to gather the industry of demand;By data collecting system of the present invention and its
The business datum that slave computer can be reported by method carries out pre-processing, the classification of validity check, data, data compression and batch are inserted
Enter a series of operation of database, mass data carries out highly effective gathering process by this kind of mode, realize not increasing or
In the case of only increasing a small amount of hardware input, efficient data acquisition and processing (DAP) demand just can be met.
Description of the drawings
Fig. 1 is the structured flowchart of data collecting system of the present invention;
Fig. 2 is the structured flowchart of data pre-processing unit of the present invention;
Fig. 3 is the structured flowchart of caching process unit of the present invention;
Fig. 4 is the structured flowchart of information memory cell of the present invention;
Fig. 5 is the workflow diagram of communication unit of the present invention;
Fig. 6 is the workflow diagram of data pre-processing unit of the present invention;
Fig. 7 is the workflow diagram of caching process unit of the present invention;
Fig. 8 is the workflow diagram of information memory cell of the present invention;
Fig. 9 is the flow chart of the method for the invention.
Specific embodiment
As shown in Figure 1, Figure 2, a kind of data collecting system shown in Fig. 3, Fig. 4, Fig. 5, Fig. 6, Fig. 7 and Fig. 8, including:At least one
Individual slave computer, host computer and the database being connected with host computer;The host computer includes communication device and information gathering dress
Put;
The communication device includes:
The communication unit being connected with the slave computer;The communication unit is used for receiving the business datum that slave computer is reported
And it is transferred to data pre-processing unit;
The data pre-processing unit being connected with the communication unit;The data pre-processing unit include convergence module and
The distribution module being connected with the convergence module;The convergence module is used for adding corresponding slave computer to every business datum
Device identifier information data is sent to distribution module after receiving timestamp information;The distribution module is used for the convergence
Business datum after resume module is preserved to the first buffer unit;
Described information harvester includes:
The first buffer unit being connected with the data pre-processing unit;
The caching process unit being connected with first buffer unit;The caching process unit include filtering module,
The sort module being connected with the filtering module and cleaning modul;It is single from the described first caching that the filtering module is used for timing
Batch extracting business datum in unit, and the business datum being extracted preserved in the first buffer unit is labeled as locating
Reason, while the business datum to being extracted carries out Effective judgement, according to business datum Effective judgement result to business datum
Valid data or invalid data mark is carried out, and the multiple business datums to marking with valid data form valid data team
Row;The sort module is used for classifying the multiple business datums in the valid data queue according to default classifying rules
And store sorted business datum to the second buffer unit;The cleaning modul is used for according to default cleaning cycle to first
The mark of each business datum stored in buffer unit is detected, and will be labeled as processed business datum from the first caching
Remove in unit;
The second buffer unit being connected with the caching process unit;
The information memory cell being connected with the second buffer unit;Described information memory cell include compression module and with institute
State the writing module that compression module is connected;The compression module is used for according to default extracting cycle from second buffer unit
Middle extraction business datum, and the business datum for meeting sampling of data space requirement that has extracted is formed the pending team of business datum
Row, carry out repeating to generate non-duplicate business datum queue after business data processing work to the pending queue of the business datum;
Said write module was packed to the non-duplicate business datum queue according to the default packing cycle, and by the business after packing
Batch data is saved in database;
Further, communication unit is additionally operable to detect communication state between host computer and slave computer and detection more
The working condition of new slave computer;
Further, the data pre-processing unit also includes to be connected with the distribution module, for according to default exhibition
Show the real-time exhibition module that the cycle is shown to business datum;The distribution module convergence module is processed after business
Data are distributed to the real-time exhibition module according to the default distribution cycle;
Further, the filtering module to the process that the business datum that is extracted carries out Effective judgement is:Judge institute
Whether the Data Identification number of the business datum of extraction is Data Identification number interested, is, the business datum is labeled as effectively
The business datum is otherwise labeled as invalid data by data;
Further,
The compression module carries out repeating the concrete mistake of business data processing work to the pending queue of the business datum
Cheng Wei:Compression module judges according to repeated data judgment rule whether the adjacent data in the pending queue of business datum belongs to weight
Multiple business datum, is to record the data value data of one of them repeated in business datum to receive timestamp information,
The data value data for otherwise recording non-duplicate business datum receives timestamp information;
It is non-heavy that the compression module receives timestamp information generation according to the data value data of the business datum for being recorded
Restitution business data queue;For the business datum for being recorded data value by compression module, the compression module is cached second
The corresponding business datum stored in unit is labeled as processed;Said write module is by the non-duplicate business number after packing
After database being saved according to queue batch, the business datum for being marked as processed in the second buffer unit is purged;
Further, the workflow of the communication unit is as follows:
A1:The communication state between host computer and slave computer is detected according to default detection cycle, execute A2;
A2:Whether judge all being capable of normal communication between host computer and slave computer?It is that A4 is then executed, otherwise executes A3;
A3:The working condition for updating slave computer is off-line state, executes A7;
A4:Judge whether the working condition of slave computer is normal?It is that A5 is then executed, otherwise executes A6;
A5:The working condition for updating slave computer is normal condition, executes A7;
A6:The working condition for updating slave computer is malfunction, executes A7;
A7:Treat next default detection cycle;
Further, the workflow of the data pre-processing unit is as follows:
B1:Convergence module adds corresponding slave computer device identifier information data to the every business datum for receiving
Distribution module is sent to after receiving timestamp information, executes B2;
B2:Business datum after convergence module is processed by distribution module is preserved to the first buffer unit, executes B3;
B3:Business datum after convergence module is processed by distribution module is distributed to real-time exhibition mould according to the default distribution cycle
Block, executes B4;
B4:Real-time exhibition module was shown to business datum according to the default displaying cycle;
Also there is before the workflow of the data pre-processing unit following workflow:
Host computer turn-on data reception state, slave computer is to host computer reporting service data;
Further, the workflow of the caching process unit is as follows:
C1:Filtering module timing is from batch extracting business datum in the first buffer unit, and will protect in the first buffer unit
The business datum being extracted that deposits is labeled as processed, execution C2;
C2:Filtering module judges whether the Data Identification number of extracted business datum is Data Identification number interested,
It is that C3 is then executed, otherwise executes C4;
C3:The business datum is labeled as valid data by filtering module, executes C5;
C4:The business datum is labeled as invalid data by filtering module;
C5:Filtering module forms effective data queue to the multiple business datums marked with valid data, executes C6:
C6:Sort module is classified according to default classifying rules to the multiple business datums in the valid data queue
And store sorted business datum to the second buffer unit;
After step C1, the workflow of the caching process unit also has following steps:
The mark of each business datum of the cleaning modul according to default cleaning cycle to storing in the first buffer unit is examined
Survey, and processed business datum will be labeled as and remove from the first buffer unit;
Further, the workflow of described information memory cell is as follows:
D1:Compression module extracts business datum according to default extracting cycle from second buffer unit, executes D2;
D2:It is pending that the business datum for meeting sampling of data space requirement that has extracted is formed business datum by compression module
Queue, executes D3;
D3:According to repeated data judgment rule, compression module judges adjacent data in the pending queue of business datum whether
Belong to repetition business datum, be that D4 is then executed, otherwise execute D5;
D4:Record the data value data of one of them repeated in business datum and timestamp information is received, execute
D6;
D5:The data value data for recording non-duplicate business datum receives timestamp information, executes D6;
D6:It is non-duplicate that compression module receives timestamp information generation according to the data value data of the business datum for being recorded
Business datum queue, executes D7;
D7:For the business datum for being recorded data value by compression module, compression module will be deposited in the second buffer unit
The corresponding business datum of storage is labeled as processed, is simultaneously written module according to the default packing cycle to the non-duplicate business
Data queue is packed, and the business datum batch after packing is saved in database, executes D8;
D8:Writing module is purged processed business datum is marked as in the second buffer unit.
A kind of collecting method as shown in Figure 9, comprises the steps:
A kind of collecting method, comprises the steps:
Step 1:Slave computer is according to certain business datum report cycle to host computer reporting service data, execution step 2;
Step 2:Host computer receives the business datum that reports of slave computer by communication unit and is transferred to data prediction list
Unit, execution step 3;
Step 3:The data pre-processing unit of host computer is added corresponding the next by convergence module to every business datum
Machine equipment identifier information data is sent to the distribution module included by data pre-processing unit after receiving timestamp information, hold
Row step 4;
Step 4:Business datum after convergence module is processed by distribution module is preserved to the first buffer unit of host computer, is held
Row step 5;
Step 5:The caching process unit of host computer is by filtering module timing from batch extracting industry in the first buffer unit
Business data, and the business datum being extracted preserved in the first buffer unit is labeled as processed, execution step 6;
Step 6:Filtering module carries out Effective judgement to the business datum that is extracted, according to business datum Effective judgement
As a result valid data or invalid data mark, and the multiple business datums to marking with valid data are carried out to business datum
Form effective data queue, execution step 7;
Step 7:The caching process unit of host computer is by sort module to the multiple business in the valid data queue
Data are classified according to default classifying rules and are stored sorted business datum to the second buffer unit of host computer, hold
Row step 8;
Step 8:The caching process unit of host computer is by cleaning modul according to default cleaning cycle to the first buffer unit
The mark of each business datum of middle storage is detected, and it is clear from the first buffer unit to be labeled as processed business datum
Remove, execution step 9;
Step 9:The information memory cell of host computer is by compression module according to default extracting cycle from the described second caching
Business datum is extracted in unit, and the business datum for meeting sampling of data space requirement that has extracted is formed business datum wait to locate
Reason queue, execution step 10;
Step 10:Compression module carries out repeating to generate after business data processing work to the pending queue of the business datum
Non-duplicate business datum queue, execution step 11;
Step 11:The information memory cell of host computer is by writing module according to the default packing cycle to the non-duplicate industry
Business data queue is packed, and the business datum batch after packing is saved in database.
Before communication unit execution step A1, which also has following workflow:
1. add slave computer list and the attribute data of each slave computer is recorded, execute 2.;
2. distribute host computer for each slave computer, execute 3.;
3. judge whether the host computer can load administrative slave computer, be to execute 4., otherwise execute 5.;
4. communication instruction is sent to each slave computer, execute 6.;
5. host computer increase or allotment are carried out for the host computer that can not load administrative slave computer, returns 3.;
6. each slave computer is directed to, is performed both by A1 to A7;
Communication unit of the present invention is additionally operable to the attribute data of record slave computer, the business datum of setting slave computer and reports
Cycle, for each slave computer distribution host computer, control host computer in data receiving state or data not reception state, Yi Jishe
The purchase of property is engaged in the data display cycle;System of the present invention includes at least one slave computer, when slave computer has multiple, can be direct
Using slave computer cluster, the slave computer cluster refers to the data acquisition equipment of miscellaneous service system, typically with large number quipments collection
The form of group occurs, and requires without the upper limit;Sampling of data space requirement of the present invention can be had by host computer by user
Human-computer interaction interface setting;Compression module extracts business datum, root according to default extracting cycle from the second buffer unit
The time interval between adjacent service data extraction operation can be known according to the default extracting cycle, when adjacent service data are carried
Less than during sampling of data space requirement, time interval between extract operation thinks that the business datum that is extracted by compression module meets
Business datum is stored in business datum and treats by the sequence of extraction according to business datum by sampling of data space requirement, compression module successively
Process in queue;Business datum can be reported upper by slave computer of the present invention according to certain business datum report cycle
Machine;Each business datum all carried unique Data Identification number before reporting;Filtering module is to the business datum extracted
Whether Data Identification number is that Data Identification number interested is judged, is that the business datum is labeled as valid data, no
Then the business datum is labeled as invalid data, wherein, the Data Identification number interested is established rules really, can carry out in advance
Default, such as defined by configuration file;It is i.e. described logical that data receipt time stamp information of the present invention refers to host computer
News unit receives the timestamp information of business datum;Repeated data judgment rule of the present invention can be adjacent service data
Whether numerical value equal, whether follows sinusoidal rule or cosine rule between adjacent service data;The attribute number of the slave computer
According to the device identifier (ID), IP address and the port numbers that at least include slave computer;The default classifying rules can be used for distinguishing
Business datum is which kind of the concrete business datum reported by which slave computer, and plays business datum according to different application use
The effect classified by way;Here default classifying rules can be different coding rule;First buffer unit and
Two buffer units are the high-speed storage devices for referring to interim storage data;Communication unit of the present invention can divide for each slave computer
Join host computer, specifically, host computer distribution is carried out according to the load capacity data situation of host computer according to communication unit;This
The bright convergence module is used for adding every business datum the corresponding slave computer device identifier information data reception time
Distribution module is sent to after stamp information;Every business datum that slave computer is reported at least includes field name, data value, affiliated sets
Standby, associated components and said module;These business datums are usually that high-frequency is reported and enormous amount;Real-time exhibition module according to
The default displaying cycle is shown to business datum, specifically, business datum is illustrated in visual form display or
In other supervision equipments;The sort module is used for the multiple business datums in the valid data queue according to default classification
Rule is classified and is stored sorted business datum to the second buffer unit, and specifically, the sort module can be to dividing
Business datum after class adds class indication;Business datum of the present invention can be reached lossless through the process of the compression module
The purpose of compressed data structure;The business datum report cycle of the slave computer can be 1mS, 20mS, 500mS, 1S etc..
Data collecting system of the present invention and its method belong to Internet of Things big data collection field, be suitable for but be not limited to the energy,
Building and agricultural etc. have a large amount of high-frequency datas to gather the industry of demand;Can by data collecting system of the present invention and its method
Carry out pre-processing with the business datum that reports slave computer, the classification of validity check, data, data compression and batch insertion data
The a series of operation such as storehouse, mass data carry out highly effective gathering process by this kind of mode, and system of the present invention can realize 1
Platform host computer concurrently accesses 6 to 8 slave computers, realizes, in the case of not increasing or only increasing a small amount of hardware input, just
Meet efficient data acquisition and processing (DAP) demand.
The above, the only present invention preferably specific embodiment, but protection scope of the present invention is not limited thereto,
Any those familiar with the art the invention discloses technical scope in, technology according to the present invention scheme and its
Inventive concept equivalent or change in addition, should all be included within the scope of the present invention.
Claims (10)
1. a kind of data collecting system, it is characterised in that the system includes:At least one slave computer, host computer and with upper
The database that position machine is connected;The host computer includes communication device and information collecting device;
The communication device includes:
The communication unit being connected with the slave computer;The communication unit is used for receiving the business datum that slave computer reports and passes
It is defeated by data pre-processing unit;
The data pre-processing unit being connected with the communication unit;The data pre-processing unit include convergence module and with institute
State the distribution module that convergence module is connected;The convergence module is used for adding every business datum corresponding bottom machine equipment
Identifier information data is sent to distribution module after receiving timestamp information;The distribution module is used for the convergence module
Business datum after process is preserved to the first buffer unit;
Described information harvester includes:
The first buffer unit being connected with the data pre-processing unit;
The caching process unit being connected with first buffer unit;The caching process unit includes filtering module and institute
State sort module and cleaning modul that filtering module is connected;The filtering module is used for timing from first buffer unit
Batch extracting business datum, and by the first buffer unit preserve the business datum being extracted be labeled as processed, with
When Effective judgement is carried out to the business datum extracted, business datum is had according to business datum Effective judgement result
Effect data or invalid data are marked, and the multiple business datums to marking with valid data form effective data queue;Institute
Sort module is stated for being classified according to default classifying rules to the multiple business datums in the valid data queue and inciting somebody to action
Sorted business datum is stored to the second buffer unit;The cleaning modul is used for caching to first according to default cleaning cycle
The mark of each business datum stored in unit is detected, and will be labeled as processed business datum from the first buffer unit
Middle removing;
The second buffer unit being connected with the caching process unit;
The information memory cell being connected with the second buffer unit;Described information memory cell include compression module and with the pressure
The writing module that contracting module is connected;The compression module is used for carrying from second buffer unit according to default extracting cycle
Business datum is taken, and the business datum for meeting sampling of data space requirement that has extracted is formed the pending queue of business datum,
The pending queue of the business datum is carried out repeating to generate non-duplicate business datum queue after business data processing work;Described
Writing module was packed to the non-duplicate business datum queue according to the default packing cycle, and by the business datum after packing
Batch is saved in database.
2. data collecting system according to claim 1, it is characterised in that communication unit be additionally operable to detect host computer with
Communication state between the machine of position and detect and update the working condition of slave computer.
3. data collecting system according to claim 1, it is characterised in that the data pre-processing unit also includes and institute
State distribution module to be connected, for the real-time exhibition module being shown business datum according to the default displaying cycle;Described point
Send out the business datum after the convergence module is processed by module the real-time exhibition module is distributed to according to the default distribution cycle.
4. data collecting system according to claim 1, it is characterised in that the filtering module is to the business number that extracted
According to the process for carrying out Effective judgement it is:Whether the Data Identification number of the extracted business datum of judgement is data mark interested
Knowledge number, is that the business datum is labeled as valid data, otherwise the business datum is labeled as invalid data.
5. data collecting system according to claim 1, it is characterised in that
The compression module carries out the detailed process of repetition business data processing work to the pending queue of the business datum:
According to repeated data judgment rule, compression module judges whether the adjacent data in the pending queue of business datum belongs to repetition industry
Business data, are to record the data value data of one of them repeated in business datum to receive timestamp information, otherwise
The data value data for recording non-duplicate business datum receives timestamp information;
The compression module receives timestamp information according to the data value data of the business datum for being recorded and generates non-duplicate industry
Business data queue;For the business datum for being recorded data value by compression module, the compression module is by the second buffer unit
The corresponding business datum of middle storage is labeled as processed;Said write module is by the non-duplicate business datum team after packing
After row batch is saved in database, the business datum for being marked as processed in the second buffer unit is purged.
6. data collecting system according to claim 2, it is characterised in that the workflow of the communication unit is as follows:
A1:The communication state between host computer and slave computer is detected according to default detection cycle, execute A2;
A2:Whether judge all being capable of normal communication between host computer and slave computer?It is that A4 is then executed, otherwise executes A3;
A3:The working condition for updating slave computer is off-line state, executes A7;
A4:Judge whether the working condition of slave computer is normal?It is that A5 is then executed, otherwise executes A6;
A5:The working condition for updating slave computer is normal condition, executes A7;
A6:The working condition for updating slave computer is malfunction, executes A7;
A7:Treat next default detection cycle.
7. data collecting system according to claim 3, it is characterised in that the workflow of the data pre-processing unit
As follows:
B1:Convergence module adds corresponding slave computer device identifier information data to the every business datum for receiving and receives
Distribution module is sent to after timestamp information, executes B2;
B2:Business datum after convergence module is processed by distribution module is preserved to the first buffer unit, executes B3;
B3:Business datum after convergence module is processed by distribution module is distributed to real-time exhibition module according to the default distribution cycle,
Execute B4;
B4:Real-time exhibition module was shown to business datum according to the default displaying cycle;
Also there is before the workflow of the data pre-processing unit following workflow:
Host computer turn-on data reception state, slave computer is to host computer reporting service data.
8. data collecting system according to claim 4, it is characterised in that the workflow of the caching process unit is such as
Under:
C1:Filtering module timing is from batch extracting business datum in the first buffer unit, and will preserve in the first buffer unit
The business datum being extracted is labeled as processed, execution C2;
C2:Filtering module judges whether the Data Identification number of extracted business datum is Data Identification number interested, be then
C3 is executed, otherwise executes C4;
C3:The business datum is labeled as valid data by filtering module, executes C5;
C4:The business datum is labeled as invalid data by filtering module;
C5:Filtering module forms effective data queue to the multiple business datums marked with valid data, executes C6:
C6:Sort module is classified according to default classifying rules to the multiple business datums in the valid data queue and is incited somebody to action
Sorted business datum is stored to the second buffer unit;
After step C1, the workflow of the caching process unit also has following steps:
Cleaning modul detected to the mark of each business datum stored in the first buffer unit according to default cleaning cycle, and
Processed business datum will be labeled as remove from the first buffer unit.
9. data collecting system according to claim 5, it is characterised in that the workflow of described information memory cell is such as
Under:
D1:Compression module extracts business datum according to default extracting cycle from second buffer unit, executes D2;
D2:The business datum for meeting sampling of data space requirement that has extracted is formed the pending team of business datum by compression module
Row, execute D3;
D3:According to repeated data judgment rule, compression module judges whether the adjacent data in the pending queue of business datum belongs to
Repeat business datum, be that D4 is then executed, otherwise execute D5;
D4:Record the data value data of one of them repeated in business datum and timestamp information is received, execute D6;
D5:The data value data for recording non-duplicate business datum receives timestamp information, executes D6;
D6:Compression module receives timestamp information according to the data value data of the business datum for being recorded and generates non-duplicate business
Data queue, executes D7;
D7:For the business datum for being recorded data value by compression module, compression module is by storage in the second buffer unit
Corresponding business datum is labeled as processed, is simultaneously written module according to the default packing cycle to the non-duplicate business datum
Queue is packed, and the business datum batch after packing is saved in database, executes D8;
D8:Writing module is purged processed business datum is marked as in the second buffer unit.
10. a kind of collecting method, it is characterised in that methods described comprises the steps:
Step 1:Slave computer is according to certain business datum report cycle to host computer reporting service data, execution step 2;
Step 2:Host computer receives the business datum that reports of slave computer by communication unit and is transferred to data pre-processing unit, holds
Row step 3;
Step 3:The data pre-processing unit of host computer adds corresponding slave computer by convergence module to every business datum and sets
Standby identifier information data is sent to the distribution module included by data pre-processing unit after receiving timestamp information, execute step
Rapid 4;
Step 4:Business datum after convergence module is processed by distribution module is preserved to the first buffer unit of host computer, executes step
Rapid 5;
Step 5:The caching process unit of host computer is by filtering module timing from batch extracting business number in the first buffer unit
According to, and the business datum being extracted preserved in the first buffer unit is labeled as processed, execution step 6;
Step 6:Filtering module carries out Effective judgement to the business datum that is extracted, according to business datum Effective judgement result
Valid data or invalid data mark is carried out to business datum, and the multiple business datums to marking with valid data are formed
Valid data queue, execution step 7;
Step 7:The caching process unit of host computer is by sort module to the multiple business datums in the valid data queue
Classified according to default classifying rules and sorted business datum is stored to the second buffer unit of host computer, executed step
Rapid 8;
Step 8:The caching process unit of host computer is by cleaning modul according to default cleaning cycle to depositing in the first buffer unit
The mark of each business datum of storage detected, and will be labeled as processed business datum and removed from the first buffer unit,
Execution step 9;
Step 9:The information memory cell of host computer is by compression module according to default extracting cycle from second buffer unit
Middle extraction business datum, and the business datum for meeting sampling of data space requirement that has extracted is formed the pending team of business datum
Row, execution step 10;
Step 10:Compression module generates non-heavy after carrying out repetition business data processing work to the pending queue of the business datum
Restitution business data queue, execution step 11;
Step 11:The information memory cell of host computer is by writing module according to the default packing cycle to the non-duplicate business number
Packed according to queue, and the business datum batch after packing is saved in database.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610881306.1A CN106484857A (en) | 2016-10-09 | 2016-10-09 | Data collecting system and its method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610881306.1A CN106484857A (en) | 2016-10-09 | 2016-10-09 | Data collecting system and its method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106484857A true CN106484857A (en) | 2017-03-08 |
Family
ID=58269249
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610881306.1A Pending CN106484857A (en) | 2016-10-09 | 2016-10-09 | Data collecting system and its method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106484857A (en) |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107070748A (en) * | 2017-04-13 | 2017-08-18 | 周发辉 | A kind of processing system and method for the big data that communicates |
CN107979650A (en) * | 2017-12-15 | 2018-05-01 | 广东迈科医学科技股份有限公司 | The transmission method of plasma data, device and system |
CN108171596A (en) * | 2017-12-28 | 2018-06-15 | 广州华夏职业学院 | A kind of multi task process analysis system and method for finance data |
CN109460401A (en) * | 2018-09-30 | 2019-03-12 | 中铁隧道局集团有限公司 | A kind of intelligentized shield TBM data acquisition and complementing method |
CN109639785A (en) * | 2018-12-03 | 2019-04-16 | 上海熙菱信息技术有限公司 | A kind of data convergence cluster management system and method |
CN109902599A (en) * | 2019-02-01 | 2019-06-18 | 初速度(苏州)科技有限公司 | A kind of high-precision car data quality detecting method and system |
CN110109409A (en) * | 2019-05-07 | 2019-08-09 | 江苏高和智能装备股份有限公司 | A kind of steel cord plying device data acquisition system |
CN110633280A (en) * | 2019-09-11 | 2019-12-31 | 北京亚信数据有限公司 | Batch data acquisition method and device, readable storage medium and computing equipment |
CN111614786A (en) * | 2020-06-05 | 2020-09-01 | 易盼红 | System and method for processing data at high speed by remote server based on block chain |
CN113778502A (en) * | 2020-06-29 | 2021-12-10 | 北京沃东天骏信息技术有限公司 | Data processing method, device, system and storage medium |
CN115456217A (en) * | 2022-09-14 | 2022-12-09 | 中远海运科技股份有限公司 | Intelligent ship Internet of things data asset management method and system |
CN116155844A (en) * | 2023-04-21 | 2023-05-23 | 天津帕克耐科技有限公司 | IDC resource management method and system |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101673100A (en) * | 2009-10-19 | 2010-03-17 | 北京北方微电子基地设备工艺研究中心有限责任公司 | Acquisition method and system of parameters of technique process |
CN102053136A (en) * | 2010-11-18 | 2011-05-11 | 北京科技大学 | Plateau non-coal mine underground air environment parameter real time monitor |
CN102209118A (en) * | 2011-06-10 | 2011-10-05 | 成都勤智数码科技有限公司 | Distributed mass data gathering method |
CN104811809A (en) * | 2014-01-23 | 2015-07-29 | 中国科学院声学研究所 | Set-top box user behavior acquisition method |
CN105868071A (en) * | 2016-03-23 | 2016-08-17 | 乐视网信息技术(北京)股份有限公司 | Monitoring data processing method and device |
-
2016
- 2016-10-09 CN CN201610881306.1A patent/CN106484857A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101673100A (en) * | 2009-10-19 | 2010-03-17 | 北京北方微电子基地设备工艺研究中心有限责任公司 | Acquisition method and system of parameters of technique process |
CN102053136A (en) * | 2010-11-18 | 2011-05-11 | 北京科技大学 | Plateau non-coal mine underground air environment parameter real time monitor |
CN102209118A (en) * | 2011-06-10 | 2011-10-05 | 成都勤智数码科技有限公司 | Distributed mass data gathering method |
CN104811809A (en) * | 2014-01-23 | 2015-07-29 | 中国科学院声学研究所 | Set-top box user behavior acquisition method |
CN105868071A (en) * | 2016-03-23 | 2016-08-17 | 乐视网信息技术(北京)股份有限公司 | Monitoring data processing method and device |
Cited By (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107070748A (en) * | 2017-04-13 | 2017-08-18 | 周发辉 | A kind of processing system and method for the big data that communicates |
CN107979650A (en) * | 2017-12-15 | 2018-05-01 | 广东迈科医学科技股份有限公司 | The transmission method of plasma data, device and system |
CN108171596A (en) * | 2017-12-28 | 2018-06-15 | 广州华夏职业学院 | A kind of multi task process analysis system and method for finance data |
CN109460401A (en) * | 2018-09-30 | 2019-03-12 | 中铁隧道局集团有限公司 | A kind of intelligentized shield TBM data acquisition and complementing method |
CN109460401B (en) * | 2018-09-30 | 2021-09-24 | 中铁隧道局集团有限公司 | Intelligent shield TBM data acquisition and completion method |
CN109639785A (en) * | 2018-12-03 | 2019-04-16 | 上海熙菱信息技术有限公司 | A kind of data convergence cluster management system and method |
CN109902599B (en) * | 2019-02-01 | 2021-08-10 | 初速度(苏州)科技有限公司 | High-precision vehicle data quality inspection method and system |
CN109902599A (en) * | 2019-02-01 | 2019-06-18 | 初速度(苏州)科技有限公司 | A kind of high-precision car data quality detecting method and system |
CN110109409A (en) * | 2019-05-07 | 2019-08-09 | 江苏高和智能装备股份有限公司 | A kind of steel cord plying device data acquisition system |
CN110633280A (en) * | 2019-09-11 | 2019-12-31 | 北京亚信数据有限公司 | Batch data acquisition method and device, readable storage medium and computing equipment |
CN111614786A (en) * | 2020-06-05 | 2020-09-01 | 易盼红 | System and method for processing data at high speed by remote server based on block chain |
CN113778502A (en) * | 2020-06-29 | 2021-12-10 | 北京沃东天骏信息技术有限公司 | Data processing method, device, system and storage medium |
CN115456217A (en) * | 2022-09-14 | 2022-12-09 | 中远海运科技股份有限公司 | Intelligent ship Internet of things data asset management method and system |
CN116155844A (en) * | 2023-04-21 | 2023-05-23 | 天津帕克耐科技有限公司 | IDC resource management method and system |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106484857A (en) | Data collecting system and its method | |
CN106131158A (en) | Resource scheduling device based on cloud tenant's credit rating under a kind of cloud data center environment | |
CN104063458B (en) | A kind of method and device that correspondence solution is provided terminal fault problem | |
CN106095639A (en) | A kind of cluster subhealth state method for early warning and system | |
CN107577771A (en) | A kind of big data digging system | |
CN106371975A (en) | Automatic operation and maintenance early-warning method and system | |
CN105897457A (en) | Service upgrade method and system of server group | |
CN106294222A (en) | A kind of method and device determining PCIE device and slot corresponding relation | |
CN102750367A (en) | Big data checking system and method thereof on cloud platform | |
CN101772760A (en) | Database management program and database management device | |
CN105631612A (en) | System and method of evaluating individual performance and capability of public servant based on big data | |
CN106646169A (en) | Electrical device partial discharge detection data collection cloud strategy | |
CN104750826A (en) | Structural data resource metadata automatically-identifying and dynamically-registering method | |
CN103345533B (en) | A kind of date storage method and device | |
CN111123873B (en) | Production data acquisition method and system based on stream processing technology | |
CN106844588A (en) | A kind of analysis method and system of the user behavior data based on web crawlers | |
CN206021244U (en) | A kind of data collecting system under distributed computer cluster | |
CN109901978A (en) | A kind of Hadoop log lossless compression method and system | |
CN115269438A (en) | Automatic testing method and device for image processing algorithm | |
CN106874306A (en) | People information portrait Compare System Key Performance Indicator evaluating method | |
CN108334549A (en) | A kind of device data storage method, extracting method, storage platform and extraction platform | |
CN106817262A (en) | A kind of log analysis device | |
CN106776704A (en) | Statistical information collection method and device | |
CN102055780A (en) | System and method for testing disk array | |
CN102902769A (en) | Database benchmark test system of cloud computing platform and method thereof |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
TA01 | Transfer of patent application right |
Effective date of registration: 20180625 Address after: 850700 Changsha Road, Gongga County, Shannan City, Tibet autonomous region 12 Applicant after: Shannan far macro Technology Co., Ltd. Address before: 116000 Yinhai Road 16 building, 701 Huangpu Road, hi tech park, Dalian, Liaoning Applicant before: Zhuhai Special Economic Zone Technology Co., Ltd. Dalian branch |
|
TA01 | Transfer of patent application right | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20170308 |
|
RJ01 | Rejection of invention patent application after publication |