CN103853719B - Easily extension mass data collection system - Google Patents

Easily extension mass data collection system Download PDF

Info

Publication number
CN103853719B
CN103853719B CN201210496189.9A CN201210496189A CN103853719B CN 103853719 B CN103853719 B CN 103853719B CN 201210496189 A CN201210496189 A CN 201210496189A CN 103853719 B CN103853719 B CN 103853719B
Authority
CN
China
Prior art keywords
storage
task
information
terminal
storage information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201210496189.9A
Other languages
Chinese (zh)
Other versions
CN103853719A (en
Inventor
舒刚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Diligence Digital Polytron Technologies Inc
Original Assignee
Diligence Digital Polytron Technologies Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Diligence Digital Polytron Technologies Inc filed Critical Diligence Digital Polytron Technologies Inc
Priority to CN201210496189.9A priority Critical patent/CN103853719B/en
Publication of CN103853719A publication Critical patent/CN103853719A/en
Application granted granted Critical
Publication of CN103853719B publication Critical patent/CN103853719B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/258Data format conversion from or to a database

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention provides a kind of easily extension mass data collection systems:Include data storage controller, storage information caching, storage operational terminal manager, storage task buffer, storage terminal, storage information unloading file, configuration rule storehouse, error message unloading file;Data storage controller:It is responsible for receiving, distributes and treat storage information;Be put in storage operational terminal manager:It is responsible for reception, the distribution of storage task;Be put in storage terminal:The task that terminal receives manager distribution is put in storage, performs in-stockroom operation;Configuration rule storehouse:The configuration information of storage configuration rule makes corresponding reaction for all parts and provides foundation under specific circumstances;Management and running are added to mass data storage process compared with the prior art, can significantly improve and improve the scalability of system, and the accuracy of data loading and integrality, the reliability of data loading work are high.

Description

Easily extension mass data collection system
Technical field
The present invention relates to IT O&Ms field more particularly to a kind of data collecting systems for being easy to extension.
Background technology
In IT O&Ms field, sex work, accuracy and treatment effeciency are for ensureing system fortune based on data monitoring Row plays an important role, and with the continuous development of information system in itself, data acquisition amount constantly increases, and not only to monitor simultaneously Numerous equipment, and different equipment further relates to many network element indexs, therefore within the unit interval, by monitoring system acquisition Lai Network element data amount it is huge, although there are many scheme in the prior art on mass data storage, in IT O&Ms field It is not high with spending, lack the scheduling mechanisms such as the caching to data task, unloading, sequence adjustment, lack for IT operation and maintenance tools, high Effect, accurate, the data loading scheme of magnanimity.
The content of the invention
It is an object of the invention to provide a kind of easy growth data acquisition system, specifically for the suitable extensively of IT operation and maintenance tools With the characteristics of, be easy to extend, be easy to customize, can also improve efficiency and the accuracy of data loading.
Scheme is used by purpose to realize the present invention, easily extends mass data collection system:Include and externally enter Storehouse information submits interface and database, has further included data storage controller, storage information caching, storage operational terminal management Device, storage task buffer, storage terminal, storage information unloading file, configuration rule storehouse, error message unloading file;
Data storage controller:It is responsible for receiving, distributes and treat storage information, in start-up course, scans and load storage letter Cease the storage information of unloading in unloading file;After data storage controller starts, the data in being cached according to storage information Item number, the storage information during storage information is cached according to configuration rule converts in an optimal manner is packaged into storage task, and carries Give storage operational terminal manager;In system shutdown procedures, data storage controller storage information is cached in do not divide With storage information, the storage task for being rejected submission can also be included, is dumped in storage information unloading file;
Be put in storage operational terminal manager:It is responsible for reception, the distribution of storage task, after system start-up, is put in storage operational terminal Manager receives the storage task of data storage controller submission, is then checked for affiliated storage terminal whether available free end End, if there is non-occupied terminal, then carries out storage task point-score;If it is not, it also needs to check task buffer according to configuration rule In task quantity whether reached maximum carrying quantity, when task maximum carrying quantity has been reached, further according to Task unloading storage information unloading file is put in storage in part by the unloading strategy put in rule;When the affiliated storage of the controller is whole When end is in low-load state, which should notify data storage controller, scan in storage information unloading file whether There is the storage information be not put in storage of unloading;In the system shutdown procedures, which stops receiving new task With distribution task to storage terminal, unappropriated storage task is to storage information unloading file in preservation task buffer.
Be put in storage terminal:The task that terminal receives manager distribution is put in storage, performs in-stockroom operation, in the process of implementation, if It was found that the storage information of mistake, then filter the wrong storage information, and error message unloading file is dumped to, then proceeded to Execution is not carried out the storage information finished, abnormal in the event of storing, such as IO, and database service is abnormal etc., then retracts all Database manipulation, and the storage task that the storage terminal is carrying out is dumped in storage information unloading file.
Configuration rule storehouse:The configuration information of storage configuration rule including the optimal storage information number of every batch of, waits optimal information Number maximum duration, maximum can carry number of tasks, storage mission failure number of retries, storage information unloading strategy, storage information and turn File, error message unloading file are deposited, criterion is provided for system operation, is made under specific circumstances for all parts corresponding anti- Foundation should be provided.
The operation principle of the present invention includes following key step:
1) configuration rule of storage method, is configured, the configuration to configuration rule includes setting:The optimal storage letter of every batch of Breath number waits optimal information number maximum duration, maximum that can carry number of tasks, storage mission failure number of retries, storage information turn Deposit strategy, storage information unloading file, error message unloading file;
2), log-on data storage control, in start-up course, data storage controller initializes storage information and delays first Deposit, then scan storage information unloading file whether have the information be not put in storage, if so, then by these information be re-loaded into In the information cache of storehouse, in case scheduling;
3), after data storage controller starts, storage information is treated in reception, in controller detection storage information caching whether The information of storage in need, and judge whether the quantity of information has reached the optimal storage information number of default every batch of, if reached Optimal number has been arrived, then has been divided into batch of data according to optimal number, is packaged into a storage task, submit to storage operational terminal Manager;It if being not reaching to optimal number, is waited, is being waited in optimal information number maximum duration according to rule, if slow It deposits middle number of data and reaches the optimal storage information number of every batch of, then distribute a storage task by optimal data item number, otherwise wait for After by total data be encapsulated as a storage task, submit to storage operational terminal manager;
4), when storage operational terminal manager receives storage task, according to configuration rule, judge to be put in storage task buffer Whether the task quantity in queue, which has had reached default maximum, can carry number of tasks, if it is not, the task is put into Into storage task buffer;If number of tasks can be carried beyond maximum, turned according to the storage information set in configuration rule Deposit strategy, directly the task is dumped in storage information unloading file and/or, random unloading storage task buffer in proportion In partial task into storage information unloading file;
5), storage operational terminal manager is allocated storage task, first determines whether have in storage task buffer Unappropriated task, if so, then judging whether the storage terminal belonging to the storage operational terminal manager is all within operation In, if available free storage terminal, a task is taken out from storage task buffer, and gives free time storage terminal and holds Row;If without idle storage terminal, wait until that storage terminal is released, reallocation storage task;
6), after idle storage terminal receives storage task, in-stockroom operation is immediately performed, if in implementation procedure In, abnormal caused by storage information mistake, storage terminal can then filter the exception information, and continue to execute what is be not carried out Storage information simultaneously dumps to the storage information of mistake in error message unloading file;If as network, data base administration system It is abnormal caused by the reasons such as system, disk I/O, then it is put in storage terminal and certain amount or time is retried according to default configuration rule In-stockroom operation, if being still unable to normal storage, all operations for the executed that retracts, and by the storage in the storage task Information dumps to storage information unloading file;
7), when the storage terminal for being put in storage operational terminal manager administration is in low load condition, operational terminal pipe is put in storage Data storage controller scanning storage information unloading file will be notified by managing device, check whether the storage information of unloading, if so, Then these information are re-loaded in storage information caching, re-start scheduling;
8), when being put in storage in operational terminal manager closing process, storage operational terminal manager stops whole to storage work Hold manager submit task, storage information caching in storage information, due to concurrent, refused by storage operational terminal manager Exhausted task is dumped in storage information unloading file;Storage operational terminal manager stops receiving new task, stops to entering Storehouse terminal distribution task dumps to all unallocated tasks in task buffer in storage information unloading file;Each storage Terminal stopping receives an assignment, but continues to execute unfinished task, pending to finish, and exits working condition.
Using the collection scheduling device of this scheme, scheduling pipe is added to mass data storage process compared with the prior art Reason can significantly improve and improve the scalability of system, be changed with adapting to different monitored systems and its scale, system can Autgmentability, customization are high;And putaway rule configuration is flexible, is suitable for the specific business characteristic of different monitored systems, holds Easily according to the actual conditions adjustment stock management of monitored system;And accuracy and integrality, the data loading of data loading The reliability of work is high, employs delayed impact of the unloading mechanism to avoid excess load to system;With monitoring system, flow system The ability of the concurrent efforts such as system, diagnostic system is strong.
Description of the drawings
Fig. 1 is the principle of the present invention schematic diagram.
Specific embodiment
All features or disclosed all methods disclosed in this specification or in the process the step of, except mutually exclusive Feature and/or step beyond, can combine in any way.
As shown in Figure 1, in IT O&Ms field, to preserve the magnanimity performance indicator number collected by different acquisition terminal According to.Configuration rule first, the optimal storage information quantity of definition every batch of is 1000, wait optimal information number maximum duration is 3 Second, maximum can carry number of tasks 500, when the task amount in the storage task buffer that storage operational terminal manager is managed surpasses When going out maximum number of tasks, the task 30% in random unloading task buffer is into storage information unloading file, storage mission failure Number of retries 3 times, storage information unloading file are message.dump, error message unloading file error_ message.dump。
In system starting process, data storage controller scans message.dump files first, and whether check wherein has By the storage information of unloading, if there is then these information are loaded into the storage information caching of controller.
After system starts, data storage controller will be checked in storage information caching with the presence or absence of storage information, institute It is database realizing associated with data storage controller to state storage information caching, is continued waiting for if not, if so, Then according in configuration rule, judging whether the cache information number has reached 1000, if reached, extract 1000 and enter Storehouse information, and this 1000 storage informations are encapsulated as a task, storage operational terminal manager is submitted to, if do not reached It to 1000 datas, then waits 3 seconds, in this 3 seconds waiting process, if number of data reaches in being cached before 3 seconds 1000, then controller is no longer waiting for, and immediately extracts this 1000 data, is encapsulated as a storage task, is submitted to Operational terminal manager is put in storage, on the contrary, after 3 second stand-by period, 1000 data requirements is still not reaching to, then controls Device all extracts storage information all in caching, is encapsulated as a task and submits storage operational terminal manager.
Storage operational terminal manager receive task after, first determine whether belonging to storage terminal whether oepration at full load, If it is not, directly the task is put into storage task buffer, the storage task buffer is with being put in storage operational terminal The associated database of manager, wait it is to be allocated, if storage terminal all in working condition, according to configuration rule, judges to appoint Whether the data in business caching have reached 500, if having reached 500, at random dump to 30% in 500 tasks In storage information unloading file, then newly submitting for task is put into task buffer, is waited to be allocated.
When the storage terminal belonging to storage operational terminal manager is available free, then operational terminal manager is put in storage, will appointed A task submitting earliest is distributed to the idle storage terminal and is performed in business caching, if without idle storage terminal, Then resource is waited to discharge.
When storage terminal receives storage task, storage task is then carried out, in task process is put in storage in execution, hair Existing storage information mistake, for example, data type not to, the sql error that performs when exception, storage end-filtration falls these mistakes Information, and these information are dumped in file error_message.dump, and continue to execute entering of being not carried out finishing Storehouse information preserves the other information be not put in storage.
Be put in storage terminal in the process of implementation, IO is abnormal, startup, Network Abnormal etc. be not abnormal for database service if run into When, storage terminal will retry 3 subtasks according to configuration rule, if still there is identical exception, retract all operations, And all storage informations in the task are dumped in message.dump files.
When the storage terminal belonging to storage operational terminal manager is in underrun, operational terminal pipe is put in storage at this time Manage device notification controller, scan in message.dump files whether a part of storage information of unloading, if so, then again plus It is downloaded in the caching of controller, Reseals task, and re-execute these storage tasks.
When in this method system shutdown procedures, controller stops receiving new storage information, and stops submission task and arrive Be put in storage operational terminal manager, preserve storage information cache in all storage informations and by storage operational terminal manager Into file message.dump, storage operational terminal manager stops receiving new task the storage task of refusal, stops dividing With the task in caching to storage terminal, and all tasks in task buffer are dumped in message.dump files, Storage terminal continues to execute being not carried out finishing of the task, according to original logic unloading file when occurring abnormal.

Claims (3)

1. easily extension mass data collection system:Include external storage information and submit interface and database, which is characterized in that should System has further included data storage controller, storage information caching, storage operational terminal manager, storage task buffer, storage Terminal, storage information unloading file, configuration rule storehouse, error message unloading file;
Data storage controller:It is responsible for receiving, distributes and treat storage information, in start-up course, scans and load storage information turn Deposit the storage information of unloading in file;After data storage controller starts, the number of data in being cached according to storage information, Storage information during storage information is cached according to configuration rule converts the storage task that is packaged into an optimal manner, and submits and feed Storehouse operational terminal manager;In system shutdown procedures, data storage controller storage information is cached in unallocated storage Information can also include the storage task for being rejected submission, dump in storage information unloading file;
Be put in storage operational terminal manager:It is responsible for reception, the distribution of storage task, after system start-up, is put in storage operational terminal management Device receives the storage task of data storage controller submission, is then checked for the affiliated whether available free terminal of storage terminal, such as The available free terminal of fruit then carries out storage task distribution;If it is not, it also needs to check appointing in task buffer according to configuration rule Whether business quantity has reached maximum carrying quantity, when task maximum carrying quantity has been reached, further according to configuration rule In unloading strategy will part storage task unloading storage information unloading file;When the affiliated storage of the data storage controller When terminal is in low-load state, which should notify data storage controller, scan storage information Whether the storage information be not put in storage of unloading is had in unloading file;In the system shutdown procedures, the storage operational terminal manager Stop receiving new task and distribution task to storage terminal, the unappropriated storage task in task buffer that preserves turns to storage information Deposit file;
Be put in storage terminal:The task that terminal receives the manager distribution of storage operational terminal is put in storage, in-stockroom operation is performed, in implementation procedure In, if it find that the storage information of mistake, then filter the wrong storage information, and error message unloading file is dumped to, It then proceedes to perform and is not carried out the storage information finished, abnormal in the event of storing, then retract all database manipulations, and The storage task that storage terminal is carrying out is dumped in storage information unloading file;
Configuration rule storehouse:The configuration information of storage configuration rule including the optimal storage information number of every batch of, waits optimal information number most For a long time, maximum can carry number of tasks, storage mission failure number of retries, storage information unloading strategy, storage information unloading text Part, error message unloading file, criterion is provided for system operation.
2. mass data collection system is easily extended according to claim 1:It is characterized in that:Storage information caching be with The associated database of data storage controller.
3. mass data collection system is easily extended according to claim 1:It is characterized in that:It is described storage task buffer be with Be put in storage the associated database of operational terminal manager.
CN201210496189.9A 2012-11-28 2012-11-28 Easily extension mass data collection system Expired - Fee Related CN103853719B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210496189.9A CN103853719B (en) 2012-11-28 2012-11-28 Easily extension mass data collection system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210496189.9A CN103853719B (en) 2012-11-28 2012-11-28 Easily extension mass data collection system

Publications (2)

Publication Number Publication Date
CN103853719A CN103853719A (en) 2014-06-11
CN103853719B true CN103853719B (en) 2018-05-22

Family

ID=50861387

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210496189.9A Expired - Fee Related CN103853719B (en) 2012-11-28 2012-11-28 Easily extension mass data collection system

Country Status (1)

Country Link
CN (1) CN103853719B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107959714A (en) * 2017-11-14 2018-04-24 西安万像电子科技有限公司 Data processing method and device
CN110888925B (en) * 2019-10-11 2022-06-17 广州大气候农业科技有限公司 Data loading and distributing method and device and storage medium

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1265141A2 (en) * 2001-03-29 2002-12-11 Kabushiki Kaisha Toshiba System, method and computer program for data-management
CN101354713A (en) * 2008-09-08 2009-01-28 大唐软件技术股份有限公司 Method and system for storing data
CN101533417A (en) * 2009-04-28 2009-09-16 阿里巴巴集团控股有限公司 A method and system for realizing ETL scheduling
CN101576918A (en) * 2009-06-19 2009-11-11 用友软件股份有限公司 Data buffering system with load balancing function
CN101894163A (en) * 2010-07-23 2010-11-24 中兴通讯股份有限公司 Database operating and scheduling method and device for performance data acquisition system
CN101996244A (en) * 2010-11-09 2011-03-30 中兴通讯股份有限公司 Device, system and method for inputting batch data into database
CN102012840A (en) * 2010-12-23 2011-04-13 中国农业银行股份有限公司 Batch data scheduling method and system
CN102209118A (en) * 2011-06-10 2011-10-05 成都勤智数码科技有限公司 Distributed mass data gathering method
CN102752387A (en) * 2012-06-29 2012-10-24 用友软件股份有限公司 Data storage processing system and data storage processing method

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1265141A2 (en) * 2001-03-29 2002-12-11 Kabushiki Kaisha Toshiba System, method and computer program for data-management
CN101354713A (en) * 2008-09-08 2009-01-28 大唐软件技术股份有限公司 Method and system for storing data
CN101533417A (en) * 2009-04-28 2009-09-16 阿里巴巴集团控股有限公司 A method and system for realizing ETL scheduling
CN101576918A (en) * 2009-06-19 2009-11-11 用友软件股份有限公司 Data buffering system with load balancing function
CN101894163A (en) * 2010-07-23 2010-11-24 中兴通讯股份有限公司 Database operating and scheduling method and device for performance data acquisition system
CN101996244A (en) * 2010-11-09 2011-03-30 中兴通讯股份有限公司 Device, system and method for inputting batch data into database
CN102012840A (en) * 2010-12-23 2011-04-13 中国农业银行股份有限公司 Batch data scheduling method and system
CN102209118A (en) * 2011-06-10 2011-10-05 成都勤智数码科技有限公司 Distributed mass data gathering method
CN102752387A (en) * 2012-06-29 2012-10-24 用友软件股份有限公司 Data storage processing system and data storage processing method

Also Published As

Publication number Publication date
CN103853719A (en) 2014-06-11

Similar Documents

Publication Publication Date Title
CN110297711B (en) Batch data processing method, device, computer equipment and storage medium
EP3335120B1 (en) Method and system for resource scheduling
CN112162865B (en) Scheduling method and device of server and server
JP4920391B2 (en) Computer system management method, management server, computer system and program
US9727372B2 (en) Scheduling computer jobs for execution
CN101533417B (en) A method and system for realizing ETL scheduling
CN103853713B (en) The efficient storage method of mass data
US20060294239A1 (en) Method and system for controlling computer in system
US20180041600A1 (en) Distributed processing system, task processing method, and storage medium
CN111258746B (en) Resource allocation method and service equipment
US8539495B2 (en) Recording medium storing therein a dynamic job scheduling program, job scheduling apparatus, and job scheduling method
CN110096342A (en) Task processing method, device, server and storage medium
EP2581833A1 (en) Multi-core processor system, control program, and method of control
US20100251248A1 (en) Job processing method, computer-readable recording medium having stored job processing program and job processing system
US9037703B1 (en) System and methods for managing system resources on distributed servers
US10606650B2 (en) Methods and nodes for scheduling data processing
CN111767145A (en) Container scheduling system, method, device and equipment
CN103853719B (en) Easily extension mass data collection system
CN116560860A (en) Real-time optimization adjustment method for resource priority based on machine learning
CN113391902B (en) Task scheduling method and device and storage medium
CN114564281A (en) Container scheduling method, device, equipment and storage medium
CN110838987A (en) Queue current limiting method and storage medium
CN112948096A (en) Batch scheduling method, device and equipment
JP2008204243A (en) Job execution control method and system
CN116127494A (en) Control method and related device for concurrent access of users

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: Two Lu Tian Hua high tech Zone of Chengdu City, Sichuan province 610000 No. 219 Tianfu Software Park C District 10 building 20 layer

Applicant after: CHINAWISERV TECHNOLOGIES Inc.

Address before: Two Lu Tian Hua high tech Zone of Chengdu City, Sichuan province 610000 No. 81 Tianfu Software Park C District 10 building 20 layer

Applicant before: CHENGDU QINZHI DIGITAL TECHNOLOGY Co.,Ltd.

CB02 Change of applicant information
GR01 Patent grant
GR01 Patent grant
PP01 Preservation of patent right

Effective date of registration: 20191211

Granted publication date: 20180522

PP01 Preservation of patent right
PD01 Discharge of preservation of patent

Date of cancellation: 20221211

Granted publication date: 20180522

PD01 Discharge of preservation of patent
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20180522

CF01 Termination of patent right due to non-payment of annual fee