CN103235807A - Data extracting and processing method supporting high-concurrency large-volume data - Google Patents

Data extracting and processing method supporting high-concurrency large-volume data Download PDF

Info

Publication number
CN103235807A
CN103235807A CN2013101383251A CN201310138325A CN103235807A CN 103235807 A CN103235807 A CN 103235807A CN 2013101383251 A CN2013101383251 A CN 2013101383251A CN 201310138325 A CN201310138325 A CN 201310138325A CN 103235807 A CN103235807 A CN 103235807A
Authority
CN
China
Prior art keywords
file
data
thread
client
acquisition
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN2013101383251A
Other languages
Chinese (zh)
Inventor
付传伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inspur Group Shandong General Software Co Ltd
Original Assignee
Inspur Group Shandong General Software Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inspur Group Shandong General Software Co Ltd filed Critical Inspur Group Shandong General Software Co Ltd
Priority to CN2013101383251A priority Critical patent/CN103235807A/en
Publication of CN103235807A publication Critical patent/CN103235807A/en
Pending legal-status Critical Current

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to the technical field of data acquisition and statistics, particularly to a data extracting and processing method supporting high-concurrency large-volume data. By defining an acquisition scheme flexibly; client software defines an acquisition SQL (structured query language) and an acquisition frequency for acquiring data at a fixed time, and a server module provides security login, data importing and business logic processing triggering, so that mass data can be stably acquired into an enterprise database to provide a data source for enterprise decision analysis.

Description

A kind of data pick-up disposal route of supporting high concurrent big data quantity
Technical field
The present invention relates to data acquisition statistical technique field, particularly a kind of data pick-up disposal route of supporting high concurrent big data quantity.
Background technology
In business decision, this enterprise product goods entry, stock and sales data in the downstream distributor there is statistical demand consumingly, how stably obtaining mass data from the huge dealer of quantity will become key to the issue, and present acquisition method all is independently, and its process is not only loaded down with trivial details but also stable inadequately.
Summary of the invention
In order to solve prior art problems, the invention provides a kind of data pick-up disposal route of supporting high concurrent big data quantity, by flexible definition acquisition scheme, gather SQL, frequency acquisition timing acquiring data by the client software definition, server component provides secure log, data importing, business logic processing to trigger, thereby gather to enterprise database mass data is stable, for the business decision analysis provides data source.
The technical solution adopted in the present invention is as follows:
A kind of data pick-up disposal route of supporting high concurrent big data quantity, its two parts method of being handled by the data of the data acquisition of client and server end constitutes, wherein,
The method of the data acquisition of A, client specifically comprises:
B, from server end download configuration definition information;
C, according to configuration definition information and the local data source combination of downloading, define concrete collection Structured Query Language (SQL) information;
D, enable and gather to judge thread, collecting thread, data upload thread and log buffer thread simultaneously;
But E, gather and to judge whether triggering collection of thread quantitative check collection definition, the collection detail of needs execution is passed to collecting thread carry out follow-up execution;
F, collecting thread are gathered managing detailed catalogue to each and are carried out one by one, compress spanned file behind the data pick-up;
Whether G, the quantitative check of data upload thread exist new file, are checked through to carry out one by one file behind the new file and upload; When uploading failure, then continue next file and upload, the failure file is waited for next time and is continued to attempt uploading;
H, log buffer provide efficient mechanism, play access buffer between each thread and journal file, have accelerated each thread execution;
The method that server end carries out the data processing specifically comprises:
A, externally provide service with network service interface, assembly moves with the form method of service;
B, server-side component start client-side management thread, file transfer management thread, file importing scheduling thread;
C, client-side management thread are regularly mutual with database, read client configuration information and preserve the client log-on message to database; The various information of thread cache client provides client login authentication function simultaneously for file transfer provides the information fast access with importing to dispatch;
The reception of d, the distribution of file transfer management thread control transmission permission, recovery and data;
E, file import scheduling thread provides same client can only import the control that a file, different clients file can be concurrent simultaneously, and importing for the file of makeing mistakes to provide the fault-tolerant mechanism that imports again.
Configuration definition information comprises following content:
(1) objective definition object: data rows name, field type, field length, explanation of field, major key whether;
(2) definition acquisition target: list structure information (destination object) and day, week, month corresponding table name of different acquisition cycle;
(3) acquisition scheme: institute comprises acquisition target sets definition and aftertreatment scheme;
(4) client configuration: login name, password, acquisition scheme.
The beneficial effect that technical scheme provided by the invention is brought is:
The present invention is by defining acquisition scheme flexibly, gather the SQL(Structured Query Language (SQL) by client definition) information, frequency acquisition timing acquiring data, server end provides secure log, data importing, business logic processing to trigger, gather to enterprise database mass data is stable, for the business decision analysis provides data source.
Embodiment
For making the purpose, technical solutions and advantages of the present invention clearer, embodiment of the present invention is described further in detail below in conjunction with specific embodiment.
Embodiment one
Provide the specific procedure that the data of the data acquisition of client and server end are handled below:
1, client:
1.1 regular log buffer class provides log buffer, regularly preserves daily record to file:
public?class?TableLogBuffer?:?ThreadTaskBase
{
The execution journal of ///is regularly preserved
protected?override?void?TaskWorkFunc()
{
lock?(_lock)
{
// internal memory does not have change, does not need to be saved in the file
if?(!_ismodified)
return;
if?(!Directory.Exists(LocalPath.LogPath))
Directory.CreateDirectory(LocalPath.LogPath);
// obtain the byte stream of format information
string?strResult?=?Serializer.XmlSerialize(_logList);
string?strPath?=?LocalPath.LogPath?+?"Logs.xml";
if?(File.Exists(strPath))
File.Delete(strPath);
FileStream?fs?=?new?FileStream(strPath,?FileMode.OpenOrCreate,?FileAccess.Write);
StreamWriter?m_streamWriter?=?new?StreamWriter(fs,?Encoding.Default);
m_streamWriter.Write(strResult);
m_streamWriter.Flush();
m_streamWriter.Close();
fs.Close();
fs.Dispose();
_ismodified?=?false;
}
}
///write journaling
public?void?WriteLog(DLog?dLog,?string?flag)
{
lock?(_lock)
{
bool?bolExistLog?=?false;
foreach(DLog?d?in?_logList.logs)
{
if((PubFunc.IsNull(d.FileName)?==PubFunc.IsNull(dLog.FileName))
&&?(PubFunc.IsNull(d.CompanyLocalGuid)?==?PubFunc.IsNull(dLog.CompanyLocalGuid)))
{
bolExistLog?=?true;
break;
}
}
if?(!bolExistLog?&&?flag?==?"0")
_logList.logs.Add(dLog);
_ismodified?=?true;
}
}
}
1.2 gather to check thread, make regular check on and judge whether local acquisition scheme triggers and begin execution:
public?class?CollectDateTimeJudge?:?ThreadTaskBase
{
protected?override?void?TaskWorkFunc()
{
DateTime?now?=?DateTime.Now;
if?(lasttime?==?now.ToString("HH:mm"))
return;
lasttime?=?now.ToString("HH:mm");
// each judgement is obtained up-to-date
List<Company>?companyList?=?Company.getCompanyList(true);
// this generates all the detailed Service Data Units that need gather
List<CollectInfo>?collinfo?=?new?List<CollectInfo>();
lock?(_lock)
{
All local clients of // circulation
foreach?(Company?cp?in?companyList)
// circulation peek scheme
foreach?(SDCDQSFA?scheme?in?cp.CollectScheme)
if?(CollectDateTimePubFunc.CheckTime(now,?scheme.SDCDQSFN_CJSJ))
{
List<CollectDateTime>?cdtlist?=?CollectDateTimePubFunc.GetCollectDateTimeList(now,?scheme);
List<CollectInfo>?tcil?=?CollectDateTimePubFunc.GetCollectInfos(cdtlist,?scheme,?cp,?null);
collinfo.AddRange(tcil);
}
if?(collinfo.Count?>?0)
{
try
{
// acquisition tasks information is set
string?taskguid?=?Guid.NewGuid().ToString();
foreach?(CollectInfo?ci?in?collinfo)
ci.TaskDesc?=?"Auto?:"?+?taskguid;
Collection.CollectionThread.AddCollectInfos(collinfo);
}
catch?(Exception?ex)
{
TxtLogBuffer.BufferObj.Write (this.TaskName+" makeing mistakes during data acquisition ", ex.ToString ());
}
}
}
}
}
1.3 collecting thread according to the execution detail record that checks that thread imports into, connects the database extracted data and generates compressed file:
public?class?Collection?:?ThreadTaskBase
{
The data acquisition execution thread of ///public
public?static?Collection?CollectionThread?=?new?Collection();
The management of ///lock object
private?object?_lock?=?new?object();
The acquisition list of ///pending
List<CollectInfo>?_tcil?=?new?List<CollectInfo>();
public?Collection()
{
// carried out once collection and judge in per 5 seconds
Interval?=?1;
TaskName=" gathering execution thread automatically ";
}
The detailed task of the collection that ///increase carried out
public?void?AddCollectInfos(List<CollectInfo>?list)
{
lock?(_lock)
{
_tcil.AddRange(list);
}
}
protected?override?void?TaskWorkFunc()
{
List<CollectInfo>?list?=?new?List<CollectInfo>();
lock?(_lock)
{
list.AddRange(_tcil);
_tcil.Clear();
}
Every acquisition tasks of ///circulation is carried out
foreach?(CollectInfo?ci?in?list)
{
try
{
ExceDataCollect(ci);
}
catch?(Exception?ex)
{
TxtLogBuffer.BufferObj.Write (this.TaskName, String.Format (" collection of carrying out [{ 0}] occurs unusual: "+ex.ToString (), ci.ToString ()));
}
}
}
///carry out data acquisition
public?void?ExceDataCollect(CollectInfo?ci)
{
string?companyDataPath?=?LocalPath.GetDataDirectory("Data",?ci.CompanyLocalGuid);
if?(!Directory.Exists(companyDataPath))
{
if?(!Directory.Exists(LocalPath.DataPath))
Directory.CreateDirectory(LocalPath.DataPath);
Directory.CreateDirectory(companyDataPath);
}
Database?db?=?new?Database(ci.DataSource.SDCDJXS_DBType,?ci.DataSource.SDCDJXS_DBMC
,?ci.DataSource.SDCDJXS_YHM,?ci.DataSource.SDCDJXS_DLMM,?ci.DataSource.SDCDJXS_DBFWM
,?"",?3600,?ci.DataSource.SDCDJXS_PORT);
try
{
TxtLogBuffer.BufferObj.Write (" begin to carry out and gather: ", ci.ToString ());
DLog?dlog?=?ci.ToDLog();
Dlog.DCID=Guid.NewGuid () .ToString () .ToLower (); // collection batch number
int?k?=?0;
while?(k?<?3)
{
k++;
try
{
Dlog.CollectBeginTime=DateTime.Now.ToString (" yyyy MM month dd day HH:mm:ss ");
// collection result is stored to Data catalogue DataPath
SaveData(db,dlog,ci);
// withdraw from after gathering successfully
break;
}
catch?(Exception?ex)
{
Dlog.CollectErrMsg=" gather attempt the "+Convert.ToString (k)+" inferior unusual: "+ex.Message;
TxtLogBuffer.BufferObj.Write (this.TaskName+" makeing mistakes when obtaining result set: ", ex.ToString ());
}
finally
{
Dlog.CollectEndTime=DateTime.Now.ToString (" yyyy MM month dd day HH:mm:ss ");
}
}
TableLogBuffer.BufferObj.WriteLog(dlog,?"0");
}
finally
{
db.Dispose();
}
}
}
1.4 file is uploaded thread, makes regular check on whether to have new data file, uploads if there is new file then to trigger file:
public?class?SendData?:?ThreadTaskBase
{
///upload task
protected?override?void?TaskWorkFunc()
{
try
{
// obtain file (comprising in the sub-directory) a of all .rar ending
if?(!Directory.Exists(LocalPath.DataPath))
return;
DirectoryInfo?di?=?new?DirectoryInfo(LocalPath.DataPath);
FileInfo[]?infos?=?di.GetFiles("*.rar",?SearchOption.AllDirectories);
// by the file modifying time-sequencing
SortedList<string,?string>?sinfos?=?new?SortedList<string,?string>();
foreach?(FileInfo?fi?in?infos)
sinfos.Add(fi.LastWriteTime.ToString("yyyy-MM-dd?HH:mm:ss.fff")?+?fi.FullName,?fi.FullName);
List<string>?FileList?=?new?List<string>();
foreach(string?s?in?sinfos.Values)
FileList.Add(s);
while?(FileList.Count?!=?0)
{
// obtain top file
String fileFullPath=FileList[0] .ToString (); // compressed file fullpath
if?(!File.Exists(fileFullPath))
{
// file does not exist, removes out tabulation, continues next file
FileList.Remove(fileFullPath);
continue;
}
DLog?dlog?=?dlog?=?new?DLog();
try
{
String filename=Path.GetFileName (fileFullPath); // obtain the compressed file name
dlog.FileName?=?filename;
Dlog.TransBeginTime=DateTime.Now.ToString (" yyyy MM month dd day HH:mm:ss ");
String bdnm=Path.GetDirectoryName (fileFullPath) .Replace (LocalPath.DataPath, " "); // obtain folder content BDNM
dlog.CompanyLocalGuid?=?bdnm;
String uploadFileTempName=Path.ChangeExtension (filename, " dat "); // temporary file name after uploading
// obtain verification number and password according to client's numbering
Company?company?=?new?Company(bdnm);
if?(string.IsNullOrEmpty(company.CompanyURL))
{
// can not find the file of normally uploading information to remove dequeue, and deletion
FileList.Remove(fileFullPath);
File.Delete(fileFullPath);//toTest
Dlog.SendErrMsg=" this client's information is wrong. and deleted file is not uploaded.";
// this client's information is wrong, enters next circulation
continue;
}
company.DataDealProxy.Timeout?=?15000;
string?strToken?=?company.DataDealProxy.Login(company.PZID,?company.DLMM);
The application of // beginning file transfer
bool?isCan?=?false;
After taking second place, // trial 10 withdraws from application
// modification number of attempt
for?(int?i?=?0;?i?<?3;?i++)
{
isCan?=?company.DataDealProxy.BeginApplyForTransFile(strToken);
If (isCan) // apply for successfully directly withdrawing from
break;
Thread.Sleep(1000?*?15);
}
if?(!isCan)
{
Dlog.SendErrMsg=" transmission application failure will be transmitted this document after a while again.";
(" ["+fileFullPath+"] transmission application failure will be transmitted this document to TxtLogBuffer.BufferObj.Write after a while again.",?"");
continue;
}
company.DataDealProxy.CreateNewFile(strToken,?uploadFileTempName);
// read the document flow object of file
FileStream?StreamToZip?=?new?FileStream(fileFullPath,?System.IO.FileMode.Open,?System.IO.FileAccess.Read);
try
{
byte[]?buffer?=?new?byte[24?*?1024];?//?24K
System.Int32?sizeRead?=?0;
AES?myAES?=?new?AES(company.CompanyGuid);
while?((sizeRead?=?StreamToZip.Read(buffer,?0,?buffer.Length))?>?0)
{
byte[]?mw?=?myAES.AESEncrypt(buffer,?0,?sizeRead);
company.DataDealProxy.AppendFile(strToken,?mw,?mw.Length,?uploadFileTempName);
// having transmitted the 24K byte all will suspend 1 second at every turn, and the network bandwidth of each client maximum takies the byte per second into 24K like this
Thread.Sleep(1000);
}
}
finally
{
StreamToZip.Close();
}
company.DataDealProxy.DeleteFile(strToken,?filename);
company.DataDealProxy.RenameFile(strToken,?uploadFileTempName,?filename);
company.DataDealProxy.EndApplyForTransFile(strToken);
A dlog.SendErrMsg=" upload success! ";
(fileFullPath " uploads success to TxtLogBuffer.BufferObj.Write! ");
Thread.Sleep(1000);
try
{
FileMove (fileFullPath); // upload successfully local file is moved on to the history file folder
}
catch?(Exception?ex)
{
Dlog.SendErrMsg +=" move failure! Unusual details: "+ex.Message;
TxtLogBuffer.BufferObj.Write (this.TaskName+" r n["+fileFullPath+"] move failure! Unusual details: ", ex.ToString ());
}
}
catch?(Exception?ex)
{
Dlog.SendErrMsg=" the transmission application occurs unusual! To attempt transmission after a while again: "+ex.Message;
(this.TaskName+" r n["+fileFullPath+"] transmission application occurs unusual TxtLogBuffer.BufferObj.Write! To attempt transmission after a while again.",?ex.Message.ToString());
}
finally
{
FileList.Remove (fileFullPath); // the file uploaded removes dequeue
Dlog.TransEndTime=DateTime.Now.ToString (" yyyy MM month dd day HH:mm:ss ");
TableLogBuffer.BufferObj.WriteLog(dlog,?"1");
}
}
}
catch?(Exception?ex)
{
TxtLogBuffer.BufferObj.Write (this.TaskName+" uploading operation occurs unusual: ", ex.ToString ());
}
}
}
2, server end:
2.1 the file transfer management thread is responsible for transmission permission distribution, recovery, file transfer:
public?class?FileTransferManager?:?ThreadTaskBase
{
The #region file transfer
The application of ///beginning transfer files
public?bool?BeginApplyForTransFile(string?clientID)
{
bool?ret?=?false;
lock?(_lock)
{
If // apply for, and do not have overtimely, then directly return true
if?(_transferingClient.ContainsKey(clientID))
ret?=?true;
else
{
// obtain client's priority level
string?yxjb?=?_clientManager.GetYXJB(clientID);
// judge whether the concurrent quantity that arranges surpasses maximum upper limit
if?((_transferingClient.Count?<?DCSConfig.AllTransferConcurrencyCapability)?||
(DCSConfig.AllTransferConcurrencyCapability?==?0))
{
When the client of // current transmission does not limit less than total concurrent quantity or system, begin to allow transfer files
_transferingClient[clientID]?=?yxjb;
// the up-to-date time of current client's transfer files is set
_transferingClientActiveTime[clientID]?=?DateTime.Now;
ret?=?true;
}
}
}
}
return?ret;
}
///finish transfer files application
public?bool?EndApplyForTransFile(string?clientID)
{
bool?ret?=?false;
lock?(_lock)
{
If // apply for
if?(_transferingClient.ContainsKey(clientID))
{
// obtain client's priority level (levels of clients that obtains during application)
string?yxjb?=?_transferingClient[clientID];
The information of // deletion client
_transferingClient.Remove(clientID);
The up-to-date time of // deletion client transfer files
_transferingClientActiveTime.Remove(clientID);
ret?=?true;
}
}
if?(DCSConfig.RecordFileCommonLog)
LogDataWriter.LogWrite.AddSystemLog (clientID, " file transfer application cancellation ", (ret " success ": " failure "));
return?ret;
}
The new file of ///create
public?void?CreateNewFile(string?clientID,?string?fileName)
{
if?(!Directory.Exists(DCSConfig.TransPath))
Directory.CreateDirectory(DCSConfig.TransPath);
FileStream?file?=?File.Create(DCSConfig.TransPath?+?fileName);
file.Close();
file.Dispose();
// log file begins the transmission time
string?logfilename?=?System.IO.Path.ChangeExtension(fileName,?"rar");
LogDataWriter.LogWrite.AddDataBeginTransLog(logfilename);
if?(DCSConfig.RecordFileCommonLog)
LogDataWriter.LogWrite.AddLog (fileName, " file transfer establishment ", " client id: "+clientID);
}
///deleted file
public?void?DeleteFile(string?clientID,?string?fileName)
{
if?(File.Exists(DCSConfig.TransPath?+?fileName))
File.Delete(DCSConfig.TransPath?+?fileName);
}
///transfer files content
public?void?AppendFile(string?clientID,?byte[]?buffer,?int?count,?string?fileName)
{
if?(!Directory.Exists(DCSConfig.TransPath))
Directory.CreateDirectory(DCSConfig.TransPath);
AES?aes?=?new?AES(_clientManager.GetAESKey(clientID));
byte[]?mw?=?aes.AESDecrypt(buffer,?0,?count);
// definition actual file object is preserved the file of uploading.
FileStream?fileStream?=?new?FileStream(DCSConfig.TransPath?+?fileName,?FileMode.Append,FileAccess.Write);
// data in the internal memory are write physical file
//fileStream.Seek(0,?SeekOrigin.End);
fileStream.Write(mw,?0,?mw.Length);
fileStream.Close();
fileStream.Dispose();
}
///Rename file
public?void?RenameFile(string?clientID,?string?oldFileName,?string?newFileName)
{
if?(File.Exists(DCSConfig.TransPath?+?oldFileName))
{
if?(oldFileName.ToLower()?!=?newFileName.ToLower())
{
if?(File.Exists(DCSConfig.TransPath?+?newFileName))
File.Delete(DCSConfig.TransPath?+?newFileName);
File.Move(DCSConfig.TransPath?+?oldFileName,?DCSConfig.TransPath?+?newFileName);
}
string?yxjb?=?_clientManager.GetYXJB(clientID);
FileInfo?fi?=?new?FileInfo(DCSConfig.TransPath?+?newFileName);
// log file transmission the deadline
string?logfilename?=?System.IO.Path.ChangeExtension(newFileName,?"rar");
LogDataWriter.LogWrite.AddDataEndTransLog(logfilename,fi.Length);
if?(DCSConfig.RecordFileCommonLog)
LogDataWriter.LogWrite.AddLog (newFileName, " file transfer is finished ", " client id: "+clientID);
// add in the formation that file carries out and move
if?(OnAddNeedDealFileName?!=?null)
OnAddNeedDealFileName(newFileName,?yxjb);
}
}
#endregion
}
2.2 file imports scheduling thread, is responsible for the concurrent control of scheduling that file imports:
public?class?FileExecManager?:?ThreadTaskBase
{
protected?override?void?TaskWorkFunc()
{
Logger.Info (" thread: ["+this.TaskName+"] "+DateTime.Now.ToString (" yyyy MM month dd day HH:mm:ss ")+" begins to carry out .... ");
string?outStr?=?"";
lock?(_lock)
{
Dictionary<string,?string>?runningClientList?=?new?Dictionary<string,?string>();
If #region has the file of operation to import thread, carry out following processing earlier
if?(_dealingFileList.Count?>?0)
{
The processing that // importing thread finishes
List<string>?fileList?=?_dealingFileList.Keys.ToList();
foreach?(string?fn?in?fileList)
{
FileImportThread?importThread?=?_dealingFileList[fn];
If // end of run
if?(!importThread.IsAlive)
{
if?(importThread.Error?==?null)
{
Under the normal situation of // operation, move to backup path
if?(File.Exists(DCSConfig.BackupPath?+?fn))
File.Delete(DCSConfig.BackupPath?+?fn);
try
{
File.Move(DCSConfig.DataPath?+?fn,?DCSConfig.BackupPath?+?fn);
The fileinfo that // deletion is moving
_dealingFileList.Remove(fn);
LogDataWriter.LogWrite.AddLog (fn, " backup after file imports ", " ");
}
catch?(Exception)?{?}
}
else
{
// increase abnormity processing
if?(!_errorFileList.ContainsKey(fn))
{
_errorFileList.Add(fn,?DateTime.Now);
_errorFileImportTimes.Add(fn,?1);
}
if?(_errorFileList.ContainsKey(fn)?&&?_errorFileImportTimes.ContainsKey(fn))
{
if?(_errorFileImportTimes.ContainsKey(fn))
{
if?(_errorFileImportTimes[fn]?<?3)
{
_errorFileImportTimes[fn]++;
_dealingFileList.Remove(fn);
_errorFileList[fn]?=?DateTime.Now;
continue;
}
}
}
The historical record of // suppressing exception file
_errorFileList.Remove(fn);
_errorFileImportTimes.Remove(fn);
if?(File.Exists(DCSConfig.ErrorPath?+?fn))
File.Delete(DCSConfig.ErrorPath?+?fn);
// otherwise move to off path
try
{
File.Move(DCSConfig.DataPath?+?fn,?DCSConfig.ErrorPath?+?fn);
The fileinfo that // deletion is moving
_dealingFileList.Remove(fn);
// following code removes, and has carried out record in importing thread, so record more herein
//LogDataWriter.LogWrite.AddException (fn, " file imports failure ", " error message: "+importThread.Error.ToString ());
}
catch?(Exception)?{?}
}
}
else
{
The numbering of the client that // record is moving
string?clientID?=?"";
if?(fn.IndexOf('_',?0)?>?0)
{
clientID?=?fn.Substring(0,?fn.IndexOf('_',?0));
}
if?(!runningClientList.ContainsKey(clientID))
runningClientList.Add(clientID,?clientID);
}
}
// recomputate the number of threads of handling file
_dealingCount.Clear();
foreach?(FileImportThread?importThread?in?_dealingFileList.Values)
{
int?curCount?=?0;
if?(_dealingCount.ContainsKey(importThread.YXJB))
curCount?=?_dealingCount[importThread.YXJB];
_dealingCount[importThread.YXJB]?=?curCount?+?1;
}
}
#endregion
#region imports the scheduling of file thread
if?(_beforeDealFileList.Count?>?0)
{
List<string>?fileList?=?_beforeDealFileList.Keys.ToList();
foreach?(string?fn?in?fileList)
{
string?clientID?=?"";
if?(fn.IndexOf('_',?0)?>?0)
{
clientID?=?fn.Substring(0,?fn.IndexOf('_',?0));
}
If // active client has had the file of handling, then continue the judgement of next file
// follow-up the rule that increased: carry out the control of concurrent quantity by client, each client can only import a file simultaneously
if?(runningClientList.ContainsKey(clientID))
continue;
string?yxjb?=?_beforeDealFileList[fn];
bool?isRun?=?false;
if?((_dealingFileList.Count?<?DCSConfig.AllDealConcurrencyCapability)?||
(DCSConfig.AllDealConcurrencyCapability?==?0))
isRun?=?true;
#endregion
if?(isRun)
{
_beforeDealFileList.Remove(fn);
if?(!_dealingFileList.ContainsKey(fn))
{
If // passed the data file that repeats, then do not need to have handled again
FileImportThread?importThread?=?new?FileImportThread(fn,?yxjb,_clientManager);
_dealingFileList.Add(fn,?importThread);
importThread.Start();
LogDataWriter.LogWrite.AddLog (fn, " the file scheduling begins to import ", " ");
}
if?(!runningClientList.ContainsKey(clientID))
runningClientList.Add(clientID,?clientID);
}
}
}
#endregion
#region recomputates the number of threads of handling file
_dealingCount.Clear();
foreach?(FileImportThread?importThread?in?_dealingFileList.Values)
{
int?curCount?=?0;
if?(_dealingCount.ContainsKey(importThread.YXJB))
curCount?=?_dealingCount[importThread.YXJB];
_dealingCount[importThread.YXJB]?=?curCount?+?1;
}
}
Logger.Info (" thread: ["+this.TaskName+"] "+DateTime.Now.ToString (" yyyy MM month dd day HH:mm:ss ")+" complete! ");
}
}。
Present embodiment effectively raises the enterprise-level data pick-up and handles building of application system, and configuration definition is flexible, and the deployed environment applicability is strong, and the handling property under the big data quantity is stable, can greatly promote the popularization of technology, has a good application prospect.
The above only is preferred embodiment of the present invention, and is in order to limit the present invention, within the spirit and principles in the present invention not all, any modification of doing, is equal to replacement, improvement etc., all should be included within protection scope of the present invention.

Claims (2)

1. data pick-up disposal route of supporting high concurrent big data quantity, its two parts method of being handled by the data of the data acquisition of client and server end constitutes, wherein,
The method of the data acquisition of A, client specifically comprises:
B, from server end download configuration definition information;
C, according to configuration definition information and the local data source combination of downloading, define concrete collection Structured Query Language (SQL) information;
D, enable and gather to judge thread, collecting thread, data upload thread and log buffer thread simultaneously;
But E, gather and to judge whether triggering collection of thread quantitative check collection definition, the collection detail of needs execution is passed to collecting thread carry out follow-up execution;
F, collecting thread are gathered managing detailed catalogue to each and are carried out one by one, compress spanned file behind the data pick-up;
Whether G, the quantitative check of data upload thread exist new file, are checked through to carry out one by one file behind the new file and upload; When uploading failure, then continue next file and upload, the failure file is waited for next time and is continued to attempt uploading;
H, log buffer provide efficient mechanism, play access buffer between each thread and journal file, have accelerated each thread execution;
The method that server end carries out the data processing specifically comprises:
A, externally provide service with network service interface, assembly moves with the form method of service;
B, server-side component start client-side management thread, file transfer management thread, file importing scheduling thread;
C, client-side management thread are regularly mutual with database, read client configuration information and preserve the client log-on message to database; The various information of thread cache client provides client login authentication function simultaneously for file transfer provides the information fast access with importing to dispatch;
The reception of d, the distribution of file transfer management thread control transmission permission, recovery and data;
E, file import scheduling thread provides same client can only import the control that a file, different clients file can be concurrent simultaneously, and importing for the file of makeing mistakes to provide the fault-tolerant mechanism that imports again.
2. a kind of data pick-up disposal route of supporting high concurrent big data quantity according to claim 1 is characterized in that described configuration definition information comprises following content:
(1) objective definition object: data rows name, field type, field length, explanation of field, major key whether;
(2) definition acquisition target: list structure information (destination object) and day, week, month corresponding table name of different acquisition cycle;
(3) acquisition scheme: institute comprises acquisition target sets definition and aftertreatment scheme;
(4) client configuration: login name, password, acquisition scheme.
CN2013101383251A 2013-04-19 2013-04-19 Data extracting and processing method supporting high-concurrency large-volume data Pending CN103235807A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2013101383251A CN103235807A (en) 2013-04-19 2013-04-19 Data extracting and processing method supporting high-concurrency large-volume data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2013101383251A CN103235807A (en) 2013-04-19 2013-04-19 Data extracting and processing method supporting high-concurrency large-volume data

Publications (1)

Publication Number Publication Date
CN103235807A true CN103235807A (en) 2013-08-07

Family

ID=48883848

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2013101383251A Pending CN103235807A (en) 2013-04-19 2013-04-19 Data extracting and processing method supporting high-concurrency large-volume data

Country Status (1)

Country Link
CN (1) CN103235807A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103440290A (en) * 2013-08-16 2013-12-11 曙光信息产业股份有限公司 Big data loading system and method
CN103488690A (en) * 2013-09-02 2014-01-01 用友软件股份有限公司 Data integrating system and data integrating method
CN103970880A (en) * 2014-05-17 2014-08-06 白崇明 Distributed multi-point data extraction method
CN105205687A (en) * 2015-08-24 2015-12-30 浪潮通用软件有限公司 Mass data acquisition method
WO2016065776A1 (en) * 2014-10-28 2016-05-06 浪潮电子信息产业股份有限公司 Method for tightly coupled scalable big-data interaction
CN107862425A (en) * 2017-08-29 2018-03-30 平安普惠企业管理有限公司 Air control collecting method, equipment, system and readable storage medium storing program for executing
WO2019104891A1 (en) * 2017-11-28 2019-06-06 平安科技(深圳)有限公司 Method and device for importing and exporting report, storage medium, and terminal

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102123043A (en) * 2011-01-05 2011-07-13 中国南方电网有限责任公司超高压输电公司百色局 System capable of realizing secondary safety protection and supervision of substation

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102123043A (en) * 2011-01-05 2011-07-13 中国南方电网有限责任公司超高压输电公司百色局 System capable of realizing secondary safety protection and supervision of substation

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
刘刚: "基于OPC的厂站数据采集的设计与实现", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *
张晓谦: "TELNET通用数据采集系统的研究与实现", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *
陈涛: "分布式日志数据采集代理框架的研究与设计", 《万方数据知识服务平台》 *

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103440290A (en) * 2013-08-16 2013-12-11 曙光信息产业股份有限公司 Big data loading system and method
CN103488690A (en) * 2013-09-02 2014-01-01 用友软件股份有限公司 Data integrating system and data integrating method
CN103488690B (en) * 2013-09-02 2017-06-30 用友网络科技股份有限公司 Data integrated system and data integrating method
CN103970880A (en) * 2014-05-17 2014-08-06 白崇明 Distributed multi-point data extraction method
CN103970880B (en) * 2014-05-17 2018-12-18 白崇明 Distributed Multi data pick-up method
WO2016065776A1 (en) * 2014-10-28 2016-05-06 浪潮电子信息产业股份有限公司 Method for tightly coupled scalable big-data interaction
CN105205687A (en) * 2015-08-24 2015-12-30 浪潮通用软件有限公司 Mass data acquisition method
CN107862425A (en) * 2017-08-29 2018-03-30 平安普惠企业管理有限公司 Air control collecting method, equipment, system and readable storage medium storing program for executing
CN107862425B (en) * 2017-08-29 2021-12-07 平安普惠企业管理有限公司 Wind control data acquisition method, device and system and readable storage medium
WO2019104891A1 (en) * 2017-11-28 2019-06-06 平安科技(深圳)有限公司 Method and device for importing and exporting report, storage medium, and terminal

Similar Documents

Publication Publication Date Title
CN103235807A (en) Data extracting and processing method supporting high-concurrency large-volume data
CN105740418B (en) A kind of real-time synchronization system pushed based on file monitor and message
US20230177016A1 (en) Versioned file system with global lock
CN110543464B (en) Big data platform applied to intelligent park and operation method
EP3602341B1 (en) Data replication system
CN103152352B (en) A kind of perfect information security forensics monitor method based on cloud computing environment and system
CN102970158B (en) Log storage and processing method and log server
US9244987B2 (en) ETL data transit method and system
CN106991035A (en) A kind of Host Supervision System based on micro services framework
CN111949633B (en) ICT system operation log analysis method based on parallel stream processing
US8135763B1 (en) Apparatus and method for maintaining a file system index
CN105005618A (en) Data synchronization method and system among heterogeneous databases
US10191915B2 (en) Information processing system and data synchronization control scheme thereof
CN104778188A (en) Distributed device log collection method
CN108092936A (en) A kind of Host Supervision System based on plug-in architecture
US11657025B2 (en) Parallel processing of filtered transaction logs
US20080301471A1 (en) Systems and methods in electronic evidence management for creating and maintaining a chain of custody
CN105205687A (en) Mass data acquisition method
KR101966201B1 (en) Big data archiving and searching stsrem
CN107688611A (en) A kind of Redis key assignments management system and method based on saltstack
US20080301284A1 (en) Systems and methods for capture of electronic evidence
CN102968479A (en) Safety zone crossing database backup method
US20080301713A1 (en) Systems and methods for electronic evidence management with service control points and agents
US11079960B2 (en) Object storage system with priority meta object replication
CN115086304B (en) Multi-source distributed downloading system based on FTP protocol

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20130807