CN110162346A

CN110162346A - A kind of processing method of project data and individual data items concurrent management

Info

Publication number: CN110162346A
Application number: CN201910396573.3A
Authority: CN
Inventors: 张一珺; 马明杰; 杨海滨
Original assignee: Nanjing Junyi Mdt Infotech Ltd
Current assignee: Nanjing Junyi Mdt Infotech Ltd
Priority date: 2019-05-14
Filing date: 2019-05-14
Publication date: 2019-08-23

Abstract

A kind of processing method of project data and individual data items concurrent management, number of threads is arranged: number of threads is provided with two ways, and one is be directly arranged in " environment " dialog box；Another kind is modification configuration file.Specific set-up mode is as follows: l) clicking " file " button, selects " option " in a menu, in " environment " setting page of " SuperMap iDesktop 8C option " dialog box of pop-up, " parallel computing threads number " directly is arranged；Establish the Super Map parallel computation environment for data analysis；It is concurrent program by concurrent process tissue independently operated on each node.First to resource manager application calculate node, services sets will be collectively constituted for all nodes of single traffic assignments；Then select a process as host process, remaining process, which is used as, divides process.

Description

A kind of processing method of project data and individual data items concurrent management

Technical field

The present invention relates to big data processing and parallel calculating methods.

Background technique

With the development of web services technologies, the data got more measure also increasing, the data processing of service field It is bigger with the time consumption of analysis.Therefore, traditional data processing technique and serial computing technology are difficult to meet in high-precision Manage the demand of big data processing.SuperMap supports parallel computation, effectively raises the efficiency of big data processing.

Parallel computation principle is, parallel computation is at several small tasks and to be performed in unison with a Task-decomposing to complete The process of solution is the effective way for enhancing challenge resolution ability and improving performance.Parallel computation can pass through a variety of ways Diameter is realized, including multi-process, multithreading and other various ways, SuperMap are to realize parallel computation by multithreading , it can sufficiently and more efficiently utilize multicore computing resource to save cost to reduce the solution time of single problem, It can satisfy the problem of more extensive or higher precision requires and solve demand.Through it was found that: serially with parallel two kinds of calculating sides Formula.When a task is divided into tri- subtasks A, B, C, serially need successively to execute three subtasks, and multithreading is simultaneously It is capable then three subtasks can be performed simultaneously by three threads.

Under parallel computation support, CPU when certain executes " extracting isopleth " analysis on a four common core computers Service condition.When using 1 Thread Analysis, cpu busy percentage is lower, only one CPU participates in operation, when setting parallel line When number of passes is 4, all four core cpus are involved in operation, and cpu busy percentage reaches as high as 100%.

The example that three-dimensional shading map is generated when one compares the operating time of multithreads computing and single thread calculating. The data of this exemplary application are that the dem data ranks number in somewhere is 15000*20000, and data volume size is 884M, right respectively It carries out three-dimensional hill shading graphic operation, carries out analysis using the SuperMap iDesktop8C of single thread and needs 80 seconds, and by simultaneously Row, which calculates, only needs same operation can be completed within 15 seconds.By upper example it is found that same data processing can be saved by parallel computation The time for saving 3-5 times, time cost is greatly saved, the performance and working efficiency of analysis are improved.Parallel computation and single line The comparison of journey time-consuming: ten times of performance boost or more of analysis, overall performance promote five times, and the operation time that parallel computation is spent is single Thread 15% or so

The function of parallel computation is supported, currently, SuperMap supports the function of parallel computation to have: grid analysis, the hydrology point Analysis, network analysis, topology preprocessing, overlay analysis, space querying etc..

Grid analysis: in grid analysis functional module support parallel computation function have: interpolation analysis, extract isopleth, Extraction contour surface, Slope Analysis, aspect analysis, grid fills out excavation, excavation is filled out in face, inverse fills out excavation, surface area amount is calculated, surface Volumes calculating searches extreme value, generates three-dimensional shading map, generates and just penetrating 3-dimensional image, single-point recallable amounts, multiple spot visible range point Analysis, grid resampling, grid be classified again, grid polymerization etc..

Overlay analysis: all supporting parallel computation to line surface stack analysis, cutting, erasing, merging, intersection including line face, Same, symmetric difference, update.

Space querying: in face of elephant include and ask friendship inquiry support parallel computation.

Object of the present invention is to propose the processing method of a kind of project data and individual data items concurrent management, Thread Count is arranged Mesh: number of threads is provided with two ways, and one is be directly arranged in " environment " dialog box；Another kind is modification configuration text Part.Specific set-up mode is as follows: l) clicking " file " button, " option " is selected in a menu, in " the SuperMap of pop-up In " environment " setting page of iDesktop 8C option " dialog box, directly " parallel computing threads number " is set；

Node in system configuration file SuperMap.xml is used for given thread number, initial value 2.

SuperMap.xml be located at subassembly product installation directory under Bin file；It is 4 that number of threads, which is arranged, then configures text Part should be modified are as follows: 4.

Establish the Super Map parallel computation environment for data analysis；It will be independently operated on each node and advance Journey tissue is concurrent program.

First to resource manager application calculate node, business will be collectively constituted for all nodes of single traffic assignments Collection；Select a process as host process, remaining process, which is used as, divides process.

It is concurrent program by concurrent process tissue independently operated on each node, modifies to the code of program, it will Principal function is revised as the function that can be executed by each concurrent process.

The document metadata caching of the host process maintenance, point process safeguard that local document caching and a point process are opened Worker thread and data thread.

Before the Super Map process of process manager scheduling execution business, dispatches first and execute document initialization Then process requests executive process to process manager.

Super Map parallel computation environment for text data analysis；It will be independently operated on each node and advance Journey tissue is in concurrent program: the scheduling simultaneously executes document initialization process, then requests executive process to process manager, It specifically includes:

After initialization is completed, process manager waits the time cycle of a wheel heartbeat communication, to know some process pipe It manages the available free Map/Reduce time slot of device and the process manager requests executive process to process manager；

After being connected to the heartbeat message, process manager will be dispatched in the document initialization process to the process manager It executes；Corresponding process manager is responsible for executing document initialization process, and logical by periodic heartbeat in the process of implementation Believe and reports the state of a process to process manager until process is completed.

The Thread Count in configuration file can be preferentially read when application program launching, if modifying at " parallel computing threads number " Thread Count, then can come into force, while can modify the value in configuration file automatically；And the number of threads in configuration file only exists It is read when application program launching once, after manual modification configuration file, needing to restart application program can just come into force.Thread The effective range of number is 1-16.If the number of threads in configuration file goes beyond the scope, setting is invalid, uses default value 2； If the value being arranged at " parallel computing threads number " is greater than 16, the value being arranged can be automatically regulated to be 16.So how to be arranged Reasonable Thread Count? it can refer to and be configured as follows:

1. specified multiple threads will distribute between all cores of computer processor, when number of threads is total equal to processor When nucleus number, all cores are involved in calculating, can make full use of the computing resource of computer.

2. number of threads is more than computer nucleus number, thread scheduling and problem of load balancing be may result in when occupying more Between, even if the time of analytical calculation further decreases, it is also possible to cause overall performance to be promoted unobvious.Therefore it is not recommended that in this way It does.

Geospatial analysis has the popular features such as algorithm logic is complicated, data scale is big, is a kind of computation-intensive, data Intensive function can make full use of multicore computing resource by parallel computation, so that analysis time is substantially reduced, raising property Energy.The big data processing that is embodied as of parallel computation provides strong and powerful support.

The processing method of project data and individual data items concurrent management, the processing of project data and individual data items concurrent management

Need to consider that relevant project data and individual data items concurrent management concurrently access.And concurrent problem is difficult, Need to meet it is common it is concurrent with it is synchronous.

The utility model has the advantages that asynchronous, the synchronization of project data of the present invention and individual data items concurrent management, is exactly that a thread executes When one method or function, other threads can be blocked, other threads, which will wait it to be finished just, to be continued to execute.It is different It walks, does not block between exactly multiple threads, multiple threads are performed simultaneously.For more popular, synchronization is exactly something one Thing is done, and asynchronous is exactly to do something, does not influence to do other things, multiple threads are asynchronous.

Specific embodiment:

After initialization is completed, process manager waits the time cycle of a wheel heartbeat communication, to know some process pipe It manages the available free Super Map time slot of device and the process manager requests executive process to process manager；

Processing is concurrent and mainly passes through lock mechanism with stationary problem.Pessimistic Locking: as its name, it is referred to data quilt Entrenched attitudes are held in extraneous (including other current affairs of this system, and the issued transaction from external system) modification.Therefore, In this data handling procedure, data are in the lock state.The realization of Pessimistic Locking often relies on the lock machine of database offer (lock mechanism for also only having database layer to provide could really guarantee the exclusiveness of data access to system, otherwise, even if in the present system Locking mechanisms are realized, also not can guarantee external system will not modify data).One typical Pessimistic Locking tune for relying on database With:

Select*from account where name=" Erica " for update

This sql sentence has locked all records for meeting search condition (name=" Erica ") in account table.This Before secondary affairs are submitted (affairs can discharge the lock in business process when submitting), the external world can not modify these records.Hibernate Pessimistic Locking, be also based on database lock mechanism realize.

Following code realizes the locking to inquiry record:

1 String hqlStr=" from TUser as user where user.name='Erica' "；

2 Query query=session.createQuery (hqlStr)；

3 query.setLockMode("user",LockMode.UPGRADE)；// lock

4 List userList=query.list ()；// inquiry is executed, obtain data

Observe the SQL statement that runtime Hibernate is generated:

select tuser0_.id as id,tuser0_.name as name,

5 tuser0_.group_id as group_id,tuser0_.user_type as user_type,

tuser0_.sex as sex from t_user tuser0_where

(tuser0_.name='Erica') for update

Here Hibernate realizes pessimistic lock mechanism by using for update clause of database.

The Lock mode of Hibernate has:

(namely Hiberate is generated before SQL) setting locks only before inquiry starts, and just can really pass through number Locking processing is carried out according to the lock mechanism in library, otherwise, data add by not including the Select SQL of for update clause It is loaded into and, so-called data library locking is not just known where to begin yet.

For opposite Pessimistic Locking, optimistic lock mechanism takes looser locking mechanisms.Pessimistic Locking is in most cases It is realized by the lock mechanism of database, to guarantee to operate maximum exclusivity.But come with database performance A large amount of expenses, especially for Long routine, such expense is often unbearable.A such as financial system, as some behaviour Work person reads the data of user, and (such as change user account remaining sum) when modifying on the basis of the user data of reading, If using pessimistic lock mechanism, during also meaning that whole operation (since operator read data, modification until submission The overall process of result is modified, or even further includes that operator midway is gone the time made coffee), data-base recording is in locking shape always State, it will be appreciated that kind of consequence is such situation will lead to if concurrent in face of several hundred thousands of.Optimistic lock mechanism is one Determine solve this problem in degree.Optimistic locking is realized based on versions of data Version) recording mechanism mostly.What is meant by data Version? as data increase a version identifier, in the version solution based on database table, generally by for data Library table increases " version " field to realize.When reading out data, this version number is read together, it is right when updating later This version number adds one.At this point, the edition data of data and the current version information of database table corresponding record will be submitted to compare It is right, if the versions of data number submitted is greater than database table current version number, updated, otherwise it is assumed that being stale data.

If account balance is 100, version 1 in database, operator A reads remaining sum, and is revised as 50, and in A Operator B has also read account balance 100 while operation, and is revised as 80, and A completes operation input system, version from 1 becomes 2 plus 1, and remaining sum is revised as 50, and operator B also has submitted record, and version also becomes 2, and remaining sum is then 80, but this When database find that the version that B is submitted is 2, and current version is also 2, be unsatisfactory for " submitting version to have to be larger than record current Version could execute update " optimistic locking strategy.Therefore, the submission of operator B is rejected.In this way, avoiding operator B use The possibility of the operating result of the result covering operator A of legacy data modification based on version=1.It from the example above can be with Find out, optimistic lock mechanism, which avoids the data library locking expense in Long routine, (in operator A and operator B operating process, all not to be had Have and database data locked), greatly improving the systematic entirety under large concurrent can show.It should be noted that optimism Lock mechanism is often based upon the data storage logic in system, therefore also has certain limitation, such as in upper example, due to optimism Lock mechanism is realized in our system, and user balance from external system updates operation not by the control of our systems, Therefore dirty data is likely to result in be updated in database.In system design stage, we should fully take into account these feelings A possibility that condition occurs, and adjust accordingly (such as optimistic locking strategy is realized in database store process, it is externally only open Data based on this storing process more new way, rather than by the direct external disclosure of database table).Hibernate is in its data Optimistic locking realization has been accessed built in engine.If not having to consider that external system operates the update of database, utilize The transparence optimistic locking that Hibernate is provided is realized, our productivity will be greatly promoted.

Lock synchronized more refers to the level of application program, and multiple threads are come in, can only access one by one, Syncrinized keyword is referred in java.Lock also has 2 levels, and one is the object lock spoken of in java, is used for thread It is synchronous；Another level is the lock of database；If it is distributed system, it is clear that can only be using the lock of database side come real It is existing.

Claims

1. the processing method of a kind of project data and individual data items concurrent management, characterized in that setting number of threads: number of threads Be provided with two ways, one is be directly arranged in " environment " dialog box；Another kind is modification configuration file.Specific setting Mode is as follows: l) clicking " file " button, selects " option " in a menu, in " the SuperMap iDesktop 8C choosing of pop-up In " environment " setting page of item " dialog box, directly " parallel computing threads number " is set；

Establish the Super Map parallel computation environment for data analysis；By concurrent process group independently operated on each node It is woven to concurrent program；

First to resource manager application calculate node, services sets will be collectively constituted for all nodes of single traffic assignments；So Select a process as host process afterwards, remaining process, which is used as, divides process.

2. the processing method of project data according to claim 1 and individual data items concurrent management, characterized in that will be each Independently operated concurrent process tissue is concurrent program on node, is modified to the code of program, principal function is revised as can The function executed by each concurrent process.

3. the processing method of project data according to claim 1 and individual data items concurrent management, characterized in that the master The document metadata caching of process maintenance, point process safeguard the worker thread sum number that local document caching and point process are opened According to thread.

4. the processing method of project data according to claim 1 and individual data items concurrent management, characterized in that in process Before manager dispatches the Super Map process of execution business, dispatches first and execute document initialization process, then to process Manager request executive process.

5. the processing method of project data according to claim 1 and individual data items concurrent management, characterized in that for text The Super Map parallel computation environment of notebook data analysis；It is parallel journey by concurrent process tissue independently operated on each node In sequence: the scheduling simultaneously executes document initialization process, then requests executive process to process manager, specifically includes: is initial Change after completing, process manager waits the time cycle of a wheel heartbeat communication, to know that some process manager is available free Map/Reduce time slot and the process manager request executive process to process manager；

After being connected to the heartbeat message, process manager will be dispatched to be held in the document initialization process to the process manager Row；Corresponding process manager is responsible for executing document initialization process, and is communicated in the process of implementation by periodic heartbeat The state of a process is reported to process manager until process is completed.