CN110162346A - A kind of processing method of project data and individual data items concurrent management - Google Patents
A kind of processing method of project data and individual data items concurrent management Download PDFInfo
- Publication number
- CN110162346A CN110162346A CN201910396573.3A CN201910396573A CN110162346A CN 110162346 A CN110162346 A CN 110162346A CN 201910396573 A CN201910396573 A CN 201910396573A CN 110162346 A CN110162346 A CN 110162346A
- Authority
- CN
- China
- Prior art keywords
- concurrent
- manager
- data
- processing method
- threads
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/44—Arrangements for executing specific programs
- G06F9/445—Program loading or initiating
- G06F9/44505—Configuring for program initiating, e.g. using registry, configuration files
- G06F9/4451—User profiles; Roaming
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/48—Program initiating; Program switching, e.g. by interrupt
- G06F9/4806—Task transfer initiation or dispatching
- G06F9/4843—Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
A kind of processing method of project data and individual data items concurrent management, number of threads is arranged: number of threads is provided with two ways, and one is be directly arranged in " environment " dialog box;Another kind is modification configuration file.Specific set-up mode is as follows: l) clicking " file " button, selects " option " in a menu, in " environment " setting page of " SuperMap iDesktop 8C option " dialog box of pop-up, " parallel computing threads number " directly is arranged;Establish the Super Map parallel computation environment for data analysis;It is concurrent program by concurrent process tissue independently operated on each node.First to resource manager application calculate node, services sets will be collectively constituted for all nodes of single traffic assignments;Then select a process as host process, remaining process, which is used as, divides process.
Description
Technical field
The present invention relates to big data processing and parallel calculating methods.
Background technique
With the development of web services technologies, the data got more measure also increasing, the data processing of service field
It is bigger with the time consumption of analysis.Therefore, traditional data processing technique and serial computing technology are difficult to meet in high-precision
Manage the demand of big data processing.SuperMap supports parallel computation, effectively raises the efficiency of big data processing.
Parallel computation principle is, parallel computation is at several small tasks and to be performed in unison with a Task-decomposing to complete
The process of solution is the effective way for enhancing challenge resolution ability and improving performance.Parallel computation can pass through a variety of ways
Diameter is realized, including multi-process, multithreading and other various ways, SuperMap are to realize parallel computation by multithreading
, it can sufficiently and more efficiently utilize multicore computing resource to save cost to reduce the solution time of single problem,
It can satisfy the problem of more extensive or higher precision requires and solve demand.Through it was found that: serially with parallel two kinds of calculating sides
Formula.When a task is divided into tri- subtasks A, B, C, serially need successively to execute three subtasks, and multithreading is simultaneously
It is capable then three subtasks can be performed simultaneously by three threads.
Under parallel computation support, CPU when certain executes " extracting isopleth " analysis on a four common core computers
Service condition.When using 1 Thread Analysis, cpu busy percentage is lower, only one CPU participates in operation, when setting parallel line
When number of passes is 4, all four core cpus are involved in operation, and cpu busy percentage reaches as high as 100%.
The example that three-dimensional shading map is generated when one compares the operating time of multithreads computing and single thread calculating.
The data of this exemplary application are that the dem data ranks number in somewhere is 15000*20000, and data volume size is 884M, right respectively
It carries out three-dimensional hill shading graphic operation, carries out analysis using the SuperMap iDesktop8C of single thread and needs 80 seconds, and by simultaneously
Row, which calculates, only needs same operation can be completed within 15 seconds.By upper example it is found that same data processing can be saved by parallel computation
The time for saving 3-5 times, time cost is greatly saved, the performance and working efficiency of analysis are improved.Parallel computation and single line
The comparison of journey time-consuming: ten times of performance boost or more of analysis, overall performance promote five times, and the operation time that parallel computation is spent is single
Thread 15% or so
The function of parallel computation is supported, currently, SuperMap supports the function of parallel computation to have: grid analysis, the hydrology point
Analysis, network analysis, topology preprocessing, overlay analysis, space querying etc..
Grid analysis: in grid analysis functional module support parallel computation function have: interpolation analysis, extract isopleth,
Extraction contour surface, Slope Analysis, aspect analysis, grid fills out excavation, excavation is filled out in face, inverse fills out excavation, surface area amount is calculated, surface
Volumes calculating searches extreme value, generates three-dimensional shading map, generates and just penetrating 3-dimensional image, single-point recallable amounts, multiple spot visible range point
Analysis, grid resampling, grid be classified again, grid polymerization etc..
Overlay analysis: all supporting parallel computation to line surface stack analysis, cutting, erasing, merging, intersection including line face,
Same, symmetric difference, update.
Space querying: in face of elephant include and ask friendship inquiry support parallel computation.
Object of the present invention is to propose the processing method of a kind of project data and individual data items concurrent management, Thread Count is arranged
Mesh: number of threads is provided with two ways, and one is be directly arranged in " environment " dialog box;Another kind is modification configuration text
Part.Specific set-up mode is as follows: l) clicking " file " button, " option " is selected in a menu, in " the SuperMap of pop-up
In " environment " setting page of iDesktop 8C option " dialog box, directly " parallel computing threads number " is set;
Node in system configuration file SuperMap.xml is used for given thread number, initial value 2.
SuperMap.xml be located at subassembly product installation directory under Bin file;It is 4 that number of threads, which is arranged, then configures text
Part should be modified are as follows: 4.
Establish the Super Map parallel computation environment for data analysis;It will be independently operated on each node and advance
Journey tissue is concurrent program.
First to resource manager application calculate node, business will be collectively constituted for all nodes of single traffic assignments
Collection;Select a process as host process, remaining process, which is used as, divides process.
It is concurrent program by concurrent process tissue independently operated on each node, modifies to the code of program, it will
Principal function is revised as the function that can be executed by each concurrent process.
The document metadata caching of the host process maintenance, point process safeguard that local document caching and a point process are opened
Worker thread and data thread.
Before the Super Map process of process manager scheduling execution business, dispatches first and execute document initialization
Then process requests executive process to process manager.
Super Map parallel computation environment for text data analysis;It will be independently operated on each node and advance
Journey tissue is in concurrent program: the scheduling simultaneously executes document initialization process, then requests executive process to process manager,
It specifically includes:
After initialization is completed, process manager waits the time cycle of a wheel heartbeat communication, to know some process pipe
It manages the available free Map/Reduce time slot of device and the process manager requests executive process to process manager;
After being connected to the heartbeat message, process manager will be dispatched in the document initialization process to the process manager
It executes;Corresponding process manager is responsible for executing document initialization process, and logical by periodic heartbeat in the process of implementation
Believe and reports the state of a process to process manager until process is completed.
The Thread Count in configuration file can be preferentially read when application program launching, if modifying at " parallel computing threads number "
Thread Count, then can come into force, while can modify the value in configuration file automatically;And the number of threads in configuration file only exists
It is read when application program launching once, after manual modification configuration file, needing to restart application program can just come into force.Thread
The effective range of number is 1-16.If the number of threads in configuration file goes beyond the scope, setting is invalid, uses default value 2;
If the value being arranged at " parallel computing threads number " is greater than 16, the value being arranged can be automatically regulated to be 16.So how to be arranged
Reasonable Thread Count? it can refer to and be configured as follows:
1. specified multiple threads will distribute between all cores of computer processor, when number of threads is total equal to processor
When nucleus number, all cores are involved in calculating, can make full use of the computing resource of computer.
2. number of threads is more than computer nucleus number, thread scheduling and problem of load balancing be may result in when occupying more
Between, even if the time of analytical calculation further decreases, it is also possible to cause overall performance to be promoted unobvious.Therefore it is not recommended that in this way
It does.
Geospatial analysis has the popular features such as algorithm logic is complicated, data scale is big, is a kind of computation-intensive, data
Intensive function can make full use of multicore computing resource by parallel computation, so that analysis time is substantially reduced, raising property
Energy.The big data processing that is embodied as of parallel computation provides strong and powerful support.
The processing method of project data and individual data items concurrent management, the processing of project data and individual data items concurrent management
Need to consider that relevant project data and individual data items concurrent management concurrently access.And concurrent problem is difficult,
Need to meet it is common it is concurrent with it is synchronous.
The utility model has the advantages that asynchronous, the synchronization of project data of the present invention and individual data items concurrent management, is exactly that a thread executes
When one method or function, other threads can be blocked, other threads, which will wait it to be finished just, to be continued to execute.It is different
It walks, does not block between exactly multiple threads, multiple threads are performed simultaneously.For more popular, synchronization is exactly something one
Thing is done, and asynchronous is exactly to do something, does not influence to do other things, multiple threads are asynchronous.
Specific embodiment:
Super Map parallel computation environment for text data analysis;It will be independently operated on each node and advance
Journey tissue is in concurrent program: the scheduling simultaneously executes document initialization process, then requests executive process to process manager,
It specifically includes:
After initialization is completed, process manager waits the time cycle of a wheel heartbeat communication, to know some process pipe
It manages the available free Super Map time slot of device and the process manager requests executive process to process manager;
After being connected to the heartbeat message, process manager will be dispatched in the document initialization process to the process manager
It executes;Corresponding process manager is responsible for executing document initialization process, and logical by periodic heartbeat in the process of implementation
Believe and reports the state of a process to process manager until process is completed.
Processing is concurrent and mainly passes through lock mechanism with stationary problem.Pessimistic Locking: as its name, it is referred to data quilt
Entrenched attitudes are held in extraneous (including other current affairs of this system, and the issued transaction from external system) modification.Therefore,
In this data handling procedure, data are in the lock state.The realization of Pessimistic Locking often relies on the lock machine of database offer
(lock mechanism for also only having database layer to provide could really guarantee the exclusiveness of data access to system, otherwise, even if in the present system
Locking mechanisms are realized, also not can guarantee external system will not modify data).One typical Pessimistic Locking tune for relying on database
With:
Select*from account where name=" Erica " for update
This sql sentence has locked all records for meeting search condition (name=" Erica ") in account table.This
Before secondary affairs are submitted (affairs can discharge the lock in business process when submitting), the external world can not modify these records.Hibernate
Pessimistic Locking, be also based on database lock mechanism realize.
Following code realizes the locking to inquiry record:
1 String hqlStr=" from TUser as user where user.name='Erica' ";
2 Query query=session.createQuery (hqlStr);
3 query.setLockMode("user",LockMode.UPGRADE);// lock
4 List userList=query.list ();// inquiry is executed, obtain data
Observe the SQL statement that runtime Hibernate is generated:
select tuser0_.id as id,tuser0_.name as name,
5 tuser0_.group_id as group_id,tuser0_.user_type as user_type,
tuser0_.sex as sex from t_user tuser0_where
(tuser0_.name='Erica') for update
Here Hibernate realizes pessimistic lock mechanism by using for update clause of database.
The Lock mode of Hibernate has:
(namely Hiberate is generated before SQL) setting locks only before inquiry starts, and just can really pass through number
Locking processing is carried out according to the lock mechanism in library, otherwise, data add by not including the Select SQL of for update clause
It is loaded into and, so-called data library locking is not just known where to begin yet.
For opposite Pessimistic Locking, optimistic lock mechanism takes looser locking mechanisms.Pessimistic Locking is in most cases
It is realized by the lock mechanism of database, to guarantee to operate maximum exclusivity.But come with database performance
A large amount of expenses, especially for Long routine, such expense is often unbearable.A such as financial system, as some behaviour
Work person reads the data of user, and (such as change user account remaining sum) when modifying on the basis of the user data of reading,
If using pessimistic lock mechanism, during also meaning that whole operation (since operator read data, modification until submission
The overall process of result is modified, or even further includes that operator midway is gone the time made coffee), data-base recording is in locking shape always
State, it will be appreciated that kind of consequence is such situation will lead to if concurrent in face of several hundred thousands of.Optimistic lock mechanism is one
Determine solve this problem in degree.Optimistic locking is realized based on versions of data Version) recording mechanism mostly.What is meant by data
Version? as data increase a version identifier, in the version solution based on database table, generally by for data
Library table increases " version " field to realize.When reading out data, this version number is read together, it is right when updating later
This version number adds one.At this point, the edition data of data and the current version information of database table corresponding record will be submitted to compare
It is right, if the versions of data number submitted is greater than database table current version number, updated, otherwise it is assumed that being stale data.
If account balance is 100, version 1 in database, operator A reads remaining sum, and is revised as 50, and in A
Operator B has also read account balance 100 while operation, and is revised as 80, and A completes operation input system, version from
1 becomes 2 plus 1, and remaining sum is revised as 50, and operator B also has submitted record, and version also becomes 2, and remaining sum is then 80, but this
When database find that the version that B is submitted is 2, and current version is also 2, be unsatisfactory for " submitting version to have to be larger than record current
Version could execute update " optimistic locking strategy.Therefore, the submission of operator B is rejected.In this way, avoiding operator B use
The possibility of the operating result of the result covering operator A of legacy data modification based on version=1.It from the example above can be with
Find out, optimistic lock mechanism, which avoids the data library locking expense in Long routine, (in operator A and operator B operating process, all not to be had
Have and database data locked), greatly improving the systematic entirety under large concurrent can show.It should be noted that optimism
Lock mechanism is often based upon the data storage logic in system, therefore also has certain limitation, such as in upper example, due to optimism
Lock mechanism is realized in our system, and user balance from external system updates operation not by the control of our systems,
Therefore dirty data is likely to result in be updated in database.In system design stage, we should fully take into account these feelings
A possibility that condition occurs, and adjust accordingly (such as optimistic locking strategy is realized in database store process, it is externally only open
Data based on this storing process more new way, rather than by the direct external disclosure of database table).Hibernate is in its data
Optimistic locking realization has been accessed built in engine.If not having to consider that external system operates the update of database, utilize
The transparence optimistic locking that Hibernate is provided is realized, our productivity will be greatly promoted.
Lock synchronized more refers to the level of application program, and multiple threads are come in, can only access one by one,
Syncrinized keyword is referred in java.Lock also has 2 levels, and one is the object lock spoken of in java, is used for thread
It is synchronous;Another level is the lock of database;If it is distributed system, it is clear that can only be using the lock of database side come real
It is existing.
Claims (5)
1. the processing method of a kind of project data and individual data items concurrent management, characterized in that setting number of threads: number of threads
Be provided with two ways, one is be directly arranged in " environment " dialog box;Another kind is modification configuration file.Specific setting
Mode is as follows: l) clicking " file " button, selects " option " in a menu, in " the SuperMap iDesktop 8C choosing of pop-up
In " environment " setting page of item " dialog box, directly " parallel computing threads number " is set;
Establish the Super Map parallel computation environment for data analysis;By concurrent process group independently operated on each node
It is woven to concurrent program;
First to resource manager application calculate node, services sets will be collectively constituted for all nodes of single traffic assignments;So
Select a process as host process afterwards, remaining process, which is used as, divides process.
2. the processing method of project data according to claim 1 and individual data items concurrent management, characterized in that will be each
Independently operated concurrent process tissue is concurrent program on node, is modified to the code of program, principal function is revised as can
The function executed by each concurrent process.
3. the processing method of project data according to claim 1 and individual data items concurrent management, characterized in that the master
The document metadata caching of process maintenance, point process safeguard the worker thread sum number that local document caching and point process are opened
According to thread.
4. the processing method of project data according to claim 1 and individual data items concurrent management, characterized in that in process
Before manager dispatches the Super Map process of execution business, dispatches first and execute document initialization process, then to process
Manager request executive process.
5. the processing method of project data according to claim 1 and individual data items concurrent management, characterized in that for text
The Super Map parallel computation environment of notebook data analysis;It is parallel journey by concurrent process tissue independently operated on each node
In sequence: the scheduling simultaneously executes document initialization process, then requests executive process to process manager, specifically includes: is initial
Change after completing, process manager waits the time cycle of a wheel heartbeat communication, to know that some process manager is available free
Map/Reduce time slot and the process manager request executive process to process manager;
After being connected to the heartbeat message, process manager will be dispatched to be held in the document initialization process to the process manager
Row;Corresponding process manager is responsible for executing document initialization process, and is communicated in the process of implementation by periodic heartbeat
The state of a process is reported to process manager until process is completed.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910396573.3A CN110162346A (en) | 2019-05-14 | 2019-05-14 | A kind of processing method of project data and individual data items concurrent management |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910396573.3A CN110162346A (en) | 2019-05-14 | 2019-05-14 | A kind of processing method of project data and individual data items concurrent management |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110162346A true CN110162346A (en) | 2019-08-23 |
Family
ID=67634467
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910396573.3A Pending CN110162346A (en) | 2019-05-14 | 2019-05-14 | A kind of processing method of project data and individual data items concurrent management |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110162346A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110659288A (en) * | 2019-09-17 | 2020-01-07 | 中国南方电网有限责任公司 | Case statement calculation method, system, device, computer equipment and storage medium |
-
2019
- 2019-05-14 CN CN201910396573.3A patent/CN110162346A/en active Pending
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110659288A (en) * | 2019-09-17 | 2020-01-07 | 中国南方电网有限责任公司 | Case statement calculation method, system, device, computer equipment and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Plattner et al. | Ganymed: Scalable replication for transactional web applications | |
US7984043B1 (en) | System and method for distributed query processing using configuration-independent query plans | |
Kim et al. | Ermia: Fast memory-optimized database system for heterogeneous workloads | |
EP2932370B1 (en) | System and method for performing a transaction in a massively parallel processing database | |
Palmieri et al. | Aggro: Boosting stm replication via aggressively optimistic transaction processing | |
EP1544753A1 (en) | Partitioned database system | |
Liu et al. | ETLMR: a highly scalable dimensional ETL framework based on MapReduce | |
Plattner et al. | Extending DBMSs with satellite databases | |
CN102054034A (en) | Implementation method for business basic data persistence of enterprise information system | |
CN107977446A (en) | A kind of memory grid data load method based on data partition | |
US7877355B2 (en) | Job scheduling for automatic movement of multidimensional data between live datacubes | |
Johnson et al. | Eliminating unscalable communication in transaction processing | |
CN113220755A (en) | Method for flexibly generating GraphQL interface based on multi-source data | |
Sun et al. | Learned index: A comprehensive experimental evaluation | |
Pandey | Performance benchmarking and comparison of cloud-based databases MongoDB (NoSQL) vs MySQL (Relational) using YCSB | |
CN115904638A (en) | Intelligent management method and system for database affairs | |
US11392576B2 (en) | Distributed pessimistic lock based on HBase storage and the implementation method thereof | |
CN110162346A (en) | A kind of processing method of project data and individual data items concurrent management | |
Azez et al. | JOUM: an indexing methodology for improving join in hive star schema | |
Ni | Comparative evaluation of spark and stratosphere | |
Chen et al. | Parallel XPath query based on cost optimization | |
Li | Introduction to Big Data | |
Ren et al. | GPU-based dynamic hyperspace hash with full concurrency | |
Fan et al. | Scalable transaction processing using functors | |
Fan et al. | 2PC+: A High Performance Protocol for Distributed Transactions of Micro-service Architecture |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
WD01 | Invention patent application deemed withdrawn after publication | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20190823 |