CN112631801B - Distributed parallel method for intelligent remote sensing image model - Google Patents

Distributed parallel method for intelligent remote sensing image model Download PDF

Info

Publication number
CN112631801B
CN112631801B CN202011530140.1A CN202011530140A CN112631801B CN 112631801 B CN112631801 B CN 112631801B CN 202011530140 A CN202011530140 A CN 202011530140A CN 112631801 B CN112631801 B CN 112631801B
Authority
CN
China
Prior art keywords
metadata
remote sensing
message queue
sensing image
memory message
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011530140.1A
Other languages
Chinese (zh)
Other versions
CN112631801A (en
Inventor
张翼
曾明勇
吴东
王宇
李祥
朱剑文
陆一峰
张昆
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuxi Jiangnan Computing Technology Institute
Original Assignee
Wuxi Jiangnan Computing Technology Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuxi Jiangnan Computing Technology Institute filed Critical Wuxi Jiangnan Computing Technology Institute
Priority to CN202011530140.1A priority Critical patent/CN112631801B/en
Publication of CN112631801A publication Critical patent/CN112631801A/en
Application granted granted Critical
Publication of CN112631801B publication Critical patent/CN112631801B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/547Remote procedure calls [RPC]; Web services
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline, look ahead
    • G06F9/3818Decoding for concurrent execution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5011Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
    • G06F9/5016Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals the resource being the memory
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5083Techniques for rebalancing the load in a distributed system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T1/00General purpose image data processing
    • G06T1/20Processor architectures; Processor configuration, e.g. pipelining

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a distributed parallel method of an intelligent remote sensing image model, which comprises the following steps: accessing a file system address and a model selection field of a remote sensing image from a business application system; reading the remote sensing image through an image preprocessing library; the metadata information of the large graph and the metadata information of the slices are serialized through JSON, and the metadata information of the large graph and the metadata information of the slices are pressed into a memory message queue by adopting a PUSH mechanism; accessing the memory message queue by adopting an asynchronous multithreading competition mechanism and an access rule for blocking access; sequencing metadata information of the detection result through JSON and PUSH to a memory message queue; serializing metadata information of the identification result through JSON and PUSH into a memory message queue; finally, detecting and packaging the identified metadata information into a uniform query interface. The method can effectively meet the requirements of large throughput, quasi-real-time calculation and agile model parallel deployment of massive remote sensing images.

Description

Distributed parallel method for remote sensing image intelligent model
Technical Field
The invention relates to a distributed parallel method for an intelligent remote sensing image model, and belongs to the technical field of remote sensing image processing.
Background
How to rapidly and intelligently detect and identify valuable targets from massive remote sensing image data has important significance (such as forest fire points, cooling towers for illegal discharge and the like). The single remote sensing image pixel points are usually more than hundreds of millions, the splitting, distribution and parallel distributed processing of calling an intelligent model on the pixel points more than hundreds of millions or billions are required to be realized, the information distribution delay and the data transmission delay in a system architecture are required to be smaller than the model processing delay, and a service model can be deployed rapidly according to the data volume.
The existing intelligent model distributed deployment method is mostly based on a gRPC/REST service interface, the data transmission link is long, the data transmission delay of remote sensing image slices is long, the model cannot be deployed quickly and thermally according to the data demand and cannot cope with data flood, and the distributed real-time processing performance of an intelligent model end and the model deployment agility are lowered integrally.
Disclosure of Invention
The invention aims to provide a distributed parallel method for intelligent remote sensing image models, which can greatly improve the system agility, the model parallel deployment performance and the system delay in the aspect of processing the intelligent remote sensing image models with large data volume, and effectively meet the requirements of large throughput, quasi-real-time calculation and agile model parallel deployment of mass remote sensing images.
In order to achieve the purpose, the invention adopts the technical scheme that: the distributed parallel method for the intelligent remote sensing image model comprises the following steps:
step 1: accessing a file system address and a model selection field of a remote sensing image from a business application system, analyzing the address and accessing a UUID library for comparison and duplicate removal;
and 2, step: reading the remote sensing image through an image preprocessing library, cutting and distributing the obtained remote sensing image, and storing a slice file obtained after cutting into a small file distributed file system, a memory queue or a high-speed SSD file directory based on a PCIE-4.0 interface;
and 3, step 3: metadata (file name, UUID and image size) of the remote sensing image and metadata information (slice name, UUID, slice coordinate and picture size) of the slice are serialized through JSON, and the metadata is pressed into a memory message queue by adopting a PUSH mechanism;
and 4, step 4: the target intelligent detection model Server end adopts an asynchronous multithreading competition mechanism and an access rule for blocking access to a memory message queue, the memory message queue sorts the requests of all target intelligent detection models in sequence, and the memory message queue adopts multithreading to receive the requests of the target intelligent detection model and adopts a single-thread feedback request;
and 5: after the metadata of the slice is extracted from a memory message queue POP, a target intelligent detection model Server terminal analyzes the metadata of the slice and reads an address corresponding to a slice file to perform target intelligent detection service, and the metadata (target coarse type, confidence coefficient and slice name) of a detection result obtained by the target intelligent detection service is serialized and PUSH into the memory message queue through JSON;
step 6: the target intelligent identification model Server terminal accesses the memory message queue by adopting an asynchronous multithreading competition mechanism, analyzes the metadata of the detection result and performs target intelligent identification service, and the metadata (target detail type, confidence coefficient and slice name) of the identification result obtained by the target intelligent identification service is serialized and PUSH into the memory message queue by JSON;
and 7: and the target post-processing model server analyzes the metadata of the identification result, performs target post-processing service to obtain the metadata of final detection identification, performs NMS duplicate removal and target coordinate conversion of a target detection frame and JSON serialization of the final result, and encapsulates the metadata into a uniform query interface.
Due to the application of the technical scheme, compared with the prior art, the invention has the following advantages:
the invention relates to a distributed parallel method of remote sensing image intelligent models, which utilizes a message queue based on a memory to greatly improve the system agility, the model parallel deployment performance and the system delay in the aspect of processing the intelligent models with large data volume such as remote sensing images in parallel deployment, effectively meets the requirements of large throughput, quasi-real-time calculation and agile model parallel deployment of mass remote sensing images, and solves the problem that the real-time processing requirements of mass remote sensing images cannot be met by adopting gRPC/REST API interface calling in the existing intelligent processing models.
Drawings
FIG. 1 is a schematic flow chart of the present invention.
Detailed Description
Example (b): the invention provides a distributed parallel method of an intelligent remote sensing image model, which comprises the following steps:
step 1: selecting fields from a file system address and a model of a remote sensing image accessed in a business application system, analyzing the address and accessing a UUID library for comparison and deduplication processing;
and 2, step: reading the remote sensing image through an image preprocessing library, cutting and distributing the obtained remote sensing image, and storing a slice file obtained after cutting into a small file distributed file system, a memory queue or a high-speed SSD file directory based on a PCIE-4.0 interface;
and 3, step 3: metadata information (file name, UUID and picture size) of the large graph and metadata information (slice name, UUID, slice initial coordinate and picture size) of the slice are serialized through JSON, and a PUSH mechanism is adopted to press the metadata information of the large graph and the metadata information of the slice into a memory message queue;
and 4, step 4: the target intelligent detection model Server end adopts an asynchronous multithreading competition mechanism and an access rule for blocking access to a memory message queue, due to the adoption of a multithreading competition mechanism access design, the separation decoupling of the intelligent model Server end and an access data preprocessing end is realized, each intelligent detection identification Server can be quickly deployed on line as required, the memory message queue sorts the requests of each target intelligent detection model in sequence, the memory message queue adopts multithreading to receive the requests of the target intelligent detection model and adopts a single thread to feed back the requests, and thus the data access atomicity is ensured;
and 5: after the metadata information of the slice is extracted from a memory message queue POP, a target intelligent detection model Server terminal analyzes the metadata information of the slice and reads a corresponding slice file address to perform target intelligent detection service, and the metadata information (target coarse type, confidence coefficient and slice name) of a detection result obtained by the target intelligent detection service is serialized and PUSH into the memory message queue through JSON;
and 6: the target intelligent identification model Server end accesses the memory message queue by adopting an asynchronous multithreading competition mechanism, analyzes the metadata information of the detection result, performs target intelligent identification service, serializes the metadata information (target detailed type, confidence coefficient and slice name) of the identification result obtained by the target intelligent identification service into the memory message queue by JSON (java Server open) and PUSH (PUSH), and improves the target identification performance by adopting detection and identification dual-network construction;
and 7: and analyzing the metadata of the identification result by the target post-processing model server, performing target post-processing service to obtain the metadata of final detection identification, performing NMS duplicate removal, target coordinate conversion and JSON serialization of the final result of the target detection frame, and packaging into a unified query interface.
The above embodiments are further explained as follows:
the invention is based on the intelligent model distributed parallel architecture of the memory queue decoupling, and realizes the service decoupling of each module in the system through the data queue of the memory; the data queue based on the memory has lower system data transmission delay and higher concurrency performance, and the service model can be quickly and thermally deployed according to task requirements and data quantity by virtue of the service decoupling design of each intelligent model of the data queue based on the memory;
the integrated double networks (detection + identification) in the specific business process greatly improve the target detection and identification precision, each intelligent model is only in butt joint with a known corresponding memory queue interface, the performance loss of locking access to shared resources is saved by a single atom competition mechanism, the business models can be inserted or removed according to the required heat, and the utilization efficiency of computing resources and the parallel loading capacity of the models are greatly improved;
because the internal hash table construction mode is based on the memory queue, the message access is extremely fast and is far lower than the model processing time delay, a single-thread monitoring and multi-model request competition mechanism is adopted at the model request end, the atomicity of message transmission and the model adding agility are ensured, and the high-efficiency parallel processing of billion-level pixel image data, computing resources and a network model is realized;
through a data queue based on a memory, asynchronous decoupling of data preprocessing and model reasoning is realized, data resource monitoring is added in a data preprocessing stage, data flow can be accurately predicted, and the parallel loading quantity of models is reasonably distributed;
in the design of an internal memory data queue, a PUSH/POP mechanism is adopted, data preprocessing is used as a producer to press data into the internal memory queue, each model is used as a client to access the internal memory queue by adopting a competition mechanism, blocking access is adopted in an access rule, and the queue can sequence requests of each model in a sequence;
the memory queue receives the model request by adopting multiple threads, but feeds back by adopting a single thread, thereby ensuring the atomicity of data access;
due to the fact that the pure memory queue is adopted, data reading and carrying are fast, single-thread polling description is adopted on response requests, thread switching and competition are reduced, data atomicity is guaranteed, the method is suitable for large-data-volume data distribution requests, the QPS is actually measured and accessed in parallel and is larger than 8 thousands, multi-path I/O multiplexing can enable a single thread to efficiently process multiple model requests, and single-thread atomic operation improves data processing throughput and model parallel agility.
When the distributed parallel method for the remote sensing image intelligent model is adopted, the message queue based on the memory is utilized, so that the system agility, the model parallel deployment performance and the system delay are greatly improved in the aspect of processing the parallel deployment of the intelligent model with large data volume such as the remote sensing image, the requirements of large throughput, quasi-real-time calculation and the parallel deployment of the agile model of the mass remote sensing images are effectively met, and the problem that the real-time processing requirement of the mass remote sensing image cannot be met by adopting a gRPC/REST API interface for calling in the existing intelligent processing model is solved.
The above embodiments are merely illustrative of the technical ideas and features of the present invention, and the purpose thereof is to enable those skilled in the art to understand the contents of the present invention and implement the present invention, and not to limit the protection scope of the present invention. All equivalent changes and modifications made according to the spirit of the present invention should be covered within the protection scope of the present invention.

Claims (1)

1. A distributed parallel method for intelligent remote sensing image models is characterized by comprising the following steps:
step 1: selecting fields from a file system address and a model of a remote sensing image accessed in a business application system, analyzing the address and accessing a UUID library for comparison and deduplication processing;
step 2: reading the remote sensing image through an image preprocessing library, cutting and distributing the obtained remote sensing image, and storing a slice file obtained after cutting into a small file distributed file system, a memory queue or a high-speed SSD file directory based on a PCIE-4.0 interface;
and step 3: sequencing metadata of the remote sensing image and metadata information of the slices through JSON, and pressing the metadata into a memory message queue by adopting a PUSH mechanism;
and 4, step 4: the target intelligent detection model Server end adopts an asynchronous multithreading competition mechanism and an access rule for blocking access to a memory message queue, the memory message queue sorts the requests of all target intelligent detection models in sequence, and the memory message queue adopts multithreading to receive the requests of the target intelligent detection model and adopts a single-thread feedback request;
and 5: after the metadata of the slice is extracted from a memory message queue POP, a target intelligent detection model Server terminal analyzes the metadata of the slice and reads an address corresponding to a slice file to perform target intelligent detection service, and the metadata of a detection result obtained by the target intelligent detection service is serialized and PUSH into the memory message queue;
step 6: the target intelligent identification model Server end accesses the memory message queue by adopting an asynchronous multithreading competition mechanism, analyzes the metadata of the detection result, performs target intelligent identification service, serializes the metadata of the identification result obtained by the target intelligent identification service into the memory message queue by JSON and PUSH;
and 7: and the target post-processing model server analyzes the metadata of the identification result, performs target post-processing service to obtain the metadata of final detection identification, performs NMS duplicate removal and target coordinate conversion of a target detection frame and JSON serialization of the final result, and encapsulates the metadata into a uniform query interface.
CN202011530140.1A 2020-12-22 2020-12-22 Distributed parallel method for intelligent remote sensing image model Active CN112631801B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011530140.1A CN112631801B (en) 2020-12-22 2020-12-22 Distributed parallel method for intelligent remote sensing image model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011530140.1A CN112631801B (en) 2020-12-22 2020-12-22 Distributed parallel method for intelligent remote sensing image model

Publications (2)

Publication Number Publication Date
CN112631801A CN112631801A (en) 2021-04-09
CN112631801B true CN112631801B (en) 2022-10-04

Family

ID=75321026

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011530140.1A Active CN112631801B (en) 2020-12-22 2020-12-22 Distributed parallel method for intelligent remote sensing image model

Country Status (1)

Country Link
CN (1) CN112631801B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114327900A (en) * 2021-12-30 2022-04-12 四川启睿克科技有限公司 Method for preventing memory leakage by thread call in management double-buffer technology

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050160088A1 (en) * 2001-05-17 2005-07-21 Todd Scallan System and method for metadata-based distribution of content
CN109145155A (en) * 2018-07-09 2019-01-04 中科遥感科技集团有限公司 High-concurrency warehousing processing method for mass remote sensing image metadata

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050160088A1 (en) * 2001-05-17 2005-07-21 Todd Scallan System and method for metadata-based distribution of content
CN109145155A (en) * 2018-07-09 2019-01-04 中科遥感科技集团有限公司 High-concurrency warehousing processing method for mass remote sensing image metadata

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
《An Efficient Ring-Based Metadata Management Policy for Large-Scale Distributed File Systems》;Yuanning Gao 等;《IEEE Transactions on Parallel and Distributed Systems》;20190901;全文 *
《一种分布式地理元数据同步机制》;罗显刚 等;《地球科学(中国地质大学学报)》;20100531;全文 *

Also Published As

Publication number Publication date
CN112631801A (en) 2021-04-09

Similar Documents

Publication Publication Date Title
CN109800222B (en) HBase secondary index self-adaptive optimization method and system
CN108009236B (en) Big data query method, system, computer and storage medium
US20180285167A1 (en) Database management system providing local balancing within individual cluster node
CN110505495B (en) Multimedia resource frame extraction method, device, server and storage medium
CN110147470B (en) Cross-machine-room data comparison system and method
CN111447102A (en) SDN network device access method and device, computer device and storage medium
CN109271363B (en) File storage method and device
CN108228709B (en) Data storage method and system, electronic device, program, and medium
CN112130996A (en) Data monitoring control system, method and device, electronic equipment and storage medium
CN110708256A (en) CDN scheduling method, device, network equipment and storage medium
CN112631801B (en) Distributed parallel method for intelligent remote sensing image model
CN111562889B (en) Data processing method, device, system and storage medium
CN113900810A (en) Distributed graph processing method, system and storage medium
CN107346270B (en) Method and system for real-time computation based radix estimation
CN213876703U (en) Resource pool management system
CN106528051B (en) The method of big data queue stack manipulation based on MongoDB
US9703788B1 (en) Distributed metadata in a high performance computing environment
CN113312345A (en) Kubernetes and Ceph combined remote sensing data storage system, storage method and retrieval method
US9069821B2 (en) Method of processing files in storage system and data server using the method
CN113656370B (en) Data processing method and device for electric power measurement system and computer equipment
CN110011845A (en) Log collection method and system
CN115964418A (en) Multi-source heterogeneous data access system and method for Internet of things
CN113872814A (en) Information processing method, device and system for content distribution network
CN113010373B (en) Data monitoring method and device, electronic equipment and storage medium
CN111813542B (en) Load balancing method and device for parallel processing of large-scale graph analysis task

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant