CN112631801B

CN112631801B - Distributed parallel method for intelligent remote sensing image model

Info

Publication number: CN112631801B
Application number: CN202011530140.1A
Authority: CN
Inventors: 张翼; 曾明勇; 吴东; 王宇; 李祥; 朱剑文; 陆一峰; 张昆
Original assignee: Wuxi Jiangnan Computing Technology Institute
Current assignee: Wuxi Jiangnan Computing Technology Institute
Priority date: 2020-12-22
Filing date: 2020-12-22
Publication date: 2022-10-04
Anticipated expiration: 2040-12-22
Also published as: CN112631801A

Abstract

The invention discloses a distributed parallel method of an intelligent remote sensing image model, which comprises the following steps: accessing a file system address and a model selection field of a remote sensing image from a business application system; reading the remote sensing image through an image preprocessing library; the metadata information of the large graph and the metadata information of the slices are serialized through JSON, and the metadata information of the large graph and the metadata information of the slices are pressed into a memory message queue by adopting a PUSH mechanism; accessing the memory message queue by adopting an asynchronous multithreading competition mechanism and an access rule for blocking access; sequencing metadata information of the detection result through JSON and PUSH to a memory message queue; serializing metadata information of the identification result through JSON and PUSH into a memory message queue; finally, detecting and packaging the identified metadata information into a uniform query interface. The method can effectively meet the requirements of large throughput, quasi-real-time calculation and agile model parallel deployment of massive remote sensing images.

Description

Distributed parallel method for remote sensing image intelligent model

Technical Field

The invention relates to a distributed parallel method for an intelligent remote sensing image model, and belongs to the technical field of remote sensing image processing.

Background

How to rapidly and intelligently detect and identify valuable targets from massive remote sensing image data has important significance (such as forest fire points, cooling towers for illegal discharge and the like). The single remote sensing image pixel points are usually more than hundreds of millions, the splitting, distribution and parallel distributed processing of calling an intelligent model on the pixel points more than hundreds of millions or billions are required to be realized, the information distribution delay and the data transmission delay in a system architecture are required to be smaller than the model processing delay, and a service model can be deployed rapidly according to the data volume.

The existing intelligent model distributed deployment method is mostly based on a gRPC/REST service interface, the data transmission link is long, the data transmission delay of remote sensing image slices is long, the model cannot be deployed quickly and thermally according to the data demand and cannot cope with data flood, and the distributed real-time processing performance of an intelligent model end and the model deployment agility are lowered integrally.

Disclosure of Invention

The invention aims to provide a distributed parallel method for intelligent remote sensing image models, which can greatly improve the system agility, the model parallel deployment performance and the system delay in the aspect of processing the intelligent remote sensing image models with large data volume, and effectively meet the requirements of large throughput, quasi-real-time calculation and agile model parallel deployment of mass remote sensing images.

In order to achieve the purpose, the invention adopts the technical scheme that: the distributed parallel method for the intelligent remote sensing image model comprises the following steps:

step 1: accessing a file system address and a model selection field of a remote sensing image from a business application system, analyzing the address and accessing a UUID library for comparison and duplicate removal;

and 2, step: reading the remote sensing image through an image preprocessing library, cutting and distributing the obtained remote sensing image, and storing a slice file obtained after cutting into a small file distributed file system, a memory queue or a high-speed SSD file directory based on a PCIE-4.0 interface;

and 3, step 3: metadata (file name, UUID and image size) of the remote sensing image and metadata information (slice name, UUID, slice coordinate and picture size) of the slice are serialized through JSON, and the metadata is pressed into a memory message queue by adopting a PUSH mechanism;

and 4, step 4: the target intelligent detection model Server end adopts an asynchronous multithreading competition mechanism and an access rule for blocking access to a memory message queue, the memory message queue sorts the requests of all target intelligent detection models in sequence, and the memory message queue adopts multithreading to receive the requests of the target intelligent detection model and adopts a single-thread feedback request;

and 5: after the metadata of the slice is extracted from a memory message queue POP, a target intelligent detection model Server terminal analyzes the metadata of the slice and reads an address corresponding to a slice file to perform target intelligent detection service, and the metadata (target coarse type, confidence coefficient and slice name) of a detection result obtained by the target intelligent detection service is serialized and PUSH into the memory message queue through JSON;

step 6: the target intelligent identification model Server terminal accesses the memory message queue by adopting an asynchronous multithreading competition mechanism, analyzes the metadata of the detection result and performs target intelligent identification service, and the metadata (target detail type, confidence coefficient and slice name) of the identification result obtained by the target intelligent identification service is serialized and PUSH into the memory message queue by JSON;

and 7: and the target post-processing model server analyzes the metadata of the identification result, performs target post-processing service to obtain the metadata of final detection identification, performs NMS duplicate removal and target coordinate conversion of a target detection frame and JSON serialization of the final result, and encapsulates the metadata into a uniform query interface.

Due to the application of the technical scheme, compared with the prior art, the invention has the following advantages:

the invention relates to a distributed parallel method of remote sensing image intelligent models, which utilizes a message queue based on a memory to greatly improve the system agility, the model parallel deployment performance and the system delay in the aspect of processing the intelligent models with large data volume such as remote sensing images in parallel deployment, effectively meets the requirements of large throughput, quasi-real-time calculation and agile model parallel deployment of mass remote sensing images, and solves the problem that the real-time processing requirements of mass remote sensing images cannot be met by adopting gRPC/REST API interface calling in the existing intelligent processing models.

Drawings

FIG. 1 is a schematic flow chart of the present invention.

Detailed Description

Example (b): the invention provides a distributed parallel method of an intelligent remote sensing image model, which comprises the following steps:

step 1: selecting fields from a file system address and a model of a remote sensing image accessed in a business application system, analyzing the address and accessing a UUID library for comparison and deduplication processing;

and 3, step 3: metadata information (file name, UUID and picture size) of the large graph and metadata information (slice name, UUID, slice initial coordinate and picture size) of the slice are serialized through JSON, and a PUSH mechanism is adopted to press the metadata information of the large graph and the metadata information of the slice into a memory message queue;

and 4, step 4: the target intelligent detection model Server end adopts an asynchronous multithreading competition mechanism and an access rule for blocking access to a memory message queue, due to the adoption of a multithreading competition mechanism access design, the separation decoupling of the intelligent model Server end and an access data preprocessing end is realized, each intelligent detection identification Server can be quickly deployed on line as required, the memory message queue sorts the requests of each target intelligent detection model in sequence, the memory message queue adopts multithreading to receive the requests of the target intelligent detection model and adopts a single thread to feed back the requests, and thus the data access atomicity is ensured;

and 5: after the metadata information of the slice is extracted from a memory message queue POP, a target intelligent detection model Server terminal analyzes the metadata information of the slice and reads a corresponding slice file address to perform target intelligent detection service, and the metadata information (target coarse type, confidence coefficient and slice name) of a detection result obtained by the target intelligent detection service is serialized and PUSH into the memory message queue through JSON;

and 6: the target intelligent identification model Server end accesses the memory message queue by adopting an asynchronous multithreading competition mechanism, analyzes the metadata information of the detection result, performs target intelligent identification service, serializes the metadata information (target detailed type, confidence coefficient and slice name) of the identification result obtained by the target intelligent identification service into the memory message queue by JSON (java Server open) and PUSH (PUSH), and improves the target identification performance by adopting detection and identification dual-network construction;

and 7: and analyzing the metadata of the identification result by the target post-processing model server, performing target post-processing service to obtain the metadata of final detection identification, performing NMS duplicate removal, target coordinate conversion and JSON serialization of the final result of the target detection frame, and packaging into a unified query interface.

The above embodiments are further explained as follows:

the invention is based on the intelligent model distributed parallel architecture of the memory queue decoupling, and realizes the service decoupling of each module in the system through the data queue of the memory; the data queue based on the memory has lower system data transmission delay and higher concurrency performance, and the service model can be quickly and thermally deployed according to task requirements and data quantity by virtue of the service decoupling design of each intelligent model of the data queue based on the memory;

the integrated double networks (detection + identification) in the specific business process greatly improve the target detection and identification precision, each intelligent model is only in butt joint with a known corresponding memory queue interface, the performance loss of locking access to shared resources is saved by a single atom competition mechanism, the business models can be inserted or removed according to the required heat, and the utilization efficiency of computing resources and the parallel loading capacity of the models are greatly improved;

because the internal hash table construction mode is based on the memory queue, the message access is extremely fast and is far lower than the model processing time delay, a single-thread monitoring and multi-model request competition mechanism is adopted at the model request end, the atomicity of message transmission and the model adding agility are ensured, and the high-efficiency parallel processing of billion-level pixel image data, computing resources and a network model is realized;

through a data queue based on a memory, asynchronous decoupling of data preprocessing and model reasoning is realized, data resource monitoring is added in a data preprocessing stage, data flow can be accurately predicted, and the parallel loading quantity of models is reasonably distributed;

in the design of an internal memory data queue, a PUSH/POP mechanism is adopted, data preprocessing is used as a producer to press data into the internal memory queue, each model is used as a client to access the internal memory queue by adopting a competition mechanism, blocking access is adopted in an access rule, and the queue can sequence requests of each model in a sequence;

the memory queue receives the model request by adopting multiple threads, but feeds back by adopting a single thread, thereby ensuring the atomicity of data access;

due to the fact that the pure memory queue is adopted, data reading and carrying are fast, single-thread polling description is adopted on response requests, thread switching and competition are reduced, data atomicity is guaranteed, the method is suitable for large-data-volume data distribution requests, the QPS is actually measured and accessed in parallel and is larger than 8 thousands, multi-path I/O multiplexing can enable a single thread to efficiently process multiple model requests, and single-thread atomic operation improves data processing throughput and model parallel agility.

When the distributed parallel method for the remote sensing image intelligent model is adopted, the message queue based on the memory is utilized, so that the system agility, the model parallel deployment performance and the system delay are greatly improved in the aspect of processing the parallel deployment of the intelligent model with large data volume such as the remote sensing image, the requirements of large throughput, quasi-real-time calculation and the parallel deployment of the agile model of the mass remote sensing images are effectively met, and the problem that the real-time processing requirement of the mass remote sensing image cannot be met by adopting a gRPC/REST API interface for calling in the existing intelligent processing model is solved.

The above embodiments are merely illustrative of the technical ideas and features of the present invention, and the purpose thereof is to enable those skilled in the art to understand the contents of the present invention and implement the present invention, and not to limit the protection scope of the present invention. All equivalent changes and modifications made according to the spirit of the present invention should be covered within the protection scope of the present invention.

Claims

1. A distributed parallel method for intelligent remote sensing image models is characterized by comprising the following steps:

step 2: reading the remote sensing image through an image preprocessing library, cutting and distributing the obtained remote sensing image, and storing a slice file obtained after cutting into a small file distributed file system, a memory queue or a high-speed SSD file directory based on a PCIE-4.0 interface;

and step 3: sequencing metadata of the remote sensing image and metadata information of the slices through JSON, and pressing the metadata into a memory message queue by adopting a PUSH mechanism;

and 5: after the metadata of the slice is extracted from a memory message queue POP, a target intelligent detection model Server terminal analyzes the metadata of the slice and reads an address corresponding to a slice file to perform target intelligent detection service, and the metadata of a detection result obtained by the target intelligent detection service is serialized and PUSH into the memory message queue;

step 6: the target intelligent identification model Server end accesses the memory message queue by adopting an asynchronous multithreading competition mechanism, analyzes the metadata of the detection result, performs target intelligent identification service, serializes the metadata of the identification result obtained by the target intelligent identification service into the memory message queue by JSON and PUSH;