CN115861082B - Low-delay picture splicing system and method - Google Patents

Low-delay picture splicing system and method Download PDF

Info

Publication number
CN115861082B
CN115861082B CN202310193024.2A CN202310193024A CN115861082B CN 115861082 B CN115861082 B CN 115861082B CN 202310193024 A CN202310193024 A CN 202310193024A CN 115861082 B CN115861082 B CN 115861082B
Authority
CN
China
Prior art keywords
queue
picture
camera
pictures
storage
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310193024.2A
Other languages
Chinese (zh)
Other versions
CN115861082A (en
Inventor
朱敏
王宇峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuxi Muchuang Integrated Circuit Design Co ltd
Original Assignee
Wuxi Muchuang Integrated Circuit Design Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuxi Muchuang Integrated Circuit Design Co ltd filed Critical Wuxi Muchuang Integrated Circuit Design Co ltd
Priority to CN202310193024.2A priority Critical patent/CN115861082B/en
Publication of CN115861082A publication Critical patent/CN115861082A/en
Application granted granted Critical
Publication of CN115861082B publication Critical patent/CN115861082B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Television Signal Processing For Recording (AREA)

Abstract

The application discloses a low-delay picture splicing system and a method. The system binds the camera queue with the QP queue of the RDMA, and the pictures can be sent in real time through the QP queue which is correspondingly bound without copying the pictures from the inside of the camera queue to the memory and temporarily storing the pictures in the inside of the camera queue; in addition, through applying for the appointed memory which is continuous in advance and is matched with the sum of the sizes of the received multipath pictures in storage size, then directly storing the channel data of the received multipath pictures into the appointed memory according to the data storage structure sequence of the picture storage format, the splicing of the multipath pictures can be completed, the higher extra time delay caused by the participation of a CPU in processing the picture splicing is omitted, and the time delay is greatly reduced.

Description

Low-delay picture splicing system and method
Technical Field
The application relates to the technical field of multi-path picture splicing, in particular to a low-delay picture splicing system and method.
Background
Visual inspection refers to the use of a machine instead of the human eye to make measurements and determinations. Visual detection means that a shot target is converted into a target image through a camera and is transmitted to a special image processing system; the image processing system converts the information into digital change signals according to the pixel distribution, brightness, color and the like, extracts the characteristic information of the signals by adopting various operation methods, then performs the distinguishing operations of positioning, identifying, grading and the like of surface defects according to the characteristic information, and further controls the action of on-site equipment according to the distinguishing result.
In some environments where the environment is harsh, it is often desirable to remotely operate the device. In scenes such as remote visual detection and real-time video monitoring, pictures shot by a plurality of cameras are often spliced into a picture for processing, displaying and storing. As shown in fig. 1, a conventional network-based remote picture stitching system includes: the multi-path camera collects pictures of the measured object in different directions and uploads the pictures to the industrial personal computer; the industrial personal computer copies the pictures, splices the pictures into a large picture, and then transmits the large picture to the server for processing through a network. Of course, a plurality of pictures can be transmitted to the server first, and then the splicing process is performed. For example, a conventional picture splicing scheme based on the ethernet has the following drawbacks: before transmitting the pictures, the pictures are required to be copied into a memory from the inside of a camera queue, and splicing, format conversion and compression processing are carried out on the pictures; there is also a copy of the picture as it is transmitted using the ethernet protocol stack, bringing a lot of delay.
Remote direct data access (Remote Direct Memory Access, RDMA) is to allow a user-mode application program to directly read or write into a remote memory through a network without operating system intervention and memory copying, thereby saving a large amount of CPU resources, improving the throughput of the system and reducing the network communication delay of the system. RDMA has three implementations InfiniBand, roCEV, iWARP, of which RoCEV2 (over Converged Ethernet V, RDMA) is the most common one that can be used across TCP/IP networks, L2 (data link layer) and L3 (network layer) networks, and the transport layer uses UDP for transport, with fast Latency. It is commonly used in scenarios where high bandwidth low latency networks are required.
At present, compared with the traditional remote picture splicing method based on the network, the emerging picture splicing method based on the RDMA still needs picture copying and picture processing and picture splicing participated by a CPU although the coding and decoding processes of the pictures are omitted.
Disclosure of Invention
Aiming at the problems, the application provides a low-delay picture splicing system and a low-delay picture splicing method, which at least solve the problems of picture copying, picture processing participated by a CPU and time delay existing in splicing in the picture transmission process.
In a first aspect of the present application, a low-latency picture stitching system is provided, comprising:
the client is used for determining a QP (remote control) queue bound with a camera queue receiving the pictures based on a binding relation between a picture storage address of the camera queue and an SQ (quality control) storage address of the QP queue of the RDMA (remote control) when any camera queue corresponding to any camera receives the pictures in the process of synchronously collecting the pictures by the multiple cameras; driving a QP queue bound with a camera queue which receives the picture to send the picture;
the server side is used for receiving channel data corresponding to each picture sent by the multi-channel QP queue; then directly storing the channel data of the received multipath pictures into a designated memory according to the data storage structure sequence of the picture storage format to generate spliced new pictures; the specified memory is continuous and stores a storage space with a size matched with the sum of the sizes of the received multipath pictures.
In some embodiments, the client is further configured to map a picture storage address of the camera queue to a user space, resulting in a user space address; then creating an SQ store address for the RDMA QP queue based on the user space address; and establishing a binding relationship between the picture storage address of the camera queue and the SQ storage address of the QP queue of the RDMA.
In some embodiments, the client is configured to, after determining a QP queue to which a camera queue of the picture is bound, fetch the camera queue of the picture to a user space; driving the QP queue bound with the camera queue to send the picture; and then releasing the pictures in the camera queue under the condition that a returned data receiving completion command is received, and enqueuing the camera queue to enable the camera queue to wait for the next picture receiving.
In some embodiments, the client is configured to, after determining a QP queue to which a camera queue of the picture is bound, fetch the camera queue of the picture to a user space; driving the QP queue bound with the camera queue to send the picture; and then releasing the pictures in the camera queue under the condition that a returned data receiving completion command is received, and enqueuing the camera queue to enable the camera queue to wait for the next picture receiving.
In some embodiments, the picture storage format includes any of YUV422P, YUV444P, YUV P.
In a second aspect of the present application, there is provided a low-latency picture stitching method, the method being applicable to a system as described in any one of the above, the method comprising:
in the process of synchronously collecting pictures by multiple cameras, when any camera queue corresponding to any camera receives a picture, a client determines a QP queue bound with the camera queue receiving the picture based on a binding relationship between a picture storage address of the camera queue and an SQ storage address of the QP queue of RDMA, which is established in advance; driving a QP queue bound to a camera queue which receives the picture to send the picture;
when the server receives channel data corresponding to each picture sent by the multi-channel QP queue, the received channel data of the multi-channel pictures are directly stored into a designated memory according to the data storage structure sequence of the picture storage format, and a spliced new picture is generated; the specified memory is continuous and stores a storage space with a size matched with the sum of the sizes of the received multipath pictures.
In a third aspect of the present application, a computer readable storage medium storing a computer program executable by one or more processors for implementing a low latency picture stitching method as described above is provided.
In a fourth aspect of the present application, there is provided a computer program product that, when run on a processor, performs the low-latency picture stitching method as described above.
In a fifth aspect of the present application, there is provided an electronic device comprising a memory and a processor, said memory having stored thereon a computer program, said memory and said processor being communicatively coupled to each other, which computer program, when executed by said processor, performs a low latency picture stitching method as described above.
Compared with the prior art, the technical scheme of the application has the following advantages or beneficial effects:
1. and the time delay in the picture transmission process is reduced: according to the method and the device, the camera queue and the QP queue of the RDMA are bound, so that pictures in the camera queue can be directly transmitted in real time through the RDMA, the picture copying time of dequeuing of the camera in the prior art is saved, and time delay is reduced.
2. And the time delay in the picture splicing process is reduced: the method has the advantages that the appointed memory with continuous storage size matched with the sum of the sizes of the received multiple paths of pictures is applied in advance, then the channel data of the received multiple paths of pictures are directly stored into the appointed memory according to the data storage structure sequence of the picture storage format, the multiple paths of pictures can be spliced, the higher extra time delay caused by the participation of a CPU in processing the picture splicing is omitted, and the time delay is greatly reduced.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the following description will briefly introduce the drawings used in the embodiments or the description of the prior art, and it is obvious that the drawings in the following description are only embodiments of the present application, and other drawings may be obtained according to the provided drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic diagram of a network-based remote picture stitching system;
FIG. 2 is a schematic diagram of a YUV444/YUV422/YUV420 picture storage format;
FIG. 3 is a camera queue binding QP queue block diagram for a single camera;
fig. 4 is a workflow diagram of a client in a low-latency picture transmission system according to an embodiment of the present disclosure;
fig. 5 is a storage schematic diagram of vertical stitching of three pictures with YUV422P storage format;
FIG. 6 is a block diagram of a multi-way camera queue binding QP queue;
fig. 7 is a workflow of a client and a server in the low-latency picture stitching system according to the present embodiment;
FIG. 8 illustrates QP queue memory space allocation provided in embodiments of the present application;
fig. 9 is a flowchart of a low-latency image stitching method provided in an embodiment of the present application;
Fig. 10 is a flowchart for establishing a binding relationship between a picture storage address of a camera queue and an SQ storage address of an RDMA QP queue according to an embodiment of the present application;
fig. 11 is a connection block diagram of an electronic device according to an embodiment of the present application.
Detailed Description
The following will describe the real-time manner of the present application in detail with reference to the drawings and embodiments, thereby how to apply the technical means to the present application to solve the technical problems, and the implementation process for achieving the corresponding technical effects is fully understood and implemented accordingly. The embodiments and the features in the embodiments can be combined with each other on the premise of no conflict, and the formed technical schemes are all within the protection scope of the application.
In order to make the purposes, technical solutions and beneficial effects of the embodiments of the present application more clear, the technical solutions in the embodiments of the present application will be described in detail below with reference to the specification and specific embodiments.
In the following, some of the examples and prior art techniques of the present application are explained for the sake of understanding the technical solutions of the present application by those skilled in the art.
RDMA communication: is based on a set of a Send Queue (SQ), a Receive Queue (RQ), and a Completion Queue (CQ). Among them, the Send Queue (SQ) and the Receive Queue (RQ) are responsible for scheduling work, they are always created in pairs, called Queue Pairs (QP). A Completion Queue (CQ) is used to send notifications when instructions placed on a work queue are completed.
V4L2: is the abbreviation of Video for linux2, which is called kernel driver for Video devices in linux. In Linux, a video device is a device file, which can be read and written as if it were a normal file.
Mmap, a method of memory mapping files, maps a file or other object into memory.
Linux, which is called GNU/Linux in full, is a free-to-use and freely-spread UNIX-like operating system.
YUV data format, comprising three components: y (Luminance/Luma), U and V represent color differences, and represent color information of pictures. The luminance information and the color information are separated with respect to the RGB color space. The coding mode is also more suitable for human eyes, and researches show that the human eyes are more sensitive to the brightness information and the color information. And YUV downsampling is to compress and sample color information relatively insensitive to human eyes according to the characteristics of human eyes, so as to obtain relatively small files for playing and transmitting. Fig. 2 is a schematic diagram of a YUV444/YUV422/YUV420 picture storage format, as shown in fig. 2, the YUV444/YUV422/YUV420 picture storage format is as follows:
YUV444: each pixel corresponds to one Y component, one U component, one V component, and thus one pixel point occupies 3 bytes.
YUV422: one Y component for each pixel, but one U component and one V component for every two pixels (or Y components), so that one pixel point takes 2 bytes.
YUV420: one Y component for each pixel, but one U component and one V component for every four pixels (or Y components), one pixel point occupies 1.5 bytes.
Next, two picture transmission splicing methods currently used in the art are briefly described.
The picture splicing scheme based on the Ethernet needs to carry out the following processing on the picture:
copying the pictures from the camera queue to a memory, and returning the queue to a driver for the next use;
splicing the multiple camera pictures into one picture;
transcoding the spliced picture into YUV420P by YUV444/YUV422 format;
carrying out H264/H265 coding compression on the YUV420P format picture;
uploading the compressed picture to a server through Ethernet;
decoding the picture at the server;
and the server analyzes, processes and issues processing commands to the pictures.
The picture splicing scheme based on the Ethernet has the following defects: before transmitting the pictures, the pictures are required to be copied into a memory from the inside of a camera queue, and splicing, format conversion and compression processing are carried out on the pictures; there is also a copy of the picture when it is transmitted using the ethernet protocol stack, and the copy and processing of the picture takes up CPU time, bringing a lot of time delay.
An RDMA-based picture stitching method comprising:
copying the pictures from the inside of a drive queue of the camera to a memory, and returning the queue to the drive for the next use;
splicing pictures acquired by the multiple cameras into a picture;
uploading the picture to a server through RDMA;
and the server analyzes, processes and issues processing commands to the pictures.
Compared with the traditional remote picture splicing method based on the network, the picture splicing scheme based on the RDMA saves the encoding and decoding processes of pictures, but still needs to copy the pictures from the inside of a camera queue to a memory and needs a CPU to participate in the processing and splicing of the pictures, so that no small time delay still exists.
Based on this, the first embodiment of the present application provides a low-latency picture stitching system, by binding a camera queue and an RDMA QP queue, without copying a picture from the inside of the camera queue to a memory, only temporarily storing the picture in the camera queue, and transmitting the picture through the corresponding bound QP queue in real time; in addition, through applying for the appointed memory which is continuous in advance and is matched with the sum of the sizes of the received multipath pictures in storage size, then directly storing the channel data of the received multipath pictures into the appointed memory according to the data storage structure sequence of the picture storage format, the splicing of the multipath pictures can be completed, the higher extra time delay caused by the participation of a CPU in processing the picture splicing is omitted, and the time delay is greatly reduced.
Example 1
The embodiment provides a low-delay picture splicing system, which comprises a client and a server. Wherein:
in practical application, the client comprises an industrial personal computer. The industrial personal computer is connected with the cameras for synchronously collecting pictures in multiple paths. The industrial personal computer is preset with a plurality of camera queues for receiving pictures obtained through real-time monitoring, and establishes a binding relation between a picture storage address of the camera queues and an SQ storage address of an QP queue of the RDMA in advance, and is used for determining the QP queue bound by the camera queue receiving the pictures when any camera queue corresponding to any camera receives the pictures in the process of synchronously collecting the pictures by the multi-path cameras based on the binding relation between the picture storage address of the camera queue and the SQ storage address of the QP queue of the RDMA; and then driving a QP queue bound by the camera queue which receives the picture to send the picture.
The server side is used for receiving channel data corresponding to each picture sent by the multi-channel QP queues; directly storing the channel data of the received multipath pictures into a designated memory according to the data storage structure sequence of the picture storage format, and generating spliced new pictures; the specified memory is continuous and stores a storage space with a size matched with the sum of the sizes of the received multipath pictures. In some embodiments, the picture storage format includes any of YUV422P, YUV444P, YUV P.
By using a single camera to illustrate, as shown in the camera queue binding QP queue block diagram in fig. 3, 4 camera queues of a single camera are respectively bound with 4 QP queues of RDMA of a client, RDMA QP of the client corresponds to QP queues of a server one by one, and data carried by a certain QP queue of the client is stored in a specified memory of the server, so that an application program of the server can directly fetch the data at the memory for processing.
In this embodiment, based on the binding relationship between the picture storage address of the camera queue and the SQ storage address of the QP queue of the RDMA, when any camera queue corresponding to any camera receives a picture, the picture is sent through the QP queue bound by the camera queue that receives the picture, without copying the picture, and RDMA has the characteristics of high bandwidth and low latency, for example, the time consumed for transmitting 640x480 pictures is in microsecond level, so the latency in the normal execution picture transmission process driven by the camera is not affected by temporary storage.
In some embodiments, the client is further configured to map a picture storage address of the camera queue to a user space, resulting in a user space address; then creating an SQ store address for the RDMA QP queue based on the user space address; and establishing a binding relationship between the picture storage address of the camera queue and the SQ storage address of the QP queue of the RDMA.
Through the binding relation between the picture storage address of the camera queue and the SQ storage address of the QP queue of the RDMA, the QP queue can directly read pictures in the picture storage address of the bound camera queue.
Further, the client is configured to, after determining that the camera queue that receives the picture is bound to the QP queue, fetch the camera queue that receives the picture to a user space; driving the QP queue bound with the camera queue to send the picture; and then releasing the pictures in the camera queue under the condition that a returned data receiving completion command is received, and enqueuing the camera queue to enable the camera queue to wait for the next picture receiving.
For example, referring to fig. 4, fig. 4 is a workflow diagram of a client in a low-latency picture stitching system according to the embodiment of the present application, and as shown in fig. 4, the workflow diagram of a client in a low-latency picture stitching system according to the embodiment of the present application includes the following steps:
(1) Initializing a V4L2 camera, and setting driving camera parameters, a picture storage format, a picture size, a picture frame rate and the like; the RDMA client is initialized, and CM (Connection Management Abstraction, connect management) event channels are established, including CQ, SQ, and RQ queues.
(2) Applying for a picture storage address of a camera queue, mapping the picture storage address to a linux user space through an MMAP (memory access point), namely a vbuf0-3 address, and starting a camera to acquire pictures; creating the SQ storage address of the QP queue of the RDMA client using the vbuf0-3 address, and establishing a link between the RDMA client and the RDMA server. Thus, stored in the address storage space of the SQ queue is the picture storage address of the camera queue.
(3) And in the process of synchronously collecting the pictures by the multiple cameras, when any camera queue corresponding to any camera is polled to receive a data receiving event of the picture, the camera queue receiving the picture is dequeued. Based on the binding relation between the picture storage address of the camera queue and the SQ storage address of the QP queue of the RDMA, determining the QP queue bound by the camera queue of the picture according to the serial number of the camera queue of the picture, starting the writing operation of the QP queue bound by the camera queue of the picture, and forming the picture of the QP queue into a RoCEV2 message and sending the RoCEV2 message to a server.
(4) And under the condition that a data receiving end command returned by the server is received, releasing the pictures in the camera queue, and enqueuing the camera queue to enable the camera queue to wait for the next picture receiving.
In some embodiments, the picture storage format includes any of YUV422P, YUV444P, YUV P.
It should be noted that, fig. 5 is a storage schematic diagram of vertically splicing three pictures with YUV422P as a storage format, and fig. 5 shows that the picture 1, the picture 2 and the picture 3 are all pixels with YUV422P as a storage format and (640×480+320×480+320×480) as a storage format, and taking the picture 1, the picture 2 and the picture 3 as examples, the Y value, the U value and the V value of the picture with YUV422P as a storage format are continuously stored in the memory. According to this principle, taking fig. 5 as an example, the method provided in this embodiment continuously stores the Y value (Y channel data), U value (U channel data) and V value (V channel data) of each of the pictures 1, 2 and 3 into a specified memory (the specified memory is continuous and stores a storage space with a size matching the sum of the sizes of the pictures 1, 2 and 3), thereby directly obtaining a new vertically spliced picture 4. The matching of the storage size of the designated memory with the sum of the sizes of the received multiple pictures may mean that the storage size at least reaches the sum of the sizes of the multiple pictures.
By way of example, taking three cameras as an example, fig. 6 is a multi-way camera queue binding QP queue block diagram, as in fig. 6. The external trigger pins of the three cameras are connected together and connected to a trigger switch (such as a Hall sensor). Therefore, the three cameras can be triggered almost simultaneously, and the three cameras can be ensured to synchronously collect pictures.
The client has applied for 4 camera queues for each camera. The first camera queue Q1 of the camera 1 is described herein as an example, and the other camera queues are the same. The picture storage space corresponding to each camera queue is divided into a Y value storage space, a U value storage space and a V value storage space, and corresponds to the SQ storage address of the QP2 queue, the SQ storage address of the QP3 queue and the SQ storage address of the QP4 queue of the bonded RDMA respectively. Namely, the QP2 queue carries Y-channel data for pictures within camera queue Q1; the QP3 queue carries the U-channel data of the pictures in the camera queue Q1; the QP4 queue carries V-channel data for pictures within camera queue Q1. Similarly, the 2 nd queue uses a QP11 queue, a QP12 queue and a QP13 queue to respectively carry Y-channel data, U-channel data and V-channel data of the picture, and the 3 rd queue and the 4 th queue apply for the QP queues according to the rule. The camera queues of the cameras 2 and 3 are processed in the same manner as the camera 1.
The server applies for a continuous appointed memory, and the space size of the appointed memory is matched with the sizes of three pictures. And the server allocates the appointed memory to the QP queue of the server according to the picture storage format of the QP queue of the client. Therefore, each channel data of the pictures from the client can be directly stored in the response memory of the QP queue of the server, and when all channel data of 3 pictures are stored, a new spliced picture is generated.
For example, taking three cameras as an example, referring to fig. 7, the workflow of the client and the server in the system disclosed in this embodiment may be shown in fig. 7, where the workflow of the client and the server in the system disclosed in this embodiment includes the following steps:
a workflow of a client comprising:
(1) The RDMA client initializes, establishes a CM (Connection Management Abstraction, connect-established management) event channel, including a CQ queue, a SQ queue, and a RQ queue.
(2) 3 QP queues (carrying Y channel data, U channel data and V channel data of the picture respectively) are allocated to 4 driving queues of each camera, and then a link is established with an RDMA server.
(3) And (4) circularly waiting for the camera queue to receive the picture, and sending the picture to the server through the QP queue of the bound RDMA once the camera queue receives the picture.
(4) And after the 3-channel data of the picture are responded, returning the camera queue, and enabling the camera queue to wait for the next picture to be received.
The workflow of the server comprises the following steps:
(1) The RDMA server initializes, establishes CM (Connection Management Abstraction, build connection management) event channels, including CQ queues, SQ queues, and RQ queues.
(2) Applying for a memory with the total size of three pictures and continuous storage space, and distributing the storage space of the memory to each QP queue according to a specific picture storage format, wherein each QP queue corresponds to the QP queue of the client one by one, and links are established with the client.
(3) And receiving the data of each channel of the picture sent by each QP queue of the client, and storing the data in the appointed position of the memory according to the picture storage format. And waiting for all the picture channel data carried by the QP queues to be received, and generating a spliced picture at the memory.
Fig. 8 is a diagram illustrating allocation of QP queue memory space provided in the embodiment of the present application, as shown in fig. 8, in the client, the memory head address of picture 1 is buf1_addr, the memory head address of picture 2 is buf2_addr, and the memory head address of picture 3 is buf3_addr. The Y channel of picture 1 is assigned to QP2 with address buf1_addr and length 640x480; the U channel of picture 1 is assigned to QP3 with an address buf1_addr+640x480 and a length of 320x480; the V-channel of picture 1 is assigned to QP4 with an address buf1_addr+640x480+320x480 and a length of 320x480. Picture 2 and picture 3 are the same.
And the client uses the QP 2-QP 10 queue to carry the data of each channel of the picture. And the server receives the channel data sent by the QP queue of the corresponding client by using the QP 2-QP 10 queues. The server applies for a space with the address buf4_addr and the length of 640x480x3 to store the spliced pictures. The head address of the QP4 is buf4_addr, other QPs store the address space pointed by the buf4_addr according to the corresponding sequence of the diagrams, and when the channel data of the three pictures are received, the new pictures are spliced.
In the system provided by the embodiment, the client determines the QP queue bound by the camera queue receiving the picture when any camera queue corresponding to any camera receives the picture in the process of synchronously collecting the picture by multiple cameras based on the binding relation between the picture storage address of the camera queue and the SQ storage address of the QP queue of the RDMA, wherein the picture storage address of the camera queue is pre-established; and then driving a QP queue bound with the camera queue which receives the pictures to send the pictures, so that the pictures in the camera queue are directly transmitted, the process of copying the pictures in the camera queue in the prior art is omitted, and the time delay caused by copying the pictures is reduced. In addition, RDMA has the characteristic of high bandwidth and low latency, for example, the time for transmitting 640x480 pictures is in microsecond level, so that the latency in the normal picture transmission process of camera driving is not affected. The server side directly stores the channel data of the received multipath pictures into the appointed memory according to the data storage structure sequence of the picture storage format to generate a spliced new picture, so that higher extra time delay caused by the participation of a CPU in processing picture splicing is omitted, and the time delay is greatly reduced.
Example two
The embodiment provides a low-delay picture splicing method. Fig. 9 is a flowchart of a low-latency image stitching method provided in the embodiment of the present application, where, as shown in fig. 9, the method provided in the embodiment includes:
step S1, in the process of synchronously collecting pictures by multiple cameras, when any camera queue corresponding to any camera receives a picture, a client determines a QP queue bound by the camera queue receiving the picture based on a binding relation between a picture storage address of the camera queue and an SQ storage address of an QP queue of RDMA, which is established in advance; and driving a QP queue bound to the camera queue which receives the picture to send the picture.
Step S2, when the server receives channel data corresponding to each picture sent by the multi-channel QP queue, the received channel data of the multi-channel pictures are directly stored into a designated memory according to the data storage structure sequence of the picture storage format, and a spliced new picture is generated; the specified memory is continuous and stores a storage space with a size matched with the sum of the sizes of the received multipath pictures.
Based on the binding relation between the picture storage address of the camera queue and the SQ storage address of the QP queue of the RDMA, when any camera queue corresponding to any camera receives a picture, the picture is sent through the QP queue bound by the camera queue receiving the picture, the picture is not required to be copied, the RDMA has the characteristic of high bandwidth and low delay, for example, the time consumed for transmitting 640x480 pictures is microsecond, and therefore, the time delay in the normal picture transmission process of the normal execution of the camera drive is not influenced by temporary storage. In addition, through applying for a specified memory which is continuous in advance and is matched with the sum of the sizes of the received multipath pictures in storage size, then sending the channel data of the pictures from the received QP queue, directly storing the channel data into the specified memory according to the data storage structure sequence of the picture storage format, the spliced new pictures can be directly obtained, higher extra time delay caused by the participation of a CPU in processing the picture splicing is omitted, and the time delay is greatly reduced.
In some embodiments, fig. 10 is a flowchart for establishing a binding relationship between a picture storage address of a camera queue and an SQ storage address of an RDMA QP queue according to an embodiment of the present application, where, as shown in fig. 10, the establishment of the binding relationship includes the following steps:
step S11, mapping the image storage address of the camera queue to a user space to obtain a user space address;
step S12, creating an SQ storage address of a QP queue of the RDMA based on the user space address;
step S13, establishing a binding relation between the picture storage address of the camera queue and the SQ storage address of the QP queue of the RDMA.
By the method, the storage space of the SQ queue in the QP queue stores the picture storage address of the bound camera queue, so that the QP queue can directly read the pictures in the picture storage address of the bound camera queue.
In some embodiments, the sending the picture through the QP queue to which the camera queue of the picture is bound includes the steps of:
step S21, taking a camera queue which receives the pictures to a user space;
step S22, driving a QP queue bound with the camera queue to send the picture;
Step S23, releasing the pictures in the camera queue and enqueuing the camera queue under the condition that a returned data receiving completion command is received, so that the camera queue waits for the next picture to be received.
In some embodiments, the picture storage format includes any of YUV422P, YUV444P, YUV P.
It should be noted that, fig. 5 is a storage schematic diagram of vertically splicing three pictures with YUV422P as a storage format, and fig. 5 shows that the picture 1, the picture 2 and the picture 3 are all pixels with YUV422P as a storage format and (640×480+320×480+320×480) as a storage format, and taking the picture 1, the picture 2 and the picture 3 as examples, the Y value, the U value and the V value of the picture with YUV422P as a storage format are continuously stored in the memory. According to this principle, taking fig. 5 as an example, the method provided in this embodiment continuously stores the Y value (Y channel data), U value (U channel data) and V value (V channel data) of each of the pictures 1, 2 and 3 into a specified memory (the specified memory is continuous and stores a storage space with a size matching the sum of the sizes of the pictures 1, 2 and 3), thereby directly obtaining a new vertically spliced picture 4. The matching of the storage size of the designated memory with the sum of the sizes of the received multiple pictures may mean that the storage size at least reaches the sum of the sizes of the multiple pictures.
According to the method provided by the embodiment, based on the binding relation between the picture storage address of the camera queue and the SQ storage address of the QP queue of the RDMA, when any camera queue corresponding to any camera receives a picture, the picture received by the camera queue does not need to be copied, and the picture can be sent through the QP queue bound by the camera queue which receives the picture only temporarily. RDMA has the characteristics of high bandwidth and low latency, for example, the time taken to transmit 640x480 pictures is on the order of microseconds, so the buffering does not affect the normal execution of the camera driver. In addition, based on the binding relation between the picture storage address of the camera queue and the SQ storage address of the QP queue of the RDMA, when any camera queue corresponding to any camera receives a picture, the picture received by the camera queue is not required to be copied, and the picture can be sent through the QP queue bound by the camera queue which receives the picture only temporarily. RDMA has the characteristics of high bandwidth and low latency, for example, the time taken to transmit 640x480 pictures is on the order of microseconds, so the buffering does not affect the normal execution of the camera driver.
Example III
The present embodiment provides a computer readable storage medium storing a computer program executable by one or more processors for implementing a low latency picture stitching method as described above.
The computer-readable storage medium may also include, among other things, computer programs, data files, data structures, etc., alone or in combination. The computer readable storage medium or computer program may be specifically designed and understood by those skilled in the art of computer software, or the computer readable storage medium may be well known and available to those skilled in the art of computer software. Examples of the computer readable storage medium include: magnetic media such as hard disks, floppy disks, and magnetic tape; optical media such as CDROM discs and DVDs; magneto-optical media, such as optical disks; and hardware means, specifically configured to store and execute computer programs, such as read-only memory (ROM), random Access Memory (RAM), flash memory; or a server, app application mall, etc. Examples of computer programs include machine code (e.g., code produced by a compiler) and files containing higher level code that may be executed by the computer using an interpreter. The described hardware devices may be configured to act as one or more software modules to perform the operations and methods described above, and vice versa. In addition, the computer readable storage medium may be distributed among networked computer systems, and the program code or computer program may be stored and executed in a decentralized manner.
Example IV
The present embodiment provides a computer program product which, when run on a processor, performs the low-latency picture stitching method as described above.
Example five
The present embodiment provides an electronic device 100, fig. 11 is a block diagram of connection of the electronic device provided in the embodiment of the present application, and as shown in fig. 11, the electronic device 100 includes a processor 101, a memory 102, a multimedia component 103, an input/output (I/O) interface 104, and a communication component 105.
The processor 101 is configured to perform all or part of the steps in the low-latency picture transfer splicing method embodiment described above. The memory 102 is used to store various types of data, which may include, for example, instructions for any application or method in the electronic device, as well as application-related data.
The processor 101 may be an application specific integrated circuit (Application Specific Integrated Circuit, ASIC), digital signal processor (Digital Signal Processor, DSP), digital signal processing device (Digital Signal Processing Device, DSPD), programmable logic device (Programmable Logic Device, PLD), field programmable gate array (Field Programmable Gate Array, FPGA), controller, microcontroller, microprocessor or other electronic component implementation for performing the methods as in the method embodiments described above.
The Memory 102 may be implemented by any type of volatile or non-volatile Memory device or combination thereof, such as static random access Memory (Static Random Access Memory, SRAM for short), electrically erasable programmable Read-Only Memory (Electrically Erasable Programmable Read-Only Memory, EEPROM for short), erasable programmable Read-Only Memory (Erasable Programmable Read-Only Memory, EPROM for short), programmable Read-Only Memory (Programmable Read-Only Memory, PROM for short), read-Only Memory (ROM for short), magnetic Memory, flash Memory, magnetic disk, or optical disk.
The multimedia component 103 may include a screen, which may be a touch screen, and an audio component for outputting and/or inputting audio signals. For example, the audio component may include a microphone for receiving external audio signals. The received audio signal may be further stored in a memory or transmitted through a communication component. The audio assembly further comprises at least one speaker for outputting audio signals.
The I/O interface 104 provides an interface between the one or more processors 101 and other interface modules, which may be a keyboard, mouse, buttons, etc. These buttons may be virtual buttons or physical buttons.
The communication component 105 is used for wired or wireless communication between the electronic device 100 and other devices. The wired communication comprises communication through a network port, a serial port and the like; the wireless communication includes: wi-Fi, bluetooth, near field communication (Near Field Communication, NFC for short), 2G, 3G, 4G, 5G, or a combination of one or more of them. The corresponding communication component 105 may thus comprise: wi-Fi module, bluetooth module, NFC module.
It should be further understood that the methods or systems disclosed in the embodiments provided herein may be implemented in other manners. The above-described method or system embodiments are merely illustrative, for example, flow diagrams and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of methods and apparatuses according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, a computer program segment, or a portion of a computer program, which comprises one or more computer programs for implementing the specified logical function(s). It should also be noted that in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures, and in fact may be executed substantially concurrently, or in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer programs.
In this application, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, apparatus or device comprising such elements; if any, the terms "first," "second," etc. are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of features indicated or implicitly indicating the precedence of features indicated; in the description of the present application, unless otherwise indicated, the terms "plurality", "multiple" and "multiple" mean at least two; if the description is to a server, it should be noted that the server may be an independent physical server or terminal, or may be a server cluster formed by a plurality of physical servers, or may be a cloud server capable of providing basic cloud computing services such as a cloud server, a cloud database, a cloud storage, a CDN, and the like; in this application, if an intelligent terminal or a mobile device is described, it should be noted that the intelligent terminal or the mobile device may be a mobile phone, a tablet computer, a smart watch, a netbook, a wearable electronic device, a personal digital assistant (Personal Digital Assistant, PDA), an augmented Reality device (Augmented Reality, AR), a Virtual Reality device (VR), an intelligent television, an intelligent sound device, a personal computer (Personal Computer, PC), etc., but the present application is not limited thereto.
Finally it is pointed out that in the description of the present specification, the terms "one embodiment," "some embodiments," "example," "one example" or "some examples" and the like refer to particular features, structures, materials or characteristics described in connection with the embodiment or example as being included in at least one embodiment or example of the present application. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiments or examples. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
While embodiments of the present application have been illustrated and described above, it should be understood that the above-described embodiments are illustrative only and are not intended to limit the present application to the details of the embodiments employed to facilitate the understanding of the present application. Any person skilled in the art to which this application pertains will be able to make any modifications and variations in form and detail of implementation without departing from the spirit and scope of the disclosure, but the scope of protection of this application shall be subject to the scope of the claims that follow.

Claims (8)

1. A low-latency picture stitching system, comprising:
the client is used for determining a QP (remote control) queue bound with a camera queue receiving the pictures based on a binding relation between a picture storage address of the camera queue and an SQ (quality control) storage address of the QP queue of the RDMA (remote control) when any camera queue corresponding to any camera receives the pictures in the process of synchronously collecting the pictures by the multiple cameras; driving a QP queue bound with a camera queue which receives the picture to send the picture;
the server side is used for receiving channel data corresponding to each picture sent by the multi-channel QP queue; then directly storing the channel data of the received multipath pictures into a designated memory according to the data storage structure sequence of the picture storage format to generate spliced new pictures; the specified memory is continuous and stores a storage space with a size matched with the sum of the sizes of the received multipath pictures.
2. The system of claim 1, wherein the client is further configured to map a picture storage address of the camera queue to a user space to obtain a user space address; then creating an SQ store address for the RDMA QP queue based on the user space address; and establishing a binding relationship between the picture storage address of the camera queue and the SQ storage address of the QP queue of the RDMA.
3. The system of claim 2, wherein the client is configured to fetch the camera queue that received the picture to a user space after determining a QP queue to which the camera queue of the picture is bound; driving the QP queue bound with the camera queue to send the picture; and then releasing the pictures in the camera queue under the condition that a returned data receiving completion command is received, and enqueuing the camera queue to enable the camera queue to wait for the next picture receiving.
4. The system of claim 1, wherein the picture storage format comprises any of YUV422P, YUV444P, YUV P.
5. A method of low latency picture stitching, the method being applicable to a system as claimed in any of claims 1-4, the method comprising:
in the process of synchronously collecting pictures by multiple cameras, when any camera queue corresponding to any camera receives a picture, a client determines a QP queue bound with the camera queue receiving the picture based on a binding relationship between a picture storage address of the camera queue and an SQ storage address of the QP queue of RDMA, which is established in advance; driving a QP queue bound to a camera queue which receives the picture to send the picture;
When the server receives channel data corresponding to each picture sent by the multi-channel QP queue, the received channel data of the multi-channel pictures are directly stored into a designated memory according to the data storage structure sequence of the picture storage format, and a spliced new picture is generated; the specified memory is continuous and stores a storage space with a size matched with the sum of the sizes of the received multipath pictures.
6. A computer readable storage medium storing a computer program executable by one or more processors for implementing the low latency picture stitching method according to claim 5.
7. A computer program product, which when run on a processor performs the low-latency picture stitching method according to claim 5.
8. An electronic device comprising a memory and a processor, wherein the memory has stored thereon a computer program, the memory and the processor being communicatively coupled to each other, the computer program, when executed by the processor, performing the low-latency picture stitching method according to claim 5.
CN202310193024.2A 2023-03-03 2023-03-03 Low-delay picture splicing system and method Active CN115861082B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310193024.2A CN115861082B (en) 2023-03-03 2023-03-03 Low-delay picture splicing system and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310193024.2A CN115861082B (en) 2023-03-03 2023-03-03 Low-delay picture splicing system and method

Publications (2)

Publication Number Publication Date
CN115861082A CN115861082A (en) 2023-03-28
CN115861082B true CN115861082B (en) 2023-04-28

Family

ID=85659784

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310193024.2A Active CN115861082B (en) 2023-03-03 2023-03-03 Low-delay picture splicing system and method

Country Status (1)

Country Link
CN (1) CN115861082B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104767910A (en) * 2015-04-27 2015-07-08 京东方科技集团股份有限公司 Video image stitching system and method
CN109491809A (en) * 2018-11-12 2019-03-19 西安微电子技术研究所 A kind of communication means reducing high-speed bus delay
CN111459418A (en) * 2020-05-15 2020-07-28 南京大学 RDMA (remote direct memory Access) -based key value storage system transmission method
CN114007044A (en) * 2021-10-28 2022-02-01 安徽奇智科技有限公司 Opencv-based image splicing system and method
CN114691026A (en) * 2020-12-31 2022-07-01 华为技术有限公司 Data access method and related equipment
WO2022146466A1 (en) * 2020-12-30 2022-07-07 Oracle International Corporation Class-based queueing for scalable multi-tenant rdma traffic
CN115174941A (en) * 2022-07-06 2022-10-11 灵羲科技(北京)有限公司 Real-time motion performance analysis and real-time data sharing method based on multi-channel video streams

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR102430187B1 (en) * 2015-07-08 2022-08-05 삼성전자주식회사 METHOD FOR IMPLEMENTING RDMA NVMe DEVICE
US10778767B2 (en) * 2017-04-28 2020-09-15 International Business Machines Corporation Persistent memory replication in RDMA-capable networks
US10972768B2 (en) * 2019-06-27 2021-04-06 Intel Corporation Dynamic rebalancing of edge resources for multi-camera video streaming

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104767910A (en) * 2015-04-27 2015-07-08 京东方科技集团股份有限公司 Video image stitching system and method
CN109491809A (en) * 2018-11-12 2019-03-19 西安微电子技术研究所 A kind of communication means reducing high-speed bus delay
CN111459418A (en) * 2020-05-15 2020-07-28 南京大学 RDMA (remote direct memory Access) -based key value storage system transmission method
WO2022146466A1 (en) * 2020-12-30 2022-07-07 Oracle International Corporation Class-based queueing for scalable multi-tenant rdma traffic
CN114691026A (en) * 2020-12-31 2022-07-01 华为技术有限公司 Data access method and related equipment
CN114007044A (en) * 2021-10-28 2022-02-01 安徽奇智科技有限公司 Opencv-based image splicing system and method
CN115174941A (en) * 2022-07-06 2022-10-11 灵羲科技(北京)有限公司 Real-time motion performance analysis and real-time data sharing method based on multi-channel video streams

Also Published As

Publication number Publication date
CN115861082A (en) 2023-03-28

Similar Documents

Publication Publication Date Title
US11483370B2 (en) Preprocessing sensor data for machine learning
US9460125B2 (en) Systems, methods, and computer program products for digital photography
US20220076084A1 (en) Responding to machine learning requests from multiple clients
US20160249106A1 (en) Remote Control of a Mobile Device
EP3406310A1 (en) Method and apparatuses for handling visual virtual reality content
US20190166410A1 (en) Methods for streaming visible blocks of volumetric video
CN104023191A (en) Android-based camera projection system and implementation method
US20190251682A1 (en) Systems, methods, and computer program products for digital photography
US11416203B2 (en) Orchestrated control for displaying media
US11223662B2 (en) Method, system, and non-transitory computer readable record medium for enhancing video quality of video call
CN110300278A (en) Video transmission method and equipment
CN115861082B (en) Low-delay picture splicing system and method
US10764535B1 (en) Facial tracking during video calls using remote control input
CN110944140A (en) Remote display method, remote display system, electronic device and storage medium
CN108496351B (en) Unmanned aerial vehicle and control method thereof, control terminal and control method thereof
KR20160067798A (en) Method and device for post processing of a video stream
US20210110554A1 (en) Systems, methods, and computer program products for digital photography using a neural network
US10536501B2 (en) Automated compression of data
CN116204257B (en) Image preview method and device, storage medium and computer equipment
JP2016524247A (en) Automatic codec adaptation
CN113473180B (en) Wireless-based Cloud XR data transmission method and device, storage medium and electronic device
CN108319493A (en) Data processing method, device and machine readable storage medium
US20230156350A1 (en) Systems, methods, and computer program products for digital photography
CN113127222B (en) Data transmission method, device, equipment and medium
CN116761070A (en) Method, device and equipment for supporting multiple cameras based on open source hong Meng system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant