WO2023005286A1

WO2023005286A1 - Image processing

Info

Publication number: WO2023005286A1
Application number: PCT/CN2022/088545
Authority: WO
Inventors: 张卿麒; 张彬; 吴阳平; 许亮
Original assignee: 上海商汤智能科技有限公司
Priority date: 2021-07-30
Filing date: 2022-04-22
Publication date: 2023-02-02
Also published as: CN113613071B; CN113613071A

Abstract

Provided in the present disclosure are an image processing method and apparatus, and a computer device and a storage medium. The method comprises: acquiring an image to be processed, and a processing duration which corresponds to a previous frame of image of the image to be processed; performing first format conversion on the image to be processed, so as to obtain a first target image of a first target format; when the processing duration does not exceed a first preset duration, performing marking processing on the image to be processed, so as to obtain first object marking information, and marking the first object marking information on the first target image, so as to obtain a second target image; and converting the second target image into a third target image, and then sending the third target image to a target device.

Description

Image Processing

Cross References to Related Applications

This application claims priority to a Chinese patent application with application number CN2021108756619 filed with the China Patent Office on July 30, 2021, the entire contents of which are incorporated in this disclosure by reference.

technical field

The present disclosure relates to the technical field of image processing.

Background technique

In order to ensure the real-time visualization of the image processing process, existing tools are mainly based on soft real-time operating systems such as Ubuntu or Android, which have a wide open source foundation, and require external devices such as graphics processing units (GPUs) Auxiliary, or a high-performance platform that relies heavily on the X86-64 instruction set, greatly increases the cost of the tool.

Contents of the invention

Embodiments of the present disclosure at least provide an image processing method, device, computer equipment, and storage medium.

In the first aspect, the embodiment of the present disclosure provides an image processing method applied to an ARM development board, including:

Acquiring the image to be processed, and the processing duration corresponding to the previous frame image of the image to be processed; wherein, in the case of marking the previous frame image, the processing duration is for the previous frame image Perform target format conversion, and perform marking processing corresponding to the duration during the target format conversion process; if the previous frame image is not marked, the processing duration is The duration corresponding to the target format conversion; performing the first format conversion on the image to be processed to obtain the first target image with the first target format; when the processing duration does not exceed the first preset duration, by performing marking processing on the image to be processed to obtain first object marking information, and marking the first object marking information on the first target image to obtain a second target image; converting the second target image to a third target image, and send the third target image to the target device.

In the second aspect, the embodiment of the present disclosure also provides a detection method, including: acquiring an image to be processed taken in the vehicle cabin; and the processing duration corresponding to the previous frame of the image to be processed; In the case where the previous frame image is tagged, the processing duration is the duration corresponding to the target format conversion of the previous frame image and the tagging process during the target format conversion process; In the case of marking a frame of image, the processing duration is the duration corresponding to the target format conversion of the previous frame of image; the first format conversion is performed on the image to be processed to obtain the first target format conversion. the first target image; in the case that the processing time does not exceed the first preset time length, the first object marking information is obtained by marking the image to be processed, and marking the first object marking information in A second target image is obtained from the first target image; the second target image is converted into a third target image and displayed; and based on the displayed second target image, a safety warning is given to the driving of the vehicle.

In a third aspect, an embodiment of the present disclosure further provides an image processing device, including: a first information acquisition module, configured to acquire an image to be processed, and a processing duration corresponding to a previous frame of the image to be processed; wherein, in In the case of performing tagging processing on the previous frame image, the processing duration is the duration corresponding to performing target format conversion on the previous frame image and performing tagging processing during the target format conversion process; In the case where the previous frame image is marked, the processing duration is the duration corresponding to the target format conversion of the previous frame image; the image conversion module is used to perform the first processing on the image to be processed format conversion, to obtain a first target image with a first target format; an image marking module, configured to obtain a first target image by marking the image to be processed when the processing time does not exceed a first preset time Object marking information, and marking the first object marking information on the first target image to obtain a second target image; a first image processing module, configured to convert the second target image into a third target image , and send the third target image to the target device.

In the fourth aspect, the embodiment of the present disclosure further provides a detection device, including: a second information acquisition module, configured to acquire the image to be processed captured in the cabin and the processing corresponding to the previous frame image of the image to be processed Duration; wherein, in the case of performing tagging processing on the previous frame image, the processing duration is corresponding to performing target format conversion on the previous frame image and performing tagging processing during the target format conversion process Duration; in the case that the previous frame of image is not marked, the processing duration is the duration corresponding to the target format conversion of the previous frame of image; the third image processing module is used to convert the previous frame of image Convert the image to be processed into a first format to obtain a first target image in a first target format; if the processing time does not exceed a first preset time length, mark the image to be processed to obtain a second target image An object marking information, and marking the first object marking information on the first target image to obtain a second target image; converting the second target image into a third target image and displaying it; an early warning module , which is used to give a safety warning to the driving of the vehicle based on the displayed third target image.

In the fifth aspect, the embodiment of the present disclosure further provides a computer device, including: a processor, a memory, and a bus, the memory stores machine-readable instructions executable by the processor, and when the computer device is running, the processing The processor communicates with the memory through a bus, and when the machine-readable instruction is executed by the processor, it executes the steps of the first aspect above, or any possible image processing method in the first aspect, and executes when executed The steps of the detection method of the second aspect above.

In the sixth aspect, the embodiments of the present disclosure also provide a computer-readable storage medium, on which a computer program is stored, and when the computer program is run by a processor, the above-mentioned first aspect, or any of the first aspects in the first aspect, can be executed. The steps of a possible image processing method, and the steps of the detection method of the above second aspect during execution.

For the effect description of the above image processing apparatus, computer equipment and storage medium, please refer to the description of the above image processing method, which will not be repeated here.

In order to make the above-mentioned objects, features and advantages of the present disclosure more comprehensible, preferred embodiments will be described in detail below together with the accompanying drawings.

Description of drawings

In order to illustrate the technical solutions of the embodiments of the present disclosure more clearly, the following will briefly introduce the accompanying drawings used in the embodiments. The accompanying drawings here are incorporated into the specification and constitute a part of the specification. The drawings show the embodiments consistent with the present disclosure, and are used together with the description to explain the technical solution of the present disclosure. It should be understood that the following drawings only show some embodiments of the present disclosure, and therefore should not be regarded as limiting the scope. For those skilled in the art, they can also make From these drawings other related drawings are obtained.

FIG. 1 shows a flowchart of an image processing method provided by an embodiment of the present disclosure;

FIG. 2 shows a schematic flow diagram of a specific implementation process of an image processing process provided by an embodiment of the present disclosure;

FIG. 3 shows a schematic diagram of target pixels determined from a second target image provided by an embodiment of the present disclosure;

FIG. 4 shows a schematic diagram of an image processing device provided by an embodiment of the present disclosure;

Fig. 5 shows a schematic diagram of a detection device provided by an embodiment of the present disclosure;

FIG. 6 shows a schematic structural diagram of a computer device provided by an embodiment of the present disclosure.

Detailed ways

In order to make the purpose, technical solutions and advantages of the embodiments of the present disclosure clearer, the technical solutions in the embodiments of the present disclosure will be clearly and completely described below in conjunction with the accompanying drawings in the embodiments of the present disclosure. The described embodiments are only the present invention. Some, but not all, embodiments are disclosed. The components of the disclosed embodiments generally described and illustrated in the figures herein may be arranged and designed in a variety of different configurations. Accordingly, the following detailed description of the embodiments of the present disclosure provided in the accompanying drawings is not intended to limit the scope of the claimed disclosure, but merely represents selected embodiments of the present disclosure. Based on the embodiments of the present disclosure, all other embodiments obtained by those skilled in the art without creative effort shall fall within the protection scope of the present disclosure.

In addition, the terms "first" and "second" in the description and claims in the embodiments of the present disclosure and the above drawings are used to distinguish similar objects, and should not be used to describe a specific order or sequence . It is to be understood that the terms so used are interchangeable under appropriate circumstances such that the embodiments described herein can be practiced in sequences other than those illustrated or described herein.

"Plural or several" mentioned herein means two or more. "And/or" describes the association relationship of associated objects, indicating that there may be three types of relationships, for example, A and/or B may indicate: A exists alone, A and B exist simultaneously, and B exists independently. The character "/" generally indicates that the contextual objects are an "or" relationship.

After research, it is found that in order to ensure the real-time visualization of the image processing process, the existing tools are mainly based on soft real-time operating systems such as ubuntu or android, which have a wide open source foundation, and require the help of graphics processing units (GPUs). ) and other external devices, or rely heavily on the high-performance platform of the X86-64 instruction set, which greatly increases the cost of the tool. In order to reduce the cost of development tools, image processing is implemented on the QNX (Quick Unix) platform of the ARM development board. However, due to the insufficient computing power of the ARM development board, the image processing efficiency is low and time-consuming, which cannot meet the real-time visualization requirements of image processing.

Based on the above research, the present disclosure provides an image processing method, device, computer equipment and storage medium, using the function of ARM development board to process data in parallel, such as the single command multiple data parallel processing library Neon in the ARM development board and the CPU automatic With registers, the speed of image format conversion can be doubled, which meets the real-time performance of image processing; in addition, in general, the process of marking images is slow, which may lead to excessively long processing time, which in turn leads to display errors. The video stream freezes, that is, the display of two consecutive frames of images freezes, which cannot meet the smoothness requirements of image display. Based on the above-mentioned image format conversion that satisfies the real-time characteristics of image processing, in order to meet the fluency requirements of image display, it is necessary to ensure that the processing time cannot exceed the first preset time length before marking the image to be processed, and then the image to be processed The display process of the video stream composed of the previous frame image will not be stuck, and can meet the real-time visualization requirements of the first object marking information. In summary, using the ARM development board's ability to process data in parallel, the mechanism for judging whether to mark an image to be processed based on the processing time, and the first object mark information stored in the memory can meet the real-time visualization requirements of image processing.

The defects in the above solutions are all the results obtained by the inventor after practice and careful research. Therefore, the discovery process of the above problems and the solutions proposed by the present disclosure below for the above problems should be the result of the inventor Contributions made to this disclosure during the course of this disclosure.

It should be noted that like numerals and letters denote similar items in the following figures, therefore, once an item is defined in one figure, it does not require further definition and explanation in subsequent figures.

The specific nouns involved in the embodiments of the present disclosure are introduced below:

ARM processor (Advanced RISC Machines, ARM) is a RISC microprocessor with low power consumption and cost.

ARM development board, that is, an embedded development board with ARM core chip as the CPU and additional peripheral functions, used to evaluate the functions of the core chip and develop products of various technology companies.

The central processing unit (CPU for short) is the computing and control core of the computer system, and is the final execution unit for information processing and program operation.

Graphics Processing Unit (GPU), also known as display core, visual processor, and display chip, is a graphics processing unit designed for use in personal computers, workstations, game consoles, and some mobile devices (such as tablets, smartphones, etc.) Microprocessor for image and graphics related operations.

RISC: Reduced Instruction Set Computing (RISC) is a microprocessor that executes fewer types of computer instructions.

OpenCV is a cross-platform computer vision and machine learning software library released under the BSD license (open source), which can run on Linux, Windows, Android and Mac OS operating systems. It is lightweight and efficient. It consists of a series of C functions and a small number of C++ classes. It also provides interfaces for languages such as Python, Ruby, and MATLAB, and implements many general-purpose algorithms in image processing and computer vision.

FFmpeg is a set of open source computer programs that can be used to record, convert digital audio and video, and convert them into streams.

Neon is a 128-bit SIMD (Single Instruction, Multiple Data, Single Instruction, Multiple Data) extension structure for ARM processors.

YUV, is a color encoding method that is often used in various video processing components. YUV allows for reduced chroma bandwidth by taking human perception into account when encoding photos or videos. Among them, Y represents brightness, and U and V represent chroma. Can include UYVU format and NV12 format.

pack, a sorting method for managing added information, there is only a relationship between up, down, left, and right, and each added information is arranged in the order of addition.

Linear interpolation refers to the interpolation method in which the interpolation function is a polynomial, and its interpolation error on the interpolation node is zero.

BGR, the default channel of OpenCV. Among them, B represents blue, G represents green, and R represents red.

QNX, a commercial Unix-like real-time operating system that complies with the POSIX specification. It is a hard real-time operating system based on priority preemption.

The code rate is the number of data bits transmitted per unit time during data transmission, such as kbps, which is thousands of bits per second.

H264, is a digital video compression format,

Real Time Streaming Protocol, (Real Time Streaming Protocol, RTSP) is an application layer protocol in the TCP/IP protocol system, which is used to control the multimedia streaming protocol of audio or video, and allows multiple streaming requirements to be controlled at the same time.

Ubuntu is a Linux operating system mainly for desktop applications.

Android is a free and open source operating system based on the Linux kernel.

Linux is a UNIX-like operating system that is free to use and spread freely.

X86-64, short for 64-bit extended, is a 64-bit extension of the X86 architecture.

To facilitate the understanding of this embodiment, an image processing method disclosed in the embodiment of the present disclosure is firstly introduced in detail. The image processing method provided in the embodiment of the present disclosure is generally executed by an ARM processor in an ARM development board. The ARM development board here can store computer-readable instructions compiled by the QNX system. In some possible implementation manners, the image processing method may be implemented by calling a computer-readable instruction stored in a memory by an ARM processor.

The image processing method provided by the embodiment of the present disclosure will be described below by taking the execution subject as an ARM processor as an example.

Referring to FIG. 1 , which is a flowchart of an image processing method provided by an embodiment of the present disclosure, the method includes steps S101 to S104, wherein:

S101: Obtain the image to be processed and the processing time corresponding to the previous frame image of the image to be processed.

Wherein, the image to be processed and the previous frame of image may include an image captured by a shooting device, such as an image in a vehicle cabin captured by a vehicle camera, and the image may include an object, such as a driver and/or a passenger. The previous frame image may be a previous frame image of the current frame image (that is, the image to be processed).

The shooting device described here can be any camera driven by the QNX platform. Wherein, the QNX driver of the camera can be replaced, which is not specifically limited in this embodiment of the present disclosure.

The processing time in this step can be:

In the case of performing tagging processing on the previous frame image, the processing duration is the duration corresponding to performing target format conversion on the previous frame image and performing tagging processing during the target format conversion process.

In the case where the marking process is not performed on the previous frame of image, the processing duration is the duration corresponding to the target format conversion of the previous frame of image.

In some embodiments, the labeling process may be a process of labeling an object in an image in the first target format, specifically including detecting an object in an image in the first target format and identifying an object in the image in the first target format The tagging process performed by one or more of type, identifying a state of the object in the image in the first target format, identifying an attribute of the object in the image in the first target format. Exemplarily, the image in the first target format can be an image in UYVY format, and the human face in the image in UYVY format can be marked, specifically, a detection frame for a human face can be generated, and, in some embodiments, can be Tag processing occurs during target format conversion.

In some embodiments, the target format conversion may include a first format conversion and a second format conversion, wherein, performing the first format conversion on the image may obtain an image having the first target format, and performing the second format conversion on the image may obtain an image having the An image in the second target format. The image in the second target format may be transmitted to a display device for display. Exemplarily, the first target format may be BGR format, and the second target format may be NV12 format. The image with the first object format may be a BGR image, and the image with the second object format may be an NV12 image.

In some embodiments, the environmental image captured by the shooting device may be acquired in real time, that is, the environmental image may be used as the image to be processed.

In some other embodiments, the image captured by the shooting device does not necessarily include the object, therefore, the image that does not include the object can be filtered. Whether the image contains a preset object, such as a person. If the preset object is included, the environment image containing the preset object will be used as the image to be processed; if the preset object is not included, the environment image can be removed. Process the next frame of image.

Exemplarily, the format of the image to be processed may include UYVY format or NV12 format among YUV formats. The ARM processor independently develops an image acquisition process to acquire the image to be processed in UYVY format or NV12 format collected by the camera in real time, and then copies the image to be processed in two copies and stores them in memory A and memory B respectively.

S102: Perform first format conversion on the image to be processed to obtain a first target image in the first target format.

In some embodiments, in order to ensure the smoothness of the video stream displayed by the target device in real time, during the image processing process, it is necessary to ensure that the image format conversion process is processed first, that is, configure a higher priority for this process.

During specific implementation, a preset high-priority image format conversion process may be used to perform first format conversion on the image to be processed to obtain a first target image in the first target format. Wherein, the first target format may be a BGR format, and the first target image in the BGR format is a BGR image.

Exemplarily, taking the image to be processed in the UYVY format captured by the vehicle-mounted camera as an example, the image to be processed stored in the memory A is obtained by using the high-priority image format conversion process, and the image to be processed is converted into the first format, and the UYVY The format is converted to the BGR format, and a BGR image with the BGR format is obtained.

S103: In the case that the processing time does not exceed the first preset time length, obtain the first object marking information by marking the image to be processed, and mark the first object marking information on the first target image to obtain the second target image .

Wherein, the first object tag information may include tag information of objects in the image to be processed, or tag information of objects in historical images whose shooting time difference from the image to be processed is less than a second preset duration.

In general, the process of tagging images is slow, which may lead to a long processing time, which will cause the displayed video stream to freeze, that is, the display of two consecutive frames of images freezes, which cannot meet the smoothness requirements of image display. In order to meet the fluency requirements of image display, it is necessary to ensure that the processing time does not exceed the first preset time length before marking the image to be processed, so that the display process of the video stream composed of the image to be processed and the previous frame image will not be stuck , in order to meet the real-time visualization requirements of the first object label information.

In some embodiments, since the process of identifying the object in the image to be processed and generating the first object label information is relatively slow, in order not to affect the real-time performance of the image processing, the preset object recognition process can be set to low priority. Low priority has lower priority than high priority, and relatively fewer resources can be invoked. Exemplarily, in the case that the object is a driver, the object recognition process may include a face recognition algorithm module, that is, the driver's face may be recognized.

In some embodiments, during the process of performing the first format conversion on the image to be processed, the object in the image to be processed is marked, and after the first format conversion on the image to be processed is completed, the obtained For the first target image, when the marking information obtained by marking the object in the image to be processed is obtained, that is, when the marking information of the object in the image to be processed is stored in the memory, it can be obtained from The marking information of the object in the image to be processed is acquired in the memory, and used as the first object marking information, waiting to be marked for the object in the image to be processed.

Here, the process of storing the label information of the object in the image to be processed into the memory, specifically, acquiring the image to be processed in the memory B, transferring the image to be processed to the object recognition process, and using the object recognition algorithm in the object recognition process , such as a face recognition algorithm, recognizes the object in the image to be processed, obtains the first object tag information of the object, and stores the first object tag information in a memory, waiting to be called.

In some embodiments, when the first format conversion of the image to be processed is completed to obtain the first target image, since the process of identifying the object in the image to be processed and generating the tag information of the object is relatively slow, the , if the marking information obtained by marking the object in the image to be processed is not obtained, the marking information of the object in the historical image whose shooting time difference with the image to be processed is less than a second preset duration may also be obtained from the memory , and use the tag information of the object in the historical image as the first object tag information. For example, the label information of the object in the previous frame image is used as the first object label information.

Through this embodiment, when the tag information of the object in the image to be processed is acquired, the object in the image to be processed is marked with the tag information of the object in the image to be processed, and if the object in the image to be processed cannot be obtained In the case of marking information, the marking information of the object in the historical image can be used to mark the object in the image to be processed. Since the difference between the shooting time of the image to be processed and the object in the historical image is less than the second preset duration, the tag information of the object in the historical image will not be different from the tag information of the object in the image to be processed. In the case of a large difference, the object in the image to be processed can be marked by using the marking information of the object in the historical image.

Exemplarily, since the process of recognizing objects is slow, when the first target image is obtained after converting the first format with a high priority, the tag information corresponding to the object in the image to be processed may not have been generated, that is, it has not been stored in the memory If there is tag information corresponding to the object, then the tag information already stored in the memory and corresponding to the object in the historical image whose shooting time difference of the image to be processed is less than the second preset time length can be called at this time. Wherein, the historical image whose shooting time difference with the image to be processed is less than the second preset duration may include three frames of images before the current frame of the image to be processed. Here, in order to reduce the memory for storing the label information, only the label information corresponding to the object in the last three frames of historical images can be saved, thereby reducing the memory occupied by it and increasing the running speed of the algorithm. It should be noted that the second preset duration may also be set according to a specific application scenario, which is not limited in this embodiment of the present disclosure.

The first object label information may include but not limited to at least one of object detection frame information, object identifier, object state characteristic information, and object attribute characteristic information. Wherein, the detection frame information may include the coordinates of the center point of the detection frame, the size information of the detection frame, that is, the length and width, and the like. The object's identity identifier may be an identity indicating the identity information of the object, such as a driver's identity or a passenger's identity. The state characteristic information of the object may include the behavior of the object, such as playing with a mobile phone, holding the steering wheel, not wearing a seat belt, and so on. The attribute feature information of the object may include age stage attributes, such as old people, adults, children, and so on.

With regard to the first preset duration, for example, considering comprehensively the playback fluency of video streams composed of multiple frames of images, based on experience, the first preset duration may be set to 50 ms. It should be noted that, in different application scenarios, the first preset duration may also be set to other values according to empirical values, which are not limited in this embodiment of the present disclosure.

Here, the second target image is the first target object recorded with the first object mark information.

S104: Convert the second target image into a third target image, and send the third target image to the target device.

Here, since the second target image cannot be displayed by the target device, in this step, the second target image is converted into a third target image that can be displayed on the target device. converting the second format to obtain a third target image in the second target format. Wherein, the second target format is an image format that can be displayed by the target device, specifically, the second target format may be NV12 format, and the third target image may be an image in NV12 format. Here, the third target image contains the first object label information.

Since the target device cannot display the first target image in the first target format, it is necessary to convert the second target image in the first target format into the second format to obtain a third target image in the second target format that can be displayed by the target device. For the image, the second format conversion process utilizes the parallel data processing function of the ARM development board, which can increase the speed of converting the second target image into a displayable third target image.

Afterwards, the converted third target image can be stored in memory C.

The target device includes a display screen displaying a third target image. Send the third target image to the target device, specifically, obtain the third target image in the memory C through the preset video encoding process, and encode the third target image, for example, the video encoding process calls the QNX platform support The video encoding interface and the video processing unit perform encoding processing on the third target image, and the video processing unit is configured with an image encoding strategy and a code rate. The third target image is encoded into the H264 data video stream format, and then the encoded H264 data is published to the target device through an RTSP server for display. Wherein, the RTSP server is formed according to a specific network component library of QNX.

For the previous frame of image in S101, the processing of the previous frame of image may include the process of performing target format conversion on the previous frame of image, therefore, the processing duration may be the duration corresponding to the target format conversion of the previous frame of image.

In some embodiments, when it is determined that the image processing duration of the previous frame image of the previous frame image exceeds the first preset duration, the previous frame image is not marked, that is, the processing duration of the previous frame image is The duration corresponding to the target format conversion of the previous frame image, and does not include the duration of marking processing during the target format conversion process of the previous frame image. Based on this, the target format conversion is performed on the previous frame image. During specific implementation, the first format conversion can be performed on the previous frame image to obtain the first format image with the first target format; the second format conversion is performed on the first format image. format conversion to obtain a second format image with a second target format.

Exemplarily, when the first target format is the BGR format, the first format image is a BGR image, and when the second target format is the NV12 format, the second format image is an NV12 image.

In some other embodiments, when it is determined that the image processing duration of the previous frame image of the previous frame image does not exceed the first preset duration, the previous frame image may be marked, that is, the previous frame image The processing time is the time corresponding to the target format conversion of the previous frame image and the marking process during the target format conversion process. Based on this, the target format conversion is performed on the previous frame image, and marking processing is performed during the target format conversion process. In specific implementation, first, the first format conversion can be performed on the previous frame image to obtain the first target format image. A format image; after that, acquire the second object tag information in the memory, and mark the second object tag information on the first format image to obtain the first format tag image, wherein the second object tag information may include the previous frame The mark information of the object in the image or the historical image whose shooting time difference with the previous frame is less than the second preset duration; after that, the second format conversion is performed on the first format marked image to obtain the second format with the second target format image. It satisfies the real-time visualization requirement of marking the object in the previous frame image, obtaining and displaying the marking information of the second object.

Here, the second object marking information of the object includes at least one of the detection frame information of the object, the identity identifier of the object, the state characteristic information of the object, and the attribute characteristic information of the object.

In some embodiments, the reason why the processing time exceeds the first preset time may include marking the object in the previous frame of image, which causes the marking process to take too long, which in turn causes the processing time to exceed the first preset time. In order to meet the The real-time performance of image processing and the smoothness of image display require that when the processing time exceeds the first preset time length, the first target image can be directly converted to the second format to obtain the fourth target image with the second target format ; Send the fourth target image to the target device.

Here, when the processing duration exceeds the first preset duration, the first target image can be directly converted to the second format, avoiding the problem that the processing duration exceeds the first preset duration after subsequent marking processing is performed for the first target image. The situation guarantees the real-time performance of image processing and the fluency requirements of image display.

Exemplarily, the fourth target image may be an image in NV12 format.

Send the fourth target image to the target device, and use the target device to display the fourth target image. The detailed description of using the target device to display the fourth target image can refer to the above-mentioned process of sending the third target image to the target device, which will not be repeated here repeat.

Here, for two consecutive frames of images, one frame lacks the first object label information, which will not affect the display effect perceived by the user, and can still meet the real-time visualization requirements of image processing.

In one or some embodiments, when the processing duration does not exceed the first preset duration, since the processing process of identifying the object in the image to be processed and generating the first object tag information is relatively slow, by treating The process of processing images for object recognition sets a low priority, that is, the second preset priority, and sets a high priority for the image format conversion process, that is, the first preset priority, which can ensure that the image format is converted and sent to the target Real-time requirements for device display.

For the image format conversion process, during specific implementation, first, obtain the first preset priority; then, according to the first preset priority, allocate a second resource to the image processing process, and use the second resource to be processed through the image processing process The image undergoes first format conversion and/or second format conversion. In addition, optionally, the image processing process may also perform marking processing of the image to be processed under the condition that the first format conversion and the second format conversion are completed first.

Here, the image processing refers to performing at least one of the first format conversion, the second format conversion, and marking processing on the image to be processed by using the second resource. Wherein, for the marking process in the image processing, it may also be determined based on the processing duration whether the marking process is performed on the image to be processed.

For the object recognition process, during specific implementation, the second preset priority is obtained, the first resource is allocated to the object recognition processing process according to the second preset priority, and the first resource is used to perform object detection, object recognition, and object detection on the image to be processed. At least one of state recognition and object attribute recognition is tagged to determine first object tag information of the object in the image to be processed. Here, the resources refer to system resources or computing resources, such as system memory, CPU, and the like.

Here, the first preset priority is higher than the second preset priority, the first preset priority is the above-mentioned high priority, and the second preset priority may be the above-mentioned low priority. The resource amount of the first resource is less than the resource amount of the second resource.

Here, for the process of performing object recognition on the image to be processed and determining the first object label information of the object in the image to be processed, refer to the above detailed description of determining the first object label information, and repeating details will not be repeated here.

As mentioned above, since the acquisition frequency of the image acquisition device is relatively high, the image of the previous frame is relatively similar to the image to be processed, and the first object tag information corresponding to the image of the previous frame stored in the memory can also be used for the image of the current frame to be processed. The mark will not affect the display effect perceived by the user. Therefore, the processing in this embodiment can meet the real-time visualization requirement of image processing.

Regarding the above S101-S104, please refer to FIG. 2, which is a schematic diagram of a specific implementation flow of the image processing process. Reference numeral 21 represents the image to be processed acquired by the shooting device, reference numeral 22 represents the image format conversion process, reference numeral 23 represents the object recognition process, and reference numeral 24 represents the first object tag information stored in the memory. Mark 25 represents judging whether to allow marking of the first target image, wherein the permitted case may include the case that the processing time does not exceed the first preset time length, and the disallowed case may include the processing time exceeds the first preset time length. Reference numeral 26 indicates that the first target image is allowed to be marked to obtain a second target image, and reference numeral 27 indicates that the second target image including the mark is converted into a second format, and the converted image is stored in the memory C. Reference numeral 28 indicates that when the first target image is not allowed to be marked, the second format conversion is directly performed on the first target image, and the converted image is stored in the memory C. Reference numeral 29 denotes a memory C.

For the above-mentioned conversion of the first format of the image to be processed, for specific implementation, please refer to the following S1021-S1024:

S1021: Acquire the first color coding information of the image to be processed; the first color coding information includes a plurality of first brightness information and multiple sets of first color information, and each pixel in the image to be processed corresponds to a first brightness information, at least One piece of first brightness information corresponds to a group of first color information.

In this step, the first color coding information may include coding information for encoding the image to be processed in the UYVY format, wherein the UYVY format is one of the horizontal sampling and vertical full sampling formats in the YUV format. Alternatively, it may also be encoding information for encoding the image to be processed in the NV12 format, where the NV12 format is one of the YUV formats for horizontal sampling and vertical 2:1 sampling. Alternatively, it may also be encoding information for encoding the image to be processed in AYUV format, where the AYUV format is one of the full sampling formats in the YUV format.

Exemplarily, taking the image to be processed as an image in UYVY format as an example, the first color coding information includes multiple first brightness information, that is, multiple Ys; and multiple sets of first color information, that is, multiple sets of UV. Each pixel of the image to be processed corresponds to a Y, and every two pieces of first brightness information correspond to a set of first color information.

In some embodiments, the image size of the image to be processed is also acquired, including the width and height of the image to be processed.

S1022: Based on the first sort order of the first color coding information, extract a plurality of first brightness information in parallel to obtain a first information sequence, and, based on the first sort order of the first color coding information, extract multiple groups of first colors in parallel information to obtain the second information sequence.

In this step, the first sorting order of the first color-coded information may be the pack sorting order, that is, it is arranged according to the adding order.

For example, the UYVY format is arranged according to the order of addition, and the first sorting order of the first color-coded information obtained is UYVYUYVY . . . . Among them, Y, U, V are the elements of the pixel. In addition, the address of each element in the first sorting order UYVYUYVY . . . may be determined based on the acquired image size of the image to be processed, and elements at corresponding positions may be extracted in parallel according to the address.

Here, the arrangement manner of the first brightness information in the first color coding information conforms to a certain sorting feature, for example, it is located in an odd numbered position or an even numbered position. Continuing the above example, the sorting positions of UYVYUYVY... are 0th, 1st, 2nd, 3rd, 4th, 5th, 6th, 7th,.... In UYVYUYVY..., the first luminance information Y is located at odd-numbered bits, and the first color information U and V are located at even-numbered bits.

During specific implementation, after the first sort order of the first color coding information is determined, the address of the first element of the image to be processed may be determined. Based on the storage capacity of the registers in ARM, determine the parallel processing performance information of the ARM development board; based on the first sorting order and parallel processing performance information of the first color-coded information, starting from the first element address, Neon can be used to sequentially extract the first in parallel. A plurality of first brightness information in the sorting order to obtain the first information sequence; and based on the first sorting order and parallel processing performance information of the first color coding information, starting from the first element address, Neon can be used to sequentially extract the first in parallel The plurality of sets of first color information in the sequence are sorted to obtain a second information sequence.

Here, since each element occupies 8 bits, 8-bit symbols (including positive "+" and negative "-", etc.) are attached to the element calculation during the image format conversion process. Therefore, each element needs to occupy 16 bits of memory. The storage capacity of the register in the ARM may be 128 bits, and its parallel processing performance information may include a parallel extraction of 16 elements without a sign, or 8 elements with a sign.

Exemplarily, taking the image to be processed as an image in UYVY format as an example, the first sorting order is UYVYUYVY..., and the corresponding addresses can be 0, 1, 2, 3, 4, 5, 6, 7,..., which can be determined It can be determined that the sorting position of the first brightness information in the first color coding information is an odd number, that is, 1, 3, 5, 7, ..., and it can be determined that the sorting position of the first color information in the first color coding information is an even number bits, that is, 0, 2, 4, 6, ..., after that, starting from the first element address, you can use Neon to extract 8 first brightness information of odd bits in parallel each time, that is, addresses 1, 3, 5, 7, 9, 11, 13, and 15 correspond to Y, and then execute the extraction process cyclically to continuously obtain the first brightness information, and then determine the first information sequence as YYYYYYYY based on the obtained multiple first brightness information...; you can use Neon every Extract 4 sets of first color information with even bits in parallel, which are 4 first color sub-information U and 4 second color sub-information V, namely U corresponding to addresses 0, 4, 8, 12 and addresses 2, 6 , 10, and 14 corresponding to V, and then execute the extraction process cyclically to continuously obtain the first color information, and then determine the second information sequence as UVUVUVUV... based on the obtained multiple sets of first color information.

By extracting a plurality of first brightness information and a plurality of sets of first color information in parallel, the first information sequence and the second information sequence corresponding to a plurality of pixels in the image to be processed can be quickly obtained, thereby improving the image quality in the image to be processed. The efficiency of image format conversion for multiple pixels.

S1023: Based on the first quantity of the first brightness information corresponding to a set of first color information, the second information sequence, and the first information sequence, determine color coding sub-information corresponding to each pixel.

In this step, the first quantity of the first brightness information corresponding to a group of first color information may indicate the quantity of the first brightness information sharing a group of first color information. For example, for the UYVY format, a set of first color information corresponds to two first brightness information, that is, two first brightness information share a set of first color information; for the NV12 format, a set of first color information corresponds to four first Brightness information, that is, four first brightness information share a set of first color information; for the AYUV format, a set of first color information corresponds to one first brightness information, that is, one first brightness information shares a set of first color information.

Here, the first color information may include first color sub-information and second color sub-information. Specifically, for example, if the first color information is UV, the first color sub-information may be U, and the second color sub-information may be V. The color coding sub-information corresponding to each pixel includes first brightness information Y, the first color sub-information may be U, and the second color sub-information may be V.

Exemplarily, for an image to be processed in UYVY format, it may be determined that a set of first color information UV corresponds to two first brightness information Y, that is, the first number is two. In the case that the first information sequence is YYYY and the second information sequence is UVUV, the first first brightness information Y corresponding to the first information sequence corresponds to the first group of first color information UV; the first information sequence corresponds to the first The two first brightness information Ys correspond to the first group of first color information UV; the third first brightness information Y corresponding to the first information sequence corresponds to the second group of first color information UV; the first information sequence corresponds to the first The four first brightness information Ys correspond to the second group of first color information UV. Furthermore, it can be determined that the color coding sub-information corresponding to the first pixel in the image to be processed is the first first brightness information Y and the first group of first color information UV; it can be determined that the second pixel in the image to be processed The color coding sub-information corresponding to the point is the second first brightness information Y and the first group of first color information UV; it can be determined that the color coding sub-information corresponding to the third pixel in the image to be processed is the third first Brightness information Y and the second group of first color information UV; it can be determined that the color coding sub-information corresponding to the fourth pixel in the image to be processed is the fourth first brightness information Y and the second group of first color information UV. Similarly, according to the above process, the color coding sub-information of each pixel in the image to be processed can be determined by recycling other multiple first brightness information and other multiple sets of first color information extracted in parallel by Neon.

S1024: Obtain a first target image in a first target format based on the color-coded sub-information corresponding to each pixel.

In this step, the first target format may include but not limited to BGR format. In a case where it is determined that the image to be processed is an image in UYVY format, the first target image in the first target format may be an image in BGR format.

During specific implementation, first, based on the color coding sub-information corresponding to each pixel, respectively determine the third color coding information corresponding to each pixel in the first target format; then, based on the third color coding information corresponding to each pixel The information is encoded to obtain a first target image having a first target format.

Here, the third color coding information in the first target format corresponding to each pixel in the image to be processed may be calculated by using a linear interpolation function.

Here, in the case that the first target format is the BGR format, the third color coding information may include element B, element G and element R.

Exemplarily, taking the conversion of the image to be processed in UYVY format into the first target image in BGR format as an example, for the color coding sub-information corresponding to a pixel, that is, Y ₁ , U ₁ , V ₁ , use the linear interpolation function f (Y, U, V), determine the third color coding information in BGR format corresponding to the pixel point, recorded as B=αf(Y ₁ , U ₁ , V ₁ ), G=βf(Y ₁ , U ₁ , V ₁ ), R=γf(Y ₁ , U ₁ , V ₁ ), where, α means to calculate the fixed coefficient in the linear interpolation function corresponding to the B element in the pixel; β means to calculate the corresponding G element in the pixel The fixed coefficient in the linear interpolation function of ; γ means calculating the fixed coefficient in the linear interpolation function corresponding to the R element in the pixel. The foregoing α, β, and γ may be set according to actual application scenarios and empirical values, and are not specifically limited in the embodiments of the present disclosure. After the second color coding information B, G, R of the pixel is determined, it is determined that the pixel is converted from the UYVY format to the BGR format. Similarly, for each pixel in the image to be processed, the first target image in the first target format, that is, the BGR image in BGR format, is finally obtained according to the format conversion method of the above pixel.

In addition, after the BGR element of each pixel is calculated, the element corresponding to each pixel can be stored in memory D according to the order of each pixel, starting from the address of the first element in the BGR image.

The above S1021-S1024 utilize the Neon extension structure in the ARM development board and the registers that come with the CPU to extract multiple first brightness information and multiple sets of first color information in parallel from the first color coding information stored in the register. Since the parallel extraction can double the speed of information acquisition, it can realize the double acceleration of image format conversion on the CPU, which can meet the needs of real-time image format conversion. This embodiment does not depend on image processing devices such as GPU, and can reduce the hardware cost of image format conversion; in addition, this embodiment provides a general image format conversion method for ARM development boards for real-time image format conversion. In comparison, the power consumption and hardware cost of the ARM development board are lower.

In view of the above, the second format conversion is performed on the second target image to obtain a third target image in the second target format. Wherein, the second target format corresponds to the second color coding information; the second color coding information includes the second brightness information and the second color information; each pixel in the third target image corresponds to a second brightness information, at least one second The brightness information corresponds to a set of second color information.

Exemplarily, the second target format may include but not limited to NV12 format, and the second color coding information includes second brightness information Y and second color information UV. Each pixel of the image in the NV12 format corresponds to one piece of second brightness information, and four pieces of second brightness information correspond to a group of second color information.

Convert the second target image into a third target image in the second target format. For specific implementation, refer to the following S301-304:

S301: Obtain third color coding information corresponding to each pixel in the second target image.

Here, since the first object label information contained in the second target image is not involved in the second format conversion process, the third color coding information corresponding to each pixel in the second target image is the first The third color coding information corresponding to each pixel in the target image.

S302: Based on the third color coding information, perform parallel calculation to obtain second brightness information corresponding to each pixel in the third target image.

During specific implementation, each pixel determined above may correspond to the third color coding information in BGR format, that is, B=αf(Y, U, V), G=βf(Y, U, V), R=γf (Y, U, V), using a linear interpolation function to perform parallel calculations to obtain the second luminance information corresponding to each pixel of the second target image, that is, Y=δf(B, G, R), where δ represents calculation The fixed coefficients in the linear interpolation function corresponding to the second luminance information Y in the pixel may be defined according to empirical values, which are not specifically limited in the embodiments of the present disclosure.

Here, the parallel calculation can be to use Neon to extract in parallel the element B in the eight 8-bit third color-coded information stored in parallel, the element G in the eight 8-bit third color-coded information, and the eight 8-bit third color-coded information from the three registers. The element R in the three-color coding information obtains 8 groups of third color coding information BGR, that is, 8 pixel points, and then uses the linear interpolation function Y=δf(B, G, R) to calculate 8 groups of BGR in parallel to obtain each of them The second luminance information corresponding to the group BGR (each pixel). The Neon parallel calculation is called circularly until the format-converted second luminance information of each pixel corresponding to the second target image is obtained, and then the second luminance information corresponding to each pixel in the third target image is obtained.

S303: Based on the second quantity of the second brightness information corresponding to a set of second color information and the third color coding information, perform parallel calculation to obtain the second color information corresponding to each pixel in the third target image.

In this step, the second quantity of the second brightness information corresponding to a group of second color information may represent the quantity of the second brightness information sharing a group of second color information. Exemplarily, when the second target format is NV12 format, the second number is 4.

During specific implementation, based on the second quantity of the second luminance information corresponding to a set of second color information, the sorting feature information of the target pixel is determined; the target pixel includes pixels used to determine the second color information; based on the sorting feature information The third color coding information corresponding to each pixel in the second target image is determined to determine the third color coding information corresponding to the target pixel; based on the third color coding information corresponding to the target pixel, parallel calculation is performed to obtain the third target image The second color information corresponding to each pixel.

Here, since the second numbers are different, the number of determined target pixel points is different. For example, in the case where the second number is 4, that is, four second brightness information share a set of second color information, then four pixels in the second target image determine one target pixel, that is, the target pixel The number is a quarter of the number of pixels in the second target image. For details, refer to FIG. 3 , which is a schematic diagram of target pixels determined from the second target image. Among them, 31 represents the second target image of 4×4; 32 represents the pixels in the second target image, and there are 16 pixels in total; 33 represents the target pixels, and there are 4 in total, that is, the pixels in the 16 second target images A quarter of the number of pixels.

In the case where the second number is 4, the sorting feature information of the target pixel points is the position information of the pixels in the second target image in even rows and even columns, as shown in Figure 3, the 0th row, the 0th column, the 0th row Row 0, column 2, row 2, column 0, row 2, column 2.

Here, the parallel computing can be to use Neon to extract in parallel the elements B, element G and element R corresponding to the row and column positions corresponding to the stored sorting feature information from the three registers, and determine at least one set of third color coding information BGR, namely At least one pixel, and then use the linear interpolation function U=εf(B, G, R), V=θf(B, G, R) to calculate the extracted BGR in parallel to obtain each group of BGR (each pixel) Corresponding second color information U and V. Wherein, ε represents the calculation of the fixed coefficient in the linear interpolation function corresponding to the second color information U in the pixel, and θ represents the calculation of the fixed coefficient in the linear interpolation function corresponding to the second color information V in the pixel, which can be calculated according to Empirical value definitions are not specifically limited in the embodiments of the present disclosure. Afterwards, the Neon parallel calculation is called circularly until the second color information after the format conversion of each pixel corresponding to the second target image is obtained, and then the second color information corresponding to each pixel in the third target image is obtained.

S304: Obtain a third target image with a second target format based on the second brightness information and second color information corresponding to each pixel in the third target image.

Specifically, based on the second brightness information and the second color information corresponding to each pixel of the third target image, determine the second color coding information corresponding to each pixel; based on the second color coding information, obtain the format of the third target image.

Here, one pixel corresponds to one second brightness information, and according to the second amount of second brightness information corresponding to a group of second color information, it is determined that the second number of pixels share a group of second color information.

Exemplarily, when the second sub-format is the NV12 format, determine the second brightness information and second color information corresponding to each pixel of the 4×4 third target image, that is, YYYYYYYYYYYYYYYYYY and UVUVUVUV, then each The second color coding information corresponding to the pixel can be YYYYYYYYYYYYYYYY UVUVUVUV, that is, the third target image in NV12 format is YYYYYYYYYYYYYYYYUVUVUVUV.

After calculating the second brightness information and second color information of each pixel, store the first second brightness information in the generated third target image into the preset first address of the second brightness information, and follow the storing the rest of the second brightness information sequentially; storing the first group of second color information in the generated third target image into the preset first address of the second color information, and storing the rest of the second color information in order, The third target image is subsequently called from the memory based on the first address of the second brightness information and the first address of the second color information.

The above S301-S304, based on the third color coding information corresponding to a plurality of pixels stored in the registers in the ARM development board, can be calculated in parallel to obtain the second brightness information of the plurality of pixels, based on the information stored in the registers in the ARM development board The third color coding information corresponding to multiple pixels can be calculated in parallel to obtain the second color information of the multiple pixels. Compared with sequentially calculating the second brightness information and the second color information of each pixel, this embodiment can The calculation efficiency of the second brightness information and the second color information is doubled, thereby improving the efficiency of image format conversion.

In some embodiments, for S303, the third number of registers is determined based on the second number of second brightness information corresponding to a set of second color information; the third number of registers is used to store the third color coding information, and based on the register The third color coding information is stored for parallel calculation to obtain the second color information corresponding to each pixel of the third target image.

Exemplarily, for converting a BGR format image into an NV12 format image, four second brightness information shares a set of second color information. When calculating the second color information in parallel, when using Neon parallel calculation to extract the third color coding information BGR, Only 4 groups of BGRs can be extracted at a time, that is, BGRs of even-numbered rows and even-numbered columns, or BGRs of odd-numbered rows and odd-numbered columns. Neon parallel computing can process up to 8 groups of BGRs at the same time. If only 4 groups of BGRs extracted in parallel are used, it will be wasteful. Neon computing power, therefore, the third color coding information stored in two registers (that is, the third number is 2) can be called at the same time, and 8 groups of BGR can be extracted at the same time, and the linear interpolation function can be used to simultaneously calculate the corresponding color of 16 pixels The second color information improves the calculation efficiency of the second color information, thereby improving the image conversion efficiency of converting the BGR format image into the NV12 format image.

In addition, an embodiment of the present disclosure also provides a detection method, which is executed by a displayable device, such as the above-mentioned target device. Its application scenario can be a vehicle driving scenario to supervise drivers and passengers.

The displayable device acquires the image to be processed captured in the cabin through the RTSP protocol, processes the image to be processed by the above image processing method, and displays the processed third target image. Based on the displayed image of the third target, a safety warning is given to the driving of the vehicle. For example, the third target image includes the first object label information, and determine the status feature information of the driver and/or passengers based on the displayed first object label information, and determine whether a safety warning is required, for example, when the driver’s status feature information indicates that the driver has If there are problems with playing with mobile phones and not wearing seat belts, a safety warning prompt message will be sent to the driver in time. For example, when the first object tag information indicates that the attribute feature information of the object is a child, and the child is not sitting in the safety seat, the safety warning prompt information is sent to the passenger in time. Specific examples are not listed here one by one.

Based on the above-mentioned embodiments, using the function of ARM development board that can process data in parallel, such as the single-command multiple data parallel processing library Neon in the ARM development board and the registers that come with the CPU, can double the speed of image format conversion and meet the requirements of the image format. Real-time processing; in addition, under normal circumstances, the process of marking images is slow, which may lead to long processing time, which will cause the displayed video stream to freeze, that is, the display of two consecutive frames of images freezes, which cannot meet the requirements of the image. Fluency requirements for display. Based on the above-mentioned image format conversion that satisfies the real-time characteristics of image processing, in order to meet the fluency requirements of image display, it is necessary to ensure that the processing time cannot exceed the first preset time length before marking the image to be processed, and then the image to be processed The display process of the video stream composed of the previous frame image will not be stuck, and can meet the real-time visualization requirements of the first object marking information. In summary, using the ARM development board's ability to process data in parallel, the mechanism for judging whether to mark an image to be processed based on the processing time, and the first object mark information stored in the memory can meet the real-time visualization requirements of image processing.

Those skilled in the art can understand that in the above method of specific implementation, the writing order of each step does not mean a strict execution order and constitutes any limitation on the implementation process. The specific execution order of each step should be based on its function and possible The inner logic is OK.

Based on the same inventive concept, the embodiment of the present disclosure also provides an image processing device corresponding to the image processing method. Since the problem-solving principle of the image processing device in the embodiment of the present disclosure is similar to the above-mentioned image processing method in the embodiment of the present disclosure, the device For the implementation of the image processing device, reference may be made to the implementation of the image processing method, and repeated descriptions will not be repeated.

Referring to FIG. 4 , which is a schematic diagram of an image processing device provided by an embodiment of the present disclosure, the device includes: a first information acquisition module 401, an image conversion module 402, an image marking module 403, and a first image processing module 404; in,

The first information acquisition module 401 is configured to acquire the image to be processed, and the processing duration corresponding to the previous frame image of the image to be processed; wherein, in the case of marking the previous frame image, the processing The duration is the duration corresponding to performing target format conversion on the previous frame image and marking processing during the target format conversion process; if the previous frame image is not marked, the processing duration The duration corresponding to the conversion of the target format for the previous frame image;

An image conversion module 402, configured to convert the image to be processed into a first format to obtain a first target image in a first target format;

An image marking module 403, configured to obtain first object marking information by performing marking processing on the image to be processed when the processing time does not exceed a first preset time length, and mark the first object marking information Obtaining a second target image on the first target image;

The first image processing module 404 is configured to convert the second target image into a third target image, and send the third target image to the target device.

In an optional implementation manner, the first image processing module 404 is configured to perform second format conversion on the second target image to obtain a third target image in the second target format; wherein, the first The second target format is an image format that the target device can display.

In an optional implementation manner, the image marking module 403 is configured to use the marking information of the object in the image to be processed as the The first object tag information; if the tag information of the object in the image to be processed is not obtained, acquire the tag of the object in the historical image whose shooting time difference with the image to be processed is less than a second preset duration information, and use the tag information of the object in the historical image as the first object tag information.

In an optional implementation manner, the first image processing module 404 is further configured to: after determining the processing duration, if the processing duration exceeds the first preset duration, Converting a target image to a second format to obtain a fourth target image in the second target format; sending the fourth target image to the target device.

In an optional implementation manner, the device further includes an object recognition module 405 and a second image processing module 406;

The first information acquiring module 401 is further configured to acquire a first preset priority and a second preset priority after acquiring the image to be processed;

The object recognition module 405 is configured to allocate a first resource to the object recognition processing process according to the second preset priority, and use the first resource to target the image to be processed through the object recognition processing process identifying and determining first object tag information of the object in the image to be processed;

The second image processing module 406 is configured to allocate a second resource to an image processing process according to the first preset priority, and use the second resource to process the image to be processed through the image processing process At least one of the first format conversion and the second format conversion.

In an optional implementation manner, the image conversion module 402 is configured to obtain first color coding information of the image to be processed; the first color coding information includes a plurality of first brightness information and a plurality of sets of first color coding information. A color information, each pixel in the image to be processed corresponds to a first brightness information, at least one first brightness information corresponds to a group of first color information; based on the first sort order of the first color coding information, parallel extracting a plurality of first luminance information to obtain a first information sequence, and, based on the first sort order of the first color coding information, extracting multiple sets of first color information in parallel to obtain a second information sequence; based on a set of first The first quantity of the first brightness information corresponding to the color information, the second information sequence and the first information sequence determine the color coding sub-information corresponding to each pixel; based on the color coding sub-information corresponding to each pixel , to obtain a first target image with the first target format.

In an optional implementation manner, the second target format corresponds to second color coding information; the second color coding information includes second brightness information and second color information; each of the third target images A pixel corresponds to a piece of second brightness information, and at least one piece of second brightness information corresponds to a set of second color information;

The first image processing module 404 is configured to obtain the third color coding information corresponding to each pixel in the second target image; based on the third color coding information, perform parallel calculation to obtain the third target image The second brightness information corresponding to each pixel in the second color information; based on the second quantity of the second brightness information corresponding to the second color information and the third color coding information, parallel calculation is performed to obtain the third target image second color information corresponding to each pixel in the third target image; based on the second brightness information and second color information corresponding to each pixel in the third target image, a third target image with the second target format is obtained.

In an optional implementation manner, the first object marking information includes at least one of detection frame information of the object, an identifier of the object, state characteristic information of the object, and attribute characteristic information of the object.

In an optional implementation manner, the images to be processed include images captured in a vehicle cabin, and the objects include drivers and/or passengers.

For the description of the processing flow of each module in the image processing device and the interaction flow between the modules, reference may be made to the relevant description in the above embodiment of the image processing method, which will not be described in detail here.

Based on the same inventive concept, the embodiment of the disclosure also provides a detection device corresponding to the detection method. Since the principle of the detection device in the embodiment of the disclosure to solve the problem is similar to the above detection method of the embodiment of the disclosure, the implementation of the detection device can be Refer to the implementation of the detection method, and the repeated parts will not be repeated.

Referring to FIG. 5 , it is a schematic diagram of a detection device provided by an embodiment of the present disclosure. The detection device includes: a second information acquisition module 501, a third image processing module 502, and an early warning module 503; wherein,

The second information acquisition module 501 is used to acquire the image to be processed taken in the cabin and the processing duration corresponding to the previous frame image of the image to be processed; wherein, in the case of marking the previous frame image Next, the processing duration is the duration corresponding to performing target format conversion on the previous frame image and marking processing during the target format conversion process; if the previous frame image is not marked. , the processing duration is the duration corresponding to the target format conversion of the previous frame image;

The third image processing module 502 is configured to perform first format conversion on the image to be processed to obtain a first target image in the first target format; when the processing time does not exceed the first preset time length, by performing marking processing on the image to be processed to obtain first object marking information, and marking the first object marking information on the first target image to obtain a second target image; converting the second target image to The third target image, and display it;

The warning module 503 is configured to give a safety warning to the driving of the vehicle based on the displayed third target image.

For the description of the processing flow of each module in the detection device and the interaction flow between the modules, reference may be made to the relevant description in the above embodiment of the detection method, which will not be described in detail here.

Based on the same technical idea, the embodiment of the present application also provides a computer device. Referring to Figure 6, it is a schematic structural diagram of a computer device provided in the embodiment of the present application, including:

processor 61 , memory 62 and bus 63 . Wherein, the memory 62 stores machine-readable instructions executable by the processor 61, and the processor 61 is used to execute the machine-readable instructions stored in the memory 62. When the machine-readable instructions are executed by the processor 61, the processor 61 executes The following steps: S101: Acquire the image to be processed and the processing time corresponding to the previous frame image of the image to be processed; S102: Perform first format conversion on the image to be processed to obtain a first target image with the first target format; S103: If the processing duration does not exceed the first preset duration, the first object marking information is obtained by marking the image to be processed, and the first object marking information is marked on the first target image to obtain a second target image; S104 : Convert the second target image to the third target image and send the third target image to the target device.

Above-mentioned memory 62 comprises memory 621 and external memory 622; Memory 621 here is also called internal memory, is used for temporarily storing the operation data in processor 61, and the data exchanged with external memory 622 such as hard disk, processor 61 communicates with memory 621 through memory 621. The external memory 622 performs data exchange. When the computer device is running, the processor 61 communicates with the memory 62 through the bus 63, so that the processor 61 executes the execution instructions mentioned in the above method embodiments.

Embodiments of the present disclosure further provide a computer-readable storage medium, on which a computer program is stored, and when the computer program is run by a processor, the steps of the image processing method described in the foregoing method embodiments are executed. Wherein, the storage medium may be a volatile or non-volatile computer-readable storage medium.

An embodiment of the present disclosure further provides a computer program product, including computer instructions, and when the computer instructions are executed by a processor, the steps of the above-mentioned image processing method are implemented. Wherein, the computer program product may be any product capable of implementing the above-mentioned image processing method, and part or all of the solutions contributed by the computer program product may be embodied in the form of software products (such as software development kits (Software Development Kit, SDK)), The software product may be stored in a storage medium, and the computer instructions contained therein cause a relevant device or processor to execute some or all steps of the above-mentioned image processing method.

Those skilled in the art can clearly understand that for the convenience and brevity of the description, the specific working process of the device described above can refer to the corresponding process in the foregoing method embodiment, and details are not repeated here. In the several embodiments provided in the present disclosure, it should be understood that the disclosed devices and methods may be implemented in other ways. The device embodiments described above are only illustrative. For example, the division of the modules is only a logical function division. In actual implementation, there may be other division methods. For example, multiple modules or components can be combined. Or some features can be ignored, or not implemented. In another point, the mutual coupling or direct coupling or communication connection shown or discussed may be through some communication interfaces, and the indirect coupling or communication connection of devices or modules may be in electrical, mechanical or other forms.

The units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place, or may be distributed to multiple network units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional module in each embodiment of the present disclosure may be integrated into one processing module, each module may exist separately physically, or two or more modules may be integrated into one module.

If the functions are implemented in the form of software function modules and sold or used as independent products, they can be stored in a non-volatile computer-readable storage medium executable by a processor. Based on this understanding, the technical solution of the present disclosure is essentially or the part that contributes to the prior art or the part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, including Several instructions are used to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in various embodiments of the present disclosure. The aforementioned storage media include: U disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disc and other media that can store program codes. .

Finally, it should be noted that: the above-mentioned embodiments are only specific implementations of the present disclosure, and are used to illustrate the technical solutions of the present disclosure, rather than limit them, and the protection scope of the present disclosure is not limited thereto, although referring to the aforementioned The embodiments have described the present disclosure in detail, and those skilled in the art should understand that any person familiar with the technical field can still modify the technical solutions described in the foregoing embodiments within the technical scope disclosed in the present disclosure Changes can be easily imagined, or equivalent replacements can be made to some of the technical features; and these modifications, changes or replacements do not make the essence of the corresponding technical solutions deviate from the spirit and scope of the technical solutions of the embodiments of the present disclosure, and should be included in this disclosure. within the scope of protection. Therefore, the protection scope of the present disclosure should be defined by the protection scope of the claims.

Claims

An image processing method is characterized in that being applied to an ARM development board, comprising:

Acquiring the image to be processed, and the processing duration corresponding to the previous frame image of the image to be processed; wherein, in the case of marking the previous frame image, the processing duration is for the previous frame image Perform target format conversion and perform marking processing corresponding to the target format conversion process; in the case that the previous frame image is not marked, the processing time is the previous frame image The duration corresponding to the conversion of the target format;

performing a first format conversion on the image to be processed to obtain a first target image having a first target format;

When the processing duration does not exceed a first preset duration, first object marking information is obtained by performing marking processing on the image to be processed, and marking the first object marking information on the first target image , get the second target image;

The second target image is converted to a third target image, and the third target image is sent to a target device.
The image processing method according to claim 1, wherein said converting the second target image into a third target image comprises:

performing second format conversion on the second target image to obtain a third target image in the second target format; wherein the second target format is an image format displayable by the target device.
The image processing method according to claim 1 or 2, wherein the first object marking information obtained by marking the image to be processed comprises:

Marking objects in the image to be processed during the first format conversion process of the image to be processed;

When the first format conversion of the image to be processed is completed to obtain the first target image, in response to acquiring marking information obtained by marking objects in the image to be processed, converting the image to be processed into processing tagging information of objects in the image as the first object tagging information;

When the conversion of the first format of the image to be processed is completed to obtain the first target image, in response to not obtaining the marking information obtained by marking the object in the image to be processed, acquiring the same as the first target image The mark information of the object in the historical image whose shooting time difference of the image to be processed is less than the second preset duration is used as the first object mark information.
The image processing method according to any one of claims 1 to 3, wherein the method further comprises:

When the processing duration exceeds the first preset duration, performing second format conversion on the first target image to obtain a fourth target image having the second target format;

The fourth target image is sent to the target device.
The image processing method according to any one of claims 1 to 4, wherein the method further comprises:

In the case that the processing duration does not exceed the first preset duration,

Allocating a first resource to the object recognition processing process according to a second preset priority, and using the first resource to perform object detection, object recognition, object state recognition and object attribute on the image to be processed through the object recognition processing process identifying at least one labeling process to determine first object labeling information for an object in said image to be processed;

Allocating a second resource to the image processing process according to the first preset priority, and using the second resource to perform at least one of the first format conversion and the second format conversion on the image to be processed through the image processing process item; wherein, the first preset priority is higher than the second preset priority.
The image processing method according to claim 1, wherein said converting the image to be processed into a first format to obtain a first target image with a first target format comprises:

Acquiring first color coding information of the image to be processed; the first color coding information includes a plurality of first brightness information and multiple sets of first color information, and each pixel in the image to be processed corresponds to a first Brightness information, at least one piece of first brightness information corresponds to a set of first color information;

Based on the first sorting order of the first color coding information, extracting a plurality of first brightness information in parallel to obtain a first information sequence, and, based on the first sorting order of the first color coding information, extracting multiple sets of first brightness information in parallel a color information to obtain a second information sequence;

Determine the color coding sub-information corresponding to each pixel based on the first quantity of the first brightness information corresponding to a set of first color information, the second information sequence, and the first information sequence;

Based on the color-coded sub-information corresponding to each pixel in the image to be processed, a first target image in the first target format is obtained.
The image processing method according to claim 2, wherein the second target format corresponds to second color coding information; the second color coding information includes second brightness information and second color information; Each pixel in the three target images corresponds to a second brightness information, and at least one second brightness information corresponds to a set of second color information;

The converting the second target image to the second format to obtain a third target image with the second target format includes:

Acquiring third color coding information corresponding to each pixel in the second target image;

Obtaining second brightness information corresponding to each pixel in the third target image through parallel calculation based on the third color coding information;

Based on a set of second quantities of second brightness information corresponding to the second color information and the third color coding information, perform parallel calculations to obtain second color information corresponding to each pixel in the third target image;

A third target image with the second target format is obtained based on the second brightness information and second color information corresponding to each pixel in the third target image.
The image processing method according to any one of claims 1 to 7, wherein the first object marking information includes detection frame information of the object, an identity identifier of the object, state feature information of the object, the At least one of the attribute feature information of the object.
The image processing method according to any one of claims 1 to 8, characterized in that the image to be processed includes an image taken in a vehicle cabin, and the object includes a driver and/or a passenger.
A detection method, characterized in that, comprising:

Acquiring the processing duration corresponding to the image to be processed taken in the cabin and the previous frame image of the image to be processed; wherein, in the case of marking the previous frame image, the processing duration is The target format conversion of the previous frame image, and the duration corresponding to the marking process during the target format conversion process; if the marking process is not performed on the previous frame image, the processing duration is for the The duration corresponding to the conversion of the target format of the previous frame image;

performing a first format conversion on the image to be processed to obtain a first target image having a first target format;

If the processing duration does not exceed a first preset duration, first object marking information is obtained by marking the image to be processed, and marking the first object marking information on the first target image On, get the second target image;

converting the second target image into a third target image and displaying it;

Based on the displayed image of the third target, a safety warning is given to the driving of the vehicle.
An image processing device, characterized in that it comprises:

The first information acquisition module is used to acquire the image to be processed, and the processing duration corresponding to the previous frame image of the image to be processed; wherein, in the case of marking the previous frame image, the processing duration In order to perform target format conversion on the previous frame image, and perform marking processing corresponding to the duration during the target format conversion process; in the case that the previous frame image is not marked, the processing duration is Performing the duration corresponding to the target format conversion on the previous frame image;

An image conversion module, configured to convert the image to be processed into a first format to obtain a first target image in a first target format;

An image marking module, configured to obtain first object marking information by performing marking processing on the image to be processed when the processing time does not exceed a first preset time length, and mark the first object marking information in Obtaining a second target image on the first target image;

A first image processing module, configured to convert the second target image into a third target image, and send the third target image to the target device.
A detection device is characterized in that it comprises:

The second information acquisition module is used to acquire the image to be processed taken in the cabin and the processing duration corresponding to the previous frame image of the image to be processed; wherein, in the case of marking the previous frame image , the processing duration is the duration corresponding to performing target format conversion on the previous frame image and marking processing during the target format conversion process; in the case of not performing marking processing on the previous frame image, The processing duration is the duration corresponding to the target format conversion of the previous frame image;

The third image processing module is configured to convert the image to be processed into a first format to obtain a first target image in the first target format; when the processing time does not exceed the first preset time length, by converting performing marking processing on the image to be processed to obtain first object marking information, and marking the first object marking information on the first target image to obtain a second target image; converting the second target image into a first target image Three target images and display them;

The early warning module is used to give a safety warning to the driving of the vehicle based on the displayed third target image.
A computer device, characterized in that it includes: a processor, a memory, and a bus, the memory stores machine-readable instructions executable by the processor, and when the computer device is running, the connection between the processor and the memory The machine-readable instructions are executed by the processor through the bus, and the steps of the image processing method according to any one of claims 1 to 9 are executed, or the steps of the detection method according to claim 10 are executed. step.
A computer-readable storage medium, characterized in that a computer program is stored on the computer-readable storage medium, and when the computer program is run by a processor, the steps of the image processing method according to any one of claims 1 to 9 are executed , or, execute the steps of the detection method as claimed in claim 10.