WO2024001764A1 - 一种图像处理方法、装置、存储介质及电子装置 - Google Patents

一种图像处理方法、装置、存储介质及电子装置 Download PDF

Info

Publication number
WO2024001764A1
WO2024001764A1 PCT/CN2023/100014 CN2023100014W WO2024001764A1 WO 2024001764 A1 WO2024001764 A1 WO 2024001764A1 CN 2023100014 W CN2023100014 W CN 2023100014W WO 2024001764 A1 WO2024001764 A1 WO 2024001764A1
Authority
WO
WIPO (PCT)
Prior art keywords
abscissa
image
ordinate
moving target
maximum moving
Prior art date
Application number
PCT/CN2023/100014
Other languages
English (en)
French (fr)
Inventor
闫心刚
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Publication of WO2024001764A1 publication Critical patent/WO2024001764A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/136Segmentation; Edge detection involving thresholding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/254Analysis of motion involving subtraction of images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/41Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Definitions

  • Embodiments of the present disclosure relate to the field of communications, and specifically, to an image processing method, device, storage medium, and electronic device.
  • Figure 1 is a schematic diagram of the motion detection algorithm of the camera main control chip in related technologies. As shown in Figure 1, it mainly includes three steps:
  • Step 1 Get the background frame. Assume the current video image frame is the nth frame, and calculate the moving average of the previous n-1 frames as the background frame. Since the acquisition of the background frame is related to the previous n-1 frames, during the tracking process, after the motor movement stops, it is necessary to wait for n-1 frames before the correct background frame can be updated for the next moving target extraction. , the real-time performance of tracking is poor (the real-time performance here has nothing to do with the amount of calculation, it is just because you have to wait for n-1 frames).
  • Step 2 binarize the difference image. Difference the current frame and the background frame and take the absolute value to obtain the difference image. Set a single threshold Th, set the value greater than Th in the difference image to 1, and set the value less than or equal to Th to 0, to obtain a binary image.
  • This single-threshold binarization method makes it difficult to set a threshold that simultaneously meets the requirements of extracting image change areas and reducing the impact of image noise. Moreover, when the color of the moving target is close to that of the image background area, the moving target area cannot be extracted.
  • Step 3 Find the moving target area. Divide the binary image into m ⁇ k sub-regions, and calculate the sum of pixel values in each sub-region. When it is greater than a certain threshold, the sub-region is considered to be a moving region, otherwise it is a static region. This method can distinguish whether there is a moving area in the image, but it cannot accurately extract the coordinates of the moving target. And when there are multiple moving targets, the coordinates of a specific moving target cannot be extracted.
  • Embodiments of the present disclosure provide an image processing method, device, storage medium and electronic device to at least solve the problem of the inability to accurately extract the largest moving target in related technologies.
  • the moving target is similar to the background color, the false detection rate is high due to the reliance on multiple past frames.
  • an image processing method including:
  • the motor is controlled according to the regional coordinates to track the maximum moving target.
  • an image processing device includes:
  • the first determination module is configured to determine the difference image between the current frame image and the previous frame image in the video image collected by the video acquisition device;
  • a binarization module configured to perform binarization processing based on a dynamic binarization threshold on the differential image to obtain a binarized image
  • the second determination module is configured to perform abscissa statistics and ordinate statistics on the binary image, and determine the regional coordinates of the largest moving target based on the abscissa statistics results and the ordinate statistics results;
  • a tracking module is configured to control the motor to track the maximum moving target according to the regional coordinates.
  • a computer-readable storage medium is also provided, and a computer program is stored in the storage medium, wherein the computer program is configured to execute any of the above method embodiments when running. steps in.
  • an electronic device including a memory and a processor.
  • a computer program is stored in the memory, and the processor is configured to run the computer program to perform any of the above. Steps in method embodiments.
  • Figure 1 is a schematic diagram of the motion detection algorithm of the camera main control chip in related technologies
  • Figure 2 is a hardware structure block diagram of a mobile terminal of an image processing method according to an embodiment of the present disclosure
  • Figure 3 is a flow chart of an image processing method according to an embodiment of the present disclosure.
  • Figure 4 is a flow chart of an image processing method according to an optional embodiment of the present disclosure.
  • Figure 5 is a schematic diagram of motion tracking of a video surveillance scene according to this embodiment.
  • Figure 6 is a schematic diagram of extracting moving target coordinates based on bar chart statistics in this embodiment
  • Figure 7 is a diagram showing the relationship between the center coordinates of the moving target in the horizontal direction and the center of the field of view according to this embodiment
  • FIG. 8 is a block diagram of an image processing device according to an embodiment of the present disclosure.
  • FIG. 2 is a hardware structural block diagram of a mobile terminal of the image processing method according to an embodiment of the present disclosure.
  • the mobile terminal may include one or more (only one is shown in Figure 2 ) processor 102 (the processor 102 may include but is not limited to a processing device such as a microprocessor MCU or a programmable logic device FPGA) and a memory 104 for storing data, wherein the above-mentioned mobile terminal may also include a processor for communication functions.
  • Transmission device 106 and input and output device 108 may be executed in a mobile terminal, a computer terminal, or a similar computing device.
  • FIG. 2 is only illustrative, and it does not limit the structure of the above-mentioned mobile terminal.
  • the mobile terminal may also include more or fewer components than shown in FIG. 2 , or have a different configuration than shown in FIG. 2 .
  • the memory 104 can be used to store computer programs, for example, software programs and modules of application software, such as the computer programs corresponding to the image processing methods in the embodiments of the present disclosure.
  • the processor 102 executes various operations by running the computer programs stored in the memory 104 .
  • Functional applications and business chain address pool slicing processing implement the above method.
  • Memory 104 may include high-speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory.
  • the memory 104 may further include memory located remotely relative to the processor 102, and these remote memories may be connected to the mobile terminal through a network. Examples of the above-mentioned networks include but are not limited to the Internet, intranets, local area networks, mobile communication networks and combinations thereof.
  • Transmission device 106 is used to receive or send data via a network.
  • Specific examples of the above-mentioned network may include a wireless network provided by a communication provider of the mobile terminal.
  • the transmission device 106 includes a network adapter (Network Interface Controller, NIC for short), which can be connected to other network devices through a base station to communicate with the Internet.
  • the transmission device 106 may be a radio frequency (Radio Frequency, RF for short) module, which is used to communicate with the Internet wirelessly.
  • NIC Network Interface Controller
  • FIG. 3 is a flow chart of the image processing method according to the embodiment of the present disclosure. As shown in Figure 3, the process includes the following steps:
  • Step S302 determine the difference image between the current frame image and the previous frame image in the video image collected by the video acquisition device
  • Step S304 perform binarization processing based on dynamic binarization threshold on the difference image to obtain a binarized image
  • the binarization threshold is dynamically changed.
  • the detection accuracy of the largest moving target can be improved by dynamically adjusting the binarization threshold.
  • Step S306 Perform abscissa statistics and ordinate statistics on the binary image, and determine the area coordinates of the largest moving target based on the abscissa statistics and ordinate statistics;
  • Step S308 Control the motor to track the maximum moving target according to the area coordinates.
  • the maximum moving target cannot be accurately extracted, the false detection rate is high when the moving target is similar to the background color, and the real-time tracking is poor due to relying on past multiple frames of images to obtain the background frame.
  • the maximum moving target can be tracked by using the image and the previous frame. There is no need to obtain the background frame based on past multiple frames of images.
  • the binarization threshold is dynamically changed. When the color of the moving target is similar to the background, the binarization threshold is dynamically adjusted. , can accurately extract the regional coordinates of the largest moving target, improving the accuracy of detecting the largest moving target and the real-time tracking of the largest moving target.
  • the video collection device may be a camera
  • the motor may be a stepper motor.
  • a stepper motor on the chassis of the hardware device that can control the horizontal movement of the upper part of the device.
  • a stepper motor inside the device. The motor controls the vertical movement of the camera, and the orientation of the camera changes as the motor rotates.
  • the above-mentioned step S2302 may specifically include: scaling the current frame image and the previous frame image to a preset size respectively to obtain the scaled current frame image and the previous frame image; determining the scaled current frame according to pixel points. The absolute value of the pixel difference between the image and the previous frame image is used to obtain the difference image.
  • the above-mentioned step S304 may specifically include: dividing the scaled current frame image or the previous frame image into m*n areas; counting the variance or image entropy of each area in the current frame image or the previous frame image. ; Use the variance or image entropy of each area in the current frame image or the previous frame image as the background distribution value of the corresponding area of the image; determine the sum of the background distribution value and the preset offset as the binarization threshold, where the preset The offset can be preset according to the actual situation and is a preset value; the difference image is binarized based on the binarization threshold to obtain a binarized image.
  • the image background distribution value is introduced to calculate the dynamic binarization threshold. For product development, it is easier to debug parameters; for tracking effects, it is easier to extract moving targets that are close to the background color.
  • the method before the above step S308, the method further includes: obtaining the last area coordinates of the moving target determined based on the previous frame image and adjacent frame images; determining the The intersection ratio of the previous area coordinates of the mobile target and the current area coordinates of the mobile target is determined, and it is determined that the intersection ratio is greater than a preset threshold.
  • three consecutive frames of images can accurately extract the coordinates of the moving target. There is no need to obtain the background frame based on past multiple frames of images, which improves the mobility Real-time tracking.
  • FIG. 4 is a flow chart of an image processing method according to an optional embodiment of the present disclosure. As shown in Figure 4, the above step S306 may specifically include:
  • Step S402 Sum the binary image along the ordinate direction to obtain a first abscissa statistical result, and extract the first abscissa of the maximum moving target according to the first abscissa statistical result;
  • Step S404 with the range of the first abscissa as a constraint, sum the binary image along the abscissa direction to obtain a first ordinate statistical result, and extract the maximum value according to the first ordinate statistical result.
  • Step S406 Taking the range of the first abscissa and the range of the first ordinate as constraints, sum the binary image again along the ordinate direction to obtain a second abscissa statistical result. According to The second abscissa statistical result extracts the second abscissa of the maximum moving target;
  • Step S408 Determine the area coordinates of the maximum moving target according to the first ordinate and the second abscissa.
  • the initial ordinate of the moving target is first extracted along the longitudinal direction; the abscissa of the moving target is extracted along the transverse direction with the initial ordinate range as a constraint; and then the final ordinate of the moving target is extracted using the initial ordinate and abscissa range as constraints.
  • the above-mentioned step S402 may specifically include: forming the first abscissa array W_hist1[w_begin1:w_end1] from the first value that is not 0 to the last value that is not 0 in the first abscissa statistical result. Then add N-1 zeros to obtain the second abscissa array W_hist1[w_begin1:w_end1+N-1, where W_hist1 is the first abscissa statistical result, and w_begin1 is the first abscissa statistical result.
  • w_end1 is the last value that is not 0 in the first abscissa statistical result; correlate W_hist1[w_begin1:w_end1+N-1 with a one-dimensional array of length N and value 1 Operation to obtain the first correlation operation result, extract the longest and continuous first area coordinate [w_b1, w_e1] from the first correlation operation result, and determine the first area coordinate [w_b1, w_e1] as the maximum movement The first abscissa of the target.
  • the above step S404 may specifically include: forming the first ordinate array H_hist[h_begin:h_end] from the first value that is not 0 to the last value that is not 0 in the first ordinate statistical result. Supplement N-1 zeros to obtain the second ordinate array H_hist[h_begin:h_end+N-1, where H_hist is the first ordinate statistical result, and h_begin is the first of the first ordinate statistical results.
  • h_end is the last value that is not 0 in the abscissa statistical result; combine the second ordinate array H_hist[h_begin:h_end+N-1 with a one-dimensional array of length N and value 1 Perform a correlation operation to obtain a second correlation operation result, extract the longest and continuous second area coordinate [h_b1, h_e1] from the second correlation operation result, and determine the second area coordinate [h_b1, h_e1] as The first ordinate of the maximum moving target.
  • the above step S406 may specifically include: forming the third abscissa array W_hist2[w_begin2:w_end2] from the first value that is not 0 to the last value that is not 0 in the second abscissa statistical result. Supplement N-1 zeros to obtain the fourth abscissa array W_hist2[w_begin2:w_end2+N-1, where W_hist2 is the second abscissa statistical result, and w_begin2 is the first of the second abscissa statistical results.
  • w_end2 is the index corresponding to the last value that is not 0 in the second abscissa statistical result;
  • the fourth abscissa array W_hist2[w_begin2: w_end2+N-1 performs a correlation operation with a one-dimensional array of length N and value 1 to obtain the third correlation operation result, and extract the longest and continuous third region from the third correlation operation result coordinates [w_b2, w_e2], and determine the third area coordinates [w_b2, w_e2] as the second abscissa of the moving target.
  • This method can accurately extract the coordinate area of the largest moving target.
  • the vertical-horizontal-vertical method specifically includes: summing the binarized image along the abscissa direction to obtain a second ordinate statistical result, and extracting the maximum moving target according to the second ordinate statistical result.
  • the second ordinate sum the binary image along the ordinate direction to obtain the third abscissa statistical result, and use the range of the second ordinate as a constraint to extract all the data based on the third abscissa statistical result.
  • the third abscissa of the maximum moving target sum the binarized image again along the abscissa direction to obtain the third ordinate statistical result, and calculate the range of the second ordinate and the third abscissa
  • the range of coordinates is a constraint, and the third ordinate of the maximum moving target is extracted according to the third ordinate statistical result; the area coordinates of the moving target are determined according to the third abscissa and the third ordinate.
  • the longest continuous area that is not 0 is extracted to obtain the area coordinates of the largest moving target.
  • one-dimensional array correlation operations are introduced to process inconsistencies within the moving target in the binary image.
  • the continuous area further improves the accuracy of the extracted moving target coordinates.
  • the above step S408 may specifically include: determining the area coordinates of the maximum moving target as [w_b2, h_b1, w_e2, h_e1], where [h_b1, h_e1] is the first ordinate, [w_b2, w_e2] is the second abscissa, (w_b2, h_b1) represents the coordinates of the upper left corner of the maximum moving target, and (w_e2, h_e1) represents the coordinates of the lower right corner of the maximum moving target.
  • step S208 may specifically include:
  • the first angle ⁇ at which the motor rotates in the horizontal direction is determined according to the zoomed width value of the current frame image and the field of view angle of the video capture device. Specifically, the first angle ⁇ can be determined by the following formula:
  • the second angle ⁇ at which the motor rotates in the vertical direction is determined according to the zoomed height value of the current frame image and the field of view angle of the video capture device. Specifically, the second angle ⁇ can be determined by the following formula:
  • (x, y) is the central ⁇ center coordinate of the area coordinates
  • Sized_W is the width value of the current frame image after scaling
  • Sized_H is the height value of the current frame image after scaling
  • is the The field of view of the video collection device
  • the video capture device is controlled to rotate a first angle ⁇ in the horizontal direction and a second angle ⁇ in the vertical direction to track the maximum moving target.
  • Figure 5 is a schematic diagram of motion tracking in a video surveillance scene according to this embodiment, as shown in Figure 5:
  • the resolution of the real-time image is W ⁇ H. In this embodiment, W is 1280 and H is 720. Scale img 1 and img 2 to Sized_H ⁇ Sized_W. In this embodiment, Sized_H takes a value of 240 and Sized_W takes a value of 144.
  • the scaled images are recorded as sized_img 1 and sized_img 2 .
  • the interpolation method for scaling selects the nearest neighbor difference.
  • diff_img[h,w]
  • h is the ordinate index of the image, 0 ⁇ h ⁇ Sized_H
  • w is the abscissa index of the image, 0 ⁇ w ⁇ Sized_W.
  • std_area[i,j] Std(Area[i,j]);
  • Std() is a function to find the data variance.
  • the variance of each area is used as the background distribution value of the corresponding area of the image, and the image background is recorded as bg_img, then is the rounding down operator.
  • the binarization threshold here is a dynamic value, which is the sum of the background distribution value and the fixed offset TH. Different images or different areas in the image have different binarization thresholds.
  • the fixed offset TH generally takes a value between 0255 and relies on human debugging experience. In this embodiment, TH is set to 25.
  • Figure 6 is a schematic diagram of extracting moving target coordinates according to bar chart statistics in this embodiment. As shown in Figure 6, it includes:
  • W_hist_corr[w_begin: w_end] The index corresponding to the first value of W_hist that is not 0 is recorded as w_begin, and the index corresponding to the last value that is not 0 is recorded as w_end.
  • N-1 zeros after W_hist[w_begin:w_end] to obtain W_hist[w_begin:w_end+N-1].
  • the value of N is taken to be 6.
  • W_hist[w_begin:w_end+N-1] is related to the one-dimensional array Weight with length N and value 1, which is used to process the discontinuous area inside the moving target in the binary image. The result is recorded as W_hist_corr[w_begin: w_end].
  • the relevant operations are:
  • step (2)1 extract the starting and ending index of the longest continuous non-0 sequence, recorded as h_b1 and h_e1, and determine whether to update h_b1.
  • step (2) extract the starting and ending indexes of the longest continuous sequence, recorded as w_b2 and w_e2. And determine whether to update w_b2.
  • the ROI coordinates of the maximum moving target are [w_b2, h_b1, w_e2, h_e1], (w_b2, h_b1) represents the coordinates of the upper left corner of the maximum moving target, and (w_e2, h_e1) represents the coordinates of the lower right corner of the maximum moving target.
  • step (5) If the calculated intersection ratio Iou is greater than the threshold TH iou , proceed to step (5), otherwise proceed to step (1) again.
  • Controlling the motor to track a moving target means keeping the moving target in the center of the camera's field of view.
  • Figure 7 is a diagram showing the relationship between the center coordinates of the moving target in the horizontal direction and the center of the field of view according to this embodiment.
  • the center coordinates of the maximum moving target area Roi2 are expressed as (x, y), and the field of view angle of the camera is ⁇ , and the angle between the moving target and the center of the field of view is ⁇ , which is the angle at which the motor wants to rotate.
  • ⁇ and x satisfy the following relationship:
  • Sized_W is the width value of the image after scaling in step (1). From the above formula, the angle value ⁇ that the motor needs to rotate in the horizontal direction can be calculated. In the same way, the angle value ⁇ that the motor needs to rotate in the vertical direction can be solved.
  • the horizontal motor rotates by ⁇ angle and the vertical motor rotates by ⁇ angle respectively.
  • step (1) After the motor movement stops, perform step (1) again.
  • FIG. 8 is a block diagram of an image processing device according to an embodiment of the present disclosure. As shown in Figure 8, the device includes:
  • the first determination module 82 is configured to determine the difference image between the current frame image and the previous frame image in the video image collected by the video collection device;
  • the binarization module 84 is configured to perform binarization processing based on a dynamic binarization threshold on the differential image to obtain a binarized image;
  • the second determination module 86 is configured to perform abscissa statistics and ordinate statistics on the binary image, and determine the regional coordinates of the largest moving target based on the abscissa statistics and ordinate statistics;
  • the tracking module 88 is configured to control the motor to track the maximum moving target according to the regional coordinates.
  • the first determination module 82 is further configured to respectively scale the current frame image and the previous frame image to a preset size to obtain the scaled current frame image and previous frame image; according to the pixel Click to confirm the current zoomed
  • the absolute value of the pixel difference between the frame image and the previous frame image is used to obtain the difference image.
  • the binarization module 84 is further configured to divide the scaled current frame image or the previous frame image into m*n regions; and make statistics of the current frame image or the previous frame image.
  • the variance or image entropy of each area in the previous frame image use the variance or image entropy or image entropy of each area in the current frame image or the previous frame image as the background distribution value of the corresponding area of the image;
  • the sum of the background distribution value and the preset offset is determined as a binarization threshold; the difference image is binarized based on the binarization threshold to obtain the binarized image.
  • the second determining module 86 includes:
  • the first extraction sub-module is configured to sum the binary image along the ordinate direction to obtain the first abscissa statistical result, and extract the first abscissa of the maximum moving target according to the first abscissa statistical result.
  • the second extraction sub-module is configured to take the range of the first abscissa as a constraint, sum the binary image along the abscissa direction, and obtain the first ordinate statistical result. According to the first ordinate statistics As a result, the first ordinate of the maximum moving target is extracted;
  • the third extraction sub-module is configured to use the range of the first abscissa and the range of the first ordinate as constraints, sum the binary image again along the ordinate direction, and obtain the second abscissa. Coordinate statistics results: extract the second abscissa of the maximum moving target according to the second abscissa statistics;
  • the determination sub-module is configured to determine the area coordinates of the maximum moving target according to the first ordinate and the second abscissa.
  • the second determination module 86 is also configured to sum the binarized image along the abscissa direction to obtain a second ordinate statistical result, and extract the result according to the second ordinate statistical result.
  • the binary image is summed along the ordinate direction to obtain a third abscissa statistical result.
  • the maximum moving target is extracted according to the third abscissa statistical result.
  • the binarized image is summed again along the abscissa direction to obtain a third ordinate statistical result.
  • the third ordinate of the maximum moving target is extracted from the three ordinate statistical results;
  • the area coordinates of the moving target are determined according to the third abscissa and the third ordinate.
  • the first extraction sub-module is further configured to extract the first abscissa array formed from the first value that is not 0 to the last value that is not 0 in the first abscissa statistical result. Add N-1 zeros to obtain the second abscissa array;
  • the coordinate is the first abscissa of the maximum moving target, wherein the preset array is a one-dimensional array with a length of N and a value of 1.
  • the second extraction sub-module is further configured to add the first ordinate array formed from the first non-zero value to the last non-zero value in the first ordinate statistical result. N-1 0s, get the second ordinate array;
  • the second area coordinate is the first ordinate of the maximum moving target.
  • the third extraction sub-module is further configured to add the third abscissa array formed from the first value that is not 0 to the last value that is not 0 in the second abscissa statistical result. N-1 0s, get the fourth abscissa array;
  • the determination sub-module is further configured to
  • the area coordinates of the maximum moving target are determined to be [w_b2, h_b1, w_e2, h_e1], where the first ordinate is [h_b1, h_e1] and the second abscissa is [w_b2, w_e2], (w_b2 , h_b1) represents the coordinates of the upper left corner of the maximum moving target, (w_e2, h_e1) represents the coordinates of the lower right corner of the maximum moving target.
  • the tracking module 88 is also configured to
  • the first angle at which the motor rotates in the horizontal direction is determined according to the zoomed width value of the current frame image and the field of view angle of the video capture device:;
  • control The video capture device rotates a first angle in the horizontal direction and a second angle in the vertical direction to track the maximum moving target.
  • the device further includes:
  • An acquisition module configured to acquire the last area coordinates of the moving target determined based on the previous frame image and adjacent frame images
  • the third determination module is configured to determine the intersection and union ratio of the previous area coordinates of the moving target and the current area coordinates of the moving target, and determine that the intersection and union ratio is greater than a preset threshold.
  • Embodiments of the present disclosure also provide a computer-readable storage medium that stores a computer program, wherein the computer program is configured to execute the steps in any of the above method embodiments when running.
  • the computer-readable storage medium may include but is not limited to: U disk, read-only memory (Read-Only Memory, referred to as ROM), random access memory (Random Access Memory, referred to as RAM) , mobile hard disk, magnetic disk or optical disk and other media that can store computer programs.
  • ROM read-only memory
  • RAM random access memory
  • mobile hard disk magnetic disk or optical disk and other media that can store computer programs.
  • Embodiments of the present disclosure also provide an electronic device, including a memory and a processor.
  • a computer program is stored in the memory, and the processor is configured to run the computer program to perform the steps in any of the above method embodiments.
  • the above-mentioned electronic device may further include a transmission device and an input-output device, wherein the transmission device is connected to the above-mentioned processor, and the input-output device is connected to the above-mentioned processor.
  • modules or steps of the present disclosure can be implemented using general-purpose computing devices, and they can be concentrated on a single computing device, or distributed across a network composed of multiple computing devices. They may be implemented in program code executable by a computing device, such that they may be stored in a storage device for execution by the computing device, and in some cases may be executed in a sequence different from that shown herein. or the described steps, or they are respectively made into individual integrated circuit modules, or multiple modules or steps among them are made into a single integrated circuit module. As such, the present disclosure is not limited to any specific combination of hardware and software.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)
  • Studio Devices (AREA)

Abstract

本公开实施例提供了一种图像处理方法、装置、存储介质及电子装置,该方法包括:确定视频采集装置采集的视频图像中当前帧图像与上一帧图像的差分图像;对差分图像进行基于动态二值化阈值的二值化处理,得到二值化图像;对二值化图像进行横坐标统计与纵坐标统计,并基于横坐标统计结果与纵坐标统计结果确定最大移动目标的区域坐标;根据区域坐标控制电机对最大移动目标进行追踪。本方法可以解决相关技术中无法准确提取最大移动目标,移动目标与背景颜色相近时误检率高,由于依赖过往多帧图像求背景帧而追踪实时性差的问题,可以准确提取最大移动目标的区域坐标,提高最大移动目标追踪的实时性。

Description

一种图像处理方法、装置、存储介质及电子装置
相关申请的交叉引用
本公开基于2022年06月30日提交的发明名称为“一种图像处理方法、装置、存储介质及电子装置”的中国专利申请CN202210761434.8,并且要求该专利申请的优先权,通过引用将其所公开的内容全部并入本公开。
技术领域
本公开实施例涉及通信领域,具体而言,涉及一种图像处理方法、装置、存储介质及电子装置。
背景技术
目前公开产品的实现方法多是借助摄像头主控芯片厂商提供的移动侦测算法实现移动追踪。图1是相关技术中摄像头主控芯片的移动侦测算法的示意图,如图1所示,其主要包括三个步骤:
步骤1,求取背景帧。设当前视频图像帧为第n帧,对前n-1帧求取滑动平均,作为背景帧。由于背景帧的求取与前n-1帧有关系,所以在追踪过程中,电机运动停止后,需要再等待n-1帧的时间,才能更新正确的背景帧,进行下一次的移动目标提取,追踪的实时性较差(此处的实时性与计算量大小无关,只是因为要等待n-1帧的时间)。
步骤2,二值化差分图像。将当前帧与背景帧作差,并取绝对值,获得差分图像。设定单一阈值Th,将差分图像中大于Th的值置为1,小于等于Th的值置为0,获得二值化图像。这种单一阈值的二值化方法,较难设置阈值同时满足提取图像变化区域及降低图像噪点影响的要求,且在移动目标与图像背景区域颜色较为接近时,无法提取移动目标区域。
步骤3,求取移动目标区域。将二值化图像划分为m×k个子区域,求取每个子区域的像素值之和,当其大于某一阈值时,则认为该子区域是运动区域,否则为静止区域。该方式可以区分出图像中是否有移动区域,但无法准确提取移动目标坐标。且有多个移动目标时,无法提取某个具体移动目标的坐标。
针对相关技术中无法准确提取最大移动目标,移动目标与背景颜色相近时误检率高,由于依赖过往多帧图像求背景帧而追踪实时性差的问题,尚未提出解决方案。
发明内容
本公开实施例提供了一种图像处理方法、装置、存储介质及电子装置,以至少解决相关技术中无法准确提取最大移动目标,移动目标与背景颜色相近时误检率高,由于依赖过往多帧图像求背景帧而追踪实时性差的问题。
根据本公开的一个实施例,提供了一种图像处理方法,所述方法包括:
确定视频采集装置采集的视频图像中当前帧图像与上一帧图像的差分图像;
对所述差分图像进行基于动态二值化阈值的二值化处理,得到二值化图像;
对所述二值化图像进行横坐标统计与纵坐标统计,并基于横坐标统计结果与纵坐标统计 结果确定最大移动目标的区域坐标;
根据所述区域坐标控制电机对所述最大移动目标进行追踪。
根据本公开的另一个实施例,提供了一种图像处理装置,所述装置包括:
第一确定模块,设置为确定视频采集装置采集的视频图像中当前帧图像与上一帧图像的差分图像;
二值化模块,设置为对所述差分图像进行基于动态二值化阈值的二值化处理,得到二值化图像;
第二确定模块,设置为对所述二值化图像进行横坐标统计与纵坐标统计,并基于横坐标统计结果与纵坐标统计结果确定最大移动目标的区域坐标;
追踪模块,设置为根据所述区域坐标控制电机对所述最大移动目标进行追踪。
根据本公开的又一个实施例,还提供了一种计算机可读的存储介质,所述存储介质中存储有计算机程序,其中,所述计算机程序被设置为运行时执行上述任一项方法实施例中的步骤。
根据本公开的又一个实施例,还提供了一种电子装置,包括存储器和处理器,所述存储器中存储有计算机程序,所述处理器被设置为运行所述计算机程序以执行上述任一项方法实施例中的步骤。
附图说明
图1是相关技术中摄像头主控芯片的移动侦测算法的示意图;
图2是本公开实施例的图像处理方法的移动终端的硬件结构框图;
图3是根据本公开实施例的图像处理方法的流程图;
图4是根据本公开可选实施例的图像处理方法的流程图;
图5是根据本实施例的视频监控场景的移动追踪的示意图;
图6是根据本实施例的条形图统计提取移动目标坐标的示意图;
图7是根据本实施例的移动目标中心坐标在水平方向上与视野中央的关系图;
图8是根据本公开实施例的图像处理装置的框图。
具体实施方式
下文中将参考附图并结合实施例来详细说明本公开的实施例。
需要说明的是,本公开的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。
本公开实施例中所提供的方法实施例可以在移动终端、计算机终端或者类似的运算装置中执行。以运行在移动终端上为例,图2是本公开实施例的图像处理方法的移动终端的硬件结构框图,如图2所示,移动终端可以包括一个或多个(图2中仅示出一个)处理器102(处理器102可以包括但不限于微处理器MCU或可编程逻辑器件FPGA等的处理装置)和用于存储数据的存储器104,其中,上述移动终端还可以包括用于通信功能的传输设备106以及输入输出设备108。本领域普通技术人员可以理解,图2所示的结构仅为示意,其并不对上述移动终端的结构造成限定。例如,移动终端还可包括比图2中所示更多或者更少的组件,或者具有与图2所示不同的配置。
存储器104可用于存储计算机程序,例如,应用软件的软件程序以及模块,如本公开实施例中的图像处理方法对应的计算机程序,处理器102通过运行存储在存储器104内的计算机程序,从而执行各种功能应用以及业务链地址池切片处理,即实现上述的方法。存储器104可包括高速随机存储器,还可包括非易失性存储器,如一个或者多个磁性存储装置、闪存、或者其他非易失性固态存储器。在一些实例中,存储器104可进一步包括相对于处理器102远程设置的存储器,这些远程存储器可以通过网络连接至移动终端。上述网络的实例包括但不限于互联网、企业内部网、局域网、移动通信网及其组合。
传输设备106用于经由一个网络接收或者发送数据。上述的网络具体实例可包括移动终端的通信供应商提供的无线网络。在一个实例中,传输设备106包括一个网络适配器(Network Interface Controller,简称为NIC),其可通过基站与其他网络设备相连从而可与互联网进行通讯。在一个实例中,传输设备106可以为射频(Radio Frequency,简称为RF)模块,其用于通过无线方式与互联网进行通讯。
在本实施例中提供了一种运行于上述移动终端或网络架构的图像处理方法,图3是根据本公开实施例的图像处理方法的流程图,如图3所示,该流程包括如下步骤:
步骤S302,确定视频采集装置采集的视频图像中当前帧图像与上一帧图像的差分图像;
步骤S304,对差分图像进行基于动态二值化阈值的二值化处理,得到二值化图像;
本实施例中,采用二值化阈值是动态变化的,对于移动目标与背景颜色相近时,通过动态调整二值化阈值,可以提高最大移动目标的检测准确性。
步骤S306,对二值化图像进行横坐标统计与纵坐标统计,并基于横坐标统计结果与纵坐标统计结果确定最大移动目标的区域坐标;
步骤S308,根据区域坐标控制电机对最大移动目标进行追踪。
通过上述步骤S302至S308,可以解决相关技术中无法准确提取最大移动目标,移动目标与背景颜色相近时误检率高,由于依赖过往多帧图像求背景帧而追踪实时性差的问题,通过当前帧图像与上一帧图像便可追踪最大移动目标,不需要根据过往多帧图像求背景帧,采用二值化阈值是动态变化的,对于移动目标与背景颜色相近时,通过动态调整二值化阈值,可以准确提取最大移动目标的区域坐标,提高了检测最大移动目标的准确性与追踪的实时性。
本实施例中,视频采集装置具体可以是摄像头,电机具体可以是步进电机,在硬件设备底盘位置有一个可以控制设备上半部分水平运动的步进电机,同时,在设备内部还有一个步进电机控制摄像头的竖直方向运动,摄像头的朝向随着电机的转动而改变。
本实施例中,上述步骤S2302具体可以包括:分别将当前帧图像与上一帧图像缩放到预设尺寸,得到缩放后的当前帧图像与上一帧图像;按照像素点确定缩放后的当前帧图像与上一帧图像的像素差值的绝对值,得到差分图像。
本实施例中,上述步骤S304具体可以包括:将缩放后的当前帧图像或上一帧图像划分为m*n个区域;统计当前帧图像或上一帧图像中每个区域的方差或图像熵;将当前帧图像或上一帧图像中每个区域的方差或图像熵作为图像对应区域的背景分布值;将背景分布值与预设偏移量之和确定为二值化阈值,其中,预设偏移可以根据实际情况预设设置,为预设的数值;基于二值化阈值对所述差分图像进行二值化处理,得到二值化图像。引入了图像背景分布值,计算动态二值化阈值。对于产品开发,调试参数更为容易;对于追踪效果,更易提取出与背景颜色接近的移动目标。
在一可选的实施例中,在上述步骤S308之前,所述方法还包括:获取上一次基于所述上一帧图像与相邻帧图像确定的所述移动目标的上一区域坐标;确定所述移动目标的上一区域坐标与所述移动目标的当前区域坐标的交并比,并确定所述交并比大于预设阈值。
采用相邻两帧求取差分图像与求取连续两次最大移动目标交并比方式,连续三帧图像,即可准确提取移动目标坐标,不需要根据过往多帧图像求背景帧,提高了移动追踪的实时性。
图4是根据本公开可选实施例的图像处理方法的流程图,如图4所示,上述步骤S306具体可以包括:
步骤S402,对所述二值化图像沿纵坐标方向求和,得到第一横坐标统计结果,根据所述第一横坐标统计结果提取所述最大移动目标的第一横坐标;
步骤S404,以所述第一横坐标的范围为约束,对所述二值化图像沿横坐标方向求和,得到第一纵坐标统计结果,根据所述第一纵坐标统计结果提取所述最大移动目标的第一纵坐标;
步骤S406,以所述第一横坐标的范围与所述第一纵坐标的范围为约束,对所述二值化图像沿所述纵坐标方向再次求和,得到第二横坐标统计结果,根据所述第二横坐标统计结果提取所述最大移动目标的第二横坐标;
步骤S408,根据所述第一纵坐标与所述第二横坐标确定所述最大移动目标的区域坐标。
对二值化图像进行条形图统计(即横坐标方向统计与纵坐标方向统计)提取移动目标。采用先沿纵方向提取移动目标初始纵坐标;以初始纵坐标范围为约束,沿横方向提取移动目标的横坐标;再以初始纵坐标及横坐标范围为约束,提取移动目标最终的纵坐标。
本实施例中,上述步骤S402具体可以包括:将第一横坐标统计结果中以第一个不为0的值至最后一个不为0的值形成的第一横坐标数组W_hist1[w_begin1:w_end1]后补充N-1个0,得到第二横坐标数组W_hist1[w_begin1:w_end1+N-1,其中,W_hist1为所述第一横坐标统计结果,w_begin1为所述第一横坐标统计结果中第一个不为0的值,w_end1为所述第一横坐标统计结果中最后一个不为0的值;将W_hist1[w_begin1:w_end1+N-1与长度为N、值为1的一维数组进行相关运算,得到第一相关运算结果,从第一相关运算结果中提取最长且连续的第一区域坐标[w_b1,w_e1],并确定所述第一区域坐标[w_b1,w_e1]为所述最大移动目标的第一横坐标。
本实施例中,上述步骤S404具体可以包括:将第一纵坐标统计结果中第一个不为0的值至最后一个不为0的值形成的第一纵坐标数组H_hist[h_begin:h_end]后补充N-1个0,得到第二纵坐标数组H_hist[h_begin:h_end+N-1,其中,H_hist为所述第一纵坐标统计结果,h_begin为所述第一纵坐标统计结果中第一个不为0的值,h_end为所述横坐标统计结果中最后一个不为0的值;将第二纵坐标数组H_hist[h_begin:h_end+N-1与长度为N、值为1的一维数组进行相关运算,得到第二相关运算结果,从所述第二相关运算结果中提取最长且连续的第二区域坐标[h_b1,h_e1],并确定所述第二区域坐标[h_b1,h_e1]为所述最大移动目标的第一纵坐标。
本实施例中,上述步骤S406具体可以包括:将第二横坐标统计结果中第一个不为0的值至最后一个不为0的值形成的第三横坐标数组W_hist2[w_begin2:w_end2]后补充N-1个0,得到第四横坐标数组W_hist2[w_begin2:w_end2+N-1,其中,W_hist2为所述第二横坐标统计结果,w_begin2为所述第二横坐标统计结果中第一个不为0的值对应的索引,w_end2为所述第二横坐标统计结果中最后一个不为0的值对应的索引;将第四横坐标数组 W_hist2[w_begin2:w_end2+N-1与长度为N、值为1的一维数组进行相关运算,得到第三相关运算结果,从所述第三相关运算结果中提取最长且连续的第三区域坐标[w_b2,w_e2],并确定所述第三区域坐标[w_b2,w_e2]为所述移动目标的第二横坐标。
将上述纵-横-纵的顺序换为横-纵-横的顺序同样试用。采用该方式可以准确提取最大移动目标的坐标区域。对于纵-横-纵的方式,具体包括:对所述二值化图像沿横坐标方向求和,得到第二纵坐标统计结果,根据所述第二纵坐标统计结果提取所述最大移动目标的第二纵坐标;对所述二值化图像沿纵坐标方向求和,得到第三横坐标统计结果,以所述第二纵坐标的范围为约束,根据所述第三横坐标统计结果提取所述最大移动目标的第三横坐标;对所述二值化图像沿所述横坐标方向再次求和,得到第三纵坐标统计结果,以所述第二纵坐标的范围与所述第三横坐标的范围为约束,根据所述第三纵坐标统计结果提取所述最大移动目标的第三纵坐标;根据所述第三横坐标与所述第三纵坐标确定所述移动目标的区域坐标。各个步骤的具体实施方式与上述横-纵-横的具体实施方式类似,在此不再赘述。
本公开实施例中,提取最长的连续不为0的区域,得到最大移动目标的区域坐标,在提取最大移动目标时,引入了一维数组相关运算,处理二值化图像中移动目标内部的不连续区域,进一步提高了所提取移动目标坐标的准确度。
本实施例中,上述步骤S408具体可以包括:确定所述最大移动目标的区域坐标为[w_b2,h_b1,w_e2,h_e1],其中,[h_b1,h_e1]为第一纵坐标,[w_b2,w_e2]为第二横坐标,(w_b2,h_b1)表示所述最大移动目标的左上角坐标,(w_e2,h_e1)表示所述最大移动目标的右下角坐标。
本实施例中,上述步骤S208具体可以包括:
根据缩放后的所述当前帧图像的宽度值与所述视频采集装置的视场角确定所述电机在水平方向上转动的第一角度,具体可以通过以下公式确定第一角度α:
根据缩放后的所述当前帧图像的高度值与所述视频采集装置的视场角确定所述电机在竖直方向上转动的第二角度,具体可以通过以下公式确定第二角度β:
其中,(x,y)为所述区域坐标的中β心坐标,Sized_W为缩放后的所述当前帧图像的宽度值,Sized_H为缩放后的所述的当前帧图像的高度值,θ为所述视频采集装置的视场角;
控制所述视频采集装置在所述水平方向上转动第一角度α,在所述竖直方向上转动第二角度β以对所述最大移动目标进行追踪。
以下对本实施例作进一步详细说明。
图5是根据本实施例的视频监控场景的移动追踪的示意图,如图5所示:
(1)计算相邻两帧差分图像;
获取摄像头实时图像,记为img1,获取该帧图像的下一帧图像,记为img2。图像中每个 像素值的取值范围为[0,255]。实时图像的分辨率为W×H,在本实施例中,W取1280,H取720。将img1和img2缩放到Sized_H×Sized_W,在本实施例中,Sized_H取值240,Sized_W取值为144,将缩放后的图像记为sized_img1和sized_img2。此处缩放的插值方法选择最临近差值。
按像素点求取sized_img1和sized_img2差值的绝对值,表示为diff_img:
diff_img[h,w]=|sized_img1[h,w]-sized_img2[h,w]|;
式中h为图像的纵坐标索引,0≤h<Sized_H,w为图像的横坐标索引,0≤w<Sized_W。
(2)动态阈值二值化差分图像;
将图像sized_img1或图像sized_img2分为m×n个区域,在本实施例中,m取6,n取6,则每个区域的大小为40×24,记为Area[i,j],0≤i,j<6i,j为区域索引,统计每个区域的方差
std_area[i,j]=Std(Area[i,j]);
Std()是求取数据方差函数。
将每个区域的方差做为图像对应区域的背景分布值,图像背景记为bg_img,则 为向下取整运算符。
将diff_img进行二值化,二值化后的图像记为Binary_img:
此处二值化阈值为动态值,是背景分布值与固定偏移量TH之和。不同的图像或者图像中的不同区域,二值化阈值不同。固定偏移量TH,一般取值在0255之间,依靠人为调试经验,在本实施例中,TH取为25。
(3)条形图统计提取移动目标坐标;
图6是根据本实施例的条形图统计提取移动目标坐标的示意图,如图6所示,包括:
①将二值化图像Binary_img沿纵坐标方向求和,获d
将W_hist第一个不为0的值对应的索引记为w_begin,最后一个不为0的值对应的索引记为w_end。在W_hist[w_begin:w_end]后补充N-1个0,获得W_hist[w_begin:w_end+N-1]。在本实施例中,取N的值为6。将W_hist[w_begin:w_end+N-1]与长度为N、值为1的一维数组Weight求取相关,用于处理二值化图像中移动目标内部的不连续区域,结果记为W_hist_corr[w_begin:w_end]。相关运算为:
式中,w_begin≤k<w_end。
计算W_hist_corr[w_begin:w_end]中最长的连续不为0数列的起始和终止索引,记为w_b1与w_e1。其中,若w_b1≠w_begin,则更新w_b1的值为w_b1-N+1。
②将二值化图像Binary_img沿横坐标方向求和,获得H_hist,此处横方向求和索引范围为[w_b1,w_e1]:
按照步骤(2)①所述相同方式,提取最长的连续不为0数列的起始和终止索引,记为 h_b1与h_e1,并判断是否要更新h_b1。
③再次将二值化图像Binary_img沿纵坐标方向求和,获得W_hist,此处纵方向求和索引范围为[h_b1,h_e1],横方向索引范围为[w_b1,w_e1]:
按照步骤(2)①所述相同方式,提取最长的连续数列起始和终止索引,记为w_b2与w_e2。并判断是否要更新w_b2。
获得最大移动目标的ROI坐标为[w_b2,h_b1,w_e2,h_e1],(w_b2,h_b1)表示最大移动目标的左上角坐标,(w_e2,h_e1)表示最大移动目标的右下角坐标。判断最大移动目标ROI面积Sroi是否大于最小面积阈值Sth,如果Sroi≥Sth,则进行步骤(4),否则重新进行步骤(1)。
(4)计算连续两次提取最大移动目标区域的交并比;
更新当前帧图像,按照步骤(1)-(3)所述方法再次提取最大移动目标区域Roi2。如图x所示,计算ROI与ROI2的交并比:
如果计算所得交并比Iou大于阈值THiou,则进行步骤(5),否则重新进行步骤(1)。
(5)控制电机追踪移动目标。
控制电机追踪移动目标,就是使移动目标处于摄像头视野中央。图7是根据本实施例的移动目标中心坐标在水平方向上与视野中央的关系图,图7所示,将最大移动目标区域Roi2的中心坐标表示为(x,y),摄像头的视场角为θ,移动目标与视野中央的夹角为α,α即电机要转动的角度。α与x满足如下关系:
Sized_W为步骤(1)中图像缩放后的宽度值。由上式即可求出电机水平方向需要转动的角度值α。同理,可求解出电机在竖直方向需要转动的角度值β。分别水平电机转动α角度,竖直电机转动β角度。
电机运动停止后,再次进行步骤(1)。
根据本公开的另一个实施例,提供了一种图像追踪处理装置,图8是根据本公开实施例的图像处理装置的框图,如图8所示,所述装置包括:
第一确定模块82,设置为确定视频采集装置采集的视频图像中当前帧图像与上一帧图像的差分图像;
二值化模块84,设置为对所述差分图像进行基于动态二值化阈值的二值化处理,得到二值化图像;
第二确定模块86,设置为对所述二值化图像进行横坐标统计与纵坐标统计,并基于横坐标统计结果与纵坐标统计结果确定最大移动目标的区域坐标;
追踪模块88,设置为根据所述区域坐标控制电机对所述最大移动目标进行追踪。
在一实施例中,第一确定模块82,还设置为分别将所述当前帧图像与所述上一帧图像缩放到预设尺寸,得到缩放后的当前帧图像与上一帧图像;按照像素点确定缩放后的所述当前 帧图像与所述上一帧图像的像素差值的绝对值,得到所述差分图像。
在一实施例中,所述二值化模块84,还设置为将缩放后的所述当前帧图像或所述上一帧图像划分为m*n个区域;统计所述当前帧图像或所述上一帧图像中每个区域的方差或图像熵;将所述当前帧图像或所述上一帧图像中每个区域的方差或图像熵或图像熵作为图像对应区域的背景分布值;将所述背景分布值与预设偏移量之和确定为二值化阈值;基于所述二值化阈值对所述差分图像进行二值化处理,得到所述二值化图像。
在一实施例中,所述第二确定模块86包括:
第一提取子模块,设置为对所述二值化图像沿纵坐标方向求和,得到第一横坐标统计结果,根据所述第一横坐标统计结果提取所述最大移动目标的第一横坐标;
第二提取子模块,设置为以所述第一横坐标的范围为约束,对所述二值化图像沿横坐标方向求和,得到第一纵坐标统计结果,根据所述第一纵坐标统计结果提取所述最大移动目标的第一纵坐标;
第三提取子模块,设置为以所述第一横坐标的范围与所述第一纵坐标的范围为约束,对所述二值化图像沿所述纵坐标方向再次求和,得到第二横坐标统计结果,根据所述第二横坐标统计结果提取所述最大移动目标的第二横坐标;
确定子模块,设置为根据所述第一纵坐标与所述第二横坐标确定所述最大移动目标的区域坐标。
在一实施例中,所述第二确定模块86,还设置为对所述二值化图像沿横坐标方向求和,得到第二纵坐标统计结果,根据所述第二纵坐标统计结果提取所述最大移动目标的第二纵坐标;
对所述二值化图像沿纵坐标方向求和,得到第三横坐标统计结果,以所述第二纵坐标的范围为约束,根据所述第三横坐标统计结果提取所述最大移动目标的第三横坐标;
对所述二值化图像沿所述横坐标方向再次求和,得到第三纵坐标统计结果,以所述第二纵坐标的范围与所述第三横坐标的范围为约束,根据所述第三纵坐标统计结果提取所述最大移动目标的第三纵坐标;
根据所述第三横坐标与所述第三纵坐标确定所述移动目标的区域坐标。
在一实施例中,第一提取子模块,还设置为将所述第一横坐标统计结果中以第一个不为0的值至最后一个不为0的值形成的第一横坐标数组之后补充N-1个0,得到第二横坐标数组;
将所述第二横坐标数组与预设数组进行相关运算,得到第一相关运算结果,从所述第一相关运算结果中提取最长且连续的第一区域坐标,并确定所述第一区域坐标为所述最大移动目标的第一横坐标,其中,所述预设数组是长度为N、值为1的一维数组。
在一实施例中,第二提取子模块,还设置为将所述第一纵坐标统计结果中第一个不为0的值至最后一个不为0的值形成的第一纵坐标数组之后补充N-1个0,得到第二纵坐标数组;
将所述第二纵坐标数组与所述预设数组进行相关运算,得到第二相关运算结果,从所述第二相关运算结果中提取最长且连续的第二区域坐标,并确定所述第二区域坐标为所述最大移动目标的第一纵坐标。
在一实施例中,第三提取子模块,还设置为将所述第二横坐标统计结果中第一个不为0的值至最后一个不为0的值形成的第三横坐标数组后补充N-1个0,得到第四横坐标数组;
将所述第四横坐标数组与所述预设数组进行相关运算,得到第三相关运算结果,从所述第三相关运算结果中提取最长且连续的第三区域坐标,并确定所述第三区域坐标为所述最大 移动目标的第二横坐标。
在一实施例中,所述确定子模块,还设置为
确定所述最大移动目标的区域坐标为[w_b2,h_b1,w_e2,h_e1],其中,所述第一纵坐标为[h_b1,h_e1],所述第二横坐标为[w_b2,w_e2],(w_b2,h_b1)表示所述最大移动目标的左上角坐标,(w_e2,h_e1)表示所述最大移动目标的右下角坐标。
在一实施例中,所述追踪模块88,还设置为
根据缩放后的所述当前帧图像的宽度值与所述视频采集装置的视场角确定所述电机在水平方向上转动的第一角度:;
根据缩放后的所述当前帧图像的高度值与所述视频采集装置的视场角确定所述电机在竖直方向上转动的第二角度;
其中,为所述区域坐标的中心坐标,为缩放后的所述当前帧图像的宽度值,为缩放后的所述的当前帧图像的高度值,为所述视频采集装置的视场角;控制所述视频采集装置在所述水平方向上转动第一角度,在所述竖直方向上转动第二角度以对所述最大移动目标进行追踪。
在一实施例中,所述装置还包括:
获取模块,设置为获取上一次基于所述上一帧图像与相邻帧图像确定的所述移动目标的上一区域坐标;
第三确定模块,设置为确定所述移动目标的上一区域坐标与所述移动目标的当前区域坐标的交并比,并确定所述交并比大于预设阈值。
本公开的实施例还提供了一种计算机可读存储介质,该计算机可读存储介质中存储有计算机程序,其中,该计算机程序被设置为运行时执行上述任一项方法实施例中的步骤。
在一个示例性实施例中,上述计算机可读存储介质可以包括但不限于:U盘、只读存储器(Read-Only Memory,简称为ROM)、随机存取存储器(Random Access Memory,简称为RAM)、移动硬盘、磁碟或者光盘等各种可以存储计算机程序的介质。
本公开的实施例还提供了一种电子装置,包括存储器和处理器,该存储器中存储有计算机程序,该处理器被设置为运行计算机程序以执行上述任一项方法实施例中的步骤。
在一个示例性实施例中,上述电子装置还可以包括传输设备以及输入输出设备,其中,该传输设备和上述处理器连接,该输入输出设备和上述处理器连接。
本实施例中的具体示例可以参考上述实施例及示例性实施方式中所描述的示例,本实施例在此不再赘述。
显然,本领域的技术人员应该明白,上述的本公开的各模块或各步骤可以用通用的计算装置来实现,它们可以集中在单个的计算装置上,或者分布在多个计算装置所组成的网络上,它们可以用计算装置可执行的程序代码来实现,从而,可以将它们存储在存储装置中由计算装置来执行,并且在某些情况下,可以以不同于此处的顺序执行所示出或描述的步骤,或者将它们分别制作成各个集成电路模块,或者将它们中的多个模块或步骤制作成单个集成电路模块来实现。这样,本公开不限制于任何特定的硬件和软件结合。
以上所述仅为本公开的优选实施例而已,并不用于限制本公开,对于本领域的技术人员来说,本公开可以有各种更改和变化。凡在本公开的原则之内,所作的任何修改、等同替换、改进等,均应包含在本公开的保护范围之内。

Claims (14)

  1. 一种图像处理方法,所述方法包括:
    确定视频采集装置采集的视频图像中当前帧图像与上一帧图像的差分图像;
    对所述差分图像进行基于动态二值化阈值的二值化处理,得到二值化图像;
    对所述二值化图像进行横坐标统计与纵坐标统计,并基于横坐标统计结果与纵坐标统计结果确定最大移动目标的区域坐标;
    根据所述区域坐标控制电机对所述最大移动目标进行追踪。
  2. 根据权利要求1所述的方法,其中,确定视频采集装置采集的视频图像中当前帧图像与上一帧图像的差分图像包括:
    分别将所述当前帧图像与所述上一帧图像缩放到预设尺寸,得到缩放后的当前帧图像与上一帧图像;
    按照像素点确定缩放后的所述当前帧图像与所述上一帧图像的像素差值的绝对值,得到所述差分图像。
  3. 根据权利要求2所述的方法,其中,对所述差分图像进行动态二值化阈值二值化处理,得到二值化图像包括:
    将缩放后的所述当前帧图像或所述上一帧图像划分为m*n个区域;
    统计所述当前帧图像或所述上一帧图像中每个区域的方差或图像熵;
    将所述当前帧图像或所述上一帧图像中每个区域的方差或图像熵作为图像对应区域的背景分布值;
    将所述背景分布值与预设偏移量之和确定为二值化阈值;
    基于所述二值化阈值对所述差分图像进行二值化处理,得到所述二值化图像。
  4. 根据权利要求1所述的方法,其中,对所述二值化图像进行横坐标统计与纵坐标统计,并基于横坐标统计结果与纵坐标统计结果确定最大移动目标的区域坐标包括:
    对所述二值化图像沿纵坐标方向求和,得到第一横坐标统计结果,根据所述第一横坐标统计结果提取所述最大移动目标的第一横坐标;
    以所述第一横坐标的范围为约束,对所述二值化图像沿横坐标方向求和,得到第一纵坐标统计结果,根据所述第一纵坐标统计结果提取所述最大移动目标的第一纵坐标;
    以所述第一横坐标的范围与所述第一纵坐标的范围为约束,对所述二值化图像沿所述纵坐标方向再次求和,得到第二横坐标统计结果,根据所述第二横坐标统计结果提取所述最大移动目标的第二横坐标;
    根据所述第一纵坐标与所述第二横坐标确定所述最大移动目标的区域坐标。
  5. 根据权利要求1所述的方法,其中,对所述二值化图像进行横坐标统计与纵坐标统计,并基于横坐标统计结果与纵坐标统计结果确定最大移动目标的区域坐标包括:
    对所述二值化图像沿横坐标方向求和,得到第二纵坐标统计结果,根据所述第二纵坐标统计结果提取所述最大移动目标的第二纵坐标;
    对所述二值化图像沿纵坐标方向求和,得到第三横坐标统计结果,以所述第二纵坐标的范围为约束,根据所述第三横坐标统计结果提取所述最大移动目标的第三横坐标;
    对所述二值化图像沿所述横坐标方向再次求和,得到第三纵坐标统计结果,以所述第二 纵坐标的范围与所述第三横坐标的范围为约束,根据所述第三纵坐标统计结果提取所述最大移动目标的第三纵坐标;
    根据所述第三横坐标与所述第三纵坐标确定所述移动目标的区域坐标。
  6. 根据权利要求4所述的方法,其中,根据所述第一横坐标统计结果提取所述最大移动目标的第一横坐标包括:
    将所述第一横坐标统计结果中以第一个不为0的值至最后一个不为0的值形成的第一横坐标数组之后补充N-1个0,得到第二横坐标数组;
    将所述第二横坐标数组与预设数组进行相关运算,得到第一相关运算结果,从所述第一相关运算结果中提取最长且连续的第一区域坐标,并确定所述第一区域坐标为所述最大移动目标的第一横坐标,其中,所述预设数组是长度为N、值为1的一维数组。
  7. 根据权利要求6所述的方法,其中,根据所述第一纵坐标统计结果提取所述最大移动目标的第一纵坐标包括:
    将所述第一纵坐标统计结果中第一个不为0的值至最后一个不为0的值形成的第一纵坐标数组之后补充N-1个0,得到第二纵坐标数组;
    将所述第二纵坐标数组与所述预设数组进行相关运算,得到第二相关运算结果,从所述第二相关运算结果中提取最长且连续的第二区域坐标,并确定所述第二区域坐标为所述最大移动目标的第一纵坐标。
  8. 根据权利要求7所述的方法,其中,根据所述第二横坐标统计结果提取所述最大移动目标的第二横坐标包括:
    将所述第二横坐标统计结果中第一个不为0的值至最后一个不为0的值形成的第三横坐标数组后补充N-1个0,得到第四横坐标数组;
    将所述第四横坐标数组与所述预设数组进行相关运算,得到第三相关运算结果,从所述第三相关运算结果中提取最长且连续的第三区域坐标,并确定所述第三区域坐标为所述最大移动目标的第二横坐标。
  9. 根据权利要求8所述的方法,其中,根据所述第一纵坐标与所述所述第二横坐标确定所述最大移动目标的区域坐标包括:
    确定所述最大移动目标的区域坐标为[w_b2,h_b1,w_e2,h_e1],其中,所述第一纵坐标为[h_b1,h_e1],所述第二横坐标为[w_b2,w_e2],(w_b2,h_b1)表示所述最大移动目标的左上角坐标,(w_e2,h_e1)表示所述最大移动目标的右下角坐标。
  10. 根据权利要求1所述的方法,其中,根据所述区域坐标控制电机对所述最大移动目标进行追踪包括:
    根据缩放后的所述当前帧图像的宽度值与所述视频采集装置的视场角确定所述电机在水平方向上转动的第一角度;
    根据缩放后的所述当前帧图像的高度值与所述视频采集装置的视场角确定所述电机在竖直方向上转动的第二角度,其中,为所述区域坐标的中心坐标,为缩放后的所述当前帧图像的宽度值,为缩放后的所述
    控制所述视频采集装置在所述水平方向上转动所述第一角度,在所述竖直方向上转动所述第二角度以对所述最大移动目标进行追踪。
  11. 根据权利要求1至10中任一项所述的方法,其中,在根据所述区域坐标控制电 机对所述最大移动目标进行追踪之前,所述方法还包括:
    获取上一次基于所述上一帧图像与相邻帧图像确定的所述最大移动目标的上一区域坐标;
    确定所述最大移动目标的上一区域坐标与所述最大移动目标的当前区域坐标的交并比,并确定所述交并比大于预设阈值。
  12. 一种图像处理装置,所述装置包括:
    第一确定模块,设置为确定视频采集装置采集的视频图像中当前帧图像与上一帧图像的差分图像;
    二值化模块,设置为对所述差分图像进行基于动态二值化阈值的二值化处理,得到二值化图像;
    第二确定模块,设置为对所述二值化图像进行横坐标统计与纵坐标统计,并基于横坐标统计结果与纵坐标统计结果确定最大移动目标的区域坐标;
    追踪模块,设置为根据所述区域坐标控制电机对所述最大移动目标进行追踪。
  13. 一种计算机可读的存储介质,所述存储介质中存储有计算机程序,其中,所述计算机程序被设置为运行时执行所述权利要求1至11任一项中所述的方法。
  14. 一种电子装置,包括存储器和处理器,所述存储器中存储有计算机程序,所述处理器被设置为运行所述计算机程序以执行所述权利要求1至11任一项中所述的方法。
PCT/CN2023/100014 2022-06-30 2023-06-13 一种图像处理方法、装置、存储介质及电子装置 WO2024001764A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210761434.8 2022-06-30
CN202210761434.8A CN117372446A (zh) 2022-06-30 2022-06-30 一种图像处理方法、装置、存储介质及电子装置

Publications (1)

Publication Number Publication Date
WO2024001764A1 true WO2024001764A1 (zh) 2024-01-04

Family

ID=89382829

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/100014 WO2024001764A1 (zh) 2022-06-30 2023-06-13 一种图像处理方法、装置、存储介质及电子装置

Country Status (2)

Country Link
CN (1) CN117372446A (zh)
WO (1) WO2024001764A1 (zh)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106650593A (zh) * 2016-09-30 2017-05-10 王玲 客流统计方法和装置
CN107516296A (zh) * 2017-07-10 2017-12-26 昆明理工大学 一种基于fpga的运动目标检测跟踪系统及方法
US20210117683A1 (en) * 2019-10-16 2021-04-22 Realtek Singapore Private Limited Object Localization and Classification System and Method Thereof
CN113409355A (zh) * 2021-05-13 2021-09-17 杭州电子科技大学 一种基于fpga的运动目标识别系统及方法

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106650593A (zh) * 2016-09-30 2017-05-10 王玲 客流统计方法和装置
CN107516296A (zh) * 2017-07-10 2017-12-26 昆明理工大学 一种基于fpga的运动目标检测跟踪系统及方法
US20210117683A1 (en) * 2019-10-16 2021-04-22 Realtek Singapore Private Limited Object Localization and Classification System and Method Thereof
CN113409355A (zh) * 2021-05-13 2021-09-17 杭州电子科技大学 一种基于fpga的运动目标识别系统及方法

Also Published As

Publication number Publication date
CN117372446A (zh) 2024-01-09

Similar Documents

Publication Publication Date Title
CN106846359B (zh) 基于视频序列的运动目标快速检测方法
CN102542289B (zh) 一种基于多高斯计数模型的人流量统计方法
CN109086724B (zh) 一种加速的人脸检测方法及存储介质
US20130063556A1 (en) Extracting depth information from video from a single camera
WO2020029518A1 (zh) 一种监控视频处理方法、装置及计算机可读介质
EP3296953A1 (en) Method and device for processing depth images
CN106296725A (zh) 运动目标实时检测与跟踪方法及目标检测装置
US10621730B2 (en) Missing feet recovery of a human object from an image sequence based on ground plane detection
US11200681B2 (en) Motion detection method and motion detection system with low computational complexity and high detection accuracy
CN110276769B (zh) 一种视频画中画架构中直播内容定位方法
CN111415374A (zh) 一种用于景区人流量监控和管理的kvm系统及方法
CN112752158B (zh) 一种视频展示的方法、装置、电子设备及存储介质
CN109254271B (zh) 一种用于地面监视雷达系统的静止目标抑制方法
CN108765463B (zh) 一种结合区域提取与改进纹理特征的运动目标检测方法
CN111160107B (zh) 一种基于特征匹配的动态区域检测方法
CN116740126A (zh) 目标跟踪方法、高速相机及存储介质
CN107358621B (zh) 对象跟踪方法及装置
WO2024001764A1 (zh) 一种图像处理方法、装置、存储介质及电子装置
CN110443142A (zh) 一种基于路面提取与分割的深度学习车辆计数方法
CN111368883B (zh) 基于单目摄像头的避障方法、计算装置及存储装置
CN112819889A (zh) 位置信息的确定方法及装置、存储介质、电子装置
WO2022205841A1 (zh) 机器人导航方法、装置、终端设备及计算机可读存储介质
Morerio et al. Optimizing superpixel clustering for real-time egocentric-vision applications
CN110674778B (zh) 一种高分辨视频图像目标检测方法及装置
CN114550060A (zh) 周界入侵识别方法、系统及电子设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23829947

Country of ref document: EP

Kind code of ref document: A1