WO2020027511A1 - Method for generating syntax-based heat-map for compressed image - Google Patents

Method for generating syntax-based heat-map for compressed image Download PDF

Info

Publication number
WO2020027511A1
WO2020027511A1 PCT/KR2019/009372 KR2019009372W WO2020027511A1 WO 2020027511 A1 WO2020027511 A1 WO 2020027511A1 KR 2019009372 W KR2019009372 W KR 2019009372W WO 2020027511 A1 WO2020027511 A1 WO 2020027511A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
moving object
compressed image
object region
motion vector
Prior art date
Application number
PCT/KR2019/009372
Other languages
French (fr)
Korean (ko)
Inventor
이현우
정승훈
이성진
Original Assignee
이노뎁 주식회사
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 이노뎁 주식회사 filed Critical 이노뎁 주식회사
Publication of WO2020027511A1 publication Critical patent/WO2020027511A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T11/002D [Two Dimensional] image generation
    • G06T11/20Drawing from basic elements, e.g. lines or circles
    • G06T11/206Drawing of charts or graphs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T11/002D [Two Dimensional] image generation
    • G06T11/20Drawing from basic elements, e.g. lines or circles
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/136Incoming video signal characteristics or properties
    • H04N19/137Motion inside a coding unit, e.g. average field, frame or block difference
    • H04N19/139Analysis of motion vectors, e.g. their magnitude, direction, variance or reliability
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/46Embedding additional information in the video signal during the compression process
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/70Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards

Definitions

  • the present invention generally relates to techniques for efficiently generating heat maps from compressed images such as H.264 AVC and H.265 HEVC.
  • the present invention provides a syntax (eg, a motion vector, for example) obtained by parsing compressed image data instead of generating a heat map through a complex image processing, for example, for a compressed image generated by a CCTV camera.
  • the present invention relates to a technology that can generate a heat map with a small number of operations by extracting a region in which something meaningful movement exists in an image, that is, a moving object region, and accumulating a trajectory of the moving object region using a coding type.
  • This data collection method can collect customer interests directly through interviews, membership cards, salespeople, or indirectly through CCTV cameras, sensors, and smartphone apps.
  • Heat maps combine heat (heat) and map (map) to represent information in a thermal distribution.
  • Such a heat map may express people's movements or interests in a color step in a camera image. People's movements are accumulated for a certain unit of time and colors are displayed according to the accumulated degree. In general, the areas in which people's movements are accumulated are expressed in red, and the areas in which people's movements are accumulated are represented in blue, so that they are consistent with the feeling of temperature.
  • heat maps from images taken from store CCTV cameras allow store managers to intuitively identify which products are of interest to customers and vice versa by using heat maps. have. Based on this information, it is possible to decide whether to change the arrangement of goods in the store, to change the price policy or to establish a sale event policy in consideration of the actual purchase rate.
  • heat maps obtained from images taken by alleyway CCTV cameras can be used to identify the paths that people are moving along the alleys.
  • a video decoding apparatus includes a parser 11, an entropy decoder 12, an inverse converter 13, a motion vector operator 14, a predictor 15, and a deblocking filter ( 16) is configured to include.
  • These hardware modules sequentially process the data of the compressed image to decompress the compressed image and restore the original image data.
  • the parser 11 parses the motion vector and the coding type for the coding unit of the compressed image.
  • Such a coding unit is generally an image block such as a macroblock or a subblock.
  • FIG. 2 is a flowchart illustrating a process of generating a heat map from a compressed image in a conventional image analysis solution.
  • a compressed image is decoded according to a video standard such as H.264 AVC and H.265 HEVC to obtain a reproduced image (S10), and the frame images constituting the reproduced image are downscaled to a small image, for example, 320x240. (S20).
  • the reason for this downscaling resizing process is to slightly reduce the processing burden in the subsequent process.
  • the moving objects existing in the compressed image are extracted through image analysis, and the coordinates of the moving objects are extracted (S30).
  • the heat map is generated by accumulating the trajectories of the moving objects through image analysis of a series of frame images over time (S40).
  • a moving object is extracted to generate a heat map.
  • compressed image decoding, downscale resizing, and image analysis are performed. These are very complicated processes, and therefore, in a conventional video control system, the capacity that a single video analysis server can process simultaneously is quite limited.
  • CCTV channels that can be covered by high-performance video analytics servers are typically up to 20 channels. Therefore, in order to generate heat maps for compressed images generated from CCTV cameras installed at various points, a plurality of image analysis servers were required, which caused an increase in cost and difficulty in securing physical space.
  • An object of the present invention is to provide a technique for effectively generating heat maps from compressed images such as H.264 AVC and H.265 HEVC.
  • an object of the present invention is syntax (e.g., motion vector, coding) obtained by parsing compressed image data, rather than generating a heat map for a compressed image generated by a CCTV camera, for example, through complex image processing. It is to provide a technology that can generate a heat map with a small number of operations by extracting a region in which there is something meaningful movement in the image, that is, a moving object region and accumulating the trajectory of the moving object region.
  • syntax e.g., motion vector, coding
  • the present invention is to achieve the above object, the syntax-based heat map generation method for a compressed image according to the present invention, parsing the bitstream of the compressed image to obtain a motion vector and coding type for the coding unit Stage 1; A second step of obtaining a motion vector cumulative value for a predetermined time for each of the plurality of image blocks constituting the compressed image; A third step of comparing a motion vector cumulative value with a first threshold value for a plurality of image blocks; A fourth step of marking an image block having a motion vector accumulation value exceeding a first threshold as a moving object region; And a fifth step of generating a heat map for the compressed image by accumulating a moving object region over a series of image frames of the compressed image.
  • the fifth step may include: identifying a plurality of moving object areas marked above in a series of image frames constituting the compressed image; Calculating a representative position with respect to the plurality of moving object regions; A fifth step of calculating hit data by accumulating the calculated plurality of representative positions; And a fifth step of generating a heat map for the compressed image based on the hit data.
  • a unique ID is newly issued and assigned to a plurality of moving object areas identified above in a series of video frames constituting the compressed image, and assigned to the moving object areas identified as unassigned.
  • a fifth step of reinforcing the hit data by accumulating the movement trajectories for each Unique ID.
  • the method of generating a heat map may include: a) identifying a plurality of adjacent image blocks (hereinafter, referred to as 'neighbor blocks') around a moving object area; B) comparing a motion vector value with a second preset threshold value for a plurality of neighboring blocks; C) additionally marking a neighboring block having a motion vector value exceeding a second threshold as a moving object region; D) additionally marking a neighboring block having a coding type of an intra picture among the plurality of neighboring blocks as a moving object region; The method may further include an e-step of performing interpolation on the plurality of moving object regions to additionally mark a predetermined number or less of unmarked image blocks surrounded by the moving object region as the moving object region.
  • the computer program according to the present invention is stored in the medium in combination with hardware to execute the syntax-based heat map generation method for the compressed image as described above.
  • a heat map can be efficiently generated from a CCTV compressed image without complex processing such as decoding, downscale resizing, difference image acquisition, image analysis, and the like.
  • FIG. 1 is a block diagram showing a general configuration of a video decoding apparatus.
  • FIG. 2 is a flowchart illustrating a process of generating a heat map from a compressed image in the prior art.
  • FIG. 3 is a flowchart illustrating an entire process of a syntax based heatmap generation process for a compressed image according to the present invention.
  • FIG. 4 is a flowchart illustrating an embodiment of a process of detecting effective motion from a compressed image in the present invention.
  • FIG. 5 is a diagram illustrating an example of a result of applying an effective motion region detection process according to the present invention to a CCTV compressed image.
  • FIG. 6 is a flowchart illustrating an example of a process of detecting a boundary region for a moving object region in the present invention.
  • FIG. 7 is a diagram illustrating an example of a result of applying a boundary area detection process according to the present invention to the CCTV image of FIG.
  • FIG. 8 is a diagram illustrating an example of a result of arranging a moving object region through interpolation with respect to the CCTV image of FIG. 7.
  • FIG. 9 is a flowchart illustrating an embodiment of a process of generating a heat map from a moving object region detected in a compressed image according to the present invention.
  • FIG. 10 is a diagram for one example in which a unique ID is assigned to a moving object area in the present invention.
  • FIG. 11 is a view showing an example in which the center coordinates are set in the moving object area in the present invention.
  • an image analysis server may be preferably performed in a system for handling compressed images, for example, a CCTV image control system or a CCTV image analysis system.
  • the present invention parses a bitstream of a compressed image without having to decode the compressed image, thereby syntax information of each image block, that is, a macro block and a sub block, preferably a motion vector. Quickly extract the moving object region using the and coding type information.
  • the moving object region thus obtained does not accurately reflect the boundary of the moving object as shown in the image attached to the present specification, but exhibits a certain level of reliability even though the processing speed is high.
  • the present invention generates a heat map for the space by accumulating the information of the moving object region thus obtained.
  • the moving object region can be extracted and the heat map can be generated without decoding the compressed image.
  • the apparatus or software to which the present invention is applied should not perform the operation of decoding the compressed image, but the scope of the present invention is not limited.
  • Step S100 First, an effective motion that can be substantially recognized from the compressed image is detected from the compressed image based on the motion vector of the compressed image, and the image region in which the effective motion is detected is set as the moving object region.
  • the motion vector and coding type of a coding unit of a compressed image are parsed according to a video compression standard such as H.264 AVC and H.265 HEVC.
  • the size of the coding unit is generally about 64x64 to 4x4 pixels and may be set to be flexible.
  • the motion vectors are accumulated for a predetermined time period (for example, 500 msec) for each image block, and it is checked whether the motion vector accumulation value exceeds the first predetermined threshold (for example, 20). If such an image block is found, it is considered that effective motion has been found in the image block and marked as a moving object area. Accordingly, even if the motion vector is generated, if the cumulative value for a predetermined time does not exceed the first threshold, the image change is assumed to be insignificant and ignored.
  • a predetermined time period for example, 500 msec
  • Step S200 Detects how far the boundary region is to the moving object region detected in S100 based on the motion vector and the coding type. If a motion vector occurs above a second threshold (for example, 0) or a coding type is an intra picture by inspecting a plurality of adjacent image blocks centered on the image block marked as a moving object area, the corresponding image block is also moved. Mark as an object area. Through this process, the corresponding image block is substantially in the form of forming a lump with the moving object region detected in S100.
  • a second threshold for example, 0
  • a coding type is an intra picture by inspecting a plurality of adjacent image blocks centered on the image block marked as a moving object area, the corresponding image block is also moved. Mark as an object area.
  • an effective motion is found and there is a certain amount of motion in the vicinity of the moving object area, it is marked as a moving object area because it is likely to be a mass with the previous moving object area.
  • determination based on a motion vector is impossible. Accordingly, the intra picture located adjacent to the image block already detected as the moving object region is estimated as a mass together with the previously extracted moving object region.
  • Step S300 The interpolation is applied to the moving object areas detected at S100 and S200 to clean up the fragmentation of the moving object area.
  • the moving object area since it is determined whether the moving object area is the image block unit, even though it is actually a moving object (for example, a person), there is an image block that is not marked as the moving object area in the middle.
  • the phenomenon of dividing into may occur. Accordingly, if there are one or a few unmarked image blocks surrounded by a plurality of image blocks marked with the moving object region, they additionally mark the moving object region. By doing so, it is possible to make the mobile object region divided into several into one. The influence of such interpolation is clearly seen when comparing FIG. 7 and FIG.
  • Step S400 The moving object region is quickly extracted from each frame image constituting the compressed image based on the syntax (motion vector, coding type) of the coding unit through the above process.
  • a heat map of the corresponding image is generated by using the extracted result of the moving object region.
  • the present invention accumulates the extraction result of the moving object region over a series of image frames to estimate how frequently the moving object is found for each region in the image and thereby generates a heat map.
  • the detection itself of the moving object region may be accumulated or the movement trajectories of the moving object region may be accumulated.
  • a detailed process of generating a heat map from the extraction result of the moving object region will be described later in detail with reference to FIG. 9.
  • FIG. 4 is a flowchart illustrating an example of a process of detecting effective motion from a compressed image in the present invention
  • FIG. 5 is a diagram illustrating an example of a result of applying the effective motion region detection process according to the present invention to a CCTV compressed image.
  • the process of FIG. 4 corresponds to step S100 in FIG. 3.
  • Step S110 First, a coding unit of a compressed image is parsed to obtain a motion vector and a coding type.
  • a video decoding apparatus performs parsing (header parsing) and motion vector operations on a stream of compressed video according to a video compression standard such as H.264 AVC and H.265 HEVC. Through this process, the motion vector and coding type are parsed for the coding unit of the compressed image.
  • Step S120 Acquire a motion vector cumulative value for a preset time (for example, 500 ms) for each of the plurality of image blocks constituting the compressed image.
  • This step is presented with the intention to detect if there are effective movements that are practically recognizable from the compressed image, such as driving cars, running people, and fighting crowds. Shaky leaves, ghosts that appear momentarily, and shadows that change slightly due to light reflections, though they are moving, are virtually meaningless objects and should not be detected.
  • a motion vector cumulative value is obtained by accumulating a motion vector in units of one or more image blocks for a predetermined time period (for example, 500 msec).
  • the image block is used as a concept including a macroblock and a subblock.
  • Steps S130 and S140 Comparing a motion vector cumulative value with respect to a plurality of image blocks with a preset first threshold value (eg, 20), and moving the image block having a motion vector cumulative value exceeding the first threshold value.
  • an image block having a predetermined motion vector accumulation value is found as described above, it is considered that something significant movement, that is, effective movement, is found in the image block and is marked as a moving object region.
  • a human run is to select and detect a movement that is worth the attention of the control personnel.
  • the cumulative value for a predetermined time is small enough not to exceed the first threshold, the change in the image is assumed to be small and insignificant and is neglected in the detection step.
  • FIG. 5 is an example illustrating a result of detecting an effective motion region from a CCTV compressed image through the process of FIG. 4.
  • an image block having a motion vector accumulation value equal to or greater than a first threshold value is marked as a moving object area and displayed as a bold line area.
  • the sidewalk block, the road, and the shadowed part are not displayed as the moving object area, while the walking people or the driving car are displayed as the moving object area.
  • FIG. 6 is a flowchart illustrating an example of a process of detecting a boundary region of a moving object region in the present invention
  • FIG. 7 is a boundary region of FIG. 5 with respect to the CCTV image of FIG. Figure 1 shows an example of the results of further applying the detection process.
  • the process of FIG. 6 corresponds to step S200 in FIG. 3.
  • the moving object is not properly marked and only a portion of the moving object is marked. In other words, if you look at a person walking or driving a car, you will find that not all of the objects are marked, but only some blocks. In addition, it is also found that a plurality of moving object areas are marked for one moving object. This means that the criterion of the moving object region adopted in S100 was very useful for filtering out the general region but was quite strict. Therefore, it is necessary to detect the boundary of the moving object by looking around the moving object area.
  • Step S210 First, a plurality of adjacent image blocks are identified based on the image blocks marked as moving object areas by the previous S100. In the present specification, these are referred to as 'neighborhood blocks'. These neighboring blocks are portions that are not marked as the moving object region by S100, and the process of FIG. 6 examines them further to determine whether any of these neighboring blocks may be included in the boundary of the moving object region.
  • Steps S220 and S230 compare a motion vector value with respect to a plurality of neighboring blocks with a second preset threshold (eg, 0), and mark the neighboring block having a motion vector value exceeding the second threshold as a moving object region. do. If the movement is located adjacent to the area of the moving object where effective motion that is practically meaningful is found and a certain amount of movement is found for itself, the image block is likely to be a block with the area of the adjacent moving object due to the characteristics of the photographed image. . Therefore, such neighboring blocks are also marked as moving object regions.
  • a second preset threshold eg, 0
  • Step S240 Also, the coding type is an intra picture among the plurality of neighboring blocks as a moving object region.
  • an intra picture since a motion vector does not exist, it is fundamentally impossible to determine whether a motion exists in a corresponding neighboring block based on the motion vector. In this case, it is safer for the intra picture located adjacent to the image block already detected as the moving object region to maintain the settings of the previously extracted moving object region.
  • FIG. 7 is a diagram illustrating a result of applying a boundary region detection process to a CCTV compressed image.
  • a plurality of image blocks marked as a moving object region are displayed as an area of a bold line.
  • the moving object area indicated by the bold line area was further extended near the moving object area indicated by the bold line area in FIG. 5 to cover the entire moving object when compared with the image captured by CCTV. You can find that it is enough.
  • FIG. 8 is a diagram illustrating an example of a result of arranging a moving object region through interpolation according to the present invention for a CCTV image image to which the boundary region detection process illustrated in FIG. 7 is applied.
  • Step S300 is a process of arranging the division of the moving object area by applying interpolation to the moving object areas detected in the previous steps S100 and S200.
  • an unmarked image block is found between the moving object regions indicated by the bold lines. If there is an unmarked image block in the middle, it can be regarded as if they are a plurality of individual moving objects. If the moving object region is fragmented as described above, the result of step S400 may be inaccurate, and the number of moving object regions may increase, thereby complicating the process of step S400.
  • the moving object region which is marked as interpolation.
  • interpolation in contrast to FIG. 7, all of the non-marked image blocks existing between the moving object regions are marked as moving object regions. By doing this, all the moving areas are bundled together and treated as a moving object.
  • FIG. 9 is a flowchart illustrating an example of a process of generating a heat map from a moving object region detected in a compressed image according to the present invention, and corresponds to step S400 of FIG. 3.
  • the present invention extracts a moving object region based on syntax information directly obtained from a compressed image.
  • the process of acquiring and analyzing the difference image of the original image by decoding the compressed image of the prior art is unnecessary, and according to the inventor's test, the processing speed is improved by up to 20 times.
  • this approach has the disadvantage of poor precision.
  • the process of generating the heat map also reflects these structural features.
  • Steps S410 and S420 Identify a plurality of moving object regions from a series of image frames constituting the compressed image. For example, if the heat map is generated from the compressed image for 10 minutes taken at 24 frames per second, the image of 14,400 frames in total through the process of steps S100 to S300 of FIG. Identifies a number of moving object areas, such as those indicated by bold lines in 8.
  • Representative coordinates are then calculated for the plurality of moving object regions identified in this manner, each of which represents a position within the frame image.
  • center coordinates cx1, cy1; cx2, cy2; cx3, cy3 for a virtual optimal rectangle surrounding the moving object region may be used.
  • Step S430 Then, by accumulating a plurality of representative coordinates derived from a series of image frames in image block units, heat data reflecting the frequency of appearance of the moving object in the space is calculated.
  • a heat map may be generated by excluding the heat data calculation process of step S430 and calculating the heat data through steps S440 to S470 from the beginning.
  • Steps S440 and S450 A unique ID is allocated to a plurality of moving object areas in a series of image frames constituting the compressed image to treat the moving object area as an 'object' rather than a region.
  • step S440 when the moving object region that has been assigned the Unique ID disappears while passing through a series of image frames, the unique ID allocated in step S440 is revoked for the moving object region (S450). In other words, the moving object previously found and tracked disappears from the image.
  • the moving object area is regarded as an object, and the moving trajectory of the object is tracked while moving over a series of frame images in the compressed image.
  • steps S440 and S450 it should be possible to determine whether the chunks of the interconnected image blocks marked as moving object regions are the same before and after the series of image frames. This is because it is possible to determine whether the Unique ID has been previously assigned to the mobile object area currently being handled.
  • the image block is a moving object region without checking the original image image, it is not possible to confirm whether the chunks of the moving object region are actually the same in the front and back image frames. That is, since the contents of the image included in the image are not known, such a change cannot be identified, for example, when the cat is replaced by a dog between the front and rear frames at the same point. However, given that the time interval between frames is very short and that the observation object of the CCTV camera moves at a normal speed, this is unlikely to happen.
  • the present invention estimates that the ratio or number of image blocks overlapping between the chunks of the moving object region in the front and back frames is equal to or greater than a predetermined threshold. According to this approach, it is possible to determine whether a specific moving object area is moving, a new moving object area is new, or an existing moving object area disappears even if the contents of the original image are not known. This judgment is lower in accuracy than the prior art, but can greatly increase the data processing speed, which is advantageous in practical applications.
  • Step S460 Arrange a plurality of representative coordinates calculated in step S420 for a plurality of moving object areas based on a unique ID, thereby obtaining a sequence of representative coordinates in which each unique ID appears in a series of image frames. Can be. This corresponds to a movement trajectory indicating how each moving object represented by a unique ID has moved in a series of image frames.
  • Step S470 Then, the hit data is reinforced by accumulating the movement trajectories for the unique IDs in image block units, for example. Since the hit data calculated in step S430 reflects the frequency of appearance of the moving object region, the processing speed is high but the characteristics of the movement trajectories of the objects are not reflected. The hit data obtained in step S470 is relatively slow in processing speed, but has an advantage of reflecting moving lines of objects in a corresponding space.
  • Step S480 A heat map image is generated for the compressed image based on the hit data calculated in the above process.
  • the present invention may be embodied in the form of computer readable codes on a computer readable nonvolatile recording medium.
  • Such nonvolatile recording media include various types of storage devices, such as hard disks, SSDs, CD-ROMs, NAS, magnetic tapes, web disks, and cloud disks. Forms that are implemented and executed may also be implemented.
  • the present invention may be implemented in the form of a computer program stored in a medium in combination with hardware to execute a specific procedure.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

The present invention generally relates to a technology for effectively generating a heat-map from a compressed image such as H.264 AVC or H.265 HEVC. More specifically, the present invention relates to a technology that can generate a heat-map with a small number of operations, by extracting a region having some significant motion in an image, that is, a moving object region, by using a syntax (e.g., a motion vector, a coding type) obtained by parsing compressed image data and accumulating a trajectory of the moving object region, instead of generating a heat-map through complex image processing, as in the prior art, for a compressed image generated by, for example, a CCTV camera. According to the present invention, there is an advantage in that a heat-map can be effectively generated from a compressed CCTV image without complex processing such as decoding, downscale resizing, difference image acquisition, image analysis, or the like. In particular, it is possible to generate a heat-map by using about 1/10 of the operation quantity of the prior art, and thus there is an advantage in that the number of available analysis channels of an image analysis server can be increased by about 10 times or more.

Description

압축영상에 대한 신택스 기반의 히트맵 생성 방법Syntax-based Heatmap Generation Method for Compressed Images
본 발명은 일반적으로 H.264 AVC 및 H.265 HEVC 등의 압축영상으로부터 히트맵을 효과적으로 생성하는 기술에 관한 것이다.BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention generally relates to techniques for efficiently generating heat maps from compressed images such as H.264 AVC and H.265 HEVC.
더욱 상세하게는, 본 발명은 예컨대 CCTV 카메라가 생성하는 압축영상에 대해 종래기술처럼 복잡한 이미지 프로세싱을 통해 히트맵을 생성하는 것이 아니라 압축영상 데이터를 파싱하여 얻어지는 신택스(syntax)(예: 모션벡터, 코딩유형)를 활용하여 영상 내의 무언가 유의미한 움직임이 존재하는 영역, 즉 이동객체 영역을 추출하고 그 이동객체 영역의 궤적을 누적시킴으로써 적은 연산으로도 히트맵을 생성할 수 있는 기술에 관한 것이다.More specifically, the present invention provides a syntax (eg, a motion vector, for example) obtained by parsing compressed image data instead of generating a heat map through a complex image processing, for example, for a compressed image generated by a CCTV camera. The present invention relates to a technology that can generate a heat map with a small number of operations by extracting a region in which something meaningful movement exists in an image, that is, a moving object region, and accumulating a trajectory of the moving object region using a coding type.
특정 공간에서 다수의 사람들이 종합적으로 어떠한 행동 패턴을 보였는지를 분석하고 이를 통해 유의미한 정보를 도출해내는 분야가 있다. 예를 들어, 매장 내에서 고객들의 동선과 관심사를 분석함으로써 영업 및 마케팅 의사 결정을 위한 중요한 데이터를 얻을 수 있다. 이와 같은 데이터 수집 방법은 고객의 관심사를 인터뷰, 멤버십 카드, 점원을 통해 직접적으로 수집하거나 CCTV 카메라, 센서, 스마트폰 앱을 통해 간접적으로 수집할 수 있다.There is a field that analyzes what patterns of behaviors of a large number of people have shown in a specific space and draws meaningful information through them. For example, by analyzing customers' movements and interests within the store, they can gain valuable data for sales and marketing decisions. This data collection method can collect customer interests directly through interviews, membership cards, salespeople, or indirectly through CCTV cameras, sensors, and smartphone apps.
이처럼 특정 공간에서 사람들의 동선과 관심도를 시각적으로 나타내는 방법으로 히트맵(heat-map)이 있다. 히트맵은 열을 뜻하는 히트(heat)와 지도를 뜻하는 맵(map)을 결합시킨 것으로 정보를 열분포 형태의 그래픽으로 표현한다. 이러한 히트맵은 카메라 영상에서 사람들의 동선이나 관심도를 색 단계로 표현할 수 있다. 일정 단위 시간 동안 사람들의 움직임을 누적하고 그 누적된 정도에 따라 색상을 부여하여 디스플레이한다. 일반적으로는 온도의 느낌과 일치될 수 있도록 사람들의 동선이 많이 누적된 영역은 붉은색 계열로 표현하고 사람들의 동선이 상대적으로 적게 누적된 영역은 푸른색 계열로 표현한다. As such, there is a heat-map as a way of visually representing people's movements and interests in a specific space. Heat maps combine heat (heat) and map (map) to represent information in a thermal distribution. Such a heat map may express people's movements or interests in a color step in a camera image. People's movements are accumulated for a certain unit of time and colors are displayed according to the accumulated degree. In general, the areas in which people's movements are accumulated are expressed in red, and the areas in which people's movements are accumulated are represented in blue, so that they are consistent with the feeling of temperature.
예를 들어, 매장 CCTV 카메라에서 촬영한 영상으로부터 얻은 히트맵을 통해 매장 운영자는 히트맵을 통해 고객들이 어떠한 상품에 관심이 있는지, 반대로 어떠한 상품이 고객들의 관심을 전혀 끌고있지 못하는지를 직관적으로 파악할 수 있다. 이러한 정보를 바탕으로 매장내 상품 배치를 변경한다거나, 실구매율을 추가로 고려하여 가격정책을 변경하거나 세일이벤트 정책을 세우는 등을 결정할 수 있다. 또한, 골목길 CCTV 카메라에서 촬영한 영상으로부터 얻은 히트맵을 통해 그 골목길에서 사람들이 어떠한 경로로 이동하는지 파악할 수 있다.For example, heat maps from images taken from store CCTV cameras allow store managers to intuitively identify which products are of interest to customers and vice versa by using heat maps. have. Based on this information, it is possible to decide whether to change the arrangement of goods in the store, to change the price policy or to establish a sale event policy in consideration of the actual purchase rate. In addition, heat maps obtained from images taken by alleyway CCTV cameras can be used to identify the paths that people are moving along the alleys.
이하에서는 도 1과 도 2를 참조하여 종래기술에서 CCTV 압축영상으로부터 히트맵을 생성하는 과정을 기술한다.Hereinafter, a process of generating a heat map from a CCTV compressed image in the prior art will be described with reference to FIGS. 1 and 2.
최근에 설치되는 CCTV 카메라는 고해상도(예: Full HD) 및 고프레임(예: 초당 24프레임)의 제품이 채택되고 있기 때문에 네트워크 대역폭과 스토리지 공간의 부담을 고려하여 H.264 AVC 및 H.265 HEVC 등과 같은 고압축율의 복잡한 영상압축 기술이 채택되고 있다. CCTV 카메라 장치에서는 압축영상을 생성하여 제공하며, 동영상을 재생하는 장치에서는 해당 기술규격에 따라 역으로 압축영상에 대한 디코딩을 수행한다. 영상압축 기술이 적용된 CCTV 영상에서 객체 존재 및 움직임 유무를 판단하려면 종래에는 압축영상을 디코딩하여 재생영상, 즉 압축이 풀려있는 원래 영상을 얻은 후에 이미지 처리하는 과정이 필요하였다.Recently installed CCTV cameras adopt high resolution (e.g. Full HD) and high frame rate (e.g. 24 frames per second), so that H.264 AVC and H.265 HEVC take into account the burden of network bandwidth and storage space. High compression ratio complex image compression technology such as is being adopted. The CCTV camera device generates and provides a compressed image, and the device for reproducing the video decodes the compressed image in reverse according to the technical specifications. In order to determine the existence and movement of an object in a CCTV image to which image compression technology is applied, conventionally, a process of processing an image after decoding a compressed image and obtaining a reproduced image, that is, an original image that has been decompressed, is required.
도 1은 H.264 AVC 기술규격에 따른 동영상 디코딩 장치의 일반적인 구성을 나타내는 블록도이다. 도 1을 참조하면, H.264 AVC에 따른 동영상 디코딩 장치는 구문분석기(11), 엔트로피 디코더(12), 역 변환기(13), 모션벡터 연산기(14), 예측기(15), 디블로킹 필터(16)를 포함하여 구성된다. 이들 하드웨어 모듈이 압축영상의 데이터를 순차적으로 처리함으로써 압축영상에서 압축을 풀고 원래의 영상 데이터를 복원해낸다. 이때, 구문분석기(11)는 압축영상의 코딩 유닛에 대해 모션벡터 및 코딩유형을 파싱해낸다. 이러한 코딩 유닛(coding unit)은 일반적으로는 매크로블록이나 서브 블록과 같은 영상 블록이다.1 is a block diagram illustrating a general configuration of a video decoding apparatus according to the H.264 AVC Technical Specification. Referring to FIG. 1, a video decoding apparatus according to H.264 AVC includes a parser 11, an entropy decoder 12, an inverse converter 13, a motion vector operator 14, a predictor 15, and a deblocking filter ( 16) is configured to include. These hardware modules sequentially process the data of the compressed image to decompress the compressed image and restore the original image data. At this time, the parser 11 parses the motion vector and the coding type for the coding unit of the compressed image. Such a coding unit is generally an image block such as a macroblock or a subblock.
도 2는 기존의 영상분석 솔루션에서 압축영상으로부터 히트맵을 생성하는 과정을 나타내는 순서도이다.2 is a flowchart illustrating a process of generating a heat map from a compressed image in a conventional image analysis solution.
먼저, 압축영상을 동영상 표준, 예컨대 H.264 AVC 및 H.265 HEVC 등에 따라 디코딩하여 재생 영상을 획득하고(S10), 재생영상을 구성하는 프레임 이미지들을 작은 이미지, 예컨대 320x240 정도로 다운스케일 리사이징을 한다(S20). 이렇게 다운스케일 리사이징 처리를 하는 이유는 이후 과정에서의 프로세싱 부담을 약간이라도 줄이기 위한 것이다. 그리고 나서, 리사이징된 프레임 이미지들에 대해 차영상(differentials)을 구한 후에 영상 분석을 통하여 압축영상 내에 존재하는 이동객체를 추출하고 이들 이동객체의 좌표를 추출해낸다(S30). 그리고 나서, 시간 흐름에 따른 일련의 프레임 이미지에 대한 영상 분석을 통해 이동객체들의 궤적을 누적시켜 히트맵을 생성한다(S40).First, a compressed image is decoded according to a video standard such as H.264 AVC and H.265 HEVC to obtain a reproduced image (S10), and the frame images constituting the reproduced image are downscaled to a small image, for example, 320x240. (S20). The reason for this downscaling resizing process is to slightly reduce the processing burden in the subsequent process. Then, after obtaining differential images of the resized frame images, the moving objects existing in the compressed image are extracted through image analysis, and the coordinates of the moving objects are extracted (S30). Then, the heat map is generated by accumulating the trajectories of the moving objects through image analysis of a series of frame images over time (S40).
종래기술에는 히트맵 생성을 위해 이동객체를 추출하는데, 고해상도 압축영상에서 이동객체를 추출하려면 압축영상 디코딩, 다운스케일 리사이징, 영상 분석을 수행한다. 이들은 복잡도가 매우 높은 프로세스이고, 그로 인해 종래의 영상관제 시스템에서는 한 대의 영상분석 서버가 동시 처리할 수 있는 용량이 상당히 제한되어 있다. 현재 고성능의 영상분석 서버가 커버할 수 있는 CCTV 채널은 통상 최대 20 채널 정도이다. 따라서, 여러 지점에 설치된 CCTV 카메라에서 발생되는 압축영상에 대해 히트맵을 생성하려면 다수의 영상분석 서버가 필요하였고, 이는 비용 증가와 물리적 공간 확보의 어려움을 유발하였다.In the prior art, a moving object is extracted to generate a heat map. To extract a moving object from a high resolution compressed image, compressed image decoding, downscale resizing, and image analysis are performed. These are very complicated processes, and therefore, in a conventional video control system, the capacity that a single video analysis server can process simultaneously is quite limited. Currently, CCTV channels that can be covered by high-performance video analytics servers are typically up to 20 channels. Therefore, in order to generate heat maps for compressed images generated from CCTV cameras installed at various points, a plurality of image analysis servers were required, which caused an increase in cost and difficulty in securing physical space.
본 발명의 목적은 일반적으로 H.264 AVC 및 H.265 HEVC 등의 압축영상으로부터 히트맵을 효과적으로 생성하는 기술을 제공하는 것이다.SUMMARY OF THE INVENTION An object of the present invention is to provide a technique for effectively generating heat maps from compressed images such as H.264 AVC and H.265 HEVC.
특히, 본 발명의 목적은 예컨대 CCTV 카메라가 생성하는 압축영상에 대해 종래기술처럼 복잡한 이미지 프로세싱을 통해 히트맵을 생성하는 것이 아니라 압축영상 데이터를 파싱하여 얻어지는 신택스(syntax)(예: 모션벡터, 코딩유형)를 활용하여 영상 내의 무언가 유의미한 움직임이 존재하는 영역, 즉 이동객체 영역을 추출하고 그 이동객체 영역의 궤적을 누적시킴으로써 적은 연산으로도 히트맵을 생성할 수 있는 기술을 제공하는 것이다.In particular, an object of the present invention is syntax (e.g., motion vector, coding) obtained by parsing compressed image data, rather than generating a heat map for a compressed image generated by a CCTV camera, for example, through complex image processing. It is to provide a technology that can generate a heat map with a small number of operations by extracting a region in which there is something meaningful movement in the image, that is, a moving object region and accumulating the trajectory of the moving object region.
본 발명은 상기의 목적을 달성하기 위한 것으로서, 본 발명에 따른 압축영상에 대한 신택스 기반의 히트맵 생성 방법은, 압축영상의 비트스트림을 파싱하여 코딩 유닛에 대한 모션벡터 및 코딩유형을 획득하는 제 1 단계; 압축영상을 구성하는 복수의 영상 블록 별로 미리 설정된 시간동안의 모션벡터 누적값을 획득하는 제 2 단계; 복수의 영상 블록에 대하여 모션벡터 누적값을 미리 설정된 제 1 임계치와 비교하는 제 3 단계; 제 1 임계치를 초과하는 모션벡터 누적값을 갖는 영상 블록을 이동객체 영역으로 마킹하는 제 4 단계; 압축영상의 일련의 영상 프레임에 걸쳐 이동객체 영역을 누적시킴으로써 압축영상에 대한 히트맵을 생성하는 제 5 단계;를 포함하여 구성된다.The present invention is to achieve the above object, the syntax-based heat map generation method for a compressed image according to the present invention, parsing the bitstream of the compressed image to obtain a motion vector and coding type for the coding unit Stage 1; A second step of obtaining a motion vector cumulative value for a predetermined time for each of the plurality of image blocks constituting the compressed image; A third step of comparing a motion vector cumulative value with a first threshold value for a plurality of image blocks; A fourth step of marking an image block having a motion vector accumulation value exceeding a first threshold as a moving object region; And a fifth step of generating a heat map for the compressed image by accumulating a moving object region over a series of image frames of the compressed image.
이때 제 5 단계는, 압축영상을 구성하는 일련의 영상 프레임에서 위 마킹된 복수의 이동객체 영역을 식별하는 제 5a 단계; 복수의 이동객체 영역에 대해 대표위치를 산출하는 제 5b 단계; 그 산출된 복수의 대표위치를 누적시킴으로써 히트 데이터를 산출하는 제 5c 단계; 히트 데이터에 기초하여 압축영상에 대한 히트맵을 생성하는 제 5d 단계;를 포함하여 구성될 수 있다.In this case, the fifth step may include: identifying a plurality of moving object areas marked above in a series of image frames constituting the compressed image; Calculating a representative position with respect to the plurality of moving object regions; A fifth step of calculating hit data by accumulating the calculated plurality of representative positions; And a fifth step of generating a heat map for the compressed image based on the hit data.
또한 제 5 단계는, 압축영상을 구성하는 일련의 영상 프레임에서 위 식별된 복수의 이동객체 영역에 대하여, ID 미할당 상태로 식별되는 이동객체 영역에 대해 Unique ID를 신규 발행하여 할당 처리하고, Unique ID 할당 상태로 영상 프레임에서 사라지는 이동객체 영역에 대해 Unique ID를 리보크 처리하는 제 5e 단계; 그 산출된 복수의 대표위치를 Unique ID 기준으로 정렬하여 Unique ID 별 이동궤적을 도출하는 제 5f 단계; Unique ID 별 이동궤적을 누적시킴으로써 히트 데이터를 보강하는 제 5g 단계;를 더 포함하여 구성될 수 있다.In the fifth step, a unique ID is newly issued and assigned to a plurality of moving object areas identified above in a series of video frames constituting the compressed image, and assigned to the moving object areas identified as unassigned. A fifth step of revoking a unique ID for the moving object region disappearing from the image frame in an ID allocation state; A fifth step of deriving a movement trajectory for each unique ID by aligning the calculated representative positions by a unique ID reference; And a fifth step of reinforcing the hit data by accumulating the movement trajectories for each Unique ID.
또한, 본 발명에 따른 히트맵 생성 방법은, 이동객체 영역을 중심으로 그 인접하는 복수의 영상 블록(이하, '이웃 블록'이라 함)을 식별하는 제 a 단계; 복수의 이웃 블록에 대해 모션벡터 값을 미리 설정된 제 2 임계치와 비교하는 제 b 단계; 제 2 임계치를 초과하는 모션벡터 값을 갖는 이웃 블록을 이동객체 영역으로 추가 마킹하는 제 c 단계; 복수의 이웃 블록 중에서 코딩유형이 인트라 픽쳐인 이웃 블록을 이동객체 영역으로 추가 마킹하는 제 d 단계; 복수의 이동객체 영역에 대하여 인터폴레이션을 수행하여 이동객체 영역으로 둘러싸인 미리 설정된 갯수 이하의 비마킹 영상 블록을 이동객체 영역으로 추가 마킹하는 제 e 단계;를 더 포함하여 구성될 수 있다.In addition, the method of generating a heat map according to the present invention may include: a) identifying a plurality of adjacent image blocks (hereinafter, referred to as 'neighbor blocks') around a moving object area; B) comparing a motion vector value with a second preset threshold value for a plurality of neighboring blocks; C) additionally marking a neighboring block having a motion vector value exceeding a second threshold as a moving object region; D) additionally marking a neighboring block having a coding type of an intra picture among the plurality of neighboring blocks as a moving object region; The method may further include an e-step of performing interpolation on the plurality of moving object regions to additionally mark a predetermined number or less of unmarked image blocks surrounded by the moving object region as the moving object region.
한편, 본 발명에 따른 컴퓨터프로그램은 하드웨어와 결합되어 이상과 같은 압축영상에 대한 신택스 기반의 히트맵 생성 방법을 실행시키기 위하여 매체에 저장된 것이다.On the other hand, the computer program according to the present invention is stored in the medium in combination with hardware to execute the syntax-based heat map generation method for the compressed image as described above.
본 발명에 따르면 디코딩, 다운스케일 리사이징, 차영상 획득, 영상 분석 등과 같은 복잡한 프로세싱을 거치지 않고서도 CCTV 압축영상으로부터 효과적으로 히트맵을 생성할 수 있는 장점이 있다. 특히, 종래기술 대비 1/10 정도의 연산량으로 히트맵을 생성할 수 있게 되어 영상분석 서버의 분석 가용 채널수를 대략 10배 이상 증가시킬 수 있는 장점이 있다.According to the present invention, there is an advantage in that a heat map can be efficiently generated from a CCTV compressed image without complex processing such as decoding, downscale resizing, difference image acquisition, image analysis, and the like. In particular, it is possible to generate a heat map with a calculation amount of about 1/10 of the prior art, which has the advantage of increasing the number of available analysis channels of the image analysis server by about 10 times or more.
도 1은 동영상 디코딩 장치의 일반적인 구성을 나타내는 블록도.1 is a block diagram showing a general configuration of a video decoding apparatus.
도 2는 종래기술에서 압축영상으로부터 히트맵을 생성하는 과정을 나타내는 순서도.2 is a flowchart illustrating a process of generating a heat map from a compressed image in the prior art.
도 3은 본 발명에 따라 압축영상에 대한 신택스 기반의 히트맵 생성 과정의 전체 프로세스를 나타내는 순서도.3 is a flowchart illustrating an entire process of a syntax based heatmap generation process for a compressed image according to the present invention.
도 4는 본 발명에서 압축영상으로부터 유효 움직임을 검출하는 과정의 구현 예를 나타내는 순서도.4 is a flowchart illustrating an embodiment of a process of detecting effective motion from a compressed image in the present invention.
도 5는 CCTV 압축영상에 대해 본 발명에 따른 유효 움직임 영역 검출 과정을 적용한 결과의 일 예를 나타내는 도면.5 is a diagram illustrating an example of a result of applying an effective motion region detection process according to the present invention to a CCTV compressed image.
도 6은 본 발명에서 이동객체 영역에 대한 바운더리 영역을 검출하는 과정의 구현 예를 나타내는 순서도. FIG. 6 is a flowchart illustrating an example of a process of detecting a boundary region for a moving object region in the present invention. FIG.
도 7은 도 5의 CCTV 영상 이미지에 대해 본 발명에 따른 바운더리 영역 검출 과정을 적용한 결과의 일 예를 나타내는 도면.7 is a diagram illustrating an example of a result of applying a boundary area detection process according to the present invention to the CCTV image of FIG.
도 8은 도 7의 CCTV 영상 이미지에 대해 인터폴레이션을 통해 이동객체 영역을 정리한 결과의 일 예를 나타내는 도면.8 is a diagram illustrating an example of a result of arranging a moving object region through interpolation with respect to the CCTV image of FIG. 7.
도 9는 본 발명에 따라 압축영상에서 검출된 이동객체 영역으로부터 히트맵을 생성하는 과정의 구현 예를 나타내는 순서도.9 is a flowchart illustrating an embodiment of a process of generating a heat map from a moving object region detected in a compressed image according to the present invention.
도 10은 본 발명에서 이동객체 영역에 Unique ID가 할당된 일 예를 나타내는 도면.FIG. 10 is a diagram for one example in which a unique ID is assigned to a moving object area in the present invention; FIG.
도 11은 본 발명에서 이동객체 영역에 중심좌표가 설정된 일 예를 나타내는 도면.11 is a view showing an example in which the center coordinates are set in the moving object area in the present invention.
이하에서는 도면을 참조하여 본 발명을 상세하게 설명한다.Hereinafter, with reference to the drawings will be described in detail the present invention.
도 3은 본 발명에 따라 압축영상에 대한 신택스 기반의 히트맵 생성 과정의 전체 프로세스를 나타내는 순서도이다. 본 발명에 따른 히트맵 생성 프로세스는 압축영상을 다루는 시스템, 예컨대 CCTV 영상관제 시스템 또는 CCTV 영상분석 시스템에서 영상분석 서버가 양호하게 수행할 수 있다.3 is a flowchart illustrating an entire process of a syntax based heatmap generation process for a compressed image according to the present invention. In the heat map generation process according to the present invention, an image analysis server may be preferably performed in a system for handling compressed images, for example, a CCTV image control system or a CCTV image analysis system.
본 발명에서는 압축영상을 디코딩할 필요없이 압축영상의 비트스트림을 파싱하여 각 영상 블록, 즉 매크로블록(Macro Block) 및 서브블록(Sub Block) 등의 신택스 정보, 바람직하게는 모션벡터(Motion Vector)와 코딩유형(Coding Type) 정보를 통해 이동객체 영역을 빠르게 추출한다. 이렇게 얻어진 이동객체 영역은 본 명세서에 첨부된 이미지에서 보여지는 바와 같이 이동객체의 경계선을 정밀하게 반영하지는 못하지만 처리속도가 빠르면서도 일정 이상의 신뢰도를 나타낸다. 그리고 나서, 본 발명에서는 이렇게 얻어진 이동객체 영역의 정보를 누적시킴으로써 해당 공간에 대한 히트맵을 생성한다.The present invention parses a bitstream of a compressed image without having to decode the compressed image, thereby syntax information of each image block, that is, a macro block and a sub block, preferably a motion vector. Quickly extract the moving object region using the and coding type information. The moving object region thus obtained does not accurately reflect the boundary of the moving object as shown in the image attached to the present specification, but exhibits a certain level of reliability even though the processing speed is high. Then, the present invention generates a heat map for the space by accumulating the information of the moving object region thus obtained.
한편, 본 발명에 따르면 압축영상을 디코딩하지 않고도 이동객체 영역을 추출해내고 히트맵 생성을 수행할 수 있다. 하지만, 본 발명이 적용된 장치 또는 소프트웨어라면 압축영상을 디코딩하는 동작을 수행하지 말아야 하는 것으로 본 발명의 범위가 한정되는 것은 아니다.Meanwhile, according to the present invention, the moving object region can be extracted and the heat map can be generated without decoding the compressed image. However, the apparatus or software to which the present invention is applied should not perform the operation of decoding the compressed image, but the scope of the present invention is not limited.
이하, 도 3을 참조하여 본 발명에 따라 압축영상으로부터 히트맵을 생성하는 과정을 전체적으로 살펴본다.Hereinafter, a process of generating a heat map from a compressed image according to the present invention will be described with reference to FIG. 3.
단계 (S100) : 먼저, 압축영상의 모션벡터에 기초하여 압축영상으로부터 실질적으로 의미를 인정할만한 유효 움직임을 검출하며, 이처럼 유효 움직임이 검출된 영상 영역을 이동객체 영역으로 설정한다.Step S100: First, an effective motion that can be substantially recognized from the compressed image is detected from the compressed image based on the motion vector of the compressed image, and the image region in which the effective motion is detected is set as the moving object region.
이를 위해, H.264 AVC 및 H.265 HEVC 등의 동영상압축 표준에 따라서 압축영상의 코딩 유닛(coding unit)의 모션벡터와 코딩유형을 파싱한다. 이때, 코딩 유닛의 사이즈는 일반적으로 64x64 픽셀 내지 4x4 픽셀 정도이며 플렉서블(flexible)하게 설정될 수 있다.To this end, the motion vector and coding type of a coding unit of a compressed image are parsed according to a video compression standard such as H.264 AVC and H.265 HEVC. In this case, the size of the coding unit is generally about 64x64 to 4x4 pixels and may be set to be flexible.
각 영상 블록에 대해 미리 설정된 일정 시간(예: 500 msec) 동안 모션벡터를 누적시키고, 그에 따른 모션벡터 누적값이 미리 설정된 제 1 임계치(예: 20)을 초과하는지 검사한다. 만일 그러한 영상 블록이 발견되면 해당 영상 블록에서 유효 움직임이 발견된 것으로 보고 이동객체 영역으로 마킹한다. 그에 따라, 모션벡터가 발생하였더라도 일정 시간동안의 누적값이 제 1 임계치를 넘지 못하는 경우에는 영상 변화가 미미한 것으로 추정하고 무시한다.The motion vectors are accumulated for a predetermined time period (for example, 500 msec) for each image block, and it is checked whether the motion vector accumulation value exceeds the first predetermined threshold (for example, 20). If such an image block is found, it is considered that effective motion has been found in the image block and marked as a moving object area. Accordingly, even if the motion vector is generated, if the cumulative value for a predetermined time does not exceed the first threshold, the image change is assumed to be insignificant and ignored.
단계 (S200) : 앞의 (S100)에서 검출된 이동객체 영역에 대하여 모션벡터와 코딩유형에 기초하여 바운더리 영역이 대략적으로 어디까지인지 검출한다. 이동객체 영역으로 마킹된 영상 블록을 중심으로 인접한 복수의 영상 블록을 검사하여 모션벡터가 제 2 임계치(예: 0) 이상 발생하였거나 코딩유형이 인트라 픽쳐(Intra Picture)일 경우에는 해당 영상 블록도 이동객체 영역으로 마킹한다. 이러한 과정을 통해서는 실질적으로는 해당 영상 블록이 앞서 (S100)에서 검출된 이동객체 영역과 한 덩어리를 이루는 형태로 되는 결과가 된다.Step S200: Detects how far the boundary region is to the moving object region detected in S100 based on the motion vector and the coding type. If a motion vector occurs above a second threshold (for example, 0) or a coding type is an intra picture by inspecting a plurality of adjacent image blocks centered on the image block marked as a moving object area, the corresponding image block is also moved. Mark as an object area. Through this process, the corresponding image block is substantially in the form of forming a lump with the moving object region detected in S100.
유효 움직임이 발견되어 이동객체 영역의 근방에서 어느 정도의 움직임이 있는 영상 블록이라면 이는 앞의 이동객체 영역과 한 덩어리일 가능성이 높기 때문에 이동객체 영역이라고 마킹한다. 또한, 인트라 픽쳐의 경우에는모션벡터가 존재하지 않기 때문에 모션벡터에 기초한 판정이 불가능하다. 이에, 이동객체 영역으로 이미 검출된 영상 블록에 인접하여 위치하는 인트라 픽쳐는 일단 기 추출된 이동객체 영역과 함께 한 덩어리로 추정한다.If an effective motion is found and there is a certain amount of motion in the vicinity of the moving object area, it is marked as a moving object area because it is likely to be a mass with the previous moving object area. In addition, in the case of an intra picture, since a motion vector does not exist, determination based on a motion vector is impossible. Accordingly, the intra picture located adjacent to the image block already detected as the moving object region is estimated as a mass together with the previously extracted moving object region.
단계 (S300) : 앞의 (S100)과 (S200)에서 검출된 이동객체 영역에 인터폴레이션(interpolation)을 적용하여 이동객체 영역의 분할(fragmentation)을 정리한다. 앞의 과정에서는 영상 블록 단위로 이동객체 영역 여부를 판단하였기 때문에 실제로는 하나의 이동객체(예: 사람)임에도 불구하고 중간중간에 이동객체 영역으로 마킹되지 않은 영상 블록이 존재하여 여러 개의 이동객체 영역으로 분할되는 현상이 발생할 수 있다. 그에 따라, 이동객체 영역으로 마킹된 복수의 영상 블록으로 둘러싸여 하나 혹은 소수의 비마킹 영상 블록이 존재한다면 이들은 이동객체 영역으로 추가로 마킹한다. 이를 통해, 여러 개로 분할되어 있는 이동객체 영역을 하나로 뭉쳐지도록 만들 수 있는데, 이와 같은 인터폴레이션의 영향은 도 7과 도 8을 비교하면 명확하게 드러난다.Step S300: The interpolation is applied to the moving object areas detected at S100 and S200 to clean up the fragmentation of the moving object area. In the above process, since it is determined whether the moving object area is the image block unit, even though it is actually a moving object (for example, a person), there is an image block that is not marked as the moving object area in the middle. The phenomenon of dividing into may occur. Accordingly, if there are one or a few unmarked image blocks surrounded by a plurality of image blocks marked with the moving object region, they additionally mark the moving object region. By doing so, it is possible to make the mobile object region divided into several into one. The influence of such interpolation is clearly seen when comparing FIG. 7 and FIG.
단계 (S400) : 이상의 과정을 통하여 코딩 유닛의 신택스(모션벡터, 코딩유형)에 기초하여 압축영상을 구성하는 각 프레임 이미지로부터 이동객체 영역을 신속하게 추출하였다. 단계 (S400)에서는 이러한 이동객체 영역의 추출 결과를 이용하여 해당 영상에 대한 히트맵을 생성한다.Step S400: The moving object region is quickly extracted from each frame image constituting the compressed image based on the syntax (motion vector, coding type) of the coding unit through the above process. In operation S400, a heat map of the corresponding image is generated by using the extracted result of the moving object region.
이를 위해, 본 발명에서는 일련의 영상 프레임에 걸쳐 이동객체 영역의 추출 결과를 누적시킴으로써 영상 내의 각 영역 별로 이동객체가 얼마나 빈번하게 발견되는지를 추정하고 그에 의해 히트맵을 생성한다. 이때, 이동객체 영역의 검출 자체를 누적시킬 수도 있고 이동객체 영역의 이동 궤적을 누적시킬 수도 있다. 이동객체 영역의 추출 결과로부터 히트맵을 생성하는 구체적인 과정에 대해서는 도 9를 참조하여 상세하게 후술한다.To this end, the present invention accumulates the extraction result of the moving object region over a series of image frames to estimate how frequently the moving object is found for each region in the image and thereby generates a heat map. In this case, the detection itself of the moving object region may be accumulated or the movement trajectories of the moving object region may be accumulated. A detailed process of generating a heat map from the extraction result of the moving object region will be described later in detail with reference to FIG. 9.
도 4는 본 발명에서 압축영상으로부터 유효 움직임을 검출하는 과정의 구현 예를 나타내는 순서도이고, 도 5는 CCTV 압축영상에 대해 본 발명에 따른 유효 움직임 영역 검출 과정이 적용된 결과의 일 예를 나타내는 도면이다. 도 4의 프로세스는 도 3에서 단계 (S100)에 대응한다.4 is a flowchart illustrating an example of a process of detecting effective motion from a compressed image in the present invention, and FIG. 5 is a diagram illustrating an example of a result of applying the effective motion region detection process according to the present invention to a CCTV compressed image. . The process of FIG. 4 corresponds to step S100 in FIG. 3.
단계 (S110) : 먼저, 압축영상의 코딩 유닛을 파싱하여 모션벡터 및 코딩유형을 획득한다. 도 1을 참조하면, 동영상 디코딩 장치는 압축영상의 스트림에 대해 H.264 AVC 및 H.265 HEVC 등과 같은 동영상압축 표준에 따라 구문분석(헤더 파싱) 및 모션벡터 연산을 수행한다. 이러한 과정을 통하여 압축영상의 코딩 유닛에 대하여 모션벡터와 코딩유형을 파싱해낸다.Step S110: First, a coding unit of a compressed image is parsed to obtain a motion vector and a coding type. Referring to FIG. 1, a video decoding apparatus performs parsing (header parsing) and motion vector operations on a stream of compressed video according to a video compression standard such as H.264 AVC and H.265 HEVC. Through this process, the motion vector and coding type are parsed for the coding unit of the compressed image.
단계 (S120) : 압축영상을 구성하는 복수의 영상 블록 별로 미리 설정된 시간(예: 500 ms) 동안의 모션벡터 누적값을 획득한다. Step S120: Acquire a motion vector cumulative value for a preset time (for example, 500 ms) for each of the plurality of image blocks constituting the compressed image.
이 단계는 압축영상으로부터 실질적으로 의미를 인정할만한 유효 움직임, 예컨대 주행중인 자동차, 달려가는 사람, 서로 싸우는 군중들이 있다면 이를 검출하려는 의도를 가지고 제시되었다. 흔들리는 나뭇잎, 잠시 나타나는 고스트, 빛의 반사에 의해 약간씩 변하는 그림자 등은 비록 움직임은 있지만 실질적으로는 무의미한 객체이므로 검출되지 않도록 한다.This step is presented with the intention to detect if there are effective movements that are practically recognizable from the compressed image, such as driving cars, running people, and fighting crowds. Shaky leaves, ghosts that appear momentarily, and shadows that change slightly due to light reflections, though they are moving, are virtually meaningless objects and should not be detected.
이를 위해, 미리 설정된 일정 시간(예: 500 msec) 동안 하나이상의 영상 블록 단위로 모션벡터를 누적시켜 모션벡터 누적값을 획득한다. 이때, 영상 블록은 매크로블록과 서브블록을 포함하는 개념으로 사용된 것이다.To this end, a motion vector cumulative value is obtained by accumulating a motion vector in units of one or more image blocks for a predetermined time period (for example, 500 msec). In this case, the image block is used as a concept including a macroblock and a subblock.
단계 (S130, S140) : 복수의 영상 블록에 대하여 모션벡터 누적값을 미리 설정된 제 1 임계치(예: 20)와 비교하며, 제 1 임계치를 초과하는 모션벡터 누적값을 갖는 영상 블록을 이동객체 영역으로 마킹한다.Steps S130 and S140: Comparing a motion vector cumulative value with respect to a plurality of image blocks with a preset first threshold value (eg, 20), and moving the image block having a motion vector cumulative value exceeding the first threshold value. Mark with
만일 이처럼 일정 이상의 모션벡터 누적값을 갖는 영상 블록이 발견되면 해당 영상 블록에서 무언가 유의미한 움직임, 즉 유효 움직임이 발견된 것으로 보고 이동객체 영역으로 마킹한다. 예컨대 영상관제 시스템에서 사람이 뛰어가는 정도로 관제 요원이 관심을 가질만한 가치가 있을 정도의 움직임을 선별하여 검출하려는 것이다. 반대로, 모션벡터가 발생하였더라도 일정 시간동안의 누적값이 제 1 임계치를 넘지 못할 정도로 작을 경우에는 영상에서의 변화가 그다지 크지않고 미미한 것으로 추정하고 검출 단계에서 무시한다.If an image block having a predetermined motion vector accumulation value is found as described above, it is considered that something significant movement, that is, effective movement, is found in the image block and is marked as a moving object region. For example, in a video surveillance system, a human run is to select and detect a movement that is worth the attention of the control personnel. On the contrary, even if a motion vector is generated, if the cumulative value for a predetermined time is small enough not to exceed the first threshold, the change in the image is assumed to be small and insignificant and is neglected in the detection step.
도 5는 도 4의 과정을 통해 CCTV 압축영상으로부터 유효 움직임 영역을 검출한 결과를 시각적으로 나타낸 일 예이다. 도 5에서는 제 1 임계치 이상의 모션벡터 누적값을 갖는 영상 블록이 이동객체 영역으로 마킹되어 볼드 라인(bold line)의 영역으로 표시되었다. 도 5를 살펴보면 보도블럭이나 도로, 그리고 그림자가 있는 부분 등은 이동객체 영역으로 표시되지 않은 반면, 걷고있는 사람들이나 주행중인 자동차 등이 이동객체 영역으로 표시되었다.FIG. 5 is an example illustrating a result of detecting an effective motion region from a CCTV compressed image through the process of FIG. 4. In FIG. 5, an image block having a motion vector accumulation value equal to or greater than a first threshold value is marked as a moving object area and displayed as a bold line area. Referring to FIG. 5, the sidewalk block, the road, and the shadowed part are not displayed as the moving object area, while the walking people or the driving car are displayed as the moving object area.
도 6은 본 발명에서 이동객체 영역에 대한 바운더리 영역을 검출하는 과정의 구현 예를 나타내는 순서도이고, 도 7은 유효 움직임 영역 검출 과정을 수행한 도 5의 CCTV 영상 이미지에 대해 도 6에 따른 바운더리 영역 검출 과정을 더 적용한 결과의 일 예를 나타내는 도면이다. 도 6의 프로세스는 도 3에서 단계 (S200)에 대응한다.FIG. 6 is a flowchart illustrating an example of a process of detecting a boundary region of a moving object region in the present invention, and FIG. 7 is a boundary region of FIG. 5 with respect to the CCTV image of FIG. Figure 1 shows an example of the results of further applying the detection process. The process of FIG. 6 corresponds to step S200 in FIG. 3.
앞서의 도 5를 살펴보면 이동객체가 제대로 마킹되지 않았으며 일부에 대해서만 마킹이 이루어진 것을 발견할 수 있다. 즉, 걷고있는 사람이나 주행중인 자동차를 살펴보면 객체의 전부가 마킹된 것이 아니라 일부 블록만 마킹되었다는 것을 발견할 수 있다. 또한, 하나의 이동객체에 대해 복수의 이동객체 영역이 마킹된 것도 많이 발견된다. 이는 앞의 (S100)에서 채택한 이동객체 영역의 판단 기준이 일반 영역을 필터링 아웃하는 데에는 매우 유용하지만 상당히 엄격한 것이었다는 것을 의미한다. 따라서, 이동객체 영역을 중심으로 그 주변을 살펴봄으로써 이동객체의 바운더리를 검출하는 과정이 필요하다.Referring to FIG. 5, it can be found that the moving object is not properly marked and only a portion of the moving object is marked. In other words, if you look at a person walking or driving a car, you will find that not all of the objects are marked, but only some blocks. In addition, it is also found that a plurality of moving object areas are marked for one moving object. This means that the criterion of the moving object region adopted in S100 was very useful for filtering out the general region but was quite strict. Therefore, it is necessary to detect the boundary of the moving object by looking around the moving object area.
단계 (S210) : 먼저, 앞의 (S100)에 의해 이동객체 영역으로 마킹된 영상 블록을 중심으로 하여 인접하는 복수의 영상 블록을 식별한다. 이들은 본 명세서에서는 편이상 '이웃 블록'이라고 부른다. 이들 이웃 블록은 (S100)에 의해서는 이동객체 영역으로 마킹되지 않은 부분인데, 도 6의 프로세스에서는 이들에 대해 좀더 살펴봄으로써 이들 이웃 블록 중에서 이동객체 영역의 바운더리에 포함될만한 것이 있는지 확인하려는 것이다.Step S210: First, a plurality of adjacent image blocks are identified based on the image blocks marked as moving object areas by the previous S100. In the present specification, these are referred to as 'neighborhood blocks'. These neighboring blocks are portions that are not marked as the moving object region by S100, and the process of FIG. 6 examines them further to determine whether any of these neighboring blocks may be included in the boundary of the moving object region.
단계 (S220, S230) : 복수의 이웃 블록에 대하여 모션벡터 값을 미리 설정된 제 2 임계치(예: 0)와 비교하고, 제 2 임계치를 초과하는 모션벡터 값을 갖는 이웃 블록을 이동객체 영역으로 마킹한다. 실질적으로 의미를 부여할만한 유효 움직임이 인정된 이동객체 영역에 인접하여 위치하고 그 자신에 대해서도 어느 정도의 움직임이 발견되고 있다면 그 영상 블록은 촬영 영상의 특성상 그 인접한 이동객체 영역과 한 덩어리일 가능성이 높다. 따라서, 이러한 이웃 블록도 이동객체 영역이라고 마킹한다. Steps S220 and S230: compare a motion vector value with respect to a plurality of neighboring blocks with a second preset threshold (eg, 0), and mark the neighboring block having a motion vector value exceeding the second threshold as a moving object region. do. If the movement is located adjacent to the area of the moving object where effective motion that is practically meaningful is found and a certain amount of movement is found for itself, the image block is likely to be a block with the area of the adjacent moving object due to the characteristics of the photographed image. . Therefore, such neighboring blocks are also marked as moving object regions.
단계 (S240) : 또한, 복수의 이웃 블록 중에서 코딩유형이 인트라 픽쳐인 것을 이동객체 영역으로 마킹한다. 인트라 픽쳐의 경우에는 모션벡터가 존재하지 않기 때문에 해당 이웃 블록에 움직임이 존재하는지 여부를 모션벡터에 기초하여 판단하는 것이 원천적으로 불가능하다. 이 경우에 이동객체 영역으로 이미 검출된 영상 블록에 인접 위치하는 인트라 픽쳐는 일단 기 추출된 이동객체 영역의 설정을 그대로 유지해주는 편이 안전하다.Step S240: Also, the coding type is an intra picture among the plurality of neighboring blocks as a moving object region. In the case of an intra picture, since a motion vector does not exist, it is fundamentally impossible to determine whether a motion exists in a corresponding neighboring block based on the motion vector. In this case, it is safer for the intra picture located adjacent to the image block already detected as the moving object region to maintain the settings of the previously extracted moving object region.
도 7은 CCTV 압축영상에 바운더리 영역 검출 과정까지 적용된 결과를 시각적으로 나타낸 도면인데, 이상의 과정을 통해 이동객체 영역으로 마킹된 다수의 영상 블록을 볼드 라인의 영역으로 표시하였다. 도 7을 살펴보면, 앞서 도 5에서 볼드 라인의 영역으로 표시되었던 이동객체 영역의 근방으로 볼드 라인 영역으로 표시된 이동객체 영역은 좀더 확장되었으며 이를 통해 CCTV로 촬영된 영상과 비교할 때 이동객체를 전부 커버할 정도가 되었다는 사실을 발견할 수 있다.FIG. 7 is a diagram illustrating a result of applying a boundary region detection process to a CCTV compressed image. In the above process, a plurality of image blocks marked as a moving object region are displayed as an area of a bold line. Referring to FIG. 7, the moving object area indicated by the bold line area was further extended near the moving object area indicated by the bold line area in FIG. 5 to cover the entire moving object when compared with the image captured by CCTV. You can find that it is enough.
도 8은 도 7에 나타낸 바운더리 영역 검출 과정을 적용한 CCTV 영상 이미지에 대해 본 발명에 따라 인터폴레이션을 통해 이동객체 영역을 정리한 결과의 일 예를 나타내는 도면이다.FIG. 8 is a diagram illustrating an example of a result of arranging a moving object region through interpolation according to the present invention for a CCTV image image to which the boundary region detection process illustrated in FIG. 7 is applied.
단계 (S300)은 앞의 (S100)과 (S200)에서 검출된 이동객체 영역에 인터폴레이션을 적용하여 이동객체 영역의 분할을 정리하는 과정이다. 도 7을 살펴보면 볼드 라인으로 표시된 이동객체 영역 사이사이에 비마킹 영상 블록이 발견된다. 이렇게 중간중간에 비마킹 영상 블록이 존재하게 되면 이들이 다수의 개별적인 이동객체인 것처럼 간주될 수 있다. 이렇게 이동객체 영역이 파편화되면 단계 (S400)의 결과가 부정확해질 수 있고, 이동객체 영역의 갯수가 많아져서 단계 (S400)의 프로세스가 복잡해지는 문제도 있다.Step S300 is a process of arranging the division of the moving object area by applying interpolation to the moving object areas detected in the previous steps S100 and S200. Referring to FIG. 7, an unmarked image block is found between the moving object regions indicated by the bold lines. If there is an unmarked image block in the middle, it can be regarded as if they are a plurality of individual moving objects. If the moving object region is fragmented as described above, the result of step S400 may be inaccurate, and the number of moving object regions may increase, thereby complicating the process of step S400.
그에 따라, 본 발명에서는 이동객체 영역으로 마킹된 복수의 영상 블록으로 둘러싸여 하나 혹은 소수의 비마킹 영상 블록이 존재한다면 이는 이동객체 영역으로 마킹하는데, 이를 인터폴레이션이라고 부른다. 도 7과 대비하여 도 8을 살펴보면, 이동객체 영역 사이사이에 존재하던 비마킹 영상 블록이 모두 이동객체 영역이라고 마킹되었다. 이를 통해, 덩어리로 움직이는 영역은 모두 묶어서 하나의 이동객체로서 다루게 된다.Accordingly, in the present invention, if there is one or a few unmarked image blocks surrounded by a plurality of image blocks marked as the moving object region, this is marked as the moving object region, which is called interpolation. Referring to FIG. 8, in contrast to FIG. 7, all of the non-marked image blocks existing between the moving object regions are marked as moving object regions. By doing this, all the moving areas are bundled together and treated as a moving object.
도 5와 도 8을 비교하면 바운더리 영역 검출 과정과 인터폴레이션 과정을 거치면서 이동객체 영역이 실제 영상의 상황을 제대로 반영하게 되어간다는 사실을 발견할 수 있다. 도 5에서 볼드 라인으로 마킹된 덩어리로 판단한다면 영상 화면 속에 아주 작은 물체들이 다수 움직이는 것처럼 다루어질 것인데, 이는 실제와는 부합하지 않는다. 반면, 도 8에서 볼드 라인으로 마킹된 덩어리로 판단한다면 어느 정도의 부피를 갖는 몇 개의 이동객체가 존재하는 것으로 다루어질 것이어서 실제 장면을 유사하게 반영한다.Comparing FIG. 5 and FIG. 8, it can be found that the moving object region properly reflects the actual image situation through the boundary region detection process and the interpolation process. In FIG. 5, if it is judged as a lump marked with a bold line, it will be treated as if a lot of very small objects move in the image screen, which does not correspond to reality. On the other hand, if it is determined as a block marked with a bold line in Fig. 8 will be treated as having a few moving objects having a certain volume, similarly reflects the actual scene.
도 9는 본 발명에 따라 압축영상에서 검출된 이동객체 영역으로부터 히트맵을 생성하는 과정의 구현 예를 나타내는 순서도로서, 도 3에서 단계 (S400)에 대응한다. FIG. 9 is a flowchart illustrating an example of a process of generating a heat map from a moving object region detected in a compressed image according to the present invention, and corresponds to step S400 of FIG. 3.
전술한 바와 같이 본 발명은 압축영상에서 바로 얻을 수 있는 신택스 정보에 기초하여 이동객체 영역을 추출한다. 종래기술의 압축영상을 디코딩하여 원본 영상에 대해 차영상을 획득하여 분석하는 과정이 불필요하게 되었으며, 이를 통해 발명자의 테스트에 따르면 최대 20배의 처리속도 개선을 이루었다. 그러나, 이러한 접근방식은 정밀도가 떨어진다는 약점이 있다. 이동객체를 추출하는 것이 아니라 이동객체가 포함된 것으로 추정되는 영상 블록의 덩어리를 추출한다는 점에서 개념상 차이가 있다. 본 발명에서 히트맵을 생성하는 과정에도 이러한 구성상 특징을 반영하였다.As described above, the present invention extracts a moving object region based on syntax information directly obtained from a compressed image. The process of acquiring and analyzing the difference image of the original image by decoding the compressed image of the prior art is unnecessary, and according to the inventor's test, the processing speed is improved by up to 20 times. However, this approach has the disadvantage of poor precision. There is a difference in concept in that it extracts the chunk of the image block that is assumed to contain the moving object, rather than extracting the moving object. In the present invention, the process of generating the heat map also reflects these structural features.
이하에서, 본 발명에서 채택하고 있는 히트맵 생성 과정의 일 실시예를 구체적으로 기술한다.Hereinafter, an embodiment of a heat map generation process adopted in the present invention will be described in detail.
단계 (S410, S420) : 압축영상을 구성하는 일련의 영상 프레임으로부터 복수의 이동객체 영역을 식별한다. 예를 들어, 초당 24 프레임으로 촬영된 10분 동안의 압축영상으로부터 히트맵을 생성하는 예라면, 총 14,400 프레임의 영상 이미지에 대해 도 3의 단계 (S100) 내지 단계 (S300)의 과정을 통하여 도 8에 볼드 라인으로 표시된 것과 같은 이동객체 영역을 다수 식별한다.Steps S410 and S420: Identify a plurality of moving object regions from a series of image frames constituting the compressed image. For example, if the heat map is generated from the compressed image for 10 minutes taken at 24 frames per second, the image of 14,400 frames in total through the process of steps S100 to S300 of FIG. Identifies a number of moving object areas, such as those indicated by bold lines in 8.
그리고 나서, 이렇게 식별된 복수의 이동객체 영역에 대해 개별적으로 해당 프레임 이미지 내에서의 위치를 나타내는 대표좌표(representative coordinates)를 산출한다. 대표좌표로는 도 11에 도시된 것과 같이 해당 이동객체 영역을 둘러싸는 가상의 최적 사각형에 대한 중심좌표(center coordinates)(cx1, cy1; cx2, cy2; cx3, cy3)를 사용할 수 있다.Representative coordinates are then calculated for the plurality of moving object regions identified in this manner, each of which represents a position within the frame image. As the representative coordinates, as shown in FIG. 11, center coordinates (cx1, cy1; cx2, cy2; cx3, cy3) for a virtual optimal rectangle surrounding the moving object region may be used.
단계 (S430) : 그리고 나서, 일련의 영상 프레임에서 도출되는 복수의 대표좌표를 예컨대 영상 블록 단위로 누적시킴으로써 그 공간에서 이동객체의 출몰 빈도를 반영하는 히트 데이터(heat data)를 산출한다. 단계 (S430)에서 산출된 히트 데이터를 활용하여 히트맵을 생성하는 것도 가능하지만, 이하의 단계 (S440) 내지 단계 (S470)을 통해 히트 데이터를 보강한 이후에 히트맵을 생성하는 것이 바람직하다. 또한, 구현 예에 따라서는 단계 (S430)의 히트 데이터 산출 과정을 배제하고 처음부터 단계 (S440) 내지 단계 (S470)을 통해 히트 데이터를 산출하여 히트맵을 생성하는 방식도 가능하다.Step S430: Then, by accumulating a plurality of representative coordinates derived from a series of image frames in image block units, heat data reflecting the frequency of appearance of the moving object in the space is calculated. Although it is possible to generate a heat map using the heat data calculated in step S430, it is preferable to generate the heat map after reinforcing the heat data through the following steps S440 to S470. In addition, according to the exemplary embodiment, a heat map may be generated by excluding the heat data calculation process of step S430 and calculating the heat data through steps S440 to S470 from the beginning.
단계 (S440, S450) : 압축영상을 구성하는 일련의 영상 프레임에서 복수의 이동객체 영역에 대하여 Unique ID를 할당 관리함으로써 이동객체 영역을 단순히 영역(region)이 아니라 '객체(object)'처럼 다룬다.Steps S440 and S450: A unique ID is allocated to a plurality of moving object areas in a series of image frames constituting the compressed image to treat the moving object area as an 'object' rather than a region.
먼저, 이동객체 영역을 하나의 객체(오브젝트)처럼 다루기 위하여 이전의 프레임에서 식별정보(ID)가 할당되지 않은 상태인 이동객체 영역을 현재 프레임에서 발견한다면 Unique ID를 신규 발행하여 할당해준다(S440). 즉, 영상에서 새로운 이동객체가 발견된 것이다. 도 10은 세 개의 이동객체 영역에 Unique ID가 할당되어 있는 예를 나타낸다.First, in order to treat the moving object area as one object (object), if a moving object area in which the identification information (ID) is not assigned in the previous frame is found in the current frame, a new unique ID is issued and assigned (S440). . That is, a new moving object is found in the image. 10 illustrates an example in which unique IDs are allocated to three moving object areas.
반대로, Unique ID 할당 상태였던 이동객체 영역이 일련의 영상 프레임을 넘어가면서 사라지는 경우에 그 이동객체 영역에 대해 앞서 단계 (S440)에서 할당해주었던 Unique ID를 리보크(revoke) 처리한다(S450). 즉, 이전에 발견하여 추적 관리해왔던 이동객체가 영상에서 사라진 것이다.On the contrary, when the moving object region that has been assigned the Unique ID disappears while passing through a series of image frames, the unique ID allocated in step S440 is revoked for the moving object region (S450). In other words, the moving object previously found and tracked disappears from the image.
이처럼 Unique ID의 신규 할당 및 리보크 처리를 통하여 이동객체 영역을 마치 객체인 것처럼 간주하고 압축영상에서 일련의 프레임 이미지를 넘어가면서 그 객체의 이동 궤적을 추적한다.Through the new allocation and revoke processing of Unique ID, the moving object area is regarded as an object, and the moving trajectory of the object is tracked while moving over a series of frame images in the compressed image.
한편, 단계 (S440, S450)에서 이루어지는 과정에 대해 좀더 살펴본다. 단계 (S440, S450)에서는 이동객체 영역이라고 마킹되어진 서로 연결된 영상 블록의 덩어리가 일련의 영상 프레임 앞뒤 간에 동일한 것인지 아닌지를 판단할 수 있어야 한다. 그래야, 현재 다루고 있는 이동객체 영역에 대해 이전에 Unique ID가 할당되어 있었는지 여부를 판단할 수 있기 때문이다.On the other hand, looks at the process made in the step (S440, S450) more. In steps S440 and S450, it should be possible to determine whether the chunks of the interconnected image blocks marked as moving object regions are the same before and after the series of image frames. This is because it is possible to determine whether the Unique ID has been previously assigned to the mobile object area currently being handled.
본 발명에서는 원본 영상 이미지를 해석하지 않고 영상 블록이 이동객체 영역인지 여부만 체크하였기 때문에 앞뒤의 영상 프레임에서 이동객체 영역의 덩어리가 실제로 동일한지 아닌지 확인할 수 없다. 즉, 영상에 포함된 이미지 내용을 파악하지 않기 때문에 예컨대 동일 지점에서 앞뒤 프레임 간에 고양이가 개로 치환되었을 때에 그러한 변화를 식별하지 못한다. 하지만, 프레임 간의 시간간격이 매우 짧다는 점과 CCTV 카메라의 관찰 대상은 통상의 속도로 움직인다는 점을 감안하면 이러한 일이 벌어질 가능성은 매우 낮다.In the present invention, since only the image block is a moving object region without checking the original image image, it is not possible to confirm whether the chunks of the moving object region are actually the same in the front and back image frames. That is, since the contents of the image included in the image are not known, such a change cannot be identified, for example, when the cat is replaced by a dog between the front and rear frames at the same point. However, given that the time interval between frames is very short and that the observation object of the CCTV camera moves at a normal speed, this is unlikely to happen.
이에, 본 발명에서는 앞뒤 프레임에서 이동객체 영역의 덩어리 간에 중첩되는 영상 블록의 비율 혹은 갯수가 일정 임계치 이상인 것들을 동일한 이동객체 영역이라고 추정한다. 이러한 접근방식에 의하면 원본 영상의 내용을 모르더라도 특정의 이동객체 영역이 움직이고 있는 것인지 아니면 새로운 이동객체 영역이 신규로 나타난 것인지 아니면 기존의 이동객체 영역이 사라진 것인지 판단할 수 있다. 이러한 판단은 정확도는 종래기술에 비해 낮지만 데이터 처리 속도를 획기적으로 높일 수 있어 실제 적용에서 유리하다.Accordingly, the present invention estimates that the ratio or number of image blocks overlapping between the chunks of the moving object region in the front and back frames is equal to or greater than a predetermined threshold. According to this approach, it is possible to determine whether a specific moving object area is moving, a new moving object area is new, or an existing moving object area disappears even if the contents of the original image are not known. This judgment is lower in accuracy than the prior art, but can greatly increase the data processing speed, which is advantageous in practical applications.
단계 (S460) : 복수의 이동객체 영역에 대해 단계 (S420)에서 산출하였던 복수의 대표좌표를 Unique ID 기준으로 정렬하며, 이를 통해 각각의 Unique ID가 일련의 영상 프레임에서 나타나는 대표좌표의 시퀀스를 얻을 수 있다. 이는 Unique ID로 표상되는 각각의 이동객체가 일련의 영상 프레임에서 어떻게 이동하였는지를 나타내는 이동궤적에 해당된다.Step S460: Arrange a plurality of representative coordinates calculated in step S420 for a plurality of moving object areas based on a unique ID, thereby obtaining a sequence of representative coordinates in which each unique ID appears in a series of image frames. Can be. This corresponds to a movement trajectory indicating how each moving object represented by a unique ID has moved in a series of image frames.
단계 (S470) : 그리고 나서, Unique ID 별 이동궤적을 예컨대 영상 블록 단위로 누적시킴으로써 히트 데이터를 보강한다. 단계 (S430)에서 산출된 히트 데이터는 이동객체 영역의 출몰 빈도를 반영하기 때문에 처리 속도는 빠른 반면 객체들의 이동궤적의 특징은 반영하지 못하는 단점이 있었다. 단계 (S470)에서 얻어지는 히트 데이터는 처리 속도는 상대적으로 느리지만 해당 공간에서 객체들의 동선을 반영할 수 있는 장점이 있다.Step S470: Then, the hit data is reinforced by accumulating the movement trajectories for the unique IDs in image block units, for example. Since the hit data calculated in step S430 reflects the frequency of appearance of the moving object region, the processing speed is high but the characteristics of the movement trajectories of the objects are not reflected. The hit data obtained in step S470 is relatively slow in processing speed, but has an advantage of reflecting moving lines of objects in a corresponding space.
단계 (S480) : 이상의 과정에서 산출된 히트 데이터에 기초하여 압축영상에 대하여 히트맵 영상을 생성한다.Step S480: A heat map image is generated for the compressed image based on the hit data calculated in the above process.
한편, 본 발명은 컴퓨터가 읽을 수 있는 비휘발성 기록매체에 컴퓨터가 읽을 수 있는 코드의 형태로 구현되는 것이 가능하다. 이러한 비휘발성 기록매체로는 다양한 형태의 스토리지 장치가 존재하는데 예컨대 하드디스크, SSD, CD-ROM, NAS, 자기테이프, 웹디스크, 클라우드 디스크 등이 있고 네트워크로 연결된 다수의 스토리지 장치에 코드가 분산 저장되고 실행되는 형태도 구현될 수 있다. 또한, 본 발명은 하드웨어와 결합되어 특정의 절차를 실행시키기 위하여 매체에 저장된 컴퓨터프로그램의 형태로 구현될 수도 있다.Meanwhile, the present invention may be embodied in the form of computer readable codes on a computer readable nonvolatile recording medium. Such nonvolatile recording media include various types of storage devices, such as hard disks, SSDs, CD-ROMs, NAS, magnetic tapes, web disks, and cloud disks. Forms that are implemented and executed may also be implemented. In addition, the present invention may be implemented in the form of a computer program stored in a medium in combination with hardware to execute a specific procedure.

Claims (7)

  1. 압축영상의 비트스트림을 파싱하여 코딩 유닛에 대한 모션벡터 및 코딩유형을 획득하는 제 1 단계;Parsing the bitstream of the compressed image to obtain a motion vector and a coding type for the coding unit;
    압축영상을 구성하는 복수의 영상 블록 별로 미리 설정된 시간동안의 모션벡터 누적값을 획득하는 제 2 단계;A second step of obtaining a motion vector cumulative value for a predetermined time for each of the plurality of image blocks constituting the compressed image;
    상기 복수의 영상 블록에 대하여 상기 모션벡터 누적값을 미리 설정된 제 1 임계치와 비교하는 제 3 단계;A third step of comparing the motion vector cumulative value with a first threshold value for the plurality of image blocks;
    상기 제 1 임계치를 초과하는 모션벡터 누적값을 갖는 영상 블록을 이동객체 영역으로 마킹하는 제 4 단계;A fourth step of marking an image block having a motion vector accumulation value exceeding the first threshold as a moving object region;
    압축영상의 일련의 영상 프레임에 걸쳐 상기 이동객체 영역을 누적시킴으로써 상기 압축영상에 대한 히트맵을 생성하는 제 5 단계;Generating a heat map for the compressed image by accumulating the moving object region over a series of image frames of the compressed image;
    를 포함하여 구성되는 압축영상에 대한 신택스 기반의 히트맵 생성 방법.A syntax-based heat map generation method for a compressed image including a.
  2. 청구항 1에 있어서, The method according to claim 1,
    상기 제 5 단계는,The fifth step,
    압축영상을 구성하는 일련의 영상 프레임에서 상기 마킹된 복수의 이동객체 영역을 식별하는 제 5a 단계;A fifth step of identifying the plurality of marked moving object regions in a series of image frames constituting a compressed image;
    상기 복수의 이동객체 영역에 대해 대표위치를 산출하는 제 5b 단계;Calculating a representative position with respect to the plurality of moving object regions;
    상기 산출된 복수의 대표위치를 누적시킴으로써 히트 데이터를 산출하는 제 5c 단계;A fifth step of calculating hit data by accumulating the plurality of representative positions calculated;
    상기 히트 데이터에 기초하여 상기 압축영상에 대한 히트맵을 생성하는 제 5d 단계;A fifth step of generating a heat map for the compressed image based on the hit data;
    를 포함하여 구성되는 것을 특징으로 하는 압축영상에 대한 신택스 기반의 히트맵 생성 방법.Syntax-based heat map generation method for a compressed image, characterized in that comprises a.
  3. 청구항 2에 있어서,The method according to claim 2,
    상기 제 5 단계는, The fifth step,
    상기 제 5c 단계와 상기 제 5d 단계 사이에 수행되는,Performed between step 5c and step 5d,
    압축영상을 구성하는 일련의 영상 프레임에서 상기 식별된 복수의 이동객체 영역에 대하여, ID 미할당 상태로 식별되는 이동객체 영역에 대해 Unique ID를 신규 발행하여 할당 처리하고, Unique ID 할당 상태로 영상 프레임에서 사라지는 이동객체 영역에 대해 Unique ID를 리보크 처리하는 제 5e 단계;In a series of video frames constituting a compressed image, a unique ID is newly issued and assigned to a plurality of mobile object areas identified by an ID unassigned state, and the image frames are assigned a unique ID state. A fifth step of revoking a unique ID for the moving object region disappearing in step 5e;
    상기 산출된 복수의 대표위치를 상기 Unique ID 기준으로 정렬하여 Unique ID 별 이동궤적을 도출하는 제 5f 단계;A fifth step of arranging the calculated representative positions based on the unique ID to derive a movement trajectory for each unique ID;
    상기 Unique ID 별 이동궤적을 누적시킴으로써 상기 히트 데이터를 보강하는 제 5g 단계;A fifth step of reinforcing the hit data by accumulating the movement trajectories for each Unique ID;
    를 더 포함하여 구성되는 것을 특징으로 하는 압축영상에 대한 신택스 기반의 히트맵 생성 방법.A syntax based heatmap generation method for a compressed image, characterized in that it further comprises.
  4. 청구항 1에 있어서,The method according to claim 1,
    상기 제 4 단계와 상기 제 5 단계 사이에 수행되는,Performed between the fourth and fifth steps,
    상기 이동객체 영역을 중심으로 그 인접하는 복수의 영상 블록(이하, '이웃 블록'이라 함)을 식별하는 제 a 단계;A step of identifying a plurality of adjacent image blocks (hereinafter, referred to as 'neighbor block') around the moving object area;
    상기 복수의 이웃 블록에 대하여 상기 제 1 단계에서 획득된 모션벡터 값을 미리 설정된 제 2 임계치와 비교하는 제 b 단계;B) comparing a motion vector value obtained in the first step with respect to the plurality of neighboring blocks with a second preset threshold value;
    상기 복수의 이웃 블록 중에서 상기 제 b 단계의 비교 결과 상기 제 2 임계치를 초과하는 모션벡터 값을 갖는 이웃 블록을 이동객체 영역으로 추가 마킹하는 제 c 단계;C) additionally marking, as a moving object region, a neighboring block having a motion vector value exceeding the second threshold value as a result of the comparison in the b of the plurality of neighboring blocks;
    를 더 포함하여 구성되는 것을 특징으로 하는 압축영상에 대한 신택스 기반의 히트맵 생성 방법.A syntax based heatmap generation method for a compressed image, characterized in that it further comprises.
  5. 청구항 4에 있어서,The method according to claim 4,
    상기 제 c 단계 이후에 수행되는,Carried out after the step c,
    상기 복수의 이웃 블록 중에서 코딩유형이 인트라 픽쳐인 이웃 블록을 이동객체 영역으로 추가 마킹하는 제 d 단계;D) additionally marking a neighboring block having a coding type of an intra picture among the plurality of neighboring blocks as a moving object region;
    를 더 포함하여 구성되는 것을 특징으로 하는 압축영상에 대한 신택스 기반의 히트맵 생성 방법.A syntax based heatmap generation method for a compressed image, characterized in that it further comprises.
  6. 청구항 5에 있어서,The method according to claim 5,
    상기 제 d 단계 이후에 수행되는,Carried out after the d step,
    상기 복수의 이동객체 영역에 대하여 인터폴레이션을 수행하여 이동객체 영역으로 둘러싸인 미리 설정된 갯수 이하의 비마킹 영상 블록을 이동객체 영역으로 추가 마킹하는 제 e 단계;Performing an interpolation operation on the plurality of moving object regions to additionally mark up to a predetermined number of non-marked image blocks surrounded by the moving object region as a moving object region;
    를 더 포함하여 구성되는 것을 특징으로 하는 압축영상에 대한 신택스 기반의 히트맵 생성 방법.Syntax-based heat map generation method for a compressed image, characterized in that further comprises.
  7. 하드웨어와 결합되어 청구항 1 내지 6 중 어느 하나의 항에 따른 압축영상에 대한 신택스 기반의 히트맵 생성 방법을 실행시키기 위하여 매체에 저장된 컴퓨터프로그램.A computer program stored in a medium in combination with hardware to execute a syntax-based heat map generation method for a compressed image according to any one of claims 1 to 6.
PCT/KR2019/009372 2018-07-30 2019-07-29 Method for generating syntax-based heat-map for compressed image WO2020027511A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
KR1020180088394A KR102042397B1 (en) 2018-07-30 2018-07-30 syntax-based method of producing heat-map for compressed video
KR10-2018-0088394 2018-07-30

Publications (1)

Publication Number Publication Date
WO2020027511A1 true WO2020027511A1 (en) 2020-02-06

Family

ID=68542075

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/KR2019/009372 WO2020027511A1 (en) 2018-07-30 2019-07-29 Method for generating syntax-based heat-map for compressed image

Country Status (2)

Country Link
KR (1) KR102042397B1 (en)
WO (1) WO2020027511A1 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR102238124B1 (en) 2019-11-29 2021-04-08 주식회사 다누시스 Detection System For Reverse Direction Moving Object
KR102190486B1 (en) 2020-04-29 2020-12-11 주식회사 다누시스 Pedestrian Abnormal Behavior Detection System For Screening Control Using Optical Flow

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR100343780B1 (en) * 2000-07-31 2002-07-20 한국전자통신연구원 Method of Camera Motion Detection in Compressed Domain for Content-Based Indexing of Compressed Video
US20140133703A1 (en) * 2012-11-11 2014-05-15 Samsung Electronics Co. Ltd. Video object tracking using multi-path trajectory analysis
KR20170072131A (en) * 2015-12-16 2017-06-26 파나소닉 아이피 매니지먼트 가부시키가이샤 Human detecting system
KR101798768B1 (en) * 2016-06-07 2017-12-12 주식회사 에스원 Events detection based Image recording device and Method thereof
US20180114067A1 (en) * 2016-10-26 2018-04-26 Samsung Sds Co., Ltd. Apparatus and method for extracting objects in view point of moving vehicle

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20150080863A (en) * 2014-01-02 2015-07-10 삼성테크윈 주식회사 Apparatus and method for providing heatmap
KR102085035B1 (en) * 2014-09-29 2020-03-05 에스케이 텔레콤주식회사 Method and Apparatus for Setting Candidate Area of Object for Recognizing Object
KR102126370B1 (en) * 2014-11-24 2020-07-09 한국전자통신연구원 Apparatus and method for analyzing motion
KR20160093809A (en) * 2015-01-29 2016-08-09 한국전자통신연구원 Method and apparatus for detecting object based on frame image and motion vector
KR101640572B1 (en) * 2015-11-26 2016-07-18 이노뎁 주식회사 Image Processing Device and Image Processing Method performing effective coding unit setting
KR101874639B1 (en) * 2016-09-09 2018-07-04 이노뎁 주식회사 CCTV camera device for elevators using motion sensor
KR101949676B1 (en) * 2017-12-20 2019-02-19 이노뎁 주식회사 syntax-based method of providing intrusion detection in compressed video

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR100343780B1 (en) * 2000-07-31 2002-07-20 한국전자통신연구원 Method of Camera Motion Detection in Compressed Domain for Content-Based Indexing of Compressed Video
US20140133703A1 (en) * 2012-11-11 2014-05-15 Samsung Electronics Co. Ltd. Video object tracking using multi-path trajectory analysis
KR20170072131A (en) * 2015-12-16 2017-06-26 파나소닉 아이피 매니지먼트 가부시키가이샤 Human detecting system
KR101798768B1 (en) * 2016-06-07 2017-12-12 주식회사 에스원 Events detection based Image recording device and Method thereof
US20180114067A1 (en) * 2016-10-26 2018-04-26 Samsung Sds Co., Ltd. Apparatus and method for extracting objects in view point of moving vehicle

Also Published As

Publication number Publication date
KR102042397B1 (en) 2019-11-08

Similar Documents

Publication Publication Date Title
WO2020027513A1 (en) Syntax-based image analysis system for compressed image, and interworking processing method
WO2019124635A1 (en) Syntax-based method for sensing object intrusion in compressed video
KR102187376B1 (en) syntax-based method of providing selective video surveillance by use of deep-learning image analysis
US20120275524A1 (en) Systems and methods for processing shadows in compressed video images
WO2018135922A1 (en) Method and system for tracking object of interest in real-time in multi-camera environment
WO2020027511A1 (en) Method for generating syntax-based heat-map for compressed image
WO2016201683A1 (en) Cloud platform with multi camera synchronization
US10410059B2 (en) Cloud platform with multi camera synchronization
WO2018030658A1 (en) Method for detecting, through reconstruction image processing, moving object from stored cctv image
WO2016064107A1 (en) Pan/tilt/zoom camera based video playing method and apparatus
WO2019039661A1 (en) Method for syntax-based extraction of moving object region of compressed video
KR102061915B1 (en) syntax-based method of providing object classification for compressed video
WO2020027512A1 (en) Method for syntax-based object tracking control for compressed image by ptz camera
KR102127276B1 (en) The System and Method for Panoramic Video Surveillance with Multiple High-Resolution Video Cameras
KR102179077B1 (en) syntax-based method of providing object classification in compressed video by use of neural network which is learned by cooperation with an external commercial classifier
KR102015082B1 (en) syntax-based method of providing object tracking in compressed video
KR102015084B1 (en) syntax-based method of detecting fence-climbing objects in compressed video
KR101064946B1 (en) Object abstraction apparatus based multi image analysis and its method
WO2019124633A1 (en) Syntax-based method for sensing wall-climbing object in compressed video
WO2019124632A1 (en) Syntax-based method for sensing loitering object in compressed video
KR102178952B1 (en) method of providing object classification for compressed video by use of syntax-based MRPN-CNN
KR102343029B1 (en) method of processing compressed video by use of branching by motion vector
JPH1115981A (en) Wide area monitoring device and system therefor
KR102153093B1 (en) syntax-based method of extracting region of moving object out of compressed video with context consideration
KR102585167B1 (en) syntax-based method of analyzing RE-ID in compressed video

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19845472

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19845472

Country of ref document: EP

Kind code of ref document: A1