WO2020175818A1 - 온라인 학습을 이용한 객체 트래킹을 위한 방법 및 시스템 - Google Patents

온라인 학습을 이용한 객체 트래킹을 위한 방법 및 시스템 Download PDF

Info

Publication number
WO2020175818A1
WO2020175818A1 PCT/KR2020/001866 KR2020001866W WO2020175818A1 WO 2020175818 A1 WO2020175818 A1 WO 2020175818A1 KR 2020001866 W KR2020001866 W KR 2020001866W WO 2020175818 A1 WO2020175818 A1 WO 2020175818A1
Authority
WO
WIPO (PCT)
Prior art keywords
learning
target
computer system
classifier model
object tracking
Prior art date
Application number
PCT/KR2020/001866
Other languages
English (en)
French (fr)
Inventor
강명구
위동윤
배순민
Original Assignee
네이버 주식회사
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 네이버 주식회사 filed Critical 네이버 주식회사
Priority to CN202080014716.0A priority Critical patent/CN113454640A/zh
Priority to JP2021549487A priority patent/JP7192143B2/ja
Publication of WO2020175818A1 publication Critical patent/WO2020175818A1/ko
Priority to US17/458,896 priority patent/US11972578B2/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/41Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • G06T7/251Analysis of motion using feature-based methods, e.g. the tracking of corners or segments involving models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • G06F18/2148Generating training patterns; Bootstrap methods, e.g. bagging or boosting characterised by the process organisation or structure, e.g. boosting cascade
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2413Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/62Extraction of image or video features relating to a temporal dimension, e.g. time-based feature extraction; Pattern tracking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/48Matching video sequences
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person

Definitions

  • Object pose estimation is important for computer vision, human-machine interaction, and other related areas. For example, if the user's head is regarded as the object to be estimated, the user's continuous head pose is estimated and In addition, the estimation result of the object (e.g., head) pose can be used to proceed with the human-machine interaction, for example, the user's You can gain attention and have more effective human-machine interaction.
  • object e.g., head
  • No. 10-2008-0073933 (published on August 12, 2008) discloses a technology for automatically tracking the motion of an object from an input video image in real time and determining the pose of an object.
  • the object pose estimation method currently used is generally divided into a tracking-based method and a learning-based method.
  • the tracking-based method estimates the pose of an object by matching a paired method between the current frame and the previous frame in the video sequence.
  • the object pose estimation is generally defined as a classification method or a regression method, and training is performed through samples including labels, and the pose of the object is estimated using the acquired training model. .
  • the global pattern of each target can be learned through the online learning model with the addition of a classifier that classifies the ID (identification number) of each target.
  • the motion factor according to the local pattern and the appearance factor according to the global pattern can be used together for tracking.
  • the system is configured to execute computer-readable instructions contained in memory.
  • the object tracking method by the at least one processor, using a global pattern matching (global pattern matching) to learn a classifier (classifier) model; And the at least one processor
  • a global pattern matching global pattern matching
  • classifier classifier
  • an object tracking method including the step of classifying and tracking each target through online learning including the classifier model is provided.
  • the learning step may include learning the global pattern of each target through a learning model in which a classifier for classifying each target is added.
  • the learning step is a step of discriminating a valid period in which a target exists in the entire continuous section of the input video; after labeling any one of the valid sections.
  • the step of learning the classifier model by creating training data; And iteratively learning the classifier model by creating training data after labeling of the next valid interval and creating accumulated training data by merging with the previously created training data.
  • the above labeling is based on the global pattern of the target.
  • the similarity matrix of the classifier model calculated based on the appearance factor, can be used.
  • the step of learning may further include performing labeling for invalid periods other than the valid interval through the classifier model learned with the valid interval.
  • the tracking step includes the steps of finding the coordinates of the keypoints of each target by finding the position of the target for all frames of the input video; the coordinates of the keypoints of each target are adjacent to each other.
  • the pose matching in the step of performing the pose matching, can be performed using a similarity matrix calculated based on a motion factor for a box indicating the position of the target.
  • the matching score can indicate the closeness between the target in the previous frame and the target in the next frame.
  • the tracking step is a process of removing the error of the pose matching through error measurement based on a boundary box indicating the position of the target, and the pose matching using an interpolation method.
  • the process of performing smoothing may further include performing at least one post-processing process.
  • () 2020/175818 1»(:1/10 ⁇ 020/001866 Provides a computer-readable recording medium characterized by recording a program for executing the object tracking method on a computer.
  • a computer system comprising: a memory; and at least one processor connected to the memory and configured to execute computer-readable instructions included in the memory, wherein the at least one processor comprises a classifier model using global pattern matching It provides a computer system that handles the process of classifying and tracking each target through online learning including the classifier model.
  • a classifier for classifying II) of each target has been added.
  • Global patterns of each target can be learned through the online learning model.
  • a motion factor according to a local pattern and an appearance factor according to a global pattern can be used together for tracking.
  • FIG. 1 is a block diagram for explaining an example of an internal configuration of a computer system according to an embodiment of the present invention.
  • FIG. 2 is a diagram showing an example of components that may be included in a processor of a computer system according to an embodiment of the present invention.
  • Figure 4 is a process of obtaining the key point coordinates of the target in an embodiment of the present invention.
  • 5 is a diagram showing an example of measurement of 1011 indicating the degree of overlap between regions in an embodiment of the present invention.
  • FIGs. 6 to 7 show an example of a process of learning a global pattern of a target in an embodiment of the present invention.
  • Embodiments of the present invention relate to a technique for tracking object location through an online learning model.
  • FIG. 1 is a block diagram for explaining an example of an internal configuration of a computer system in an embodiment of the present invention.
  • an object tracking system according to an embodiment of the present invention is a computer system ( 100) can be implemented.
  • the computer system 100 is a component for executing the object tracking method, and the processor 110, the memory 120, the persistent storage device 130, the bus 140, the input/output interface 150 and network interface 160.
  • the processor 110 is a component for object tracking.
  • Processor (no ) may include, for example, a processor and/or digital processor within a computer processor, mobile device or other electronic device.
  • Processor (no ) may be, for example, It can be included in server computing devices, server computers, a series of server computers, server farms, cloud computers, content platforms, etc.
  • Processor 110 is via bus 140
  • the memory 120 is used by or output by the computer system 100.
  • Memory 120 may include volatile memory, permanent, virtual or other memory for storing information.
  • Memory 120 may include, for example, random access memory (RAM) and/or dynamic RAM (DRAM).
  • RAM random access memory
  • DRAM dynamic RAM
  • the memory 120 may be used to store arbitrary information such as state information of the computer system 100.
  • the memory 120 stores instructions of the computer system 100, including, for example, instructions for object tracking. It can also be used to store computers.
  • System 100 may include one or more processors (1 W) as needed or where appropriate.
  • Bus 140 provides the interaction between the various components of the computer system 100.
  • Bus 140 may, for example, carry data between components of computer system 100, for example between processor 110 and memory 120.
  • Bus 140 may include a wireless and/or wired communication medium between components of the computer system 100, and may include parallel, serial, or other topology arrangements.
  • This permanent storage device 130 is a memory or other persistent storage device as used by the computer system 100 to store data for a predetermined extended period (e.g., compared to memory 120).
  • Persistent storage device 130 may comprise a non-volatile main memory, such as used by processor 110 in computer system 100.
  • Persistent storage device 130 is, for example, flash. It may include memory, hard disk, optical disk, or other computer-readable medium.
  • the input/output interface 150 may include interfaces to a keyboard, mouse, voice command input, display or other input or output device. Input for configuration commands and/or object tracking is provided by the input/output interface 150. Can be received through have.
  • the network interface 160 such as a local area network or Internet
  • Network interface 160 may contain interfaces to wired or wireless connections.
  • Input for configuration commands and/or object tracking is a network
  • the computer system 100 may include more components than those of FIG. 1. However, it is not necessary to clearly show most of the prior art components.
  • the computer system 100 may be implemented to include at least some of the input/output devices connected to the input/output interface 150 described above, or a transceiver, a GPS (Global Positioning System) module, a camera, various sensors, a database, etc. Other components such as may also be included.
  • the comparison may not be performed properly due to occlusion or the object appears blurred due to fast movement, or a problem of being recognized as a different object even though it is the same object may occur.
  • the pose estimation used in the existing object tracking is not 100% accurate, and there is a limit that is estimated as a similar position with a local pattern. Due to this, the ID of the target changes (shift) ) Can cause problems, and accumulation of these small errors can result in moving away from the target object.
  • the target is made through an online learning model using global pattern matching.
  • Objects can be tracked more accurately.
  • FIG. 2 is a diagram showing an example of components that can be included in a processor of a computer system according to an embodiment of the present invention
  • FIG. 3 is an example of an object tracking method that can be performed by a computer system according to an embodiment of the present invention. This is a flow chart showing.
  • the processor 110 may include an estimating unit 210, a similarity calculation unit 220, a matching unit 230, a post-processing unit 240, and a location providing unit 250. These components of the processor (1 W) may be expressions of different functions performed by the processor (110) according to control instructions provided by at least one program code. , The estimation unit 2W can be used as a functional expression that operates to control the computer system 100 so that the processor 110 performs pose estimation.
  • the components of the processor (1 W) and the processor 110 can perform the steps (S310 to S350) included in the object tracking method of FIG. 3.
  • the processor 110 and the processor 110 The components of the processor 110 may be implemented to execute an instruction according to the code of the operating system included in the memory 120 and at least one program code described above.
  • at least one program code is an object tracking method. It can correspond to the code of the implemented program to process.
  • the object tracking method may not occur in the order shown, among the steps
  • the processor 110 can load the program code stored in the program file for the object tracking method into the memory 120.
  • the program file for the object tracking method is a permanent storage device described with reference to FIG. 130
  • the processor (no) can control the computer system 110 so that the program code is loaded into the memory 120 from the program file stored in the persistent storage device 130 via the bus.
  • the processor (no) and processor (no) include an estimation unit (2W), similarity calculation unit 220, matching unit 230, post-processing unit 240, and location providing unit 250, respectively, in the memory 120 It may be different functional expressions of the processor 110 for executing the subsequent steps (S310 to S350) by executing the instruction of the corresponding part of the loaded program code. For execution of the steps (S310 to S350),
  • the processor 110 and the components of the processor 110 may directly process an operation according to a control command or control the computer system 100.
  • step S310 when the video file is input, the estimation unit 2W selects the input video.
  • a pose estimation can be performed on the target.
  • the estimator 210 finds the position of the person corresponding to the target object for every frame of the input video,
  • the coordinates of can be used as a key point.
  • the state government (2 W) can find a person in the frame through a YOLO (you only look once) based human detection algorithm, and the top-down ) Method, you can get the coordinates of each person's key point.
  • the similarity calculation unit 220 may calculate a pose similarity between adjacent frames based on the key point coordinates of each person for each frame.
  • the similarity calculation unit ( 220) can obtain a matching score indicating the similarity of poses between people in two adjacent frames, and in this case, the matching score is K people in the nth frame and K'people in the n+1th frame, respectively. It can mean an indicator of how close it is.
  • the matching score representing the pose similarity may include a motion factor according to a local pattern and an appearance factor according to a global pattern.
  • the model for calculating the matching score is a classifier that classifies the ID of each target. Online learning added It can be implemented as a model, and the global pattern of each target can be learned through the corresponding online learning model.
  • the classifier model according to the present invention can accumulate training data of each target along with the time axis, and as an example of training data, it can include all key points of the target.
  • the global pattern of each target can be learned through the classifier model, in which case the classifier for learning the global pattern can apply any network model that can be classified.
  • the motion factor it can be obtained based on a bounding box indicating the location area of the target, Intersection Over Union (IoU) and pose IoU, where the IoU is the degree of overlap between the two areas as shown in FIG.
  • IoU Intersection Over Union
  • pose IoU the degree of overlap between the two areas as shown in FIG.
  • the matching unit 230 may perform pose matching between frames using the result of step S320.
  • the matching unit 230 indicates the degree of pose similarity. Based on the matching score, it is possible to actually match the i-th box (that is, the target position) of the n-th frame with the j-th box of the n+1th frame.
  • This matching unit 230 uses a matching algorithm such as the Hungarian method.
  • the matching unit 230 can match each box by first calculating the similarity matrix between adjacent frames and then optimizing it with a Hungarian method, and at this time, the similarity matrix for pose matching can be used to indicate IoU. It can be calculated using the motion factor.
  • the post-processing unit 240 may perform a post-processing process including the elimination of erroneous detection with respect to the pose matching result of step S330.
  • the post-processing unit 240 is a boundary box. Matching errors can be eliminated through IoU-based error measurement.
  • the post-processing unit 240 can correct the matching errors using interpolation, and furthermore, based on a moving average, etc.
  • the position providing unit 250 may provide the position of each target according to the pose matching as a tracking result.
  • the position providing unit 250 may provide the coordinate value of each target as an output.
  • Target The area marked with the position of is called the boundary box, and the position of the target can be given as a position coordinate within the frame of the boundary box.
  • the position coordinates of the target are [X coordinate of the left line, Y coordinate of the upper line, X coordinate of the right line, Y coordinate of the lower line], [X coordinate of the left line, Y coordinate of the upper line, width of rectangle, height of rectangle], etc. It can be expressed in the form of
  • FIG. 6 to 7 show examples of a process of learning a global pattern of a target in an embodiment of the present invention. [64] Figures 6 through 7 show the sample mining process.
  • the model result value is a result of applying the existing tracking technology using the motion factor.
  • the appearance factor is calculated secondly and can be used for object tracking. .
  • a valid period and an invalid period can be defined and classified.
  • the valid period refers to the section in which all targets exist, and the hatched part in FIG. Indicate the reason and effect section.
  • the training data uses the entire continuous section consisting of a plurality of frames. At this time, the input unit of the learning model is sampled from the entire continuous section.
  • It can be a mini-batch, and the size of the mini-batch can be set to a predefined default value or by the user.
  • the training data is the box image including the target location and the ID of the target.
  • the box image refers to an image in which only the area representing the position of the engraved person is cut out.
  • This learning model becomes a probability value for each target ID of the box image when a box image containing a random person is given.
  • the training data of the first section is created using the section (group 0), and the model is trained using the training data of the first section.
  • the training data at this time may be the result obtained by using the existing object tracking technology and labeled as it is.
  • the box image and target ID can be used as training data.
  • the next target section that is, the second longest valid section 720
  • the training data for the second section is created.
  • the training data for the first section and the second section are created.
  • the accumulated training data is created by merging, and the model is trained again using it.
  • prediction is performed with the model learned in the valid interval.
  • each box can be matched using this, and the similarity of the classifier model can be calculated using the appearance factor, not the motion factor.
  • the classifier for classifying the ID of each target As described above, according to the embodiments of the present invention, the classifier for classifying the ID of each target
  • the devices described above are hardware components, software components, and/or It can be implemented as a combination of hardware components and software components.
  • the devices and components described in the embodiments include a processor, a controller,
  • a processing device can run an operating system (OS) and one or more software applications running on the operating system. In addition, the processing device responds to the execution of the software. Thus, data can be accessed, stored, manipulated, processed, and created.
  • OS operating system
  • software applications running on the operating system.
  • the processing device responds to the execution of the software.
  • one processing unit can contain multiple processing elements and/or multiple types of processing elements, for example, a processing unit can contain multiple processors or one processor and one controller. Also, in parallel
  • the software includes a computer program, code,
  • It may contain instructions, or a combination of one or more of these, and configure the processing unit to behave as desired, or independently or
  • the processing unit can be commanded.
  • Software and/or data can be interpreted by the processing unit, or to provide instructions or data to the processing unit, of any type of machine, component, or physical device.
  • the software may be distributed on a networked computer system and stored or executed in a distributed manner.
  • the software and data may be stored on one or more computer-readable recording media. have.
  • the method according to the embodiment may be implemented in the form of program instructions that can be executed through various computer means and recorded on a computer-readable medium.
  • the medium continuously stores a program executable by a computer, or for execution or download.
  • the media may be various recording means or storage means in the form of a combination of single or several hardware, not limited to media directly connected to any computer system, but distributed over a network. Examples of media include magnetic media such as hard disks, floppy disks and magnetic tapes, optical recording media such as CD-ROMs and DVDs, magnetic-optical media such as floptical disks. medium), and some configured to store program instructions, including ROM, RAM, and flash memory.
  • the recording media managed by the app store that distributes applications, sites that supply or distribute various other software, and servers, etc. 2020/175818 1»(:1 ⁇ 1 ⁇ 2020/001866 Storage media is also available.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • Multimedia (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Medical Informatics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Image Analysis (AREA)

Abstract

온라인 학습을 이용한 객체 트래킹을 위한 방법 및 시스템이 개시된다. 객체 트래킹 방법에 있어서, 글로벌 패턴 매칭(global pattern matching)을 이용하여 분류기(classifier) 모델을 학습하는 단계; 및 상기 분류기 모델을 포함한 온라인 학습을 통해 각 타겟을 분류하여 트래킹하는 단계를 포함한다.

Description

명세서
발명의 명칭:온라인학습을이용한객체트래킹을위한방법및 시스템
기술분야
[1] 아래의설명은객체트래킹 (object tracking)기술에관한것이다.
배경기술
[2] 객체포즈추정은컴퓨터비전,인간-기계상호작용,및기타관련영역에대한 중요내용이다.예컨대,사용자의머리를추정될객체로간주하면사용자의 연속적인머리포즈에대한추정을통하여사용자가표현하고싶은풍부한 개성화정보를알수있다.또한,객체 (이를테면,머리)포즈의추정결과는 인간-기계상호작용을진행하기위해사용될수있으며,예를들어,머리포즈에 대한추정을통해사용자의시선초점을획득할수있고더효과적인인간-기계 상호작용을진행할수있다
[3] 객체포즈추정기술의일례로,한국공개특허공보
제 10-2008-0073933호 (공개일 2008년 08월 12일)에는입력비디오영상으로부터 객체의움직임을실시간으로자동트래킹하고객체의포즈를판단하는기술이 개시되어 있다.
[4] 현재사용된객체포즈추정방법은일반적으로추적 (tracking)기반의방법및 학습기반의방법으로구분된다.
[5] 추적기반의방법은비디오시퀀스에있는현재프레임 (Current Frame)과전 프레임 (Previous Frame)의사이에하나의쌍 (paired)매칭방법으로객체의포즈를 추정하는것이다.
[6] 학습기반의방법은일반적으로객체포즈추정을분류 (classify)방식또는복귀 방식으로정의하며,라벨을포함하는샘플을통해트레이닝을진행하고획득된 트레이닝모델을이용하여객체의포즈를추정한다.
발명의상세한설명
기술적과제
[7] 각타겟의 ID (식별번호)를분류하는분류기 (classifier)가추가된온라인학습 모델을통해각타겟의글로벌패턴 (global pattern)을학습할수있다.
[8] 시간축과함께누적되는각타겟의학습데이터를만들고이를이용하여
분류기모델을학습할수있다.
[9] 로컬패턴 (local pattern)에따른모션팩터 (motion factor)와글로벌패턴에따른 외양팩터 (appearance factor)를트래킹에함께이용할수있다.
과제해결수단
[1이 컴퓨터시스템에서수행되는객체트래킹방법에 있어서,상기컴퓨터
시스템은메모리에포함된컴퓨터판독가능한명령들을실행하도록구성된 적어도하나의프로세서를포함하고,상기객체트래킹방법은,상기적어도 하나의프로세서에의해,글로벌패턴매칭 (global pattern matching)을이용하여 분류기 (classifier)모델을학습하는단계;및상기적어도하나의프로세서에 의해,상기분류기모델을포함한온라인학습을통해각타겟을분류하여 트래킹하는단계를포함하는객체트래킹방법을제공한다.
[11] 일측면에따르면,상기학습하는단계는,각타겟을분류하는분류기가추가된 학습모델을통해각타겟의글로벌패턴을학습하는단계를포함할수있다.
[12] 다른측면에따르면,상기학습하는단계는,샘플마이닝 (sample mining)을통해 시간축과함께누적되는각타겟의학습데이터를만들고누적된학습데이터를 이용하여상기분류기모델을반복학습하는단계를포함할수있다.
[13] 또다른측면에따르면,상기학습하는단계는,입력비디오의연속된구간 전체에서타겟이존재하는유효구간 (valid period)을구분하는단계;상기유효 구간중어느하나의유효구간의라벨링후학습데이터를만들어상기분류기 모델을학습하는단계;및다음유효구간의라벨링후학습데이터를만들고 이전에만들어진학습데이터와병합하여누적된학습데이터를만들어상기 분류기모델을반복학습하는단계를포함할수있다.
[14] 또다른측면에따르면,상기라벨링은타겟의글로벌패턴에따른외양
팩터 (appearance factor)기반으로계산된상기분류기모델의유사도매트릭스를 이용할수있다.
[15] 또다른측면에따르면,상기학습하는단계는,상기유효구간이외의무효 구간 (invalid period)에대해상기유효구간으로학습된상기분류기모델을통해 라벨링을수행하는단계를더포함할수있다.
[16] 또다른측면에따르면,상기트래킹하는단계는,입력비디오의모든프레임에 대해타겟의위치를찾아각타겟의키포인트 (keypoint)의좌표를구하는단계;각 타겟의키포인트의좌표를이용하여인접한프레임에서의타겟간의매칭 스코어 (matching score)를구하는단계 ;및상기타겟간의매칭스코어를 기준으로프레임간의포즈매칭을수행하는단계를포함할수있다.
[17] 또다른측면에따르면,상기포즈매칭을수행하는단계는,타겟의위치를 나타내는박스에대한모션팩터 (motion factor)기반으로계산된유사도 매트릭스를이용하여상기포즈매칭을수행할수있다.
[18] 또다른측면에따르면,상기매칭스코어는이전프레임에서의타겟과다음 프레임에서의타겟간의가까운정도를나타낼수있다.
[19] 또다른측면에따르면,상기트래킹하는단계는,타겟의위치를나타내는경계 박스를기반으로한오류측정을통해상기포즈매칭의오류를제거하는과정 , 보간법 (interpolation)을이용하여상기포즈매칭의오류를보정하는과정 ,및 이동평균 (moving average)에기반하여상기포즈매칭에대한
평활화 (smoothing)를수행하는과정중적어도하나의후처리과정을수행하는 단계를더포함할수있다. () 2020/175818 1»(:1/10公020/001866 상기객체트래킹방법을컴퓨터에실행시키기 위한프로그램이 기록되어 있는 것을특징으로하는컴퓨터에서판독가능한기록매체를제공한다.
컴퓨터시스템에 있어서,메모리;및상기 메모리와연결되고,상기 메모리에 포함된컴퓨터판독가능한명령들을실행하도록구성된적어도하나의 프로세서를포함하고,상기 적어도하나의프로세서는,글로벌패턴매칭을 이용하여분류기모델을학습하는과정;및상기분류기모델을포함한온라인 학습을통해각타겟을분류하여트래킹하는과정을처리하는컴퓨터시스템을 제공한다.
발명의효과
[22] 본발명의실시예들에따르면,각타겟의 II)를분류하는분류기가추가된
온라인학습모델을통해각타겟의글로벌패턴을학습할수있다.
[23] 본발명의실시예들에따르면,시간축과함께누적되는각타겟의 학습
데이터를만들고이를이용하여분류기모델을학습할수있다.
[24] 본발명의실시예들에따르면,로컬패턴에따른모션팩터와글로벌패턴에 따른외양팩터를트래킹에함께 이용할수있다.
도면의간단한설명
[25] 도 1은본발명의 일실시예에 있어서 컴퓨터시스템의 내부구성의 일례를 설명하기 위한블록도이다.
[26] 도 2는본발명의 일실시예에따른컴퓨터시스템의프로세서가포함할수있는 구성요소의 예를도시한도면이다.
[27] 도 3은본발명의 일실시예에따른컴퓨터시스템이수행할수있는객체
트래킹 방법의 예를도시한순서도이다.
[28] 도 4는본발명의 일실시예에 있어서 타겟의키포인트좌표를구하는과정의
] ] ]
231 예시를도시한것이다.
333 2201
[29] 도 5는본발명의 일실시예에 있어서 영역간중첩정도를나타내는 1011의측정 예시를도시한것이다.
[3이 도 6내지도 7은본발명의 일실시예에 있어서타겟의글로벌패턴을학습하는 과정의 예시를도시한것이다.
발명의실시를위한최선의형태
이하,본발명의실시예를첨부된도면을참조하여상세하게설명한다. 본발명의실시예들은온라인학습모델을통해객체위치를트래킹하는 기술에 관한것이다.
34] 본명세서에서구체적으로개시되는것들을포함하는실시예들은각타겟의 II)를분류하는분류기가추가된온라인학습모델을통해각패턴의글로벌 패턴을학습할수있고,이를통해정확성,효율성,비용절감등의측면에 있어서 상당한장점들을달성한다. [35] 도 1은본발명의일실시예에있어서컴퓨터시스템의내부구성의일례를 설명하기위한블록도이다.예를들어,본발명의실시예들에따른객체트래킹 시스템이도 1의컴퓨터시스템 (100)을통해구현될수있다.
[36] 도 1에도시한바와같이,컴퓨터시스템 (100)은객체트래킹방법을실행하기 위한구성요소로서프로세서 (110),메모리 (120),영구저장장치 (130),버스 (140), 입출력인터페이스 (150)및네트워크인터페이스 (160)를포함할수있다.
[37] 프로세서 (110)는객체트래킹을위한구성요소로서명령어들의시퀀스를
처리할수있는임의의장치를포함하거나그의일부일수있다.프로세서 (no)는 예를들어컴퓨터프로세서,이동장치또는다른전자장치내의프로세서 및/또는디지털프로세서를포함할수있다.프로세서 (no)는예를들어,서버 컴퓨팅디바이스,서버컴퓨터,일련의서버컴퓨터들,서버팜,클라우드컴퓨터, 컨텐츠플랫폼등에포함될수있다.프로세서 (110)는버스 (140)를통해
메모리 (120)에접속될수있다.
[38] 메모리 (120)는컴퓨터시스템 (100)에의해사용되거나그에의해출력되는
정보를저장하기위한휘발성메모리 ,영구,가상또는기타메모리를포함할수 있다.메모리 (120)는예를들어 랜덤 액세스메모리 (RAM: random access memory) 및/또는다이내믹 RAM(DRAM: dynamic RAM)을포함할수있다.메모리 (120)는 컴퓨터시스템 (100)의상태정보와같은임의의정보를저장하는데사용될수 있다.메모리 (120)는예를들어객체트래킹을위한명령어들을포함하는컴퓨터 시스템 (100)의명령어들을저장하는데에도사용될수있다.컴퓨터
시스템 (100)은필요에따라또는적절한경우에하나이상의프로세서 (1 W)를 포함할수있다.
[39] 버스 (140)는컴퓨터시스템 (100)의다양한컴포넌트들사이의상호작용을
가능하게하는통신기반구조를포함할수있다.버스 (140)는예를들어컴퓨터 시스템 (100)의컴포넌트들사이에,예를들어프로세서 (110)와메모리 (120) 사이에데이터를운반할수있다.버스 (140)는컴퓨터시스템 (100)의컴포넌트들 간의무선및/또는유선통신매체를포함할수있으며,병렬,직렬또는다른 토폴로지배열들을포함할수있다.
[4이 영구저장장치 (130)는 (예를들어,메모리 (120)에비해)소정의연장된구간 동안데이터를저장하기위해컴퓨터시스템 (100)에의해사용되는바와같은 메모리또는다른영구저장장치와같은컴포넌트들을포함할수있다.영구 저장장치 (130)는컴퓨터시스템 (100)내의프로세서 (110)에의해사용되는바와 같은비휘발성메인메모리를포함할수있다.영구저장장치 (130)는예를들어 플래시메모리,하드디스크,광디스크또는다른컴퓨터판독가능매체를 포함할수있다.
[41] 입출력인터페이스 (150)는키보드,마우스,음성명령입력,디스플레이또는 다른입력또는출력장치에대한인터페이스들을포함할수있다.구성명령들 및/또는객체트래킹을위한입력이입출력인터페이스 (150)를통해수신될수 있다.
[42] 네트워크인터페이스 (160)는근거리네트워크또는인터넷과같은
네트워크들에대한하나이상의인터페이스를포함할수있다.네트워크 인터페이스 (160)는유선또는무선접속들에대한인터페이스들을포함할수 있다.구성명령들및/또는객체트래킹을위한입력이네트워크
인터페이스 (160)를통해수신될수있다.
[43] 또한,다른실시예들에서컴퓨터시스템 (100)은도 1의구성요소들보다더많은 구성요소들을포함할수도있다.그러나,대부분의종래기술적구성요소들을 명확하게도시할필요성은없다.예를들어,컴퓨터시스템 (100)은상술한입출력 인터페이스 (150)와연결되는입출력장치들중적어도일부를포함하도록 구현되거나또는트랜시버 (transceiver), GPS(Global Positioning System)모듈, 카메라,각종센서 ,데이터베이스등과같은다른구성요소들을더포함할수도 있다.
[44] 실제영상에서객체트래킹을수행하는경우,객체가다른물체에의하여
가려지거나 (occlusion),빠른움직임으로인하여객체가흐리게나타나는경우 등으로인하여비교가제대로수행되지않거나,동일한객체임에도다른객체로 인식되는문제가생길수있다.
[45] 이러한이유들로기존객체트래킹에이용되는포즈주정 (pose estimation)의 경우 100%정확하지않으며,로컬패턴을가진유사위치로추정되는한계가 있다.이로인해,타겟의 ID가바뀌는 (shift)문제가발생할수있고,이러한작은 오류가누적되면타겟객체에서멀어지는결과를초래할수있다.
[46] 본발명에서는글로벌패턴매칭을이용한온라인학습모델을통해타겟
객체를보다정확하게트래킹할수있다.
[47] 본명세서에서는인물트래킹을대표적인예시로하여설명하고있으나,이에 한정되는것은인물이외에각종사물이나다른종류의객체를대상으로적용할 수있다.
[48] 도 2는본발명의일실시예에따른컴퓨터시스템의프로세서가포함할수있는 구성요소의 예를도시한도면이고,도 3은본발명의일실시예에따른컴퓨터 시스템이수행할수있는객체트래킹방법의 예를도시한순서도이다.
[49] 도 2에도시된바와같이,프로세서 (110)는추정부 (210),유사도계산부 (220), 매칭부 (230),후처리부 (240),및위치제공부 (250)를포함할수있다.이러한 프로세서 (1 W)의구성요소들은적어도하나의프로그램코드에의해제공되는 제어명령에따라프로세서 (110)에의해수행되는서로다른기능들 (different functions)의표현들일수있다.예를들어,프로세서 (110)가포즈추정을 수행하도록컴퓨터시스템 (100)을제어하기위해동작하는기능적표현으로서 추정부 (2W)가사용될수있다.
[5이 프로세서 (1 W)및프로세서 (110)의구성요소들은도 3의객체트래킹방법이 포함하는단계들 (S310내지 S350)을수행할수있다.예를들어,프로세서 (110)및 프로세서 (110)의구성요소들은메모리 (120)가포함하는운영체제의코드와 상술한적어도하나의프로그램코드에따른명령 (instruction)을실행하도록 구현될수있다.여기서,적어도하나의프로그램코드는객체트래킹방법을 처리하기위해구현된프로그램의코드에대응될수있다.
[51] 객체트래킹방법은도시된순서대로발생하지않을수있으며,단계들중
일부가생략되거나추가의과정이더포함될수있다.
[52] 프로세서 (110)는객체트래킹방법을위한프로그램파일에저장된프로그램 코드를메모리 (120)에로딩할수있다.예를들어 ,객체트래킹방법을위한 프로그램파일은도 1을통해설명한영구저장장치 (130)에저장되어 있을수 있고,프로세서 (no)는버스를통해영구저장장치 (130)에저장된프로그램 파일로부터프로그램코드가메모리 (120)에로딩되도록컴퓨터시스템 (110)을 제어할수있다.이때,프로세서 (no)및프로세서 (no)가포함하는추정부 (2W), 유사도계산부 (220),매칭부 (230),후처리부 (240),및위치제공부 (250)각각은 메모리 (120)에로딩된프로그램코드중대응하는부분의명령을실행하여이후 단계들 (S310내지 S350)을실행하기위한프로세서 (110)의서로다른기능적 표현들일수있다.단계들 (S310내지 S350)의실행을위해,프로세서 (110)및 프로세서 (110)의구성요소들은직접제어명령에따른연산을처리하거나또는 컴퓨터시스템 (100)을제어할수있다.
[53] 단계 (S310)에서추정부 (2 W)는비디오파일이입력되면입력비디오를
대상으로포즈추정을수행할수있다.이때,추정부 (210)는입력비디오의모든 프레임에대해타겟객체에해당되는사람의위치를찾아각사람의
키포인트 (keypoint)의좌표를구할수있다.
[54] 예를들어 ,도 4를참조하면입력비디오를구성하는모든프레임에서타겟이 되는사람의위치를찾은후찾은사람의머리,좌우어깨,좌우팔꿈치,좌우손, 좌우무릎,좌우발등 17군데의좌표를키포인트로활용할수있다.일례로, 주정부 (2 W)는 YOLO(you only look once)기반의사람검줄 (human detection) 알고리즘을통해프레임에서사람을찾을수있고,탑-다운 (top-down)방식으로 각사람의키포인트의좌표를구할수있다.
[55] 다시도 3에서,단계 (S320)에서유사도계산부 (220)는프레임별각사람의 키포인트좌표를기반으로인접한프레임간의포즈유사도 (pose similarity)를 계산할수있다.다시말해,유사도계산부 (220)는인접한두프레임에서의 사람들사이의포즈유사도를나타내는매칭스코어 (matching score)를구할수 있으며,이때매칭스코어는 n번째프레임에서 K명의사람들이 n+1번째 프레임에서 K’명의사람들과각각얼마나가까운가를나타내는지표를의미할 수있다.
[56] 특히,본발명에서포즈유사도를나타내는매칭스코어는로컬패턴에따른 모션팩터와글로벌패턴에따른외양팩터를포함할수있다.매칭스코어를 계산하기위한모델은각타겟의 ID를분류하는분류기가추가된온라인학습 모델로구현될수있으며,해당온라인학습모델을통해각타겟의글로벌 패턴을학습할수있다.
[57] 본발명에따른분류기모델은시간축과함께각타겟의학습데이터를누적할 수있으며,학습데이터의일례로는타겟의모든키포인트를포함할수있다. 다시말해,분류기모델을통해각타겟의글로벌패턴을학습할수있다.이때, 글로벌패턴을학습하기위한분류기는분류 (classification)가가능한모든 네트워크모델을적용할수있다.
[58] 모션팩터의경우타겟의위치영역을나타내는경계박스 (bounding box) IoU(Intersection Over Union)와포즈 IoU를바탕으로구할수있으며 ,이때 IoU는 도 5에도시한바와같이두영역사이의중첩정도를나타내는것으로,이를 통해지상진리 (ground truth) (실제객체경계 )를가진객체검출에서예측값이 얼마나정확한지측정할수있다.그리고,외양팩터는객관적확률을판단하기 위한샘플마이닝 (sample mining)과온라인학습에기반한글로벌패턴매칭을 이용하여구할수있다.
[59] 다시도 3에서,단계 (S330)에서매칭부 (230)는단계 (S320)의결과를이용하여 프레임간의포즈매칭을수행할수있다.다시말해,매칭부 (230)는포즈 유사도를나타내는매칭스코어를기준으로실제로 n번째프레임의 i번째 박스 (즉,타겟위치)를 n+1번째프레임의 j번째박스와매칭할수있다.
[6이 매칭부 (230)는헝가리안메소드 (Hungarian method)등의매칭알고리즘을
이용하여포즈매칭을수행할수있다.매칭부 (230)는인접한프레임간의유사도 매트릭스를먼저계산한후이를헝가리안메소드로최적화하여각박스를 매칭할수있으며 ,이때포즈매칭을위한유사도매트릭스는 IoU를나타내는 모션팩터를이용하여계산할수있다.
[61] 단계 (S340)에서후처리부 (240)는단계 (S330)의포즈매칭결과에대해오검출 배제등을포함하는후처리과정을수행할수있다.일례로,후처리부 (240)는 경계박스 IoU기반오류측정을통해매칭오류를제거할수있다.또한, 후처리부 (240)는보간법 (interpolation)을이용하여매칭오류를보정할수있고, 더나아가이동평균 (moving average)등에기반하여포즈매칭에대한
평활화 (smoothing)를거칠수있다.
[62] 단계 (S350)에서위치제공부 (250)는트래킹결과로서포즈매칭에따른각 타겟의위치를제공할수있다.위치제공부 (250)는각타겟의좌표값을 출력으로제공할수있다.타겟의위치를표시한영역을경계박스라고하는데, 이때타겟의위치는경계박스의프레임내에서의위치좌표로주어질수있다. 타겟의위치좌표는 [좌측선의 X좌표,위측선의 Y좌표,우측선의 X좌표, 아래측선의 Y좌표], [좌측선의 X좌표,위측선의 Y좌표,직사각형의너비, 직사각형의높이]등의형태로표기될수있다.
[63] 도 6내지도 7은본발명의일실시예에 있어서타겟의글로벌패턴을학습하는 과정의 예시를도시한것이다. [64] 도 6내지도 7은샘플마이닝과정을나타내고있다.
[65] 도 6을참조하면 , 1.모델결과값은모션팩터를사용한기존트래킹기술을 적용한결과로,본발명에서는기존트래킹을 1차로적용한후 2차로외양 팩터를계산하여객체트래킹에이용할수있다.
[66] 2.전체동영상내에서유효구간 (valid period)과무효구간 (invalid period)을 정의하여구분할수있다.여기에서유효구간은모든타겟이존재하는구간을 의미하며 ,도 6에서해칭된부분이유효구간을나타낸다.
[67] 도 7을참조하면, 3.모델훈련을반복하고해당모델을사용하여다음유효 구간에대하여라벨을지정하여학습예시를추가할수있다.
[68] 학습데이터는복수개의프레임으로이루어진연속된구간전체를이용한다. 이때,학습모델의입력단위는연속된구간전체에서샘플링된
미니배치 (mini-batch)가될수있으며,미니배치의크기는사전에정해진디폴트 값으로정해지거나혹은사용자에의해정해질수있다.
[69] 학습데이터는타겟위치를포함하는박스이미지와해당타겟의 ID를
포함한다.여기서,박스이미지는전체이미지에서각인물의위치를나타내는 영역만을잘라낸이미지를의미한다.
P이 학습모델 (네트워크)의출력은임의인물이포함된박스이미지가주어졌을때 해당박스이미지의각타겟 ID에대한확률값이된다.
1] 도 7에도시한바와같이,학습의첫번째단계 (1st)에서는가장긴유효
구간 (기 0)을이용하여첫번째구간의학습데이터를만들고,첫번째구간의 학습데이터를이용하여모델을학습시킨다.이때의학습데이터는기존객체 트래킹기술을이용하여얻어낸결과를그대로라벨링한것일수있고,아닌 박스이미지와타겟 ID를학습데이터로사용할수있다.
2] 두번째단계 (2nd)에서는첫번째구간에서학습된모델로다음대상구간,즉 두번째로긴유효구간 (720)을라벨링시킨후두번째구간의학습데이터를 만든다.그리고,첫번째구간과두번째구간의학습데이터를병합하여누적된 학습데이터를만들고이를이용하여다시모델을학습시킨다.
3] 이러한방식을반복하여유효구간에대한학습이종료된후무효구간에
대해서는유효구간으로학습된모델로예측 (라벨링)을수행하게된다.
4] 상기한라벨링과정은분류기모델을위한유사도매트릭스를계산한후이를 이용하여각박스를매칭할수있으며,이때분류기모델의유사도는모션 팩터가아닌외양팩터를이용하여계산할수있다.
5] 이처럼본발명의실시예들에따르면,각타겟의 ID를분류하는분류기가
추가된온라인학습모델을통해각타겟의글로벌패턴을학습할수있고,시간 축과함께누적되는각타겟의학습데이터를만들고이를이용하여분류기 모델을학습할수있으며,이를통해로컬패턴에따른모션팩터와글로벌 패턴에따른외양팩터를객체트래킹에함께이용할수있다.
6] 이상에서설명된장치는하드웨어구성요소,소프트웨어구성요소,및/또는 하드웨어구성요소및소프트웨어구성요소의조합으로구현될수있다.예를 들어,실시예들에서설명된장치및구성요소는,프로세서,콘트롤러,
ALU(arithmetic logic unit),디지털신호프로세서 (digital signal processor), 마이크로컴뉴터 , FPGA(field programmable gate array), PLU (programmable logic unit),마이크로프로세서 ,또는명령 (instruction)을실행하고응답할수있는다른 어떠한장치와같이 ,하나이상의범용컴퓨터또는특수목적컴퓨터를 이용하여구현될수있다.처리장치는운영체제 (OS)및상기운영체제상에서 수행되는하나이상의소프트웨어어플리케이션을수행할수있다.또한,처리 장치는소프트웨어의실행에응답하여,데이터를접근,저장,조작,처리및 생성할수도있다.이해의편의를위하여 ,처리장치는하나가사용되는것으로 설명된경우도있지만,해당기술분야에서통상의지식을가진자는,처리 장치가복수개의처리요소 (processing element)및/또는복수유형의처리요소를 포함할수있음을알수있다.예를들어,처리장치는복수개의프로세서또는 하나의프로세서및하나의콘트롤러를포함할수있다.또한,병렬
프로세서 (parallel processor)와같은,다른처리구성 (processing configuration)도 가능하다.
[77] 소프트웨어는컴퓨터프로그램 (computer program),코드 (code),
명령 (instruction),또는이들중하나이상의조합을포함할수있으며,원하는 대로동작하도록처리장치를구성하거나독립적으로또는
결합적으로 (collectively)처리장치를명령할수있다.소프트웨어및/또는 데이터는,처리장치에의하여해석되거나처리장치에명령또는데이터를 제공하기위하여,어떤유형의기계,구성요소 (component),물리적장치,컴퓨터 저장매체또는장치에구체화 (embody)될수있다.소프트웨어는네트워크로 연결된컴퓨터시스템상에분산되어서 ,분산된방법으로저장되거나실행될 수도있다.소프트웨어및데이터는하나이상의컴퓨터판독가능기록매체에 저장될수있다.
8] 실시예에따른방법은다양한컴퓨터수단을통하여수행될수있는프로그램 명령형태로구현되어컴퓨터판독가능매체에기록될수있다.이때,매체는 컴퓨터로실행가능한프로그램을계속저장하거나,실행또는다운로드를위해 임시저장하는것일수도있다.또한,매체는단일또는수개의하드웨어가 결합된형태의다양한기록수단또는저장수단일수있는데,어떤컴퓨터 시스템에직접접속되는매체에한정되지않고,네트워크상에분산존재하는 것일수도있다.매체의 예시로는,하드디스크,플로피디스크및자기테이프와 같은자기매체, CD-ROM및 DVD와같은광기록매체,플롭티컬디스크 (floptical disk)와같은자기-광매체 (magneto-optical medium),및 ROM, RAM,늘래시 메모리등을포함하여프로그램명령어가저장되도록구성된것이있을수있다. 또한,다른매체의예시로,어플리케이션을유통하는앱스토어나기타다양한 소프트웨어를공급내지유통하는사이트,서버등에서관리하는기록매체내지 2020/175818 1»(:1^1{2020/001866 저장매체도들수있다.
발명의실시를위한형태
9] 이상과같이실시예들이비록한정된실시예와도면에의해설명되었으나, 해당기술분야에서통상의지식을가진자라면상기의기재로부터다양한수정 및변형이가능하다.예를들어,설명된기술들이설명된방법과다른순서로 수행되거나,및/또는설명된시스템,구조,장치,회로등의구성요소들이설명된 방법과다른형태로결합또는조합되거나,다른구성요소또는균등물에의하여 대치되거나치환되더라도적절한결과가달성될수있다.
[8이 그러므로,다른구현들,다른실시예들및특허청구범위와균등한것들도
후술하는특허청구범위의범위에속한다.
[81]

Claims

() 2020/175818 1»(:1/10公020/001866 청구범위
[청구항 1] 컴퓨터시스템에서수행되는객체트래킹방법에있어서,
상기컴퓨터시스템은메모리에포함된컴퓨터판독가능한명령들을 실행하도록구성된적어도하나의프로세서를포함하고, 상기객체트래킹방법은,
상기적어도하나의프로세서에의해,글로벌패턴매칭 (global pattern matching)을이용하여분류기 (classifier)모델을학습하는단계;및 상기적어도하나의프로세서에의해,상기분류기모델을포함한온라인 학습을통해각타겟을분류하여트래킹하는단계
를포함하는객체트래킹방법 .
[청구항 2] 제 1항에 있어서,
상기학습하는단계는,
각타겟을분류하는분류기가추가된학습모델을통해각타겟의글로벌 패턴을학습하는단계
를포함하는객체트래킹방법 .
[청구항 3] 제 1항에 있어서,
상기학습하는단계는,
샘플마이닝 (sample mining)을통해시간죽과함께누적되는각타겟의 학습데이터를만들고누적된학습데이터를이용하여상기분류기 모델을반복학습하는단계
를포함하는객체트래킹방법 .
[청구항 4] 제 1항에 있어서,
상기학습하는단계는,
입력비디오의연속된구간전체에서타겟이모두존재하는유효 구간 (valid period)을구분하는단계 ;
상기유효구간중어느하나의유효구간의라벨링후학습데이터를 만들어상기분류기모델을학습하는단계;및
다음유효구간의라벨링후학습데이터를만들고이전에만들어진학습 데이터와병합하여누적된학습데이터를만들어상기분류기모델을 반복학습하는단계
를포함하는객체트래킹방법 .
[청구항 5] 제 4항에 있어서,
상기라벨링은타겟의글로벌패턴에따른외양팩터 (appearance factor) 기반으로계산된상기분류기모델의유사도매트릭스를이용하는것 을특징으로하는객체트래킹방법 .
[청구항 6] 제 4항에 있어서,
상기학습하는단계는, 상기유효구간이외의구간에대해상기유효구간으로학습된상기 분류기모델을통해라벨링을수행하는단계
를더포함하는객체트래킹방법 .
[청구항 7] 제 1항에 있어서,
상기트래킹하는단계는,
입력비디오의모든프레임에대해타겟의위치를찾아각타겟의 키포인트 (keypoint)의좌표를구하는단겨] ;
각타겟의키포인트의좌표를이용하여인접한프레임에서의타겟간의 매칭스코어 (matching score)를구하는단계 ;및
상기타겟간의매칭스코어를기준으로프레임간의포즈매칭을 수행하는단계
를포함하는객체트래킹방법 .
[청구항 8] 제 1항내지제 7항중어느한항의객체트래킹방법을컴퓨터에
실행시키기위한프로그램이기록되어있는것을특징으로하는 컴퓨터에서판독가능한기록매체 .
[청구항 9] 컴퓨터시스템에있어서 ,
메모리;및
상기메모리와연결되고,상기메모리에포함된컴퓨터판독가능한 명령들을실행하도록구성된적어도하나의프로세서
를포함하고,
상기적어도하나의프로세서는,
글로벌패턴매칭을이용하여분류기모델을학습하는과정;및 상기분류기모델을포함한온라인학습을통해각타겟을분류하여 트래킹하는과정
을처리하는컴퓨터시스템.
[청구항 10] 제 9항에 있어서,
상기학습하는과정은,
각타겟을분류하는분류기가추가된학습모델을통해각타겟의글로벌 패턴을학습하는것
을특징으로하는컴퓨터시스템.
[청구항 11] 제 9항에 있어서,
상기학습하는과정은,
샘플마이닝을통해시간축과함께누적되는각타겟의학습데이터를 만들고누적된학습데이터를이용하여상기분류기모델을반복 학습하는것
을특징으로하는컴퓨터시스템.
[청구항 12] 제 9항에 있어서,
상기학습하는과정은, 2020/175818 1»(:1^1{2020/001866 입력비디오의연속된구간전체에서모든타겟이존재하는유효구간을 구분하는과정 ;
상기유효구간중어느하나의유효구간의라벨링후학습데이터를 만들어상기분류기모델을학습하는과정;및
다음유효구간의라벨링후학습데이터를만들고이전에만들어진학습 데이터와병합하여누적된학습데이터를만들어상기분류기모델을 반복학습하는과정
을포함하는컴퓨터시스템.
[청구항 13] 제 12항에있어서,
상기라벨링은타겟의글로벌패턴에따른외양팩터기반으로계산된 상기분류기모델의유사도매트릭스를이용하는것
을특징으로하는컴퓨터시스템.
[청구항 14] 제 I2항에있어서 ,
상기학습하는과정은,
상기유효구간이외의구간에대해상기유효구간으로학습된상기 분류기모델을통해라벨링을수행하는과정
을더포함하는컴퓨터시스템.
[청구항 15] 제 9항에있어서,
상기트래킹하는과정은,
입력비디오의모든프레임에대해타겟의위치를찾아각타겟의 키포인트의좌표를구하는과정 ;
각타겟의키포인트의좌표를이용하여인접한프레임에서의타겟간의 매칭스코어를구하는과정 ;및
상기타겟간의매칭스코어를기준으로프레임간의포즈매칭을 수행하는과정
을포함하는컴퓨터시스템.
PCT/KR2020/001866 2019-02-28 2020-02-11 온라인 학습을 이용한 객체 트래킹을 위한 방법 및 시스템 WO2020175818A1 (ko)

Priority Applications (3)

Application Number Priority Date Filing Date Title
CN202080014716.0A CN113454640A (zh) 2019-02-28 2020-02-11 用于利用在线学习进行对象跟踪的方法及系统
JP2021549487A JP7192143B2 (ja) 2019-02-28 2020-02-11 オンライン学習を利用した物体追跡のための方法およびシステム
US17/458,896 US11972578B2 (en) 2019-02-28 2021-08-27 Method and system for object tracking using online training

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
KR1020190023916A KR102198920B1 (ko) 2019-02-28 2019-02-28 온라인 학습을 이용한 객체 트래킹을 위한 방법 및 시스템
KR10-2019-0023916 2019-02-28

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/458,896 Continuation US11972578B2 (en) 2019-02-28 2021-08-27 Method and system for object tracking using online training

Publications (1)

Publication Number Publication Date
WO2020175818A1 true WO2020175818A1 (ko) 2020-09-03

Family

ID=72240109

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/KR2020/001866 WO2020175818A1 (ko) 2019-02-28 2020-02-11 온라인 학습을 이용한 객체 트래킹을 위한 방법 및 시스템

Country Status (5)

Country Link
US (1) US11972578B2 (ko)
JP (1) JP7192143B2 (ko)
KR (1) KR102198920B1 (ko)
CN (1) CN113454640A (ko)
WO (1) WO2020175818A1 (ko)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022114506A1 (ko) * 2020-11-30 2022-06-02 삼성전자주식회사 전자 장치 및 전자 장치의 제어 방법

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022092511A1 (ko) * 2020-10-30 2022-05-05 에스케이텔레콤 주식회사 3차원 맵을 생성하는 방법 및 생성된 3차원 맵을 이용하여 사용자 단말의 포즈를 결정하는 방법
KR102614895B1 (ko) * 2021-02-09 2023-12-19 주식회사 라온버드 동적 카메라 영상 내의 객체를 실시간 추적하는 시스템 및 방법
CN116416413A (zh) * 2021-12-31 2023-07-11 杭州堃博生物科技有限公司 支气管镜的运动导航方法、装置及设备

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20060009874A (ko) * 2003-04-30 2006-02-01 이 아이 듀폰 디 네모아 앤드 캄파니 마킹된 물품을 추적하고 트레이스하는 방법
KR20130073812A (ko) * 2011-12-23 2013-07-03 삼성전자주식회사 객체 포즈 추정을 위한 장치 및 방법
JP2017010224A (ja) * 2015-06-19 2017-01-12 キヤノン株式会社 物体追尾装置、物体追尾方法及びプログラム
KR20170137350A (ko) * 2016-06-03 2017-12-13 (주)싸이언테크 신경망 생성 모델을 이용한 객체 움직임 패턴 학습장치 및 그 방법
KR20180009180A (ko) * 2016-07-18 2018-01-26 단국대학교 천안캠퍼스 산학협력단 모바일 환경 객체 신뢰도 평가와 학습을 통한 융합 객체 추적 시스템 및 방법

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4591215B2 (ja) * 2005-06-07 2010-12-01 株式会社日立製作所 顔画像データベース作成方法及び装置
JP4148281B2 (ja) * 2006-06-19 2008-09-10 ソニー株式会社 モーションキャプチャ装置及びモーションキャプチャ方法、並びにモーションキャプチャプログラム
KR20080073933A (ko) 2007-02-07 2008-08-12 삼성전자주식회사 객체 트래킹 방법 및 장치, 그리고 객체 포즈 정보 산출방법 및 장치
JP6420605B2 (ja) * 2014-09-24 2018-11-07 Kddi株式会社 画像処理装置
JP6442746B2 (ja) * 2015-12-24 2018-12-26 キヤノンマーケティングジャパン株式会社 情報処理装置、制御方法、プログラム
US9600717B1 (en) * 2016-02-25 2017-03-21 Zepp Labs, Inc. Real-time single-view action recognition based on key pose analysis for sports videos

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20060009874A (ko) * 2003-04-30 2006-02-01 이 아이 듀폰 디 네모아 앤드 캄파니 마킹된 물품을 추적하고 트레이스하는 방법
KR20130073812A (ko) * 2011-12-23 2013-07-03 삼성전자주식회사 객체 포즈 추정을 위한 장치 및 방법
JP2017010224A (ja) * 2015-06-19 2017-01-12 キヤノン株式会社 物体追尾装置、物体追尾方法及びプログラム
KR20170137350A (ko) * 2016-06-03 2017-12-13 (주)싸이언테크 신경망 생성 모델을 이용한 객체 움직임 패턴 학습장치 및 그 방법
KR20180009180A (ko) * 2016-07-18 2018-01-26 단국대학교 천안캠퍼스 산학협력단 모바일 환경 객체 신뢰도 평가와 학습을 통한 융합 객체 추적 시스템 및 방법

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022114506A1 (ko) * 2020-11-30 2022-06-02 삼성전자주식회사 전자 장치 및 전자 장치의 제어 방법

Also Published As

Publication number Publication date
JP2022521540A (ja) 2022-04-08
KR20200105157A (ko) 2020-09-07
CN113454640A (zh) 2021-09-28
KR102198920B1 (ko) 2021-01-06
JP7192143B2 (ja) 2022-12-19
US20210390347A1 (en) 2021-12-16
US11972578B2 (en) 2024-04-30

Similar Documents

Publication Publication Date Title
CN108038474B (zh) 人脸检测方法、卷积神经网络参数的训练方法、装置及介质
WO2020175818A1 (ko) 온라인 학습을 이용한 객체 트래킹을 위한 방법 및 시스템
CN108764048B (zh) 人脸关键点检测方法及装置
Hannuna et al. DS-KCF: a real-time tracker for RGB-D data
CN108256479B (zh) 人脸跟踪方法和装置
WO2016034008A1 (zh) 一种目标跟踪方法及装置
Zhang et al. Semi-automatic road tracking by template matching and distance transformation in urban areas
US11501110B2 (en) Descriptor learning method for the detection and location of objects in a video
CN109255382B (zh) 用于图片匹配定位的神经网络系统,方法及装置
CN111798487A (zh) 目标跟踪方法、装置和计算机可读存储介质
CN110956131A (zh) 单目标追踪方法、装置及系统
Li et al. Predictive RANSAC: Effective model fitting and tracking approach under heavy noise and outliers
CN113505763B (zh) 关键点检测方法、装置、电子设备及存储介质
Ghanem et al. An improved and low-complexity neural network model for curved lane detection of autonomous driving system
CN117388870A (zh) 应用于激光雷达感知模型的真值生成方法、装置及介质
CN116433722A (zh) 目标跟踪方法、电子设备、存储介质及程序产品
CN112446231A (zh) 一种人行横道检测方法、装置、计算机设备及存储介质
CN114241411B (zh) 基于目标检测的计数模型处理方法、装置及计算机设备
CN114485694A (zh) 用于自动检测建筑物覆盖区的系统和方法
CN112904365A (zh) 地图的更新方法及装置
Huang et al. Non-rigid visual object tracking using user-defined marker and Gaussian kernel
KR101373397B1 (ko) 증강현실의 호모그래피 정확도 향상을 위한 csp 기반의 ransac 샘플링 방법
CN112749293A (zh) 一种图像分类方法、装置及存储介质
JP2001060265A (ja) 画像処理装置および方法、並びに媒体
CN110705479A (zh) 模型训练方法和目标识别方法、装置、设备及介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20762462

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2021549487

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20762462

Country of ref document: EP

Kind code of ref document: A1