WO2008047774A1 - dispositif de traitement d'images animées, procédé de traitement d'images animées et programme de traitement d'images animées - Google Patents

dispositif de traitement d'images animées, procédé de traitement d'images animées et programme de traitement d'images animées Download PDF

Info

Publication number
WO2008047774A1
WO2008047774A1 PCT/JP2007/070132 JP2007070132W WO2008047774A1 WO 2008047774 A1 WO2008047774 A1 WO 2008047774A1 JP 2007070132 W JP2007070132 W JP 2007070132W WO 2008047774 A1 WO2008047774 A1 WO 2008047774A1
Authority
WO
WIPO (PCT)
Prior art keywords
moving object
object region
winner
vector data
pixel
Prior art date
Application number
PCT/JP2007/070132
Other languages
English (en)
Japanese (ja)
Inventor
Nobuyuki Matsui
Naotake Kamiura
Teijiro Isokawa
Yuzo Ogawa
Akitsugu Ohtsuka
Kenji Iwatani
Original Assignee
Toa Corporation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Toa Corporation filed Critical Toa Corporation
Publication of WO2008047774A1 publication Critical patent/WO2008047774A1/fr

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/222Studio circuitry; Studio devices; Studio equipment
    • H04N5/262Studio circuits, e.g. for mixing, switching-over, change of character of image, other special effects ; Cameras specially adapted for the electronic generation of special effects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/277Analysis of motion involving stochastic approaches, e.g. using Kalman filters
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/537Motion estimation other than block-based

Definitions

  • Moving image processing apparatus moving image processing method, and moving image processing program
  • the present invention relates to a moving image processing apparatus, a moving image processing method, and a moving image processing program, and in particular, a moving object in a moving image using a self-organizing map (SOM).
  • SOM self-organizing map
  • the present invention relates to a moving image processing apparatus, a moving image processing method, and a moving image processing program.
  • SOM maps multidimensional data to a two-dimensional map, and is used, for example, to classify unknown data.
  • Patent Document 1 As a technology developed from this SOM, for example, there is one disclosed in Patent Document 1, for example.
  • a plurality of cell forces constituting a map are handled in units of blocks that are aggregates, that is, learning is performed in units of the blocks.
  • the unknown data is classified based on the block unit vector data.
  • more accurate learning and classification of unknown data is realized compared to a general SOM in which learning is performed in a single cell and unknown data is classified based on vector data of the single cell. It is said
  • Patent Document 2 a pseudo map is provided in addition to the unlearned map which is the main map, and learning based on learning data consisting of vector data is performed one by one by this pseudo map, and all learning is performed. After the learning based on the data, the technical skill S is disclosed in which the learning results of the pseudo map are collectively reflected in the unlearned map.
  • the vector data of each cell constituting the unlearned map does not change during the learning by the pseudo map! / The classification of unknown data based on the vector data of each cell is always performed accurately.
  • the conventional technique disclosed in Patent Document 2 is also applicable to the case where each cell constituting the map is handled in units of blocks, as in the conventional technique disclosed in Patent Document 1. Yes.
  • Patent Document 1 Japanese Unexamined Patent Publication No. 2006-53842
  • Patent Document 2 Japanese Unexamined Patent Publication No. 2006-79326
  • the present invention provides a novel moving image processing apparatus, moving image processing method, and moving image processing program capable of accurately detecting a moving object in a moving image using SOM. The purpose.
  • a moving image processing apparatus includes an image for one frame that forms a moving image including pixels that form a moving object region and pixels that form a non-moving object region.
  • Data is input, n (n; plural) features of this image data are extracted for each pixel, and n-dimensional first vector data is generated, and two-dimensionally arranged n And a map including a plurality of neurons having dimensions of second vector data and belonging to either a moving object region or a non-migrating animal region class.
  • the 3rd vector data which is the statistics of the 2nd vector data of each neuron that constitutes each, constitutes a search unit that searches each pixel for a winner block corresponding to the 1st outer data, and a winner block
  • An identification means for identifying whether each pixel forms a moving object area or a non-moving object area based on the class to which the neuron belongs, the identification result by this identification means, and the first vector data of each pixel.
  • updating means for updating the second vector data and class of the neurons constituting the winner block corresponding to the pixel. Then, after updating based on all the pixels is performed by the updating means, image data for one new frame constituting the moving image is input to the extracting means.
  • the data of the pixel is extracted by n characteristic power extracting means.
  • the extraction means generates n-dimensional first vector data representing the extracted n features for each pixel.
  • each neuron constituting the map has n-dimensional second vector data and belongs to either the moving object region or the non-moving object region class.
  • the search means assembles a plurality of blocks that are a collection of a part of two euroons adjacent to each other, and a winner block corresponding to each pixel is searched for each pixel from the plurality of blocks.
  • the block having the third vector data most corresponding to the first vector data is the largest.
  • a block with short third vector data a power S winner block.
  • the third vector data is statistics of the second vector data of each neuron constituting each block, and is, for example, an average value. Then, based on the class to which each neuron constituting the winner block belongs, the power of forming each of the moving object region and the non-moving object region by each pixel is identified by the identifying unit.
  • the updating means Updated based on the identification result by the identification means and the first vector data of each pixel, the second vector data and class force S of each neuron constituting the winner block corresponding to each pixel, the updating means Updated, and so to speak. Then, after learning based on all the pixels is performed by the updating means, a new one-frame image data is stored. Data is input to the extraction means. That is, every time the frame is changed, the identification of each pixel and the learning based on each pixel after the identification are repeated.
  • the present invention may further include display means for displaying only the pixels identified as forming the moving object region by the identifying means. In this way, it is possible to extract only moving objects from the moving image and display them.
  • the search means searches for a winner candidate block whose third vector data corresponds to the first vector data among a plurality of blocks of the same size for each pixel, and this winner candidate Repeating execution means for repeatedly executing the search by the winner candidate searching means for each pixel so as to sequentially search for another winner candidate block having a smaller size in the winner candidate block searched by the searching means, and a winner candidate Determining means for determining, for each pixel, a winner block that has the third vector data corresponding to the first vector among a plurality of winner candidate blocks searched by repeatedly performing a search by the search means; It may be included. According to this configuration, a plurality of winner candidate blocks having different sizes are sequentially searched by a so-called decision tree method.
  • a true winner block is determined from the plurality of winner candidate blocks.
  • the adoption of the decision tree method for searching for the winner block reduces the amount of calculation required for searching for the winner block and reduces the burden on the search means. This is extremely effective for improving the processing speed of the entire moving image processing apparatus including the search means.
  • the updating unit may perform updating based on all the pixels in a batch, that is, in batch, after all the pixels are identified by the identifying unit. In this way, the amount of computation required for updating by the updating means is reduced, and the burden on the updating means is reduced. This is also extremely effective in improving the processing speed of the entire moving image processing apparatus including the updating means.
  • the image data in the present invention may include color information.
  • the extraction unit extracts the color information as a feature of the image data.
  • the color information mentioned here may be color space information according to a generally known RGB format, or may be color space information according to a YUV format. CMY for printing Color space information according to the K format may be used.
  • the feature of each pixel may include the feature of neighboring pixels in the vicinity of the pixel, for example, peripheral pixels.
  • the extracting means may handle a plurality of adjacent pixels as one pixel. In this way, the processing load of the entire moving image processing apparatus including the extracting means is reduced, and it is extremely effective in improving the processing speed of the entire moving image processing apparatus.
  • the image processing apparatus may further include a frame setting unit that sets a frame including the moving object region and a part of the non-moving object region, and the extraction unit may handle only the pixels in the frame. This also reduces the processing load on the entire moving image processing apparatus including the extracting means, and is extremely effective in improving the processing speed of the entire moving image processing apparatus. In addition, the possibility that pixels that form non-moving object areas (especially non-moving object areas outside the frame) will be mistakenly identified as moving object areas is reduced, and such pixels become noise. Influence is suppressed.
  • a moving image processing method includes ⁇ ( ⁇ ; plural ⁇ ) of image data for one frame constituting a moving image including pixels that form a moving object region and pixels that form a non-moving object region.
  • a moving image processing program includes n (n; plural) pieces of image data for one frame constituting a moving image including pixels forming a moving object region and pixels forming a non-moving object region.
  • An extraction procedure for extracting n features for each pixel and generating n-dimensional first vector data, each having n-dimensional second vector data, and either moving object region or non-moving object region A map forming procedure for forming a map in which a plurality of neurons belonging to the class are arranged in a two-dimensional manner.
  • the third vector data which is the statistics of the second vector data of each neuron constituting each, is assigned to the winner block corresponding to the first vector data.
  • the extraction procedure, map formation procedure, search procedure, identification procedure, and update procedure are executed by the computer, and after updating based on all pixels by the update procedure, a new one frame is obtained.
  • the image data is the target of processing by the extraction procedure.
  • FIG. 1 is a diagram showing a schematic configuration of an embodiment of the present invention.
  • FIG. 2 is an illustrative view showing a relationship between an input image and an output image in the same embodiment.
  • FIG. 3 is a block diagram showing a detailed configuration of the moving image processing apparatus in FIG. 1.
  • FIG. 4 is an illustrative view for explaining the contents of processing by the image dividing unit in FIG. 3.
  • FIG. 5 is an illustrative view for explaining the contents of processing by the frame setting unit in FIG. 3;
  • FIG. 6 is an illustrative view conceptually showing the structure of the map in FIG. 3.
  • FIG. 7 is an illustrative view for illustrating the contents of processing by the control unit in FIG. 3.
  • FIG. 8 is an illustrative view conceptually showing a state where the map in FIG. 3 is classified.
  • FIG. 9 is an illustrative view showing an output image corresponding to FIG. 5.
  • FIG. 10 is an illustrative view showing an actual input image and output image in the same embodiment.
  • FIG. 11 is a flowchart showing an outline of an object detection task executed by the control unit in FIG. 3.
  • FIG. 12 is a flowchart following FIG. 11.
  • FIG. 13 is a flowchart showing details of a winner block search process in FIG. 11.
  • FIG. 14 is a flowchart showing details of the update preparation process in FIG.
  • a moving image processing system 10 includes a color video camera (hereinafter simply referred to as a camera) 20, a moving image processing device 30, and a monitor 40.
  • a color video camera hereinafter simply referred to as a camera
  • a moving image processing device 30, and a monitor 40.
  • the camera 20 is a so-called fixed type, and is fixed at an appropriate place by a fixing tool (not shown).
  • a fixing tool not shown.
  • the camera 20 converts the incident optical image into a composite video signal that is an analog electric signal and outputs the composite video signal.
  • the composite video signal output from the camera 20 is input to the moving image processing device 30.
  • the moving image processing device 30 performs the following processing on the input composite video signal.
  • an input image according to a composite video signal includes a moving object region 100 and a non-moving object region, that is, a background region 102, as shown in FIG.
  • the moving image processing device 30 takes out only the moving object region 100 out of these, and generates a processed video signal processed so as to display an image obtained by taking out only the moving object region 100.
  • This processed video signal is input to the monitor 40, whereby an image of only the moving object region 100 as shown in FIG. 2B is displayed on the display screen of the monitor 40.
  • the moving image processing device 30 has a function of automatically detecting the moving object region 110 in the moving image given from the camera 20 and displaying the moving object region 110 on the monitor 40.
  • the moving image processing apparatus 30 is configured as shown in FIG. 3, for example.
  • the moving image processing device 30 includes a composite video signal from the camera 20. Is input.
  • the input conversion circuit 50 converts the input composite video signal into a digital video signal conforming to the YUV format, that is, color image data.
  • the color image data converted by the input conversion circuit 50 is sequentially input to the image dividing unit 52 for each frame.
  • the image dividing unit 52 converts an input image constituted by the color image data into a plurality of pixels in each of the horizontal and vertical directions, for example, a (a ; Integer of 2 or more) Divide by pixel.
  • a a ; Integer of 2 or more
  • the H XV force is 640 X 480
  • the a X a force is 4 X 4
  • the color image data after the division processing by the image dividing unit 52 is sequentially input to the initial detection unit 54 and the frame setting unit 56 one frame at a time.
  • the initial detection unit 54 detects the moving object region 100 first when it appears.
  • the moving object region 100 is detected by an image processing method such as a generally known frame difference method. Is detected.
  • the position (coordinate) data on the image of the pixel representing the moving object area 100 strictly speaking, the small sections 110, 110,. Is input to the frame setting unit 56.
  • the frame setting unit 56 based on this, as shown in FIG. 5, in the frame image at that time, the moving object region 100 A rectangular frame 120 is set to enclose.
  • This feature data X [t, g] is input to the control unit 60.
  • the control unit 60 is for realizing a block-unit learning type SOM. Specifically, the control unit 60 applies the feature data X [t, g] for each small block 110 input from the feature extraction unit 58 to the map 62, so that the small block 110 is moved to the moving object region. Identify whether 100 or background region 102 is formed. At the same time, the control unit 60 learns the map 62 using the feature data after the identification as learning data, updates the reference vector w j described later in detail, and performs classification. Note that the map 62 is in an unlearned state with respect to the initial first frame in which the moving object region 100 is detected by the initial detection unit 54 described above, so the position data obtained from the initial detection unit 54 is not included.
  • control unit 60 forms various square blocks 66 composed of 2 ⁇ 2 or more neurons 64, 64,. Of these blocks 66, 66,...,
  • the block reference vector B (b, b,.
  • the block reference vector B mentioned here is a statistic of the reference vector w j of each neuron 64, 64,... Constituting each block 66, and is, for example, an average value.
  • an arbitrary (i-th) element b of the block reference vector B is expressed by the following equation (1).
  • is the total number of neurons 64 constituting the block 66, in other words, the maximum value of the number j of the Euron 64 in the block 66.
  • the total number T of blocks 66 that can be considered on the map 62 is enormous as represented by the following equation 2, and increases exponentially as the size m X m of the map 62 increases. Therefore, obtaining the Euclidean distance D for all of this huge number of blocks 66, 66,..., And thus the winner block, is a considerable burden on the control unit 60.
  • the control unit 60 in the present embodiment is based on a decision tree method as shown in FIG. And search for the winner block.
  • FIG. 7B shows a state in which the block 66 indicated by the second diagonal pattern 68 from the right is a winner candidate block.
  • the control unit 60 in the winner candidate block 68, as shown in FIG. , Select all the blocks 66, 66, ... that are [m-2] X [m-2] size smaller by one. Then, in the same manner as described above, the middle candidate of these blocks 66, 66,... Is searched for a winner candidate block 68 of [m-2] X [m-2] size. Similarly, as shown in FIG. 4D, the control unit 60 further reduces the size by one in the [m-2] X [m-2] size winner candidate block 68 [m-2] — Search for winner candidate block 68 of size 3] X [m—3]. In this search for the winner candidate block 68, as shown in FIG. 5E, a winner candidate block 68 of size 2 X 2 (in this embodiment, [m-4] X [m-4]) is searched. Until you continue.
  • the total number T of blocks 66 for which the Euclidean distance D is obtained is drastically reduced from the value represented by the above-mentioned formula 2, and This is the value represented by number 3.
  • the control unit 60 searches for a winner block for each of the small sections 110. Then, each time the winner block is determined for each sub-section 110, the reference vector w j of each neuron 64 constituting the winner block and the feature data X [t, The cumulative amount of deviation wd ⁇ t, g] from g] is calculated.
  • control unit 60 calculates the deviation accumulation rate wr j [t, g] based on the following formula 5.
  • the winner block is determined for all the small sections 110, 110, ... as described above, and the deviation accumulation amount wd j [t, g] and the deviation accumulation rate wr j [t, g] are calculated.
  • a series of processing by the control unit 60 is called an epoch.
  • the reference vector w j of each neuron 64 is updated collectively, that is, in a batch, every time one epoch is completed.
  • the control unit 60 repeats this epoch several times in one frame, for example, 30 times. Then, after the 30 epochs are executed, the epoch is repeated 30 times in the same manner for the next frame.
  • the control unit 60 determines that each of the small sections 110 for the first frame includes the moving object region 100 and the background region 102 based on the position data provided from the initial detection unit 54. For the force S for identifying which of these is formed, and for the second and subsequent frames, the map 62 is used for the identification. For this reason, the control unit 60 attaches to the first frame. After the identification, the neurons 64, 64,... On the map 62 are classified based on the identification result.
  • the control unit 60 assigns to each neuron 64, 64,... Of the winner block corresponding to each small section 110...
  • the identification result in the first frame of the small section 110 that is, the small section 110.
  • Statistics of the index value assigned to each neuron 64 for example, an average value , Ask.
  • each neuron 64 is assigned to the moving object region 100 or the background region 102. Decide if it belongs to a class. As a result, the neurons 64, 64,... On the map 62 are classified into those belonging to the moving object region 100 (lattice pattern) and those belonging to the background region 102 (hatched pattern) as shown in FIG. Divided.
  • each of the small sections 110, 110, 10 in the rectangular frame 120 described above. ... identifies whether each forms moving object region 100 or background region 102. Specifically, among the neurons 64, 64,... Constituting the winner block, the small section 110 having many belonging to the moving object region 100 is identified as forming the moving object region 100. On the other hand, among the neurons 64, 64,... Constituting the winner block, the small section 110 having many belonging to the background area 102 is identified as forming the background area 102. It should be noted that, among the neurons 64, 64,... Constituting the winner block, the small section 110 in which the number belonging to the moving object region 100 and the number belonging to the background region 102 is the same number is one of the predetermined regions. For example, the moving object region 100 is identified.
  • the identification result by the control unit 60 is given to the output conversion unit 70.
  • Color image data is sequentially input to the output conversion unit 70 frame by frame from the input conversion unit 50 described above. ing.
  • the output conversion unit 70 is a small unit that is identified as forming the moving object region 100 by the control unit 60 out of the input image constituted by the color image data.
  • the above-described processed video signal processed so as to display only the pixels constituting the section 110 is generated. When this processed video signal is input to the monitor 40, an image of only the moving object region 100 as shown in FIG. 9 is displayed on the display screen of the monitor 40.
  • control unit 60 classifies each neuron 64, 64, ... on the map 62 again based on the identification result of the second frame. That is, not only the reference vectors w j of the neurons 64, 64,... But also the classes of the neurons 64, 64,. Then, based on the learned map 62, the next third frame is identified. Thereafter, identification and learning are repeated each time the frame is changed. For each frame after the third frame, the above-described rectangular frame 120 is set based on the identification result of the previous frame. For example, a rectangular frame 120 having the same size as the previous frame is set so as to surround all the small sections 110, 110,... Identified as forming the moving object region 100 in the previous frame.
  • the control unit 60 stops identification and learning and resets the initial detection unit 54. Thereby, the moving image processing apparatus 30 returns to the initial state before the moving object region 100 appears.
  • FIG. 10 shows an example of an actual input image and output image of the moving image processing apparatus 30 of the present embodiment.
  • the image shown on the left is the input image
  • the image shown on the right is the output image.
  • Figures (a), (b), and (c) are images of the first frame, the 20th frame, and the 40th frame, respectively. From FIG. 10, it can be seen that only a person who is going to cross the field of view (field of view) of the camera 20 is detected as a moving object. In other words, the moving image processing device 30 of the present embodiment has made it clear that the moving object can be properly touched with a detection opening.
  • the beg controller 60 that realizes the moving object detection using such a map 62 is shown in FIG.
  • the object detection task shown in the flowchart of FIG. 12 is executed.
  • step S5 After setting “;!” To the flag F indicating that the moving object region 100 is detected, the process proceeds to step S5.
  • step S5 the control unit 60 initializes the map 62, and in detail sets a random number to each reference vector w j of each neuron 64, 64, ... on the map 62. Then, the process proceeds to step S 7, and feature data X [t, g] is acquired from the feature extraction unit 58. The feature data X [t, g] is also stored by the control unit 60. Furthermore, after setting the initial value “1” to the index e indicating the number of executions of the epoch described above in step S9, the control unit 60 sets the small section 110 in the rectangular frame 120 described above in step S11. The initial value “1” is set in the index g representing the number of, and the winner block search process in step S 13 is executed.
  • step S13 the control unit 60 searches for a winner block based on the above-described decision tree method.
  • the control unit 60 proceeds to step S15 and determines whether or not the flag F described above is “0”.
  • step S15 If the flag F force is not 0 "in step S15, that is, immediately after the moving object region 100 is detected, the control unit 60 proceeds to step S17 and stores the initial identification stored in step S1 described above. Based on the data, the output conversion unit 70 is controlled to display only the moving object region 100. Then, the control unit 60 sets “0” to the flag F in step S19, and then prepares for the update in step S21. Proceed to processing.
  • step S15 if the flag F force is 0 "in step S15, that is, if there is experience of executing step S17 after the moving object region 100 is detected, the control unit 60 proceeds to step S23. Then, in this step S23, it is determined whether or not the current epoch execution number e is “1”, and if it is “1”, the process proceeds to identification processing in step S25. [0063] In step S25, the control unit 60 applies the feature data X [t, g] of the small section 110 that is the current processing target to the map 62, and the small section 110 moves to the moving object region 100. And the force that forms any of the background regions 102.
  • step S27 the output conversion unit 70 is controlled based on the identification result in step S25. That is, the output conversion unit 70 is controlled so that the small section 110 that is the current processing target forms the moving object region 100 and displays it when it does not.
  • step S27 the process proceeds to the update preparation process of step S21.
  • step S21 the control unit 60 calculates an accumulated deviation amount wd j [t, g] for each neuron 64 constituting the winner block, based on the above-described equation 4.
  • the deviation accumulation rate wr j [t, g] is calculated based on Equation 5. After these calculations, the control unit 60 proceeds to step S29.
  • step S29 the control unit 60 determines whether or not the number g of the small section 110 that is the current processing target has reached the maximum value G, that is, all the small sections 110, 110,. It is determined whether step S13 to step S27 have been executed. If there is a small section 110 for which steps S13 to S27 have not yet been executed, the process proceeds to step S31, and the value of the number g of the small section 110 is incremented by “1”, and then step S Return to 13. On the other hand, when step S13 to step S27 are executed once for all / J, sections 110, 110,..., The process proceeds to step S33.
  • step S33 the control unit 60 updates the reference vector w j of each neuron 64 based on the above-described Expression 6. Then, the process proceeds to step S35 in FIG. 12 to determine whether or not the epoch execution count e has reached its maximum value E. As described above, the maximum number of executions E of the epoch in the present embodiment is 30 times.
  • control unit 60 performs the map in the manner described above.
  • step S43 the feature data X [t + 1, g] of a new frame is acquired. After the frame number t is incremented by "1", Return to step S7.
  • step S39 If the presence of the moving object region 100 is not confirmed in step S39, the control unit 60 proceeds to step S45. In step S45, the initial detection unit 54 is reset, and the series of object detection tasks is completed.
  • step S13 the winner block search process of step S13 in this object detection task will be described in more detail with reference to FIG.
  • control unit 60 first proceeds to step S101, where the map
  • control unit 60 proceeds to step S105, and among all the blocks 66, 66, ... having a size of pXp in the winner candidate block 68, the respective block reference vectors B and the characteristic data Search for the shortest Euclidean distance D between X [t, g].
  • the control unit 60 stores the block 66 searched in step S105 as the winner candidate block 68 in the next step S107, and also records the Euclidean distance D of the winner candidate block 68.
  • control unit 60 proceeds to step S109, and determines whether or not the current block size p has reached the minimum value “2”. If not, the process proceeds to step S 111, the block size p is reduced by “;!”, And the process returns to step S 105. On the other hand, if the block size p has reached the minimum value “2”, the process proceeds to step S113.
  • step S113 the control unit 60 selects the most Euclidean among the plurality of winner candidate blocks 68, 68, ... searched by repeating the above-described steps S105 to S107. Search for a short distance D. Then, the searched winner candidate block 68 is determined as a true winner block, and the winner block search process shown in the flowchart of FIG. 13 is terminated.
  • step S21 in the object detection task described above will be described in detail.
  • control unit 60 first proceeds to step S201, where the index j representing the number of the neuron 64 in the current winner block is an initial value "1"
  • step S203 set index i representing the feature (dimension) number.
  • step S205 After the initial value “1” is set, the process proceeds to step S205.
  • step S205 the control unit 60 calculates the accumulated deviation amount w dj [t, g] based on the above-described equation 4. Then, the calculation result wd j [t, g] is stored in the next step S207.
  • control unit 60 proceeds to step S209, and calculates the deviation accumulation rate wr j [t, g] based on the above equation 5. Then, the calculation result wr j [t, g] is stored in the next step S211 and then the process proceeds to step S213.
  • step S217 the control unit 60 performs step for the power of whether or not the value of the index j representing the number of the neuron 64 has reached the maximum value ⁇ , that is, for all the neurons 64 in the current winner block. It is determined whether or not S203 to step S215 have been executed. If there is a neuron 64 that has not yet executed steps S203 to S215, the process proceeds to step S219, the value of index j is incremented by “1”, and the process returns to step S203. On the other hand, when Steps S203 to S215 are executed for all neurons 64, the update preparation shown in the flowchart of FIG. The process ends.
  • a block unit learning type SOM in which each neuron 64, 64,... Constituting the map 62 is handled in units of blocks is used.
  • the moving image processing apparatus 30 for detecting the moving object region 100 can be realized.
  • the characteristics according to the mode can be accurately captured by learning. Therefore, it is possible to respond flexibly and appropriately to various aspects (situations) of the moving object region 100.
  • the force of dividing the input image by a X a pixels as shown in FIG. 4 by the image dividing unit 52 shown in FIG. 3 is not limited to this.
  • a X b (b; an integer different from a) pixels may be divided, or may not be divided extremely, that is, the image dividing unit 52 may be excluded from the configuration of FIG.
  • the burden on the subsequent stage, particularly the control unit 60 is reduced. This is extremely effective for improving the processing speed of the entire moving image processing apparatus 30 including the control unit 60.
  • the frame setting unit 56 shown in FIG. 3 sets the rectangular frame 120 as shown in FIG. 5, and the position data of only the small sections 110, 110,... Surrounded by the rectangular frame 120. And the force that allows YUV data to be input to the feature extraction unit 58. That is, the frame setting unit 56 is excluded from the configuration of FIG. 3, and the position data and YUV data of all the small sections 110, 110,... (Or pixels) are input to the feature extraction unit 58. Good.
  • the burden on the subsequent stage, in particular, the feature extraction unit 58 and the control unit 60 is reduced. This is also extremely effective in improving the processing speed of the entire moving image processing apparatus 30.
  • the feature extraction unit 58 shown in FIG. 3 calculates the average value and the variance of each of the Y data, U data, and V data of the surrounding small sections 110, 110,.
  • a total of 12 types (dimensions) of features were extracted, but this is not a limitation.
  • the average value and the variance value may be extracted, or only one of the average value and the variance value may be extracted.
  • RGB format color space data may be extracted, and only luminance data may be extracted.
  • position (coordinate) data on the image of each pixel may be extracted together. In other words, an appropriate feature should be extracted according to the situation.
  • the force S determined by the control unit 60 shown in FIG. 3 to search for a winner block based on the decision tree method as shown in FIG. 7 is not limited to this. That is, the Euclidean distance D may be obtained for all the blocks 66, 66,... Considered on the map 62, and the winner block may be searched based on the result. However, in this case, since a considerable burden is imposed on the control unit 60 as described above, it is preferable to search for a winner block based on the decision tree method as in the present embodiment.
  • control unit 60 every time one epoch is finished, batchwise, force was decided to update the reference vector w j of each neuron 64 S, not limited to this.
  • the reference vector w j may be updated each time a winner block is determined for each subdivision 110.
  • the update formula for the reference vector w j is expressed by the following equation (7).
  • control unit 60 may repeat the epoch over the number of times other than the force determined to repeat the epoch 30 times per frame. Also, instead of simply repeating the epoch, for example, comparing the accumulated deviation wd j [t, g] in the previous epoch and the current epoch, the difference between the two, that is, the quantization error, is a predetermined threshold value.
  • the epoch may be performed for the next frame when:
  • the size of each block 66 is p X p, in other words, the shape of the block 66 is a square, but may be a rectangle.
  • the force S for making the block 66 a square is desirable.
  • the map 62 need not be an m ⁇ m square, but may be a rectangle, but a square is more convenient.
  • a fixed camera is used as the camera 20 shown in FIG. 1.
  • a movable camera having a pan head may be used.
  • an automatic tracking function can be realized in which the moving object is always captured at the center of the camera.
  • the identification and learning procedure of the map 62 in units of blocks is the same as that of the fixed camera 20 described in the present embodiment.
  • the position (coordinate) data of each pixel on the image is indispensable for this identification and learning. Accordingly, since there is no particular problem in obtaining the displacement amount referred to here, the present invention is extremely useful for realizing the automatic tracking function.
  • the moving image processing apparatus 30 in the present embodiment can be realized by a general-purpose computer such as a personal computer.
  • a program for causing a general-purpose computer to function as the moving image processing apparatus 30 can be provided.

Abstract

L'invention concerne un dispositif original de traitement d'images animées pour détecter un objet animé dans une image animée en utilisant une carte auto-organisée. Un signal vidéo composite provenant d'une caméra (20) est converti en données d'image couleur par une section de conversion d'entrée (50). Les données d'image couleur sont entrées dans une section d'extraction de caractéristique (58) grâce à une section de division d'image (52) et une section de paramétrage de trame (56). La section d'extraction de caractéristique (58) extrait une caractéristique de dimension n de chaque pixel constitutif des données d'image couleur d'une trame chaque fois que les données d'image couleur de la trame sont entrées. Les données de caractéristique extraite sont entrées dans une section de commande (60). La section de commande (60) constitue une carte auto-organisée de type d'apprentissage d'unités de bloc avec une carte (62) et identifie la région parmi la région d'objet animé et la région d'arrière-plan que chaque pixel constitue. Selon les résultats d'identification, une section de conversion de sortie (70) génère un signal vidéo traité de sorte que seule la région d'objet animé est affichée.
PCT/JP2007/070132 2006-10-17 2007-10-16 dispositif de traitement d'images animées, procédé de traitement d'images animées et programme de traitement d'images animées WO2008047774A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2006-282496 2006-10-17
JP2006282496A JP2008102589A (ja) 2006-10-17 2006-10-17 動画像処理装置および動画像処理方法ならびに動画像処理プログラム

Publications (1)

Publication Number Publication Date
WO2008047774A1 true WO2008047774A1 (fr) 2008-04-24

Family

ID=39314002

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2007/070132 WO2008047774A1 (fr) 2006-10-17 2007-10-16 dispositif de traitement d'images animées, procédé de traitement d'images animées et programme de traitement d'images animées

Country Status (2)

Country Link
JP (1) JP2008102589A (fr)
WO (1) WO2008047774A1 (fr)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101799876B (zh) * 2010-04-20 2011-12-14 王巍 一种视音频智能分析管控系统
CN101859436B (zh) * 2010-06-09 2011-12-14 王巍 一种大幅规律运动背景智能分析管控系统

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH05174149A (ja) * 1991-12-26 1993-07-13 Nippon Telegr & Teleph Corp <Ntt> 画像認識装置
JPH05210739A (ja) * 1991-09-12 1993-08-20 Fuji Photo Film Co Ltd 被写体認識方法
JP2000259838A (ja) * 1999-03-12 2000-09-22 Fujitsu Ltd 画像追跡装置及び記録媒体
JP2004164563A (ja) * 2002-09-26 2004-06-10 Toshiba Corp 画像解析方法、画像解析装置、画像解析プログラム
JP2005063308A (ja) * 2003-08-19 2005-03-10 Fuji Photo Film Co Ltd 画像識別方法および装置、オブジェクト識別方法および装置ならびにプログラム
JP2006079326A (ja) * 2004-09-09 2006-03-23 Sysmex Corp 分類支援マップ作成方法およびそれを実行するためのプログラムならびに分類支援マップ作成装置

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH05210739A (ja) * 1991-09-12 1993-08-20 Fuji Photo Film Co Ltd 被写体認識方法
JPH05174149A (ja) * 1991-12-26 1993-07-13 Nippon Telegr & Teleph Corp <Ntt> 画像認識装置
JP2000259838A (ja) * 1999-03-12 2000-09-22 Fujitsu Ltd 画像追跡装置及び記録媒体
JP2004164563A (ja) * 2002-09-26 2004-06-10 Toshiba Corp 画像解析方法、画像解析装置、画像解析プログラム
JP2005063308A (ja) * 2003-08-19 2005-03-10 Fuji Photo Film Co Ltd 画像識別方法および装置、オブジェクト識別方法および装置ならびにプログラム
JP2006079326A (ja) * 2004-09-09 2006-03-23 Sysmex Corp 分類支援マップ作成方法およびそれを実行するためのプログラムならびに分類支援マップ作成装置

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
TAKIZAWA H. ET AL.: "Jiko Soshikika Map o Mochiita Senko Sharyo Ninshiki Shuho", IEICE TECHNICAL REPORT PRMU, vol. 101, no. 302, 6 September 2001 (2001-09-06), pages 23 - 28, XP003022341 *

Also Published As

Publication number Publication date
JP2008102589A (ja) 2008-05-01

Similar Documents

Publication Publication Date Title
JP6351240B2 (ja) 画像処理装置、画像処理方法及びプログラム
US8144932B2 (en) Image processing apparatus, image processing method, and interface apparatus
CN111524137B (zh) 基于图像识别的细胞识别计数方法、装置和计算机设备
JP4933186B2 (ja) 画像処理装置、画像処理方法、プログラム及び記憶媒体
KR101410489B1 (ko) 얼굴 식별 방법 및 그 장치
JP6192271B2 (ja) 画像処理装置、画像処理方法及びプログラム
JP5777367B2 (ja) パターン識別装置、パターン識別方法及びプログラム
US20110211233A1 (en) Image processing device, image processing method and computer program
JP5549345B2 (ja) 画像収集装置に用いる天空検出装置及び方法
CN109308711B (zh) 目标检测方法、装置及图像处理设备
JP2022510622A (ja) 画像処理モデルの訓練方法、画像処理方法、ネットワーク機器、及び記憶媒体
US20210133980A1 (en) Image processing apparatus, training apparatus, image processing method, training method, and storage medium
CN108305253B (zh) 一种基于多倍率深度学习的病理图像分类方法
KR20130043222A (ko) Tv 제어용 제스처 인식 시스템
CN110032932B (zh) 一种基于视频处理和决策树设定阈值的人体姿态识别方法
WO2005064540A1 (fr) Procede, systeme et programme de detection d&#39;image de visage
JP5578816B2 (ja) 画像処理装置
JP7005213B2 (ja) 画像解析装置
CN115984969A (zh) 一种复杂场景下轻量级行人跟踪方法
CN113052170A (zh) 一种无约束场景下的小目标车牌识别方法
CN111028263B (zh) 一种基于光流颜色聚类的运动物体分割方法及其系统
JP2005134966A (ja) 顔画像候補領域検索方法及び検索システム並びに検索プログラム
WO2008047774A1 (fr) dispositif de traitement d&#39;images animées, procédé de traitement d&#39;images animées et programme de traitement d&#39;images animées
CN111079516B (zh) 基于深度神经网络的行人步态分割方法
KR20080079443A (ko) 영상으로부터의 객체 검출 방법 및 장치

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 07829866

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 07829866

Country of ref document: EP

Kind code of ref document: A1