WO2022269891A1 - 画像処理装置、学習装置、画像処理システム、画像処理方法、生成方法、画像処理プログラム、及び生成プログラム - Google Patents
画像処理装置、学習装置、画像処理システム、画像処理方法、生成方法、画像処理プログラム、及び生成プログラム Download PDFInfo
- Publication number
- WO2022269891A1 WO2022269891A1 PCT/JP2021/024093 JP2021024093W WO2022269891A1 WO 2022269891 A1 WO2022269891 A1 WO 2022269891A1 JP 2021024093 W JP2021024093 W JP 2021024093W WO 2022269891 A1 WO2022269891 A1 WO 2022269891A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- image
- target
- information
- image processing
- unit
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/21—Server components or server architectures
- H04N21/218—Source of audio or video content, e.g. local disk arrays
- H04N21/2187—Live feed
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/27—Server based end-user applications
- H04N21/274—Storing end-user multimedia data in response to end-user request, e.g. network recorder
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N7/00—Television systems
- H04N7/18—Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast
Definitions
- the present disclosure relates to an image processing device, a learning device, an image processing system, an image processing method, a generation method, an image processing program, and a generation program.
- Multiple cameras may capture an object.
- a technique has been proposed for displaying an object from various directions based on a plurality of images obtained by imaging the object with a plurality of cameras (see Patent Document 1).
- an optimum image which is an image that the user wants to see
- the method allows the user to view the optimal image via the terminal device.
- an object that the user wants to see may move. If the object moves, the image produced by one camera is not always the best image. Therefore, the problem is how to identify the camera that generated the optimum image from among a plurality of cameras.
- the purpose of this disclosure is to identify the camera that produced the optimal image.
- the image processing device includes an acquisition unit that acquires a plurality of images generated by a plurality of imaging devices existing at different locations and a target object image that is an image including the target, and the plurality of images and the target. a detection unit that detects object information, which is information about the object, for each image using an object image; an identifying unit that identifies the first imaging device that generated the best image containing the object.
- FIG. 1 illustrates an image processing system according to Embodiment 1;
- FIG. 2 is a diagram showing an example (part 1) of arrangement of cameras according to the first embodiment;
- FIG. 2 is a diagram showing an example (part 2) of arrangement of cameras according to the first embodiment;
- FIG. 2 illustrates hardware included in the image processing apparatus according to the first embodiment;
- FIG. 2 is a block diagram showing functions of the image processing apparatus according to Embodiment 1;
- FIG. 4 is a diagram (part 1) for explaining target information according to the first embodiment; 3A to 3C are diagrams (part 2) for explaining target information according to the first embodiment;
- FIG. 4 is a diagram showing an example of specific information according to Embodiment 1;
- FIG. 4 is a flowchart showing an example of processing executed by the image processing apparatus according to Embodiment 1;
- 4 is a diagram showing a specific example of processing executed by the image processing system according to the first embodiment;
- FIG. 3 is a block diagram showing functions of an image processing apparatus according to a modification of Embodiment 1;
- FIG. 3 is a block diagram showing functions of an image processing apparatus according to a second embodiment;
- FIG. FIG. 10 is a diagram showing an example of using a plurality of trained models according to Embodiment 2;
- FIG. FIG. 10 is a diagram showing an example of using one trained model according to Embodiment 2;
- FIG. 10 is a diagram showing an example of a neural network according to Embodiment 2;
- FIG. FIG. 12 is a diagram showing an example of random forest according to the second embodiment;
- FIG. 11 is a block diagram showing functions of a learning device according to Embodiment 2;
- FIG. 12 illustrates an image processing system according to a third embodiment;
- FIG. 11 is a block diagram showing functions of an information processing apparatus according to a third embodiment;
- 14 is a block diagram showing functions of an image processing apparatus according to Embodiment 4;
- FIG. FIG. 13 is a diagram showing an example of a trained model for events according to Embodiment 4;
- FIG. 11 is a block diagram showing functions of an information processing apparatus according to a fifth embodiment;
- FIG. 1 is a diagram showing an image processing system according to Embodiment 1.
- the image processing system includes an image processing device 100 and cameras 200_1 to 200_6.
- the image processing system may include terminal device 300 .
- the image processing device 100, the cameras 200_1 to 200_6, and the terminal device 300 are connected via a network.
- the network may be a wired network or a wireless network.
- the image processing device 100 is a device that executes an image processing method.
- the camera 200_1 is called camera A.
- the camera 200_2 is called camera B.
- the camera 200_3 is called camera C.
- the camera 200_4 is called camera D.
- the camera 200_5 is called camera E.
- Camera 200_6 is called camera F.
- FIG. 1 illustrates six cameras. The number of cameras is not limited to six. Note that the camera is also called an imaging device.
- Each of the cameras 200_1 to 200_6 is installed to photograph the same event, is capable of photographing the state of the event, and exists at different points.
- the camera 200_1 exists at the A point.
- the camera 200_2 is present at the B point.
- each of the cameras 200_1-200_6 exists at a different point.
- Events are, for example, live performances, boxing, futsal, and the like. In the discussion below, the event is assumed to be live.
- Cameras 200_1 to 200_6 capture images of a plurality of women appearing at the event.
- the plural females are W, X, Y, Z.
- Several women are dancing while singing a song. Therefore, multiple women may not exist in a fixed position.
- the object is a person, an animal, a moving machine, or the like.
- the object may appear in the event and move during the event.
- the object may be a person specified by the user from among the list of event characters displayed on the screen of the terminal device 300 .
- X is the object.
- the terminal device 300 is a device used by a user.
- the terminal device 300 acquires an image (more specifically, video) including the object X via the image processing device 100 .
- a user can view the object X using the terminal device 300 .
- the image including the object X is the optimum image. In other words, the image containing the object X is the image that the user wants to see.
- object X moves. Therefore, the image produced by one camera is not always the best image. For example, in FIG. 1, since the object X exists in front of the camera C, the image generated by the camera C can be said to be the optimum image. However, because the object X moves, the image produced by camera C is not always the best image. Therefore, a method by which the image processing apparatus 100 identifies the camera that generated the optimum image from among the plurality of cameras will be described below.
- FIG. 1 shows a case where the cameras 200_1 to 200_6 are arranged substantially in a line. Multiple cameras may be arranged as follows.
- FIG. 2 is a diagram showing an example (part 1) of camera arrangement according to the first embodiment.
- FIG. 2 shows that the cameras are arranged in a circle.
- FIG. 2 shows cameras 200_7 and 200_8.
- the camera 200_7 is also called camera G.
- the camera 200_8 is also called camera H.
- FIG. 3 is a diagram illustrating an example (part 2) of camera arrangement according to the first embodiment.
- FIG. 3 shows that camera 200_7 and camera 200_8 are placed far away.
- the case of FIG. 1 will be used. That is, the description will be made using the case where the cameras A to F are arranged substantially in a line.
- FIG. 4 illustrates hardware included in the image processing apparatus according to the first embodiment.
- the image processing apparatus 100 has a processor 101 , a volatile memory device 102 and a nonvolatile memory device 103 .
- the processor 101 controls the image processing apparatus 100 as a whole.
- the processor 101 is a CPU (Central Processing Unit), FPGA (Field Programmable Gate Array), or the like.
- Processor 101 may be a multiprocessor.
- the image processing apparatus 100 may have a processing circuit.
- the processing circuit may be a single circuit or multiple circuits.
- the volatile storage device 102 is the main storage device of the image processing device 100 .
- the volatile memory device 102 is RAM (Random Access Memory).
- a nonvolatile storage device 103 is an auxiliary storage device of the image processing apparatus 100 .
- the nonvolatile storage device 103 is a HDD (Hard Disk Drive) or an SSD (Solid State Drive).
- FIG. 5 is a block diagram showing functions of the image processing apparatus according to the first embodiment.
- the image processing apparatus 100 has a storage unit 110 , an acquisition unit 120 , a detection unit 130 , an identification unit 140 , a selection unit 150 and an output control unit 160 .
- the storage unit 110 may be implemented as a storage area secured in the volatile storage device 102 or the nonvolatile storage device 103 .
- a part or all of the acquisition unit 120, the detection unit 130, the identification unit 140, the selection unit 150, and the output control unit 160 may be implemented by a processing circuit.
- Part or all of the acquisition unit 120, the detection unit 130, the identification unit 140, the selection unit 150, and the output control unit 160 may be implemented as modules of a program executed by the processor 101.
- the program executed by the processor 101 is also called an image processing program.
- an image processing program is recorded on a recording medium.
- the acquisition unit 120 acquires a plurality of images generated by cameras A to F. For example, the acquisition unit 120 acquires multiple images from cameras A to F. Note that the object X is included in at least one of the plurality of images.
- the acquisition unit 120 acquires an object image.
- the acquisition unit 120 acquires the target object image from the storage unit 110 .
- the acquisition unit 120 acquires the target object image from the terminal device 300 .
- the target object image is an image including the target object X.
- FIG. Object images may be referred to as sample images.
- the acquisition unit 120 may acquire the name information of the target object X from the terminal device 300 and acquire the target object image from the storage unit 110 based on the name information.
- the detection unit 130 detects target information for each image using a plurality of images and target object images. For example, the detection unit 130 detects target information based on an image generated by the camera A. FIG. Also, for example, the detection unit 130 detects target information based on the image generated by the camera B. FIG. Thus, the detection unit 130 detects six pieces of target information based on the six images generated by the cameras A to F. The target information also includes the identifier of the camera that generated the image. For example, based on the image generated by Camera A, the detected object information includes the identifier of Camera A.
- Target information is information about the target in the image.
- the object information includes information indicating whether or not the object X is included in the image, the size of the object X in the image, the position of the object X in the image, the orientation of the object X, One or more of information indicating whether or not the object X in the image is blurred, information indicating the brightness of the object X in the image, and skeleton information of the object X.
- the target information is not limited to the above information.
- the object information may be other information as long as it is information about how the object X appears in the image.
- FIG. 6 is a diagram (part 1) for explaining target information according to the first embodiment.
- Image 400 in FIG. 6 is an image generated by camera C.
- the detection unit 130 creates information indicating whether or not the target object X is included in the image 400 using the image 400 and the target object image. Specifically, the detection unit 130 creates information indicating whether or not the object X is included in the image 400 using pattern matching, object recognition technology, or the like. Note that, for example, the object recognition technology is a specific object recognition technology.
- the detection unit 130 detects the size of the object X in the image 400 when the object X is included in the image 400 .
- FIG. 6 shows the size of the object X with a frame 401 .
- the size of object X is indicated by the area of frame 401 .
- the size of the object X may indicate whether or not it is larger than the reference size.
- the detection unit 130 detects the position of the object X within the image. For example, the detection unit 130 detects the distance between the center line of the image 400 and the center line of the frame 401 as the position of the object X. FIG. Also, the detection unit 130 may detect where the object X exists within the image 400 . For example, the detection results are center, left, right, and so on.
- the detection unit 130 detects the orientation of the object X when the object X is included in the image 400 .
- the detection unit 130 detects the orientation of the object X using Head Pose Estimation. Examples of detection results are shown.
- FIGS. 7A to 7C are diagrams (part 2) for explaining target information according to the first embodiment.
- FIGS. 7A to 7C show detection results.
- FIG. 7A shows that the orientation of the object X is the front.
- the detection result in FIG. 7A may indicate 0 degrees.
- FIG. 7B shows that the orientation of the target object X is right beside.
- the detection result in FIG. 7B may indicate 90 degrees.
- FIG. 7C shows that the object X is oriented directly behind.
- the detection result of FIG. 7C may indicate 180 degrees.
- the detection unit 130 creates information indicating whether or not the object X in the image 400 is blurred. For example, the detection unit 130 creates information indicating whether or not the object X is blurred, based on the steepness of luminance change in the edge portion of the image 400, the amount of high frequency components in the image 400, and the like. Information indicating whether or not the object X is blurred may be expressed as information indicating whether or not the object X is in focus.
- the detection unit 130 creates information indicating the brightness of the object X in the image 400 when the object X is included in the image 400 .
- the detection unit 130 creates the brightness or brightness of the area of the object X in the image 400 as information indicating the brightness of the object X.
- the information indicating the brightness of the object X may be expressed as information indicating whether or not the image is obtained by imaging the object X in backlight.
- the detection unit 130 detects skeleton information of the object X when the object X is included in the image 400 .
- the detection unit 130 detects skeleton information of the target object X using Open Pose. An example of skeleton information is shown.
- FIG. 8 is a diagram (part 3) for explaining target information according to the first embodiment.
- the detection unit 130 detects skeleton information 402 of the object X based on the image 400 . Further, based on the skeleton information 402, the detection unit 130 may detect information such as whether the whole body is included in the image or part of the body is not included in the image. Thus, the detection unit 130 detects target information for each image. That is, the detection unit 130 detects six pieces of target information. 6 examples of target information are shown.
- FIG. 9 is a diagram showing an example of target information according to the first embodiment. In FIG. 9, six target information are illustrated. The detection unit 130 detects such six pieces of target information.
- the identification unit 140 identifies the camera that generated the optimum image from cameras A to F using the target information detected for each image.
- the camera that generated the optimum image is also referred to as the first imaging device.
- the optimum image is an image including the object X.
- FIG. Specifically, the identification unit 140 identifies the camera that generated the optimum image using the target information detected for each image and the identification information. exemplifies specific information;
- the identification information 111 is information for identifying the camera that generated the optimum image.
- the specific information 111 is acquired by the acquisition unit 120 .
- the acquisition unit 120 acquires the specific information 111 from the storage unit 110 .
- the acquisition unit 120 acquires the specific information 111 from an external device (for example, a cloud server).
- the identification unit 140 identifies the camera that generated the optimum image using the point addition method.
- the point addition method will be explained concretely.
- the object information corresponding to camera C indicates that object X is included in the image.
- the specifying unit 140 specifies that the target information corresponding to the camera C satisfies the condition “the target is included in the image” indicated by the specifying information 111 . Therefore, the specifying unit 140 gives 1 point to the target information corresponding to the camera C.
- FIG. the object information corresponding to the camera C indicates that the size of the object X is large.
- the specifying unit 140 specifies that the target information corresponding to the camera C satisfies the condition “the target object is large” indicated by the specifying information 111 .
- the specifying unit 140 further assigns 1 point to the target information corresponding to the camera C.
- the specifying unit 140 gives 1 point to the target information corresponding to the camera C when the condition is satisfied.
- the specifying unit 140 performs similar processing on the target information corresponding to cameras A to F.
- the specifying unit 140 specifies the camera corresponding to the target information with the highest score among the target information corresponding to the cameras A to F as the camera that generated the optimum image.
- the identifying unit 140 identifies camera C. FIG.
- the selection unit 150 selects an image generated by the camera identified by the identification unit 140 from among the plurality of images.
- the image contains the identifier of the camera.
- an identifier of the camera is attached to the image. Therefore, the selection unit 150 can select the image generated by the camera identified by the identification unit 140 from among the plurality of images. Note that the selected image is the optimal image.
- the output control unit 160 outputs the selected image.
- the output control section 160 outputs the selected image to the terminal device 300 .
- the output control section 160 may output the selected image to the storage section 110 .
- FIG. 11 is a flowchart illustrating an example of processing executed by the image processing apparatus according to Embodiment 1.
- the acquisition unit 120 acquires a plurality of images generated by the cameras A to F.
- the acquisition unit 120 acquires the target object image and the specific information.
- the detection unit 130 detects object information for each image using a plurality of images and object images.
- Step S14 The identifying unit 140 identifies the camera that generated the optimum image using the target information detected for each image and the specific information.
- Step S15 The selection unit 150 selects an image generated by the identified camera from among the plurality of images.
- Step S16 The output control section 160 outputs the selected image.
- FIG. 12 is a diagram showing a specific example of processing executed by the image processing system according to the first embodiment.
- FIG. 12 shows images generated by cameras AF.
- image "A001" is an image generated by camera A at time "1p".
- the image processing device 100 acquires an object image including the object X from the terminal device 300 .
- the image processing device 100 identifies the camera B that generated the optimum image.
- the image processing apparatus 100 selects the image "B002" generated by the camera B from among the images "A002" to "F002".
- the image processing device 100 outputs the image “B002” to the terminal device 300 . Thereby, the terminal device 300 displays the image “B002”.
- the image processing device 100 After the image "B002" is output, the image processing device 100 identifies the camera B that generated the optimum image.
- the image processing apparatus 100 selects the image “B003” generated by the camera B from among the images "A003" to "F003".
- the image processing device 100 outputs the image “B003” to the terminal device 300 . Accordingly, the terminal device 300 displays the image "B003".
- the image processing device 100 After the image "B003" is output, the image processing device 100 identifies the camera C that generated the optimum image.
- the image processing apparatus 100 selects the image “C004” generated by the camera C from among the images “A004” to “F004”.
- the image processing device 100 outputs the image “C004” to the terminal device 300 . Thereby, the terminal device 300 displays the image “C004”.
- the image processing device 100 repeats the above processing. This allows the user to continuously view the optimum image.
- the image processing apparatus 100 can identify the camera that generated the optimum image, as described above.
- FIG. 13 is a block diagram showing the functions of the image processing device of the modification of the first embodiment.
- Acquisition unit 120 stores a plurality of images in storage unit 110 .
- Acquisition unit 120 acquires a plurality of images from storage unit 110 at a predetermined timing. For example, the obtaining unit 120 obtains a plurality of images from the storage unit 110 when receiving a process execution instruction from the user. Then, the detection unit 130, the identification unit 140, and the selection unit 150 execute processing.
- the selected image (that is, the optimum image) is stored in storage unit 110 .
- the output control unit 160 acquires the optimum image from the storage unit 110 at a predetermined timing. For example, when the acquisition unit 120 acquires from the terminal device 300 an instruction to transmit an optimal image (that is, an image including the object X), the output control unit 160 acquires the optimal image from the storage unit 110 . The output control unit 160 outputs the acquired optimum image. For example, the output control unit 160 outputs the optimum image to the terminal device 300. FIG.
- the image processing device 100 can output an optimal image at a predetermined timing.
- Embodiment 2 Next, Embodiment 2 will be described. In Embodiment 2, mainly matters different from Embodiment 1 will be described. In the second embodiment, descriptions of items common to the first embodiment are omitted. Embodiment 2 describes a case where detection and specific processing are performed using a trained model.
- FIG. 14 is a block diagram showing functions of the image processing apparatus according to the second embodiment.
- the image processing device 100a has an acquisition unit 120a, a detection unit 130a, and an identification unit 140a.
- the function of the acquisition unit 120a will be explained later.
- the detection unit 130a uses at least one trained model in the process of detecting target information for each image. A case where a trained model is used will be described using a specific example.
- FIG. 15 is a diagram showing an example of using a plurality of trained models according to the second embodiment.
- FIG. 15 shows an image 410 and an object image 411 .
- Image 410 is an image generated by camera C.
- FIG. The detection unit 130 a detects a person in the image 410 using the person detection model 131 , which is a trained model, and the image 410 . Thereby, W, X, and Y are detected.
- the detection unit 130a identifies the object X in the image 410 using the person identification model 132, which is a trained model, the image 410, and the object image 411. Also, the detection unit 130 a detects the size of the object X in the image 410 and the position of the object X in the image 410 using the person identification model 132 , the image 410 and the object image 411 .
- the detection unit 130 a detects the orientation of the target object X using the orientation detection model 133 that is a trained model and the image 410 .
- the detection unit 130a uses the focus detection model 134, which is a trained model, and the image 410 to detect whether or not the object X in the image 410 is blurred.
- the detection unit 130 a detects the brightness of the object X in the image 410 using the brightness detection model 135 that is a trained model and the image 410 .
- the detection unit 130a detects the skeleton information of the target object X using the skeleton detection model 136, which is a trained model, and the image 410.
- the human detection model 131, the human identification model 132, the orientation detection model 133, the focus detection model 134, the brightness detection model 135, and the skeleton detection model 136 are acquired by the acquisition unit 120a.
- the acquisition unit 120a acquires these trained models from the storage unit 110.
- FIG. Also, for example, the acquisition unit 120a acquires these learned models from an external device.
- the detection unit 130 a may detect target information corresponding to the camera C using one trained model, the image 410 and the target object image 411 .
- a case where one trained model is used is illustrated.
- FIG. 16 is a diagram showing an example in which one trained model of Embodiment 2 is used.
- FIG. 16 shows that the trained model constitutes a neural network.
- the detection unit 130 a detects target information corresponding to the camera C using the learned model, the image 410 and the target object image 411 .
- the learned model is obtained by the obtaining unit 120a.
- the acquiring unit 120a acquires the learned model from the storage unit 110.
- FIG. Also, for example, the acquiring unit 120a acquires the learned model from an external device.
- the identification unit 140a identifies the camera that generated the optimum image using the target information detected for each image and the learned model.
- a trained model consists of a neural network. Illustrate a neural network.
- FIG. 17 is a diagram showing an example of a neural network according to the second embodiment.
- the identifying unit 140a identifies the camera that generated the optimum image using the target information detected for each image and the learned model. Specifically, target information detected for each image is input to the trained model, and the trained model outputs the camera that generated the optimum image.
- the identification unit 140a identifies the camera that generated the optimum image by outputting the camera that generated the optimum image.
- a trained model may consist of a random forest. Illustrate a random forest.
- FIG. 18 is a diagram showing an example of random forest according to the second embodiment.
- the identifying unit 140a may identify the camera that generated the optimum image using a trained model that forms a random forest.
- the learned model that configures the neural network or random forest is acquired by the acquiring unit 120a.
- the acquiring unit 120a acquires the learned model from the storage unit 110.
- FIG. Also, for example, the acquiring unit 120a acquires the learned model from an external device.
- the image processing apparatus 100 can detect target information for each image using a trained model. Also, the image processing apparatus 100 can identify the camera that generated the optimum image using the learned model.
- FIG. 19 is a block diagram showing functions of the learning device according to the second embodiment.
- Learning device 500 has a processor, volatile storage, and non-volatile storage.
- the learning device 500 may have processing circuitry.
- the learning device 500 is a device that executes the generation method.
- Learning device 500 has acquisition unit 510 and generation unit 520 .
- Part or all of the acquisition unit 510 and the generation unit 520 may be realized by a processing circuit of the learning device 500 .
- part or all of the acquisition unit 510 and the generation unit 520 may be implemented as modules of programs executed by the processor of the learning device 500 .
- the program is also called a generation program.
- the generation program is recorded on a recording medium.
- the acquisition unit 510 acquires object information created for each image based on a plurality of images generated by a plurality of cameras existing at different points and the object image. A label indicating that the target information is created based on the optimum image is added to one target information among the plurality of acquired target information. By adding a label to the target information in this way, the learning device 500 can perform supervised learning.
- the acquisition unit 510 acquires the target information from an external device.
- the target information may be information created by the user.
- the target information also includes the identifier of the camera.
- the generation unit 520 generates a learned model that identifies the camera that generated the optimum image from among the plurality of cameras, using the target information created for each image. Note that the optimal image includes the object.
- the camera is also called a first imaging device.
- a device such as the image processing device 100a can identify the camera that generated the optimum image.
- Embodiment 3 Next, Embodiment 3 will be described. In Embodiment 3, mainly matters different from Embodiments 1 and 2 will be described. In the third embodiment, descriptions of items common to the first and second embodiments are omitted. In the third embodiment, a case will be described in which the process of identifying the camera that generated the optimum image is performed by an apparatus other than the image processing apparatus.
- FIG. 20 is a diagram showing an image processing system according to the third embodiment.
- the image processing system includes cameras 200_1 to 200_6, an image processing device 600, and an information processing device 700.
- FIG. The image processing system may include terminal device 300 .
- the image processing device 600 and the information processing device 700 communicate via a network.
- the network may be a wired network or a wireless network.
- the image processing device 600 is a device that detects target information.
- the method for detecting target information is the same as the method for detecting target information in the first embodiment. That is, the image processing apparatus 600 detects object information for each image using a plurality of images generated by the cameras A to F and object images.
- the information processing device 700 is a device that executes an information processing method.
- the information processing device 700 has a processor, a volatile memory device, and a non-volatile memory device.
- the information processing device 700 may have a processing circuit. Next, functions of the information processing device 700 will be described.
- FIG. 21 is a block diagram showing functions of the information processing apparatus according to the third embodiment.
- the information processing device 700 has a storage unit 710 , an acquisition unit 720 , an identification unit 730 and an output unit 740 .
- the storage unit 710 may be implemented as a storage area secured in a volatile storage device or a non-volatile storage device included in the information processing device 700 .
- a part or all of the acquisition unit 720 , the identification unit 730 , and the output unit 740 may be realized by a processing circuit included in the information processing device 700 . Also, part or all of the acquisition unit 720 , the identification unit 730 , and the output unit 740 may be implemented as modules of a program executed by a processor included in the information processing device 700 .
- the program is also called an information processing program. For example, the information processing program is recorded on a recording medium.
- the acquisition unit 720 acquires target information detected for each image. For example, the acquisition unit 720 acquires target information detected for each image from the image processing device 600 .
- the identification unit 730 identifies the camera that generated the optimum image from cameras A to F using the target information detected for each image.
- the optimal image contains object X.
- the camera is also called a first imaging device.
- the identification unit 730 identifies the camera that generated the optimum image using the target information detected for each image and the identification information 111 . That is, the specifying unit 730 performs the same processing as the specifying unit 140.
- FIG. Note that the specific information 111 is acquired by the acquisition unit 720 .
- the acquisition unit 720 acquires the specific information 111 from the storage unit 710 .
- the acquisition unit 720 acquires the specific information 111 from an external device.
- the identifying unit 730 may identify the camera that generated the optimum image by the following method.
- the identification unit 730 identifies the camera that generated the optimum image using the target information detected for each image and the learned model. That is, the identifying unit 730 performs the same processing as the identifying unit 140a.
- the learned model is acquired by the acquisition unit 720.
- the acquisition unit 720 acquires the learned model from the storage unit 710 .
- the acquiring unit 720 acquires the learned model from an external device.
- the output unit 740 outputs information indicating the specified camera to the image processing device 600 .
- camera C be the camera.
- the image processing device 600 selects an image generated by the camera C from among multiple images. That is, the image processing device 600 performs the same processing as the selection unit 150.
- FIG. Image processing device 600 outputs the selected image to terminal device 300 . That is, the image processing device 600 executes the same processing as the output control section 160.
- the information processing device 700 can identify the camera that generated the optimum image.
- Embodiment 4 Next, Embodiment 4 will be described. In Embodiment 4, mainly matters different from Embodiment 1 will be described. In the fourth embodiment, descriptions of items common to the first embodiment are omitted.
- FIG. 22 is a block diagram showing functions of the image processing apparatus according to the fourth embodiment.
- the image processing device 100b has an acquisition unit 120b and a specification unit 140b.
- the acquisition unit 120b acquires event information.
- the acquiring unit 120b acquires event information from the terminal device 300.
- FIG. Further, for example, the acquisition unit 120b acquires event information through an input operation by the user.
- the acquisition unit 120b acquires from the storage unit 110 .
- the storage unit 110 may store event information about an event that is currently being held (that is, an event for which an image is being distributed).
- the event information is information indicating the type of event. For example, the event is live, boxing, futsal, and the like.
- the event information may include information indicating a person appearing in the event.
- the acquisition unit 120b acquires a trained model for an event based on the event information. Demonstrate a trained model for an event.
- FIG. 23 is a diagram illustrating an example of a learned model for events according to the fourth embodiment. FIG. 23 illustrates three learned models as learned models for events.
- the three trained models are a trained model for live performance 113a, a trained model for futsal 113b, and a trained model for boxing 113c.
- the trained model for live performance 113a, the trained model for futsal 113b, and the trained model for boxing 113c may be stored in the storage unit 110 or may be stored in an external device.
- the learned model for live performance 113a is a learned model generated by learning to identify the camera that generated the image that satisfies the points of importance in the live performance.
- the live trained model 113a uses target information created for each image based on a plurality of images generated by a plurality of cameras existing at different locations and a target object image to obtain the Among them, a trained model that identifies the camera that produced the best image.
- a plurality of pieces of target information used in learning for generating the trained model for live performance 113a are created based on a plurality of images generated by a plurality of cameras existing at different points in past live performances.
- a label is attached to the target information created based on the image selected by a person as being preferable among the plurality of created target information.
- the learned model for futsal 113b is a learned model generated by learning to identify the camera that generated the image that satisfies important points in futsal. Further, the trained model for futsal 113b uses target information created for each image based on a plurality of images generated by a plurality of cameras existing at different points and a target object image, and uses the target information of the plurality of cameras. Among them, a trained model that identifies the camera that produced the best image. A plurality of pieces of target information used in learning for generating the trained model 113b for futsal are created based on a plurality of images generated by a plurality of cameras existing at different points in past futsal. A label is attached to the target information created based on the image selected by a person as being preferable among the plurality of created target information.
- the trained model for boxing 113c is a trained model generated by learning to identify the camera that generated the image that satisfies the point of importance in boxing.
- the trained model for boxing 113c uses target information created for each image based on a plurality of images generated by a plurality of cameras existing at different locations and a target object image, and uses target information generated for each of the images. Among them, a trained model that identifies the camera that produced the best image.
- a plurality of pieces of target information used in learning for generating the boxing trained model 113c are created based on a plurality of images generated by a plurality of cameras existing at different points in past boxing.
- a label is attached to the target information created based on the image selected by a person as being preferable among the plurality of created target information.
- two cameras located at different points can capture images that include the whole body but the face is turned slightly to the side, or images that face the front but partially hide the legs. may generate If the event is live, the user selects the latter as the preferred image. Also, if the event is futsal, the user selects the former as the preferred image. In this way, the point that the user determines to be preferable differs depending on the event. Therefore, it is desirable to identify the camera that generated the optimum image using the trained model generated corresponding to each event, and provide the optimum image to the user. Therefore, the acquisition unit 120b acquires the learned model for the event based on the event information. For example, when the event information indicates futsal, the acquisition unit 120b acquires the learned model 113b for futsal from the storage unit 110 .
- the identifying unit 140b identifies the camera that generated the optimum image using the target information detected for each image and the learned model for the event. For example, the identification unit 140b identifies the camera that generated the optimum image using the target information detected for each image and the learned model for futsal 113b.
- the selection unit 150 selects an image generated by the identified camera from among the plurality of images. Thereby, for example, when the event information indicates futsal, the selection unit 150 selects an image including the player's feet.
- the output control section 160 outputs the selected image to the terminal device 300 . This allows the user to see the player's feet through the terminal device 300 .
- the image processing apparatus 100b can switch the reference of the optimum image according to the event, and therefore can select the optimum image according to the event.
- Embodiment 5 Next, Embodiment 5 will be described. In Embodiment 5, mainly matters different from Embodiments 3 and 4 will be described. Further, in the fifth embodiment, descriptions of matters common to the third and fourth embodiments are omitted. Embodiment 5 describes a case where an information processing apparatus has the functions of Embodiment 4. FIG.
- FIG. 24 is a block diagram showing functions of the information processing apparatus according to the fifth embodiment.
- the information processing device 700a has an acquisition unit 720a and an identification unit 730a.
- Acquisition unit 720a acquires event information. For example, the acquisition unit 720 a acquires event information from the terminal device 300 or the image processing device 600 . Further, for example, the acquisition unit 720a acquires event information through an input operation by the user. Acquisition unit 720a acquires a trained model for an event based on the event information. For example, the acquisition unit 720a acquires a trained model for an event from the storage unit 710. FIG. Also, for example, the acquisition unit 720a acquires a learned model for an event from an external device.
- the identifying unit 730a identifies the camera that generated the optimum image using the target information detected for each image and the learned model for the event.
- the specifying unit 730a has the same function as the specifying unit 140b.
- the information processing device 700a can identify the camera that generated the image corresponding to the event (that is, the optimum image).
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Databases & Information Systems (AREA)
- Image Analysis (AREA)
Priority Applications (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| JP2023529395A JPWO2022269891A1 (https=) | 2021-06-25 | 2021-06-25 | |
| PCT/JP2021/024093 WO2022269891A1 (ja) | 2021-06-25 | 2021-06-25 | 画像処理装置、学習装置、画像処理システム、画像処理方法、生成方法、画像処理プログラム、及び生成プログラム |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/JP2021/024093 WO2022269891A1 (ja) | 2021-06-25 | 2021-06-25 | 画像処理装置、学習装置、画像処理システム、画像処理方法、生成方法、画像処理プログラム、及び生成プログラム |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2022269891A1 true WO2022269891A1 (ja) | 2022-12-29 |
Family
ID=84543960
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/JP2021/024093 Ceased WO2022269891A1 (ja) | 2021-06-25 | 2021-06-25 | 画像処理装置、学習装置、画像処理システム、画像処理方法、生成方法、画像処理プログラム、及び生成プログラム |
Country Status (2)
| Country | Link |
|---|---|
| JP (1) | JPWO2022269891A1 (https=) |
| WO (1) | WO2022269891A1 (https=) |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2024185485A1 (ja) * | 2023-03-09 | 2024-09-12 | ソニーグループ株式会社 | 映像処理装置、映像処理方法、およびプログラム |
Citations (8)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2018025734A (ja) * | 2016-08-03 | 2018-02-15 | 由希子 岡 | 画像表示システムおよび画像表示プログラム |
| JP2018081515A (ja) * | 2016-11-17 | 2018-05-24 | 日本電信電話株式会社 | リソース検索装置およびリソース検索方法 |
| WO2019077697A1 (ja) * | 2017-10-18 | 2019-04-25 | 三菱電機株式会社 | 画像共有支援装置、画像共有システム、及び、画像共有支援方法 |
| JP2019129328A (ja) * | 2018-01-22 | 2019-08-01 | 西日本電信電話株式会社 | 高精細動画生成装置、高精細動画生成方法、およびプログラム |
| JP2019212938A (ja) * | 2018-05-31 | 2019-12-12 | シャープ株式会社 | 撮像装置、撮像方法およびプログラム |
| JP2020088647A (ja) * | 2018-11-27 | 2020-06-04 | キヤノン株式会社 | 情報処理装置、情報処理方法及びプログラム |
| JP2021026744A (ja) * | 2019-08-09 | 2021-02-22 | 日本テレビ放送網株式会社 | 情報処理装置、画像認識方法及び学習モデル生成方法 |
| US20210146218A1 (en) * | 2019-11-15 | 2021-05-20 | Toca Football, Inc. | System and method for a user adaptive training and gaming platform |
Family Cites Families (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2007028555A (ja) * | 2005-07-21 | 2007-02-01 | Sony Corp | カメラシステム,情報処理装置,情報処理方法,およびコンピュータプログラム |
| JP5247356B2 (ja) * | 2008-10-29 | 2013-07-24 | キヤノン株式会社 | 情報処理装置およびその制御方法 |
| JP7366594B2 (ja) * | 2018-07-31 | 2023-10-23 | キヤノン株式会社 | 情報処理装置とその制御方法 |
| CN109658572B (zh) * | 2018-12-21 | 2020-09-15 | 上海商汤智能科技有限公司 | 图像处理方法及装置、电子设备和存储介质 |
-
2021
- 2021-06-25 WO PCT/JP2021/024093 patent/WO2022269891A1/ja not_active Ceased
- 2021-06-25 JP JP2023529395A patent/JPWO2022269891A1/ja active Pending
Patent Citations (8)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2018025734A (ja) * | 2016-08-03 | 2018-02-15 | 由希子 岡 | 画像表示システムおよび画像表示プログラム |
| JP2018081515A (ja) * | 2016-11-17 | 2018-05-24 | 日本電信電話株式会社 | リソース検索装置およびリソース検索方法 |
| WO2019077697A1 (ja) * | 2017-10-18 | 2019-04-25 | 三菱電機株式会社 | 画像共有支援装置、画像共有システム、及び、画像共有支援方法 |
| JP2019129328A (ja) * | 2018-01-22 | 2019-08-01 | 西日本電信電話株式会社 | 高精細動画生成装置、高精細動画生成方法、およびプログラム |
| JP2019212938A (ja) * | 2018-05-31 | 2019-12-12 | シャープ株式会社 | 撮像装置、撮像方法およびプログラム |
| JP2020088647A (ja) * | 2018-11-27 | 2020-06-04 | キヤノン株式会社 | 情報処理装置、情報処理方法及びプログラム |
| JP2021026744A (ja) * | 2019-08-09 | 2021-02-22 | 日本テレビ放送網株式会社 | 情報処理装置、画像認識方法及び学習モデル生成方法 |
| US20210146218A1 (en) * | 2019-11-15 | 2021-05-20 | Toca Football, Inc. | System and method for a user adaptive training and gaming platform |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2024185485A1 (ja) * | 2023-03-09 | 2024-09-12 | ソニーグループ株式会社 | 映像処理装置、映像処理方法、およびプログラム |
Also Published As
| Publication number | Publication date |
|---|---|
| JPWO2022269891A1 (https=) | 2022-12-29 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN110267008B (zh) | 图像处理方法、装置、服务器及存储介质 | |
| CN110673716B (zh) | 智能终端与用户交互的方法、装置、设备及存储介质 | |
| CN112425157B (zh) | 信息处理装置和方法以及程序 | |
| US20170209795A1 (en) | Systems and methods for capturing participant likeness for a video game character | |
| CN102681657A (zh) | 交互式内容创建 | |
| JP7068745B2 (ja) | 学習済モデル提案システム、学習済モデル提案方法、およびプログラム | |
| US20220044147A1 (en) | Teaching data extending device, teaching data extending method, and program | |
| US20180286069A1 (en) | Image processing apparatus and image processing method | |
| CN111523390B (zh) | 一种图像识别的方法及增强现实ar图标识别的系统 | |
| CN107682654B (zh) | 视频录制方法、拍摄装置、电子设备及介质 | |
| JP2020035095A (ja) | アノテーション装置およびアノテーション方法 | |
| JP2017188715A (ja) | 映像表示システム及び映像表示方法 | |
| US10049605B2 (en) | Display apparatus, display control method, and display system | |
| CN207851897U (zh) | 基于TensorFlow的人工智能的教学系统 | |
| US9443158B1 (en) | Method for computer vision to recognize objects marked for identification with a bigram of glyphs, and devices utilizing the method for practical purposes | |
| Koehnsen et al. | Step by step and frame by frame–Workflow for efficient motion tracking of high-speed movements in animals | |
| WO2022269891A1 (ja) | 画像処理装置、学習装置、画像処理システム、画像処理方法、生成方法、画像処理プログラム、及び生成プログラム | |
| CN111625101A (zh) | 一种展示控制方法及装置 | |
| US12614253B2 (en) | Imaging device, image generation method, and recording medium for generating composite image from two or more images | |
| CN111258410A (zh) | 一种人机交互设备 | |
| JP2017123589A (ja) | 情報処理装置、情報処理方法および映像投影システム | |
| Rahman | Understanding how the kinect works | |
| Abawi et al. | Hri-Free: Cognitive Robotic Simulation for Evaluating Embodied Social Attention Models | |
| CN110929666A (zh) | 生产线监控方法、装置、系统及计算机设备 | |
| KR20230007874A (ko) | 인공 신경망을 이용하여 비디오 시퀀스의 유형을 결정하기 위한 전자 장치, 방법, 및 컴퓨터 판독가능 저장 매체 |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21947175 Country of ref document: EP Kind code of ref document: A1 |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 2023529395 Country of ref document: JP |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 21947175 Country of ref document: EP Kind code of ref document: A1 |