WO2015159487A1

WO2015159487A1 - Image delivery method, image reception method, server, terminal apparatus, and image delivery system

Info

Publication number: WO2015159487A1
Application number: PCT/JP2015/001655
Authority: WO
Inventors: 敏康杉尾; 陽司柴原; 悠樹丸山; 徹松延; 陽一杉野; 幹博大内; 寿郎笹井; 邦昭磯貝; 竜二牟田; 貴子堀; 伊藤　智祥
Original assignee: パナソニックＩｐマネジメント株式会社
Priority date: 2014-04-14
Filing date: 2015-03-24
Publication date: 2015-10-22

Abstract

An image delivery method, which is executed by a server (103) that delivers to a terminal apparatus (102) a plurality of images captured by a plurality of users at different viewpoints, comprises: a delivery step (S142) of delivering to the terminal apparatus (102) a first image that is one of the plurality of images and that has been requested from the terminal apparatus (102); a selection step (S143) of selecting, from the plurality of images, a second image that will be most probably requested from the terminal apparatus (102) next; and a transmission step (S144) of starting transmission of the second image to the terminal apparatus (102) while delivering the first image to the terminal apparatus (102).

Description

Video distribution method, video reception method, server, terminal device, and video distribution system

The present invention relates to a video distribution method for distributing video shot from a plurality of viewpoints.

As a video distribution method, for example, a technique described in Patent Document 1 is known. In addition, a video distribution method for distributing video shot from a plurality of viewpoints is known (see, for example, Patent Document 2). In such a video distribution method, a user can designate and view an arbitrary video from a plurality of videos obtained by shooting a specific scene from different viewpoints.

JP 2009-206625 A JP 2012-094990 A

In such a video distribution method, it is desired that the video can be switched smoothly.

Therefore, an object of the present invention is to provide a video distribution method or a video reception method capable of smoothly switching video.

In order to achieve the above object, a video distribution method according to an aspect of the present invention is a video distribution method by a server that distributes to a terminal device any one of a plurality of videos taken from different viewpoints by a plurality of users. A distribution step of distributing the first video requested by the terminal device to the terminal device; and one of the plurality of videos, which is next requested by the terminal device. A selection step of selecting a second video that is highly likely to be transmitted, and a transmission step of starting transmission of the second video to the terminal device while the first video is being distributed to the terminal device.

A video reception method according to an aspect of the present invention is a video reception method by a terminal device that receives any of a plurality of videos taken from a plurality of viewpoints from a server and displays the videos. Selecting the first video from the video, a requesting step for requesting the server to transmit the first video, a first receiving step for receiving the first video from the server, and the first video A display step of displaying, and a second reception step of starting reception of a second video that is one of the plurality of videos and is likely to be selected next during reception of the first video.

These general or specific aspects may be realized by a system, a method, an integrated circuit, a computer program, or a recording medium such as a computer-readable CD-ROM. The system, method, integrated circuit, computer program Also, any combination of recording media may be realized.

The present invention can provide a video distribution method or video reception method capable of smoothly switching video.

It is a figure which shows the structure of the video delivery system which concerns on Embodiment 1. FIG. 2 is a block diagram of a server according to Embodiment 1. FIG. 2 is a block diagram of a terminal device according to Embodiment 1. FIG. 4 is a diagram showing processing of the video distribution system according to Embodiment 1. FIG. 6 is a diagram illustrating an example of an initial screen according to Embodiment 1. FIG. 6 is a diagram showing an example of related video selection processing according to Embodiment 1. FIG. 6 is a diagram showing an example of related video selection processing according to Embodiment 1. FIG. 6 is a diagram showing an example of related video selection processing according to Embodiment 1. FIG. 6 is a diagram showing an example of related video selection processing according to Embodiment 1. FIG. 6 is a diagram showing an example of related video selection processing according to Embodiment 1. FIG. 6 is a diagram showing an example of related video selection processing according to Embodiment 1. FIG. 6 is a diagram showing an example of a display screen according to Embodiment 1. FIG. 3 is a flowchart of processing performed by a terminal device according to Embodiment 1; 10 is a flowchart of a modification of processing by the terminal device according to the first embodiment. 10 is a flowchart of a modification of processing by the terminal device according to the first embodiment. 3 is a flowchart of processing performed by a server according to the first embodiment. It is a figure which shows the structure of a video information processing system. It is a figure which shows an example of the notification screen displayed at the time of camera starting. 1 is an overall configuration diagram of a content supply system that realizes a content distribution service. 1 is an overall configuration diagram of a digital broadcasting system. It is a figure which shows an example of a smart phone. It is a block diagram which shows the structural example of a smart phone.

(Knowledge that became the basis of the present invention)
When distributing a plurality of videos from different viewpoints, the user selects a video to be viewed, and the selected video is distributed from the server to the terminal device. As a result, there is a possibility that a waiting time may occur after the user selects a video until the video is displayed. Since the communication band is limited, it is difficult to transmit all videos to the terminal device in advance.

On the other hand, Patent Document 1 discloses a method for sending a large image including the periphery of a viewing image. Patent Document 2 discloses a method of distributing a viewpoint video around a display viewpoint video among a plurality of videos with different viewpoints as a group video.

However, in the case of distributing video shot from an arbitrary viewpoint by a plurality of users, it is difficult to perform seamless display with the above technique. Specifically, in the above technique, since a shooting viewpoint is determined in advance, a group video or the like can be determined in advance. On the other hand, shooting conditions such as a viewpoint, image quality, and zoom degree are arbitrarily set for videos arbitrarily shot by a plurality of users. In such a case, it is difficult to seamlessly display the video from the viewpoint preferred by the user.

A video distribution method according to an aspect of the present invention is a video distribution method by a server that distributes to a terminal device one of a plurality of videos taken from different viewpoints by a plurality of users, and one of the plurality of videos. A distribution step of distributing the first video requested from the terminal device to the terminal device, and a second video that is one of the plurality of videos and is likely to be requested next from the terminal device And a transmission step of starting transmission of the second video to the terminal device while delivering the first video to the terminal device.

According to this, the second video is sent in advance to the terminal device during the display of the first video. Thereby, the terminal device can smoothly switch from the first video to the second video.

For example, in the selection step, a video having a high degree of association with the first video is selected as the second video from the plurality of videos.

According to this, when a second video having a high degree of association with the first video currently displayed is selected, the terminal device can smoothly switch the video.

For example, in the selection step, it is determined that the degree of association is higher as the position of the shooting scene is closer to the position of the shooting scene of the first video.

For example, in the selection step, it is further determined that the degree of association is higher as the width of the shooting scene is closer to the width of the shooting scene of the first video.

For example, in the selection step, the degree of association between the subject included in the first video and the video in which the same subject is shot is set high.

For example, in the selection step, the second video is selected based on a frame rate, resolution, or bit rate of the plurality of videos.

For example, in the selection step, an image that is frequently selected by another user among the plurality of images is selected as the second image.

For example, in the selection step, the second video is selected based on the user's viewing history or pre-registered preference information.

According to this, the terminal device receives the second video in advance during the display of the first video. Thereby, the terminal device can smoothly switch from the first video to the second video.

For example, the video receiving method further includes a step of storing the received second video, and the second video stored when the second video is selected during display of the first video. Displaying two images.

For example, the video distribution method further includes a step of receiving the third video from the server when a third video different from the first video and the second video is selected during display of the first video. And displaying the stored second video until the third video is received.

According to this, the terminal device can display the second video during the waiting time for switching from the first video to another video.

For example, in the display step, an image overlooking a place where the plurality of videos are taken is displayed, and an image including a plurality of icons indicating the positions of the plurality of viewpoints is displayed.

For example, in the display step, an icon indicating the position of the viewpoint of the second video among the plurality of icons is highlighted.

This makes it easier for the user to select the second video.

The server according to an aspect of the present invention is a server that distributes to a terminal device any one of a plurality of videos shot from different viewpoints by a plurality of users, and is one of the plurality of videos, A distribution unit that distributes the first video designated by the terminal device to the terminal device, and a selection for selecting a second video that is one of the plurality of videos and is likely to be requested next from the terminal device And a transmission unit that starts transmission of the second video to the terminal device while delivering the first video to the terminal device.

A terminal device according to an aspect of the present invention is a terminal device that receives any of a plurality of videos taken from a plurality of viewpoints from a server and displays the videos. A selection unit that selects a video, a request unit that requests the server to transmit the first video, a first reception unit that receives the first video from the server, and a display unit that displays the first video A second receiving unit that starts receiving a second video that is one of the plurality of videos and is likely to be selected next during reception of the first video.

The video distribution system according to one aspect of the present invention includes a server and a terminal device.

Note that these comprehensive or specific modes may be realized by a system, a method, an integrated circuit, a computer program, or a recording medium such as a computer-readable CD-ROM, and the system, method, integrated circuit, and computer program. Also, any combination of recording media may be realized.

Hereinafter, embodiments will be specifically described with reference to the drawings. Note that each of the embodiments described below shows a specific example of the present invention. The numerical values, shapes, materials, constituent elements, arrangement positions and connecting forms of the constituent elements, steps, order of steps, and the like shown in the following embodiments are merely examples, and are not intended to limit the present invention. In addition, among the constituent elements in the following embodiments, constituent elements that are not described in the independent claims indicating the highest concept are described as optional constituent elements.

(Embodiment 1)
In the video distribution system according to the present embodiment, some of the plurality of videos are transmitted to the terminal device in advance. Thereby, when the said one part image | video is selected next, a video | video can be switched seamlessly.

First, the configuration of the video distribution system according to the present embodiment will be described. FIG. 1 is a block diagram showing a configuration of a video distribution system 100 according to the present embodiment. The video distribution system 100 includes a plurality of cameras 101, a terminal device 102, and a server 103 that can communicate with each other via a network 104.

The plurality of cameras 101 generate a plurality of video signals by photographing the same scene from different viewpoints in the same time zone. Each camera 101 is carried by each of a plurality of users. For example, the plurality of cameras 101 are owned by a plurality of spectators in a place such as a sports stadium. A plurality of video signals photographed by the plurality of cameras 101 are transmitted to the server 103 via the network 104. In addition, the video signal includes information indicating a photographing viewpoint (camera position), a camera direction, a magnification, and the like.

The camera 101 only needs to be a device having at least a photographing function, such as a digital still camera, a digital video camera, a smartphone, or a mobile terminal.

The terminal device 102 is a terminal used by the user, and has at least a function of displaying an image. For example, the terminal device 102 is a smartphone, a portable terminal, a personal computer, or the like. Note that the terminal device 102 has the same function as the camera 101, and the user may be included in the audience, or the user may view a video from a place other than the stadium.

The server 103 holds a plurality of video signals transmitted from the plurality of cameras 101. Further, the server 103 transmits a part of the plurality of video signals to be held to the terminal device 102 in accordance with a request from the terminal device 102. In addition, the server 103 analyzes the contents of the plurality of video signals and calculates the relevance of the plurality of video signals based on the obtained video characteristics. Further, the server 103 transmits a related video signal having a high degree of relevance to the selected video signal to the terminal device 102 in addition to the selected video signal designated by the terminal device 102.

In the following, an example in which a plurality of video signals are transmitted from a plurality of cameras 101 in real time and the user views the video signals in real time using the terminal device 102 will be described. However, at least one of video transmission and viewing is described. May not be performed in real time. Further, the transmission and reception of video signals (video) described below mainly means stream transmission and reception in which video signals are continuously transmitted or received.

The configuration of each device will be described below. FIG. 2 is a block diagram showing the configuration of the server 103. The server 103 includes a reception unit 111, a video storage unit 112, a control unit 113, and a transmission unit 114.

The receiving unit 111 receives a plurality of video signals 151 in which the same scene is captured from different viewpoints by the plurality of cameras 101. In addition, the reception unit 111 receives the viewpoint designation signal 152 transmitted from the terminal device 102. This viewpoint designation signal 152 designates one of the plurality of video signals 151.

The video storage unit 112 stores a plurality of video signals 151 received by the reception unit 111.

The control unit 113 selects the video signal 151 designated by the viewpoint designation signal 152 from the plurality of video signals 151 stored in the video storage unit 112 as the selected video signal 153, and selects the selected video via the transmission unit 114. The signal 153 is transmitted to the terminal device 102. In addition, the control unit 113 selects a related video signal 154 related to the selected video signal 153 from the plurality of video signals 151 stored in the video storage unit 112, and the related video signal is transmitted via the transmission unit 114. 154 is transmitted to the terminal device 102.

FIG. 3 is a block diagram of the terminal device 102. The terminal device 102 includes a receiving unit 121, a storage unit 122, a decoding unit 123, an output unit 124, a transmission unit 125, a control unit 126, and an input unit 127.

The receiving unit 121 receives the selected video signal 153 and the related video signal 154 transmitted from the server 103. The storage unit 122 temporarily holds the selected video signal 153 and the related video signal 154 received by the receiving unit 121.

The decoding unit 123 generates a decoded video by decoding the selected video signal 153. The output unit 124 generates an output video 155 including the decoded video, and displays the output video 155 on a display device such as a display provided in the terminal device 102, for example.

The input unit 127 receives a user operation. For example, the input unit 127 receives a user operation on the touch panel provided in the terminal device 102. When the input unit 127 receives an operation of changing the viewpoint by the user, the control unit 126 transmits a viewpoint designation signal 152 indicating the viewpoint to be changed to the server 103 via the transmission unit 125.

Next, the operation of the video distribution system 100 will be described. FIG. 4 is a sequence diagram of video distribution processing in the video distribution system 100. In FIG. 4, a plurality of video signals 151 are already held in the server 103. Note that the plurality of video signals 151 may be videos updated in real time from the plurality of cameras 101 as in the case of the stadium, in which the user is a stadium spectator. It may be a past video.

First, for example, the terminal device 102 starts an application program (application) in accordance with a user operation (S101). Next, the terminal device 102 displays an initial screen (S102). Specifically, the terminal apparatus 102 receives information indicating the positions (viewpoint positions) of the plurality of cameras 101 when the plurality of video signals 151 are captured as initial information from the server 103. Information indicating the position is displayed as an initial screen.

FIG. 5 is a diagram showing an example of this initial screen. As the background image 201, an image overlooking a place where a plurality of videos are taken is used. Further, a camera icon 202 indicating the position of the viewable video and the position of the camera 101 that shot the video is displayed on the background image 201.

Note that thumbnails may be displayed instead of the camera icon 202 or in addition to the camera icon 202. Furthermore, a thumbnail may be displayed instead of the camera icon 202 when the initial screen is enlarged.

Also, when the number of videos is large, only the camera icon 202 or the thumbnail of the video having a high recommendation level for the user may be displayed based on the degree of relevance described later. When thumbnails are displayed, the thumbnails may be displayed larger than the camera icon 202.

Further, when the number of videos is large, highly relevant videos may be grouped, and the camera icon 202 may be displayed for each group or a representative video of each group. Here, the representative video is determined based on, for example, video characteristics (resolution, frame rate, bit rate, or the like). For example, a video having the highest resolution, a video having the highest frame rate, or a video having the highest bit rate is determined as the representative video.

Also, information indicating the related contents of each group may be displayed together with the camera icon 202. Further, instead of the camera icon 202, thumbnails of representative videos of each group or reduced videos may be displayed.

Here, the representative video of each group is highly likely to be clicked. For this reason, the terminal apparatus 102 may receive the representative video from the server 103 in advance. That is, the terminal device 102 may receive all the representative videos of each group when the initial screen is displayed. Alternatively, when a certain group or representative video is selected, the terminal device 102 may receive a part or all of the video included in the group from the server 103.

Further, the terminal device 102 may set only the camera icon 202 corresponding to the video that has been stored for a while in the storage unit 122 after a while from the start of reception.

Further, the terminal device 102 may select the viewpoint to be displayed so that the number of camera icons 202 to be displayed is constant even when the screen is enlarged or reduced according to a user operation.

Further, the background image 201 of the initial screen may be switched depending on the position where the user is currently present. For example, when the user is on the infield side stand of the stadium, the landscape image seen from the infield side stand is set as the background image 201, and when the user is on the outfield side stand, the landscape image seen from the outfield stand is the background image 201. May be set.

Also, the camera icon 202 displayed on the initial screen may be switched according to the background image 201. The camera icon 202 may be switched according to the position of the user. For example, when the user is on the infield side stand, an image of a landscape seen from the infield side stand may be set as the background image 201, and a camera icon 202 indicating a shooting viewpoint existing in the landscape may be displayed.

At this time, the video to be received in advance may be switched according to the position of the user. For example, when the user is at the infield side stand, the terminal device 102 may receive in advance an image shot from the outfield side stand.

Further, the initial screen or the video received in advance may be switched depending on the viewing status of all users or some users. For example, a video with a large number of viewing users or a large number of users viewed in the past may be preferentially received.

Again, the description will be made with reference to FIG. When one of the camera icons 202 is selected by the user on the initial screen (S103), the terminal apparatus 102 transmits a viewpoint designation signal 152 indicating the selected viewpoint to the server 103 (S104).

The server 103 that has received the viewpoint designation signal 152 starts transmission of the selected video signal 153 designated by the viewpoint designation signal 152 to the terminal device 102 (S105). The terminal device 102 that has received the selected video signal 153 decodes the selected video signal 153 and starts displaying the obtained video (S106).

Further, the server 103 that has received the viewpoint designation signal 152 selects the related video signal 154 related to the selected video signal 153 (S107), and starts transmitting the related video signal 154 to the terminal device 102 (S108). Here, the selection of the related video signal 154 (S107) is performed after the transmission of the selected video signal 153 is started (S105). However, these processing orders may be arbitrary, and some of them are performed in parallel. Also good.

The related video selection process (S108) will be described. The server 103 uses at least one of the following methods as related video selection processing. In each of the following methods, the relevance level of each video is set, and the video with the highest final relevance level is selected as the related video. A plurality of videos may be selected as related videos in order from the higher priority side.

6 to 11 are flowcharts of this selection process.

In the example shown in FIG. 6, the server 103 calculates the position of the shooting scene of the selected video (area shown in the video) (S151), and determines the relevance of the video whose shooting scene is close to the position of the shooting scene of the selected video. Increase (S152). Specifically, the server 103 calculates the position of the shooting scene of each video using information included in the video signal 151 transmitted to the camera 101. More specifically, the video signal 151 includes information such as the viewpoint position where the video is shot, the direction of the camera 101, and the zoom magnification. The server 103 uses these pieces of information to calculate the position of the shooting scene that the camera 101 is shooting.

Note that the server 103 may calculate the position of the shooting scene of each video in advance, or may be performed at an arbitrary timing as long as the video signal 151 is received.

In addition to the position of the shooting scene, the server 103 may increase the relevance of the video whose shooting scene is close to the shooting scene of the selected video.

Also, the server 103 does not need to increase the degree of relevance for a video whose position of the imaging scene is very close (almost the same) to the position of the shooting scene of the selected video.

In the example shown in FIG. 7, the server 103 identifies a subject (for example, a player) in the selected video (S161), and increases the relevance of a video in which the same subject as the subject in the selected video is captured (S162). For example, the camera 101 identifies a subject in the video by image analysis (face authentication or the like), and transmits a video signal 151 including information indicating the subject to the server 103. The server 103 determines a subject in each video using the information. The image analysis may be performed by the server 103. The subject is not limited to a specific person, but may be a specific team.

As described above, the server 103 calculates the degree of association using the information generated by the camera 101 or the server 103 using at least one of the video captured by the camera 101 and the information acquired by the sensor attached to the camera. To do.

In the example shown in FIG. 8, the server 103 acquires the popularity of a plurality of videos (S171), and increases the relevance of videos with high popularity (S172). Here, the degree of popularity indicates, for example, the number of times that a video has been viewed within a certain time in the present or the past, or the number of users who have viewed the video. Note that the degree of popularity is sequentially calculated in the server 103 based on the viewing status of a plurality of users, for example.

In the example shown in FIG. 9, the server 103 acquires user preference information (S181), and increases the degree of relevance of the video that matches the user preference (S182). Here, the preference information is a user's viewing history or registered information indicating a user's preference or hobby registered in advance. For example, when the user has watched many videos of a specific player or team in the past, the server 103 increases the degree of relevance of the videos of the player or team. Moreover, when the player or team which a user supports is shown by registration information, the server 103 increases the relevance degree of the image | video which the said player or team is reflected.

In the example shown in FIG. 10, the server 103 acquires communication band information indicating a communication band that can be used by the terminal device 102 (S191), and changes the degree of association according to the communication band (S192). Specifically, the server 103 increases the degree of relevance of the video having the bit rate, the frame rate, or the resolution suitable for the communication band that can be used by the terminal device 102. For example, when the communication band that can be used by the terminal device 102 is sufficiently wide, the server 103 increases the relevance of the video having a high bit rate, frame rate, or resolution.

The server 103 may generate a plurality of bit rate video signals by converting the resolution or the frame rate of the video signal 151 transmitted from the camera 101, and store the plurality of video signals.

Also, when the bandwidth that can be used by the terminal device 102 changes during viewing, the selected video or the related video may be switched according to the available bandwidth.

In the example shown in FIG. 11, the server 103 acquires communication band information indicating a communication band that can be used by the terminal apparatus 102 (S191), and determines the number of related videos according to the communication band (S193). Specifically, the server 103 increases the number of related videos as the communication band is wider.

Thus, the server 103 selects, as a related video (second video), a video having a high degree of association with the selected video (first video) from among a plurality of videos. Specifically, the server 103 determines that the degree of association is higher as the position of the shooting scene is closer to the position of the shooting scene of the selected video. Further, the server 103 determines that the degree of association is higher as the width of the shooting scene is closer to the width of the shooting scene of the selected video. Further, the server 103 sets a high degree of association between the subject included in the selected video and the video in which the same subject is captured.

Also, the server 103 selects related videos based on the frame rate, resolution, or bit rate of a plurality of videos. In addition, the server 103 selects, as a related video, a video that has been frequently selected by another user from a plurality of videos. Further, the server 103 selects a related video based on the viewing history of the user or the preference information registered in advance.

FIG. 12 is a diagram showing an example of the display screen after the video is selected. As shown in FIG. 12, a selection video 211 that is a selected video, an overhead image 212, a top image 213, and operation buttons 214 to 216 are displayed on the display screen.

The bird's-eye view image 212 is an image for bird's-eye view of the shooting scene, and includes a camera icon 202. This overhead image 212 is the same as the image displayed on the initial screen. The top image 213 is a view of the entire shooting scene viewed from above, and includes a camera icon 202.

Operation buttons 214 to 216 are buttons for the user to operate. When the operation button 214 is selected, the display returns to the initial screen. When the

operation button

215 or 216 is operated, the display video is switched to another viewpoint video. At this time, a video having a high degree of association with the selected video is preferentially selected.

For example, when the operation button 215 is operated, the display image is switched to the image where the position of the shooting scene is closest to the position of the shooting scene of the selected image.

In addition, when the operation button 216 is operated, the display image is switched to the image with the highest recommendation. As a result, even if the video that has been selected and viewed does not meet the user's preference, the user can easily switch the display video to the video that can enjoy the game at that time and view the displayed video. it can.

In addition, when the user selects the camera icon 202 included in the overhead image 212 or the top image 213, the display image is switched to the image corresponding to the selected camera icon 202.

In addition, arrangement | positioning of each image and operation button shown in FIG. 12 is an example, and is not limited to this example. Further, it is not necessary to display all of the plurality of images and the plurality of operation buttons, and only a part of them may be displayed.

Here, in the present embodiment, the display of the camera icon 202 is changed according to the degree of association with the selected video. For example, a camera icon 202 corresponding to a video having a high degree of association with the selected video is highlighted. Note that only the camera icon 202 corresponding to a video having a high degree of association with the selected video among the plurality of videos may be displayed. Further, the display method of the camera icon 202 may be changed continuously or stepwise according to the degree of association. Information indicating the degree of association may be displayed near the camera icon 202.

As another embodiment, a sensor may be incorporated in the ball, and it may be determined how the ball flew based on information detected by the sensor. Then, the trajectory of the ball may be superimposed on the overhead image 212 or the top image 213.

Furthermore, when the camera icon 202 is present at the point where the ball flew, the terminal device 102 may receive the video signal of the viewpoint position close to the position of the ball in advance by the server 103.

That is, the system obtains the flow of the game from some means (such as a ball sensor) and preliminarily estimates the camera icon 202 that the user wants to see based on the information, and the terminal device 102 is estimated. The video may be received in advance.

Further, the server 103 may set priorities for a plurality of videos based on the current situation such as the flow of the game or the position of the user.

Again, the description will be made with reference to FIG. On the display screen shown in FIG. 6, a viewpoint switching operation is performed (S109). Here, it is assumed that the related video is selected. In this case, since the terminal device 102 has received the related video signal 154 in advance, the terminal device 102 decodes the related video signal 154 and displays the related video (S110). In this way, the terminal device 102 can seamlessly switch the video by receiving in advance the related video that is likely to be selected next.

Also, the terminal device 102 transmits a viewpoint designation signal 152 indicating the selected viewpoint to the server 103 (S111). In addition, the server 103 that has received the viewpoint designation signal 152 transmits a selection video signal 153 designated by the viewpoint designation signal 152 to the terminal device 102. That is, the server 103 continues transmission of the related video signal 154 as transmission of the selected video signal 153 (S112). Further, the server 103 selects the related video signal 154 related to the new selected video signal 153 (S113), and starts transmitting the related video signal 154 to the terminal device 102 (S114).

Note that the order of the video display (S110) and the transmission of the viewpoint designation signal 152 (S111) may be arbitrary, and some of them may be performed in parallel.

Next, the operation flow of the terminal device 102 will be described. FIG. 13 is a flowchart illustrating the operation flow of the terminal apparatus 102. FIG. 13 shows processing of the terminal device 102 in a state where a certain viewpoint video is displayed.

The terminal apparatus 102 determines whether or not the viewpoint switching is instructed by the user's operation (S121). When the viewpoint switching is instructed (Yes in S121), the terminal device 102 transmits the viewpoint designation signal 152 to the server 103 (S122).

Further, the terminal device 102 determines whether the viewpoint switching destination selected video is a related video (S123). If the selected video is not related video (No in S123), the terminal apparatus 102 waits until receiving the selected video transmitted by the server 103 in response to the viewpoint designation signal 152 (S124), and if the selected video is received (S124). Yes), the selected video is displayed (S125).

On the other hand, when the selected video is a related video (Yes in S123), the terminal device 102 displays the already stored related video as the selected video (S125).

Here, when the system displays live video, switching of decoded video may be performed at the time when decoding of the random access frame is completed. At this time, a waiting time is generated from the time when the user's viewpoint switching instruction is generated until the switching time, and the terminal device 102 may continue to play the video before switching during this waiting time, May be displayed.

When the system displays a highlight video instead of a live video, the terminal device 102 searches for a random access point at a time closest to the playback time of the video before switching, and decodes and displays the video from there. Also good.

Next, when receiving the related video related to the new selected video (Yes in S126), the terminal device 102 sequentially stores the received related video in the storage unit 122 (S127). Note that the data of the selected video after being displayed and the data of the related video that has not been used for a certain period after reception are sequentially deleted from the storage unit 122.

Next, the terminal apparatus 102 displays the newly received related video information (S128). Specifically, the terminal device 102 highlights the camera icon 202 of the related video. For example, the related video camera icon 202 is displayed larger than the other camera icons 202. In addition, the contour line of the camera icon 202 of the related video is displayed thicker than the contour lines of the other camera icons 202. Alternatively, the color of the camera icon 202 of the related video is changed to a conspicuous color such as red. The highlighting method is not limited to this.

Note that the terminal apparatus 102 may perform the processing shown in FIG. 14 and 15 are flowcharts illustrating a flow of a modification example of the operation of the terminal device 102.

In the process shown in FIG. 14, step S129 is added to the process shown in FIG. That is, when the selected video is not related video (No in S123), the terminal apparatus 102 displays the related video in a period until the selected video is received (S129). In addition, when the terminal device 102 stores a plurality of related videos, the terminal device 102 displays a related video having the highest degree of association with the newly selected video among the plurality of stored related videos. May be.

Further, in the process shown in FIG. 15, step S130 is added to the process shown in FIG. That is, when the selected video is not a related video (No in S123), the terminal device 102 displays the three-dimensional configuration data in a period until the selected video is received (S130). Here, the three-dimensional configuration data is the three-dimensional configuration data of a place where a plurality of videos are taken, and in the example shown in FIG. 5, is the three-dimensional configuration data of a baseball field. The three-dimensional configuration data is generated by the server 103 using a plurality of video signals 151 and transmitted to the terminal device 102 in advance.

Note that the terminal device 102 may generate an image to be displayed during this period using the three-dimensional configuration data. For example, the terminal apparatus 102 generates a video in which the viewpoint position is continuously changed from the viewpoint position of the immediately preceding display video in the three-dimensional configuration data to the viewpoint position of the selected video, and displays the video during the above period. Good. Such visual effects may also be used when video data is stored in the storage unit 122. Furthermore, whether or not this visual effect is used may be switched according to the distance between the viewpoint position of the immediately preceding display image and the viewpoint position of the selected image. For example, when the distance is short, the visual effect is not used, and when the distance is long, the visual effect is used.

In addition, here, an example in which related video or three-dimensional configuration data is displayed in the waiting time until receiving the selected video has been described. However, when the terminal device 102 cannot receive the selected video due to a communication error or the like, When the selected video cannot be received due to some error, the related video or the three-dimensional configuration data may be displayed.

If the terminal device 102 cannot receive the selected video and the camera 101 that is capturing the video is near the user, the terminal device 102 uses another communication method such as near field communication. The video signal may be received directly from the 101.

As described above, the terminal apparatus 102 receives any of a plurality of videos taken from a plurality of viewpoints from the server 103 and displays the videos. First, the terminal device 102 selects a selection video (first video) from a plurality of videos (S121). Next, the terminal device 102 requests the server 103 to transmit the selected video (S122). Next, the terminal device 102 receives the selected video from the server 103 (S124), and displays the selected video (S125). Next, during reception and display of the selected video, the terminal device 102 starts receiving a related video that is one of a plurality of videos and is different from the selected video and is likely to be selected next. (S126).

Also, the terminal device 102 accumulates the received related video (S127). When the related video is selected during the display of the selected video (Yes in S123), the terminal device 102 displays the stored related video (S125).

Further, when the third video different from the selected video and the related video is selected during the display of the selected video (No in S123), the terminal device 102 receives the third video from the server 103 (S124). The terminal apparatus 102 displays the related video that has been accumulated until the third video is received (S129).

Next, the operation flow of the server 103 will be described. FIG. 16 is a flowchart showing the operation flow of the server 103.

First, the server 103 determines whether the viewpoint designation signal 152 is received from the terminal device 102 (S141). When the server 103 receives the viewpoint designation signal 152 (Yes in S141), the server 103 selects the video signal indicated by the viewpoint designation signal 152 as the selected video signal 153 from the accumulated video signals, and the selected video signal 153 is transmitted to the terminal device 102 (S142).

Further, as described above, the server 103 selects the related video signal 154 having a high degree of relevance to the selected video based on the priority from the plurality of stored video signals 151 (S143), and uses the related video signal 154 as a terminal. The data is transmitted to the device 102 (S144).

As described above, the server 103 delivers any of a plurality of videos taken from different viewpoints by a plurality of users to the terminal device 102. First, the server 103 distributes the selected video (first video) requested by the terminal device 102 to the terminal device 102, which is one of a plurality of videos (S142). Next, the server 103 selects, from a plurality of videos, a related video (second video) that is different from the selected video and is highly likely to be requested next from the terminal device 102 (S143). In other words, the related video is a video that is not requested from the terminal device 102. Next, the server 103 starts transmitting the related video to the terminal device 102 while delivering the selected video to the terminal device 102 (S144).

The video distribution method, video reception method, and video distribution system according to the embodiment have been described above, but the present invention is not limited to this embodiment.

Further, each processing unit included in each device included in the video distribution system according to the above embodiment is typically realized as an LSI which is an integrated circuit. These may be individually made into one chip, or may be made into one chip so as to include a part or all of them.

Further, the integration of circuits is not limited to LSI, and may be realized by a dedicated circuit or a general-purpose processor. An FPGA (Field Programmable Gate Array) that can be programmed after manufacturing the LSI or a reconfigurable processor that can reconfigure the connection and setting of circuit cells inside the LSI may be used.

In each of the above embodiments, each component may be configured by dedicated hardware or may be realized by executing a software program suitable for each component. Each component may be realized by a program execution unit such as a CPU or a processor reading and executing a software program recorded on a recording medium such as a hard disk or a semiconductor memory.

In other words, each device included in the video distribution system includes a processing circuit and a storage device (storage) electrically connected to the processing circuit (accessible from the processing circuit). The processing circuit includes at least one of dedicated hardware and a program execution unit. Further, when the processing circuit includes a program execution unit, the storage device stores a software program executed by the program execution unit. The processing circuit uses the storage device to execute the video distribution method or the wave video reception method according to the above embodiment.

Furthermore, the present invention may be the software program or a non-transitory computer-readable recording medium on which the program is recorded. Needless to say, the program can be distributed via a transmission medium such as the Internet.

Further, all the numbers used above are illustrated for specifically explaining the present invention, and the present invention is not limited to the illustrated numbers.

Further, the order in which the steps included in the above video distribution method or video reception method are executed is for illustrating the present invention specifically, and may be in an order other than the above. Also, some of the above steps may be executed simultaneously (in parallel) with other steps.

As described above, the video distribution method, the video reception method, the video distribution system, the server, and the terminal device according to one or more aspects of the present invention have been described based on the embodiment. It is not limited. Unless it deviates from the gist of the present invention, the embodiment in which various modifications conceived by those skilled in the art have been made in the present embodiment, and forms constructed by combining components in different embodiments are also applicable to one or more of the present invention. It may be included within the scope of the embodiments.

(Embodiment 2)
Another application example of the configuration of the image processing method and apparatus shown in the above embodiments and a system using the same will be described. The system can be applied to a video system in which intelligentization and widening of the target space are progressing. For example, (1) a surveillance system implemented in a security camera in a store or a factory, an in-vehicle camera in a police, or Traffic information system using own camera or each on-vehicle camera or camera provided on road, (3) Environmental survey or delivery system using remote control or automatic control device such as drone, and (4) Entertainment The present invention can be applied to a content transmission / reception system such as a video using an installation camera in a facility or a stadium, a mobile camera such as a drone, or a personally owned camera.

FIG. 17 is a diagram showing a configuration of the video information processing system ex100 in the present embodiment. In this embodiment, an example of preventing the generation of blind spots and an example of prohibiting photographing in a specific area will be described.

17 includes a video information processing apparatus ex101, a plurality of cameras ex102, and a video reception apparatus ex103. Note that the video receiving device ex103 is not necessarily included in the video information processing system ex100.

The video information processing apparatus ex101 includes a storage unit ex111 and an analysis unit ex112. Each of the N cameras ex102 has a function of capturing video and a function of transmitting captured video data to the video information processing apparatus ex101. In addition, the camera ex102 may have a function of displaying an image being shot. The camera ex102 converts the captured video signal into HEVC or H.264. The encoded information may be encoded using an encoding method such as H.264 and transmitted to the video information processing apparatus ex101, or unencoded video data may be transmitted to the video information processing apparatus ex101.

Here, each camera ex102 is a fixed camera such as a surveillance camera, a moving camera mounted on an unmanned flight type radio control or a car, or a user camera possessed by the user.

The moving camera receives the instruction signal transmitted from the video information processing apparatus ex101, and changes the position or shooting direction of the moving camera itself according to the received instruction signal.

Also, the time of the plurality of cameras ex102 is calibrated using the time information of the server or the reference camera, etc. before the disclosure of photographing. Further, the spatial positions of the plurality of cameras ex102 are calibrated based on how the objects in the space to be imaged are captured or relative positions from the reference camera.

The storage unit ex111 included in the information processing apparatus ex101 stores video data transmitted from the N cameras ex102.

The analysis unit ex112 detects a blind spot from the video data stored in the storage unit ex111, and transmits an instruction signal indicating an instruction to the mobile camera for preventing the generation of the blind spot to the mobile camera. The moving camera moves in accordance with the instruction signal and continues shooting.

The analysis unit ex112 performs blind spot detection using, for example, SfM (Structure from Motion). SfM is a technique for restoring the three-dimensional shape of a subject from a plurality of videos taken from different positions, and is widely known as a shape restoration technique for simultaneously estimating the subject shape and the camera position. For example, the analysis unit ex112 restores the three-dimensional shape in the facility or the stadium from the video data saved in the saving unit ex111 using SfM, and detects an area that cannot be restored as a blind spot.

Note that if the position and shooting direction of the camera ex102 are fixed and information on the position and shooting direction is known, the analysis unit ex112 may perform SfM using these known information. Further, when the position and shooting direction of the moving camera can be acquired by a GPS and an angle sensor provided in the moving camera, the moving camera transmits information on the position and shooting direction of the moving camera to the analysis unit ex112, and the analysis unit The ex 112 may perform SfM using the transmitted position and shooting direction information.

Note that the method of detecting the blind spot is not limited to the method using SfM described above. For example, the analysis unit ex112 may grasp the spatial distance of the object to be imaged by using information of a depth sensor such as a laser range finder. In addition, the analysis unit ex112 detects information such as a camera position, a shooting direction, and a zoom magnification from an image that includes a preset marker or a specific object in the space, or the size of the marker or the like. Also good. As described above, the analysis unit ex112 performs blind spot detection using an arbitrary method capable of detecting the imaging region of each camera. In addition, the analysis unit ex112 acquires information such as a mutual positional relationship for a plurality of imaging targets from video data or a proximity distance sensor, and identifies an area where a blind spot is likely to occur based on the acquired positional relationship. May be.

Here, the blind spot includes not only a portion where an image does not exist in a region to be photographed, but also a portion having a poor image quality compared to other portions and a portion where a predetermined image quality is not obtained. This detection target portion may be set as appropriate according to the configuration or purpose of the system. For example, the required image quality may be set high for a specific subject in the space where the image is taken. Conversely, for a specific area in the shooting space, the required image quality may be set low, or it may be set not to be determined as a blind spot even if no video is shot.

The above-mentioned image quality includes various information related to the video such as the area occupied by the subject to be photographed in the video (for example, the number of pixels) or whether the subject to be photographed is in focus. Whether or not it is a blind spot may be determined based on the information or the combination thereof.

In the above description, detection of a region that is actually a blind spot has been described. However, a region that needs to be detected in order to prevent the generation of a blind spot is not limited to a region that is actually a blind spot. For example, when there are a plurality of shooting targets and at least some of them are moving, there is a possibility that a new blind spot may be generated when another shooting target enters between a shooting target and the camera. On the other hand, the analysis unit ex112 detects movements of a plurality of shooting targets from, for example, shot video data and the like, and based on the detected movements of the plurality of shooting targets and position information of the camera ex102, a new blind spot and It is also possible to estimate a possible region. In this case, the video information processing apparatus ex101 may transmit an instruction signal to the moving camera so as to capture an area that may become a blind spot, and prevent the generation of a blind spot.

When there are a plurality of moving cameras, the video information processing apparatus ex101 needs to select a moving camera that transmits an instruction signal in order to capture a blind spot or an area that may become a blind spot. In addition, when there are a plurality of moving cameras and blind spots, or areas that may become blind spots, the video information processing apparatus ex101 determines which dead spots or areas that may become blind spots for each of the plurality of moving cameras. It is necessary to decide whether to shoot. For example, the video information processing apparatus ex101 selects a moving camera that is closest to the blind spot or the area that is the blind spot based on the blind spot or the area that may be the blind spot and the position of the area that each moving camera is capturing. To do. Further, the video information processing apparatus ex101 determines, for each moving camera, whether or not a blind spot is newly generated when the moving camera cannot obtain the video data currently being shot. If it is not obtained, a moving camera determined not to generate a blind spot may be selected.

With the above configuration, the video information processing apparatus ex101 can prevent the generation of a blind spot by detecting a blind spot and transmitting an instruction signal to the moving camera so as to prevent the blind spot.

(Modification 1)
In the above description, an example in which an instruction signal for instructing movement is transmitted to the moving camera has been described. However, the instruction signal may be a signal for instructing the user of the user camera to move. For example, the user camera displays an instruction image that instructs the user to change the direction of the camera based on the instruction signal. Note that the user camera may display an instruction image indicating a movement route on a map as an instruction to move the user. The user camera may display detailed shooting instructions such as shooting direction, angle, angle of view, image quality, and movement of the shooting area in order to improve the quality of the acquired image. If control is possible on the ex101 side, the video information processing apparatus ex101 may automatically control the feature amount of the camera ex102 regarding such shooting.

Here, the user camera is, for example, a smartphone, a tablet terminal, a wearable terminal, or an HMD (Head Mounted Display) held by a spectator in the stadium or a guard in the facility.

Also, the display terminal that displays the instruction image need not be the same as the user camera that captures the video data. For example, the user camera may transmit an instruction signal or an instruction image to a display terminal associated with the user camera in advance, and the display terminal may display the instruction image. Further, information on the display terminal corresponding to the user camera may be registered in advance in the video information processing apparatus ex101. In this case, the video information processing apparatus ex101 may display the instruction image on the display terminal by directly transmitting the instruction signal to the display terminal corresponding to the user camera.

(Modification 2)
The analysis unit ex112 may generate a free viewpoint video (three-dimensional reconstruction data) by restoring the three-dimensional shape in the facility or the stadium from the video data stored in the storage unit ex111 using, for example, SfM. Good. This free viewpoint video is stored in the storage unit ex111. The video information processing apparatus ex101 reads video data corresponding to the visual field information (and / or viewpoint information) transmitted from the video reception apparatus ex103 from the storage unit ex111 and transmits the video data to the video reception apparatus ex103. Note that the video reception device ex103 may be one of the plurality of cameras 111.

(Modification 3)
The video information processing apparatus ex101 may detect a shooting prohibited area. In this case, the analysis unit ex112 analyzes the photographed image, and transmits a photographing prohibition signal to the moving camera when the mobile camera is photographing the photographing prohibition region. The mobile camera stops shooting while receiving the shooting prohibition signal.

The analysis unit ex112, for example, matches the three-dimensional virtual space restored using SfM with the captured image, thereby determining whether the mobile camera set in advance in the space is capturing the prohibited image area. judge. Alternatively, the analysis unit ex112 determines whether the moving camera is shooting the shooting prohibited area using a marker or a characteristic object arranged in the space as a trigger. The photographing prohibited area is, for example, a toilet in a facility or a stadium.

In addition, when the user camera is shooting a shooting prohibited area, the user camera displays a message on a display or the like connected wirelessly or by wire, or outputs a sound or sound from a speaker or an earphone. Thus, the user may be informed that the current location is a shooting prohibited location.

For example, as the above message, it is displayed that the direction in which the camera is currently facing is prohibited from being shot. Alternatively, the shooting prohibited area and the current shooting area are shown on the displayed map. In addition, the resumption of photographing is automatically performed when, for example, the photographing prohibition signal is not output. Alternatively, photographing may be resumed when the photographing prohibition signal is not output and the user performs an operation to resume photographing. In addition, when the stop and restart of shooting occur a plurality of times in a short period, calibration may be performed again. Alternatively, notification for confirming the current position or prompting the user to move may be performed.

Also, in the case of special operations such as police, a passcode or fingerprint authentication that turns off such a function for recording may be used. Further, even in such a case, image processing such as mosaicing may be automatically performed when a video in the photographing prohibited area is displayed or stored outside.

With the above configuration, the video information processing apparatus ex101 can determine that shooting is prohibited and notify the user to stop shooting, thereby setting a certain region to shooting prohibited.

(Modification 4)
In order to construct a three-dimensional virtual space from videos, it is necessary to collect videos from a plurality of viewpoints. Therefore, the video information processing system ex100 sets an incentive for the user who transferred the shot video. For example, the video information processing apparatus ex101 delivers a video value to a user who has transferred video at a free or discounted rate, a monetary value that can be used in an online or offline store or game, a game, etc. Points that have non-monetary value such as social status in virtual space. In addition, the video information processing apparatus ex101 gives a particularly high point to a user who has transferred a captured video of a valuable field of view (and / or viewpoint) such as many requests.

(Modification 5)
The video information processing apparatus ex101 may transmit additional information to the user camera based on the analysis result of the analysis unit ex112. In this case, the user camera superimposes additional information on the captured video and displays it on the screen. The additional information is, for example, information on players such as a player name or height when a game in a stadium is being shot, and the name or face photo of the player is associated with each player in the video. Is displayed. Note that the video information processing apparatus ex101 may extract additional information by searching via the Internet based on part or all of the video data area. The camera ex102 receives such additional information by short-range wireless communication including Bluetooth (registered trademark) or visible light communication from lighting such as a stadium, and maps the received additional information to video data. Also good. The camera ex102 is a table in which this mapping is stored in a storage unit connected to the camera ex102 by wire or wirelessly, and shows a correspondence relationship between information obtained by visible light communication technology and additional information, etc. It may be performed based on a certain rule of the above, or may be performed using the most probable combination result by Internet search.

Also, in the monitoring system, for example, information of a caution person is superimposed on a user camera held by a guard in the facility, so that the monitoring system can be highly accurate.

(Modification 5)
The analysis unit ex112 may determine which area in the facility or stadium the user camera is capturing by matching the free viewpoint image and the captured image of the user camera. Note that the imaging region determination method is not limited to this, and various imaging region determination methods or other imaging region determination methods described in the above-described embodiments may be used.

The video information processing apparatus ex101 transmits the past video to the user camera based on the analysis result of the analysis unit ex112. The user camera displays the past video on the screen by superimposing the past video on the shot video or replacing the shot video with the past video.

For example, during the halftime, the highlight scene of the first half is displayed as a past video. Accordingly, the user can enjoy the highlight scene of the first half as a video in the direction in which he / she is viewing during the halftime. The past video is not limited to the highlight scene in the first half, but may be a highlight scene of a past game held at the stadium. The timing at which the video information processing apparatus ex101 delivers the past video is not limited to half time, and may be, for example, after the match or during the match. Particularly during a game, based on the analysis result of the analysis unit ex112, the video information processing apparatus ex101 may deliver a scene that is considered important and missed by the user. In addition, the video information processing apparatus ex101 may distribute the past video only when requested by the user, or may distribute a distribution permission message before the past video is distributed.

(Modification 6)
The video information processing apparatus ex101 may transmit advertisement information to the user camera based on the analysis result of the analysis unit ex112. The user camera superimposes advertisement information on the captured video and displays it on the screen.

The advertisement information may be distributed immediately before the past video distribution during the half time or after the match, as shown in, for example, Modification 5. Accordingly, the distributor can obtain an advertisement fee from the advertiser, and can provide a video distribution service to the user at a low cost or free of charge. In addition, the video information processing apparatus ex101 may distribute an advertisement distribution permission message immediately before distribution of the advertisement information, may provide a service for free only when the user views the advertisement, or views the advertisement. Service may be provided at a lower cost than when not.

In addition, when the user clicks “Order now” according to the advertisement, the system or the staff who knows the location of the user based on some location information or the automatic delivery system of the venue will bring the ordered drink to the seat Will deliver. The decision may be handed to the staff or may be made based on credit card information set in advance in the mobile terminal application or the like. Further, the advertisement may include a link to an e-commerce site, and online shopping such as normal home delivery may be possible.

(Modification 7)
The video receiving device ex103 may be one of the cameras ex102 (user camera).

In this case, the analysis unit ex112 determines which area in the facility or stadium the user camera is shooting by matching the free viewpoint video and the video shot by the user camera. Note that the method for determining the imaging region is not limited to this.

For example, when the user performs a swipe operation in the direction of the arrow displayed on the screen, the user camera generates viewpoint information indicating that the viewpoint is moved in that direction. The video information processing apparatus ex101 reads the video data obtained by shooting the area moved by the viewpoint information from the shooting area of the user camera determined by the analysis unit ex112 from the storage unit ex111, and transmits the video data to the user camera. Start. The user camera displays the video distributed from the video information processing apparatus ex101 instead of the captured video.

As described above, the users in the facility or the stadium can view the video from a favorite viewpoint with a simple operation like a screen swipe. For example, a spectator watching on the third base side of a baseball field can view a video from the first base side viewpoint. In addition, in the surveillance system, the security guards in the facility can watch the video that should be watched as an interrupt from the viewpoint or the center that they want to confirm by a simple operation like a screen swipe while changing the viewpoint appropriately. Therefore, it is possible to increase the accuracy of the monitoring system.

Also, distribution of video to users in the facility or stadium is also effective when there are obstacles between the user camera and the shooting target and there is an invisible area, for example. In this case, the user camera may switch and display the video of a part of the shooting area of the user camera including the obstacle from the shot video to the distribution video from the video information processing apparatus ex101. The entire screen may be switched from the captured video to the distributed video and displayed. In addition, the user camera may display an image in which the object to be viewed is seen through the obstacle by combining the captured image and the distribution image. According to this configuration, it is possible to view the video distributed from the video information processing apparatus ex101 even when the shooting target cannot be seen from the position of the user due to the influence of the obstacle, so that the influence of the obstacle can be reduced. it can.

Further, when the distribution video is displayed as a video of an area that cannot be seen due to an obstacle, display switching control different from the display switching control according to the input process by the user, such as the screen swipe described above, may be performed. . For example, when it is determined that an obstacle is included in the shooting area based on the information on the movement and shooting direction of the user camera and the position information of the obstacle obtained in advance, the display from the shot video to the distribution video is performed. Switching may be performed automatically. In addition, when it is determined that an obstacle that is not a shooting target is reflected by analysis of the shot video data, display switching from the shot video to the distribution video may be automatically performed. Further, when the area of the obstacle (for example, the number of pixels) included in the photographed image exceeds a predetermined threshold, or when the ratio of the area of the obstacle to the area to be photographed exceeds a predetermined ratio, The display switching to the distribution video may be automatically performed.

It should be noted that the display switching from the captured video to the distribution video and the display switching from the distribution video to the captured video may be performed in accordance with the user input processing.

(Modification 8)
The speed at which the video data is transferred to the video information processing apparatus ex101 may be instructed based on the importance of the video data captured by each camera ex102.

In this case, the analysis unit ex112 determines the importance of the video data stored in the storage unit ex111 or the camera ex102 that captured the video data. The determination of the importance here is performed based on, for example, information such as the number of people or moving objects included in the video, the image quality of the video data, or a combination thereof.

Also, the determination of the importance of the video data may be based on the position of the camera ex102 where the video data is shot or the area where the video data is shot. For example, when there are a plurality of other cameras ex102 being shot near the target camera ex102, the importance of the video data shot by the target camera ex102 is reduced. In addition, even when the position of the target camera ex102 is far from the other camera ex102, when there are a plurality of other cameras ex102 shooting the same area, the importance of the video data shot by the target camera ex102 is set. make low. The determination of the importance of the video data may be performed based on the number of requests in the video distribution service. The importance determination method is not limited to the method described above or a combination thereof, and may be any method according to the configuration or purpose of the monitoring system or the video distribution system.

Also, the determination of the importance may not be based on the captured video data. For example, the importance of the camera ex102 that transmits video data to a terminal other than the video information processing apparatus ex101 may be set high. Conversely, the importance of the camera ex102 that transmits video data to a terminal other than the video information processing apparatus ex101 may be set low. Thereby, for example, when a plurality of services that require transmission of video data share a communication band, the degree of freedom in controlling the communication band according to the purpose or characteristics of each service is increased. Thereby, it is possible to prevent the quality of each service from deteriorating due to the lack of necessary video data.

Also, the analysis unit ex112 may determine the importance of the video data using the free viewpoint video and the video shot by the camera ex102.

The video information processing apparatus ex101 transmits a communication speed instruction signal to the camera ex102 based on the importance determination result performed by the analysis unit ex112. For example, the video information processing apparatus ex101 instructs a high communication speed to the camera ex102 that captures a video with high importance. Further, the video information processing apparatus ex101 may transmit not only the speed control but also a signal instructing a method in which important information is transmitted a plurality of times in order to reduce a disadvantage caused by the lack. Thereby, communication within the facility or the entire stadium can be performed efficiently. Communication between the camera ex102 and the video information processing apparatus ex101 may be wired communication or wireless communication. The video information processing apparatus ex101 may control only one of wired communication and wireless communication.

The camera ex102 transmits the captured video data to the video information processing apparatus ex101 at a communication speed according to the communication speed instruction signal. Note that if the retransmission of the camera ex102 fails a predetermined number of times, the camera ex102 may stop the retransmission of the captured video data and start the transfer of the next captured video data. As a result, communication within the facility or the entire stadium can be efficiently performed, and high-speed processing in the analysis unit ex112 can be realized.

In addition, when the communication speed assigned to each camera ex102 is not a sufficient band for transferring the captured video data, the video data of the bit rate capable of transmitting the captured video data at the assigned communication speed. The converted video data may be transmitted, or the video data transfer may be stopped.

In addition, as described above, when video data is used to prevent the generation of blind spots, only a part of the shooting area included in the captured video data may be necessary to fill the blind spots. There is sex. In this case, the camera ex102 generates the extracted video data by extracting at least the area necessary for preventing the generation of the blind spot from the video data, and the generated extracted video data is used as the video information processing apparatus. You may transmit to ex101. According to this configuration, the occurrence of blind spots can be suppressed with a smaller communication band.

Further, for example, when additional information is superimposed and displayed or video distribution is performed, the camera ex102 needs to transmit the position information of the camera ex102 and the shooting direction information to the video information processing apparatus ex101. In this case, the camera ex102 to which only a bandwidth that is not sufficient for transferring the video data may be transmitted, only the position information detected by the camera ex102 and the information on the shooting direction. Further, when the video information processing apparatus ex101 estimates position information and shooting direction information of the camera ex102, the camera ex102 converts the shot video data to a resolution necessary for estimating the position information and shooting direction information. The converted video data may be transmitted to the video information processing apparatus ex101. According to this configuration, it is possible to provide an additional information superimposed display or video distribution service even for the camera ex102 to which only a small communication band is allocated. In addition, since the video information processing apparatus ex101 can acquire shooting area information from a larger number of cameras ex102, for example, when the shooting area information is used for the purpose of detecting a focused area, for example. It is valid.

Note that the switching of the video data transfer process according to the allocated communication band described above may be performed by the camera ex102 based on the notified communication band, or the video information processing apparatus ex101 performs the operation of each camera ex102. The control signal indicating the determined operation may be notified to each camera ex102. As a result, the processing can be appropriately shared according to the calculation amount necessary for determining the switching of the operation, the processing capability of the camera ex102, the necessary communication band, and the like.

(Modification 9)
The analysis unit ex112 may determine the importance of the video data based on the visual field information (and / or viewpoint information) transmitted from the video reception device ex103. For example, the analysis unit ex112 sets the importance of captured video data including many areas indicated by the visual field information (and / or viewpoint information) to be high. The analysis unit ex112 may determine the importance of the video data in consideration of the number of people included in the video or the number of moving objects. Note that the importance determination method is not limited to this.

Note that the communication control method described in the present embodiment is not necessarily used in a system that reconstructs a three-dimensional shape from a plurality of video data. For example, in the environment where a plurality of cameras ex102 exist, if the video data is transmitted selectively or with a difference in transmission speed by wired communication and / or wireless communication, the communication control method described in the present embodiment is It is valid.

(Modification 10)
In the video distribution system, the video information processing apparatus ex101 may transmit an overview video showing the entire shooting scene to the video receiving apparatus ex103.

Specifically, when the video information processing apparatus ex101 receives the distribution request transmitted from the video receiving apparatus ex103, the video information processing apparatus ex101 reads an overview video of the entire facility or stadium from the storage unit ex111, and the external video is received by the video receiving apparatus. send to ex103. The overview video may have a long update interval (may be a low frame rate) or may have a low image quality. The viewer touches a portion to be seen in the overview video displayed on the screen of the video receiving device ex103. Accordingly, the video reception device ex103 transmits visual field information (and / or viewpoint information) corresponding to the touched portion to the video information processing device ex101.

The video information processing apparatus ex101 reads video data corresponding to the visual field information (and / or viewpoint information) from the storage unit ex111, and transmits the video data to the video receiving apparatus ex103.

In addition, the analysis unit ex112 generates a free viewpoint video by preferentially restoring the three-dimensional shape (three-dimensional reconstruction) on the region indicated by the visual field information (and / or viewpoint information). The analysis unit ex112 restores the three-dimensional shape of the entire facility or the stadium with an accuracy that shows an overview. Thereby, the video information processing apparatus ex101 can efficiently restore the three-dimensional shape. As a result, it is possible to realize a high frame rate and high image quality of a free viewpoint video in an area desired by the viewer.

(Modification 11)
Note that the video information processing apparatus ex101 may store in advance, for example, three-dimensional shape restoration data of a facility or a stadium generated in advance from a design drawing or the like as a preliminary video. The prior image is not limited to this, and may be virtual space data obtained by mapping, for each object, the unevenness of the space obtained from the depth sensor and the picture derived from the image or the image data at the past or during calibration.

For example, when soccer is being performed in a stadium, the analysis unit ex112 performs reconstruction of a three-dimensional shape limited to only players and balls, and combines the obtained restoration data and a prior image to generate a free viewpoint video. May be generated. Alternatively, the analysis unit ex112 may preferentially restore the three-dimensional shape with respect to the player and the ball. Thereby, the video information processing apparatus ex101 can efficiently restore the three-dimensional shape. As a result, it is possible to realize a high frame rate and high image quality of a free viewpoint video related to the player and the ball that the viewer pays attention to. In the monitoring system, the analysis unit ex112 may perform the reconstruction of the three-dimensional shape by limiting to only the person and the moving object or giving priority to them.

(Modification 12)
The time of each device may be calibrated at the start of shooting based on the reference time of the server. The analysis unit ex112 uses a plurality of video data captured at a time that falls within a preset time range according to the accuracy of time setting among a plurality of captured video data captured by the plurality of cameras ex102. 3D shape restoration. For the detection of this time, for example, the time when the captured video data is stored in the storage unit ex111 is used. The time detection method is not limited to this. As a result, the video information processing apparatus ex101 can efficiently restore the three-dimensional shape, thereby realizing a high frame rate and high image quality of the free viewpoint video.

Alternatively, the analysis unit ex112 may restore the three-dimensional shape using only the high-quality data or using the high-quality data preferentially among the plurality of video data stored in the storage unit ex111. .

(Modification 13)
The analysis unit ex112 may restore the three-dimensional shape using the camera attribute information. In this case, the camera ex102 transmits the captured video data and camera attribute information to the video information processing apparatus ex101. The camera attribute information is, for example, a shooting position, a shooting angle, a shooting time, or a zoom magnification.

Thereby, since the video information processing apparatus ex101 can efficiently restore the three-dimensional shape, it is possible to realize a high frame rate and high image quality of the free viewpoint video.

Specifically, the camera ex102 defines three-dimensional coordinates in the facility or in the stadium, and information about which coordinates the camera ex102 took from which angle, how much zoom, and at what time, along with the video. It transmits to the video information processing apparatus ex101 as camera attribute information. Further, when the camera ex102 is activated, the clock on the communication network in the facility or stadium is synchronized with the clock in the camera, and time information is generated.

Also, the position and angle information of the camera ex102 is acquired by pointing the camera ex102 to a specific point in the facility or stadium when the camera ex102 is activated or at an arbitrary timing. FIG. 18 is a diagram illustrating an example of a notification displayed on the screen of the camera ex102 when the camera ex102 is activated. According to this notification, when the user touches the display of the camera ex102 by aligning the “+” displayed at the center of the soccer ball in the advertisement on the north side of the stadium with the “+” displayed in the center of the screen, the camera ex102 is moved from the camera ex102. The vector information up to the advertisement is acquired and the reference of the camera position and angle is specified. Thereafter, the camera coordinates and angle at that time are specified from the motion information of the camera ex102. Needless to say, the display is not limited to this, and a display that uses an arrow or the like to indicate coordinates, an angle, a moving speed of the imaging region, or the like during the imaging period may be used.

The coordinates of the camera ex102 may be specified using GPS, WiFi (registered trademark), 3G, LTE (Long Term Evolution), and 5G (wireless LAN) radio waves, or a beacon (Bluetooth (registered trademark)). , Ultrasonic), or the like. Further, information on which base station in the facility or stadium the captured video data has reached may be used.

(Modification 14)
The system may be provided as an application that operates on a mobile terminal such as a smartphone.

An account such as various SNSs may be used to log in to the system. An application-dedicated account or a guest account with limited functions may be used. By using the account in this way, it is possible to evaluate a favorite video or a favorite account. In addition, by preferentially allocating bandwidth to video data similar to the video data being shot or viewed, video data having a viewpoint similar to the viewpoint of the video data being shot or viewed, the resolution of these video data Can be increased. Thereby, it is possible to restore the three-dimensional shape from these viewpoints with higher accuracy.

In addition, the user can select a preferred image video in the application and follow the other party, so that the selected image can be viewed with priority over other users, or can be used for text chat, etc., subject to the other party's approval. You can have a connection. In this way, a new community can be generated.

Thus, by connecting users within the community, shooting itself and sharing of the shot image are activated, and it is possible to promote restoration of a more accurate three-dimensional shape.

Also, according to the connection setting within the community, the user can edit an image or video taken by another person or create a new image or video by collaging the image of another person with his own image. This makes it possible to share a new video work, such as sharing a new image or video only with people in the community. Also, a video work can be used for augmented reality games by inserting a CG character in this editing.

In addition, according to the system, 3D model data can be sequentially output, so that a 3D printer or the like of a facility can output a 3D object based on 3D model data in a characteristic scene such as a goal scene. . Thereby, after a game, an object based on the scene during the game can be sold as a souvenir such as a key holder, or distributed to participating users. Of course, it is also possible to print an image from the best viewpoint as a normal photograph.

(Modification 15)
By using the system described above, for example, the rough state of the entire region can be managed by the center connected to the system from the video of the police car wearable camera and the police officer wearable camera.

In general patrol, for example, still images are transmitted and received every few minutes. In addition, the center identifies areas where there is a high possibility of crimes based on crime maps based on the results of analysis using past crime data, etc., or areas related to the crime occurrence probability identified in this way Holds data. In an identified area where the crime occurrence probability is high, the frequency of image transmission / reception may be increased, or the image may be changed to a moving image. Further, when an incident occurs, a moving image or three-dimensional reconstruction data using SfM or the like may be used. The center or each terminal simultaneously corrects an image or virtual space using information from other sensors such as a depth sensor or a thermo sensor, so that the police officer can grasp the situation more accurately.

Also, the center can feed back the object information to a plurality of terminals by using the 3D reconstruction data. This allows individuals with each terminal to track the object.

Also, recently, for the purpose of investigating buildings or the environment, or shooting with a sense of reality such as sports, shooting from the air is performed with a flightable device such as a quadcopter or drone. Although photographing with such an autonomous mobile device tends to cause a problem that the image is blurred, SfM can be three-dimensionalized while correcting the blur based on the position and the inclination. As a result, it is possible to improve the image quality and the accuracy of space restoration.

Also, the installation of an in-vehicle camera that takes pictures outside the vehicle is obligatory in some countries. Even in such an in-vehicle camera, by using three-dimensional data modeled from a plurality of images, it is possible to more accurately grasp the weather in the direction of the destination, the state of the road surface, the degree of traffic congestion, and the like.

(Embodiment 3)
By recording a program for realizing the configuration of the image processing method described in each of the above embodiments on a storage medium, the processing described in each of the above embodiments can be easily performed in an independent computer system. It becomes. The storage medium may be any medium that can record a program, such as a magnetic disk, an optical disk, a magneto-optical disk, an IC card, and a semiconductor memory.

Furthermore, application examples of the image processing method shown in the above embodiments and a system using the same will be described here. The system includes an apparatus using an image processing method. Other configurations in the system can be appropriately changed according to circumstances.

FIG. 19 is a diagram showing an overall configuration of a content supply system ex200 that realizes a content distribution service. The communication service providing area is divided into desired sizes, and base stations ex206, ex207, ex208, ex209, and ex210, which are fixed wireless stations, are installed in each cell.

This content supply system ex200 includes a computer ex211, a PDA (Personal Digital Assistant) ex212, a camera ex213, a smartphone ex214, a game machine ex215, etc. via the Internet ex201, the Internet service provider ex202, the communication network ex204, and the base stations ex206 to ex210. Are connected.

However, the content supply system ex200 is not limited to the configuration shown in FIG. 19 and may be connected by combining any of the elements. In addition, each device may be directly connected to a communication network ex204 such as a telephone line, cable television, or optical communication without going through the base stations ex206 to ex210 which are fixed wireless stations. In addition, the devices may be directly connected to each other via short-range wireless or the like.

The camera ex213 is a device that can shoot a moving image such as a digital video camera, and the camera ex216 is a device that can shoot a still image and a moving image such as a digital camera. In addition, the smartphone ex214 is a GSM (registered trademark) (Global System for Mobile Communications) method, a CDMA (Code Division Multiple Access) method, a W-CDMA (Wideband-Code Division MultipleL method, or a Multiple Acceleration method). , HSPA (High Speed Packet Access), a smartphone corresponding to a communication method using a high frequency band, or a PHS (Personal Handyphone System), and any of them may be used.

In the content supply system ex200, the camera ex213 and the like are connected to the streaming server ex203 through the base station ex209 and the communication network ex204, thereby enabling live distribution and the like. In live distribution, content (for example, music live video) that the user captures using the camera ex213 is encoded and transmitted to the streaming server ex203. On the other hand, the streaming server ex203 streams the content data transmitted to the requested client. Examples of the client include a computer ex211, a PDA ex212, a camera ex213, a smartphone ex214, and a game machine ex215 that can decode the encoded data. Each device that receives the distributed data decodes the received data and reproduces it.

The encoded processing of the captured data may be performed by the camera ex213, the streaming server ex203 that performs the data transmission processing, or may be performed in a shared manner. Similarly, the decryption processing of the distributed data may be performed by the client, the streaming server ex203, or may be performed in common with each other. In addition to the camera ex213, still images and / or moving image data captured by the camera ex216 may be transmitted to the streaming server ex203 via the computer ex211. The encoding process in this case may be performed by any of the camera ex216, the computer ex211, and the streaming server ex203, or may be performed in a shared manner. Further, with respect to the display of the decoded image, a plurality of devices connected to the system may be linked to display the same image, or the entire image is displayed on a device having a large display unit, and the smartphone ex214 or the like displays the image. A part of the area may be enlarged and displayed.

Also, these encoding / decoding processes are generally performed in the computer ex211 and the LSI ex500 included in each device. The LSI ex500 may be configured as a single chip or a plurality of chips. It should be noted that moving image encoding / decoding software is incorporated into some recording media (CD-ROM, flexible disk, hard disk, etc.) that can be read by the computer ex211 etc., and encoding / decoding processing is performed using the software. May be. Furthermore, when the smartphone ex214 has a camera, moving image data acquired by the camera may be transmitted. The moving image data at this time is data encoded by the LSI ex500 included in the smartphone ex214.

Further, the streaming server ex203 may be a plurality of servers or a plurality of computers, and may process, record, and distribute data in a distributed manner.

As described above, in the content supply system ex200, the client can receive and reproduce the encoded data. As described above, in the content supply system ex200, the information transmitted by the user can be received, decrypted and reproduced by the client in real time, and even a user who does not have special rights or facilities can realize personal broadcasting.

Note that the above embodiments may be applied not only to the example of the content supply system ex200 but also to the digital broadcast system ex300 as shown in FIG. Specifically, in the broadcast station ex301, multiplexed data obtained by multiplexing music data and the like on video data is transmitted to a communication or satellite ex302 via radio waves. This video data is data encoded by the moving image encoding method described in the above embodiments. Receiving this, the broadcasting satellite ex302 transmits a radio wave for broadcasting, and this radio wave is received by a home antenna ex304 capable of receiving satellite broadcasting. The received multiplexed data is decoded and reproduced by a device such as the television (receiver) ex400 or the set top box (STB) ex317.

Also, it reads and decodes multiplexed data recorded in a recording medium ex315 such as DVD or BD, or a memory ex316 such as SD, or encodes a video signal in the recording medium ex315 or memory ex316, and in some cases, a music signal It is possible to mount the moving picture decoding apparatus or moving picture encoding apparatus described in each of the above embodiments in the reader / recorder ex318 that writes in a multiplexed manner. In this case, the reproduced video signal is displayed on the monitor ex319, and the video signal can be reproduced in another device or system by the recording medium ex315 in which the multiplexed data is recorded or the memory ex316. In addition, a moving picture decoding apparatus may be mounted in a set-top box ex317 connected to a cable ex303 for cable television or an antenna ex304 for satellite / terrestrial broadcasting, and this may be displayed on a monitor ex319 of the television. At this time, the moving picture decoding apparatus may be incorporated in the television instead of the set top box.

FIG. 21 is a diagram showing the smartphone ex214. FIG. 22 is a diagram illustrating a configuration example of the smartphone ex214. The smartphone ex214 includes an antenna ex450 for transmitting and receiving radio waves to and from the base station ex210, a camera unit ex465 that can take a video and a still image, a video captured by the camera unit ex465, a video received by the antenna ex450, and the like. A display unit ex458 such as a liquid crystal display for displaying the decrypted data is provided. The smartphone ex214 further includes an operation unit ex466 such as a touch panel, an audio output unit ex457 such as a speaker for outputting audio, an audio input unit ex456 such as a microphone for inputting audio, a captured video, a still image , A memory portion ex467 that can store the recorded audio, or the encoded data or the decoded data such as received video, still image, or mail, or the memory ex316 illustrated in FIG. And a slot part ex464 which is an interface part with the SIMex 468 for authenticating access to various data including the network.

The smartphone ex214 controls the power supply circuit ex461, the operation input control unit ex462, the video signal processing unit ex455, the camera interface unit ex463, the LCD (for the main control unit ex460 that comprehensively controls the display unit ex458, the operation unit ex466, and the like. A Liquid Crystal Display) control unit ex459, a modulation / demodulation unit ex452, a multiplexing / demultiplexing unit ex453, an audio signal processing unit ex454, a slot unit ex464, and a memory unit ex467 are connected to each other via a bus ex470.

When the end of call and the power key are turned on by a user operation, the power supply circuit unit ex461 starts up the smartphone ex214 in an operable state by supplying power from the battery pack to each unit.

The smartphone ex214 converts the audio signal collected by the audio input unit ex456 in the audio call mode into a digital audio signal by the audio signal processing unit ex454 based on the control of the main control unit ex460 having a CPU, a ROM, a RAM, and the like. This is subjected to spectrum spread processing by the modulation / demodulation unit ex452, and is subjected to digital analog conversion processing and frequency conversion processing by the transmission / reception unit ex451, and then transmitted via the antenna ex450. In addition, the smartphone ex214 amplifies reception data received via the antenna ex450 in the voice call mode, performs frequency conversion processing and analog-digital conversion processing, performs spectrum despreading processing in the modulation / demodulation unit ex452, and performs voice signal processing unit ex454. After being converted into an analog audio signal, the audio output unit ex457 outputs it.

Further, when an e-mail is transmitted in the data communication mode, the text data of the e-mail input by the operation of the operation unit ex466 of the main unit is sent to the main control unit ex460 via the operation input control unit ex462. The main control unit ex460 performs spread spectrum processing on the text data in the modulation / demodulation unit ex452, performs digital analog conversion processing and frequency conversion processing in the transmission / reception unit ex451, and then transmits the text data to the base station ex210 via the antenna ex450. . In the case of receiving an e-mail, almost the reverse process is performed on the received data and output to the display unit ex458.

When transmitting video, still images, or video and audio in the data communication mode, the video signal processing unit ex455 compresses the video signal supplied from the camera unit ex465 by the moving image encoding method described in each of the above embodiments. The encoded video data is sent to the multiplexing / demultiplexing unit ex453. The audio signal processing unit ex454 encodes the audio signal picked up by the audio input unit ex456 while the camera unit ex465 captures video, still images, and the like, and sends the encoded audio data to the multiplexing / separating unit ex453. To do.

The multiplexing / demultiplexing unit ex453 multiplexes the encoded video data supplied from the video signal processing unit ex455 and the encoded audio data supplied from the audio signal processing unit ex454 by a predetermined method, and is obtained as a result. The multiplexed data is subjected to spread spectrum processing by a modulation / demodulation unit (modulation / demodulation circuit unit) ex452, and subjected to digital analog conversion processing and frequency conversion processing by a transmission / reception unit ex451, and then transmitted through an antenna ex450.

Decode multiplexed data received via antenna ex450 when receiving data of moving image files linked to websites in data communication mode or when receiving e-mails with video and / or audio attached For this, the multiplexing / separating unit ex453 separates the multiplexed data into a video data bit stream and an audio data bit stream, and performs video signal processing on the video data encoded via the synchronization bus ex470. The encoded audio data is supplied to the audio signal processing unit ex454 while being supplied to the unit ex455. The video signal processing unit ex455 decodes the video signal by decoding using a video decoding method corresponding to the video encoding method shown in each of the above embodiments, and the display unit ex458 via the LCD control unit ex459. From, for example, video and still images included in a moving image file linked to a home page are displayed. The audio signal processing unit ex454 decodes the audio signal, and the audio is output from the audio output unit ex457.

In addition to the transmission / reception terminal having both the encoder and the decoder, the terminal such as the smartphone ex214 is a transmission terminal having only an encoder and a receiving terminal having only a decoder, as well as the television ex400. A possible implementation format is possible. Furthermore, in the digital broadcasting system ex300, it has been described that multiplexed data in which music data or the like is multiplexed with video data is received and transmitted. However, data in which character data related to video is multiplexed in addition to audio data It may be video data itself instead of multiplexed data.

Further, the present invention is not limited to the above-described embodiment, and various changes and modifications can be made without departing from the scope of the present invention.

The present invention can be applied to a video distribution system that distributes video shot by a plurality of cameras.

DESCRIPTION OF SYMBOLS 100 Video delivery system 101 Camera 102 Terminal device 103 Server 104 Network 111,121 Reception part 112 Video storage part 113,126 Control part 114,125 Transmission part 122 Storage part 123 Decoding part 124 Output part 127 Input part 151 Video signal 152 Viewpoint designation | designated Signal 153 Selection video signal 154 Related video signal 155 Output video 201 Background image 202 Camera icon 211 Selection video 212 Overhead image 213

Top image

214, 215, 216 Operation buttons

Claims

A video delivery method by a server that delivers any of a plurality of videos taken from different viewpoints by a plurality of users to a terminal device,
A delivery step of delivering the first video requested by the terminal device to the terminal device, the one of the plurality of videos;
A selection step of selecting a second video that is one of the plurality of videos and is likely to be requested next from the terminal device;
A transmission step of starting transmission of the second video to the terminal device while distributing the first video to the terminal device.
The video distribution method according to claim 1, wherein in the selection step, a video having a high degree of association with the first video is selected as the second video from the plurality of videos.
The video distribution method according to claim 2, wherein the selection step determines that the degree of association is higher as the position of the shooting scene is closer to the position of the shooting scene of the first video.
The video distribution method according to claim 3, wherein the selection step further determines that the degree of association is higher as the width of the shooting scene is closer to the width of the shooting scene of the first video.
The video distribution method according to claim 2, wherein, in the selection step, the degree of association between a subject included in the first video and a video in which the same subject is shot is set high.
The video distribution method according to claim 1, wherein, in the selection step, the second video is selected based on a frame rate, resolution, or bit rate of the plurality of videos.
The video distribution method according to claim 1, wherein, in the selection step, a video that has been frequently selected by another user among the plurality of videos is selected as the second video.
The video distribution method according to claim 1, wherein, in the selection step, the second video is selected based on a user's viewing history or pre-registered preference information.
A video reception method by a terminal device that receives any of a plurality of videos taken from a plurality of viewpoints from a server and displays the videos,
A selection step of selecting a first video from the plurality of videos;
A requesting step for requesting the server to transmit the first video;
A first receiving step of receiving the first video from the server;
A display step for displaying the first video;
A second receiving step of starting reception of a second video that is one of the plurality of videos and is likely to be selected next during reception of the first video;
The video receiving method further includes:
Storing the received second video;
The video receiving method according to claim 9, further comprising: displaying the stored second video when the second video is selected during display of the first video.
The video distribution method further includes:
Receiving a third video from the server when a third video different from the first video and the second video is selected during the display of the first video;
The video receiving method according to claim 10, further comprising: displaying the stored second video until the third video is received.
The display step further displays an image that is a bird's-eye view of a place where the plurality of videos are taken and includes a plurality of icons indicating the positions of the plurality of viewpoints. The video receiving method according to the item.
The video receiving method according to claim 12, wherein in the display step, an icon indicating a position of a viewpoint of the second video is highlighted among the plurality of icons.
A server for delivering any of a plurality of videos taken from different viewpoints by a plurality of users to a terminal device,
A distribution unit that is one of the plurality of images and distributes the first image designated by the terminal device to the terminal device;
A selection unit that selects a second video that is one of the plurality of videos and is likely to be requested next from the terminal device;
A transmission unit configured to start transmission of the second video to the terminal device while delivering the first video to the terminal device.
A terminal device that receives any of a plurality of videos taken from a plurality of viewpoints from a server and displays the videos,
A selection unit for selecting a first video from the plurality of videos;
A request unit that requests the server to transmit the first video;
A first receiver for receiving the first video from the server;
A display unit for displaying the first video;
A terminal device comprising: a second receiving unit that starts receiving a second video that is one of the plurality of videos and is likely to be selected next during reception of the first video.
A server according to claim 14;
A video distribution system comprising the terminal device according to claim 15.
A program for causing a computer to execute the video receiving method according to claim 9.