CN112513782A

CN112513782A - Display control system, display control method, and display control program

Info

Publication number: CN112513782A
Application number: CN201980049828.7A
Authority: CN
Inventors: 细见幸司; 栗山孝司; 胜俣祐辉; 出口和明
Original assignee: Nikon Corp
Current assignee: Nikon Corp
Priority date: 2018-07-31
Filing date: 2019-07-31
Publication date: 2021-03-16
Also published as: JP2020021225A; WO2020027226A1; TW202015430A

Abstract

The invention provides a display control system, a display control method, and a display control program capable of virtually adding an article to a real space image. The display control system includes: an acquisition unit that acquires three-dimensional position information of a real space; a setting unit that specifies a position of a target portion of an article in the real space using the three-dimensional position information, and sets the target portion at the position; and a display control unit that adds a target object corresponding to the target portion to a position of the target portion in a video, and adds an article object corresponding to the article to an article position of the article in the video, the article position being determined based on user motion information indicating a user motion for moving the article toward the target object.

Description

Display control system, display control method, and display control program

Technical Field

The invention relates to a display control system, a display control method, and a display control program.

Background

Patent document 1 and the like describe a server that distributes content.

Documents of the prior art

Patent document

Patent document 1: JP 5530557A

Disclosure of Invention

According to one aspect of the present invention, there is provided a display control system including: an acquisition unit that acquires three-dimensional position information of a real space; a setting unit that specifies a position of a target portion of an article in the real space using the three-dimensional position information, and sets the target portion at the position; and a display control unit that adds a target object corresponding to the target portion to a position of the target portion specified by the three-dimensional position information in a real space image corresponding to the real space, and adds an article object corresponding to the article to an article position of the article specified by user operation information indicating a user operation for moving the article toward the target object in the real space image.

According to another aspect of the present invention, there is provided a display control method that causes an acquisition unit to acquire three-dimensional position information of a real space, causes a setting unit to specify a position of a target portion that accepts an article in the real space using the three-dimensional position information and set the target portion at the position, causes a display control unit to add a target object corresponding to the target portion at the position of the target portion specified using the three-dimensional position information in a real space image corresponding to the real space, and adds an article object corresponding to the article at an article position specified based on user action information indicating a user action to move the article toward the target object in the real space image.

According to another aspect of the present invention, there is provided a display control program that causes a computer to execute a display control method of: the display control method includes a setting unit that specifies a position of a target portion of an article in real space using three-dimensional position information of the real space and sets the target portion at the position, a display control unit that adds a target object corresponding to the target portion to a position of the target portion specified using the three-dimensional position information in a real space image corresponding to the real space, and adds an article object corresponding to the article to an article position specified based on user operation information indicating a user operation of moving the article toward the target object in the real space image.

Drawings

Fig. 1 is a diagram showing an overall configuration of a video live broadcast system according to embodiment 1.

Fig. 2 is a block diagram of a control device, a distribution terminal, and a server in a studio.

Fig. 3 is a diagram showing an attribute database.

Fig. 4 is a block diagram of a server and a user terminal.

Fig. 5 (a) is a flowchart of the processing when the cup is set as the target portion, and (b) is a diagram showing a display screen of the publisher terminal when the cup is set as the target portion.

Fig. 6 (a) is a flowchart of a process when the target portion is set by the gesture, and (b) is a diagram showing a display screen of the publisher terminal when the target portion is set by the gesture.

Fig. 7 is a flowchart of the processing performed when the target portion is a virtual object.

Fig. 8 (a) is a diagram showing a display screen of a publisher terminal when selecting a target object, (b) is a diagram showing a display screen of a publisher terminal in a state in which a bucket is selected as a target object, and (c) is a diagram showing a display screen of a publisher terminal in a state in which a publisher holds a bucket as a target object.

Fig. 9 (a) is a flowchart of the video live broadcast process, and (b) is a diagram showing a display screen of the user terminal when the special period starts.

Fig. 10 (a) is a flowchart of the present gift processing, (b) is a diagram showing a display screen of a user terminal on which an item selection target for displaying selectable items in a list is displayed, and (c) is a diagram showing a display screen of a user terminal on which a selected item is displayed.

Fig. 11 (a) is a diagram showing a state in which the user performs a slide operation on the display surface of the user terminal, (b) is a diagram showing display screens of the user terminal and the publisher terminal that display a state in which the item object flies to the target object, and (c) is a diagram showing display screens of the user terminal and the publisher terminal that display a state in which the item object reaches the target object.

Fig. 12 (a) is a flowchart showing the item object display processing, (b) is a diagram showing the display screens of the user terminal and the publisher terminal that display the state in which the publisher leaves the seat, and (c) is a diagram showing the display screens of the user terminal and the publisher terminal that display the states in which the item object and the target object are displayed on the screen.

Fig. 13 (a) is a flowchart showing the processing when the article overflows from the target portion, following fig. 12 (a), (b) is a diagram showing the display screens of the user terminal and the publisher terminal that display a state in which a plurality of articles are held in the container constituted by the palm, and (c) is a diagram showing the display screens of the user terminal and the publisher terminal that display a state in which the articles overflow from the container constituted by the palm.

Fig. 14 (a) is a diagram showing the display screens of the user terminal and the publisher terminal showing a state in which the article has rolled on the table, following fig. 13 (c), (b) is a diagram showing the display screens of the user terminal and the publisher terminal showing a state in which the article has fallen from the table, (c) is a diagram showing the display screens of the user terminal and the publisher terminal showing a state in which the article has rolled on the floor, and (d) is a diagram showing the display screens of the user terminal and the publisher terminal showing a state in which the article has stopped on the floor.

Fig. 15 (a) is a flowchart showing a process in a case where the article does not enter the target portion, and (b) is a diagram showing display screens of the user terminal and the publisher terminal that display a state where the article is out of alignment with the target portion.

Fig. 16 is a diagram showing a display control system in embodiment 2.

Fig. 17 is a block diagram of the head mounted display in embodiment 2.

Detailed Description

Hereinafter, a video live broadcast system to which the present invention is applied will be described with reference to fig. 1 to 17.

[ embodiment 1 ]

[ overview of video live broadcast System ]

As shown in fig. 1, a video live broadcast system 1 includes: a studio 10 for the publisher X to shoot the published images; a server 40 for distributing the distributed video acquired in the studio 10 by live broadcasting;

user terminals

50, 70, 80 of a user A, B, C who views a distributed video distributed by a server 40. The server 40 and the

user terminals

50, 70, 80 are connected via the network 2. The number of user terminals is not limited to 3 shown here, and may be 1, several tens of user terminals, or several hundreds of user terminals.

In the real space in the studio 10, the publisher X as the subject is photographed, for example. The shooting may be performed by a third person such as a photographer or may be a self-timer shooting. The number of persons of the publisher X photographed in the studio 10 is 1 person here, but may be a plurality of persons. The publisher X is a combination of ordinary persons, performers (idol, star, artist, model, sonographer, musician, sportsperson, etc.), amateurs targeting the performers, professionals, or bands of the amateurs, or members of the combination. For example, the publisher X is a live performer performing a performance, and is a character actually in the studio 10. The publisher X may also be displayed as an avatar, which is a character of its own avatar in the real space image. Publisher X is a retriever that accepts an item provided by user A, B, C. In addition, the subject may be a public facility instead of the publisher X. Examples of the public facilities include sacrifice facilities such as a magical house, and monuments such as a bronze statue (a portrait, an animal portrait, a Buddha statue, etc.). The studio 10 used when the subject is the publisher X may be, for example, a shooting studio or a karaoke room, or a room at home of the publisher X. The present invention may be used outdoors such as athletic fields for athletic sports, soccer, baseball, etc., or may be used indoors such as in gymnasiums or concert halls. The studio 10 may be a concert hall or an event hall, and in this case, the publisher X is an actor performing live on a stage, for example. The studio 10 where the publisher X is located is a place separated from the place where the user A, B, C is actually located. In addition, for example, when the subject of the publisher X is a sacrifice facility, a monument, or the like, and a target part of a moral box or a donation box is set in front of the facilities, the user A, B, C can virtually throw the sesame oil money to the sacrifice facility, the monument, or the like, which is far from the place where the user is located.

The studio 10 includes: a microphone 11; an RGB camera 12; a depth camera 13; a studio monitor 14; a speaker 15; a haptic device 16; a control device 17; smart devices, etc. publisher terminals 18. In addition, the publisher X is illuminated by the lighting 9 on the table 6.

The microphone 11 collects dialogue, environmental sounds, and the like of the publisher X. The microphone 11 collects singing voice and the like matching the karaoke data of the publisher X. The RGB camera 12 is a 1 st camera arranged in the studio 10 and configured to photograph a real space in the studio 10. The RGB camera 12 is a digital camera having a moving image (video) capturing function, for example, and is a video camera, for example. The RGB camera 12 has an image sensor such as a CCD or CMOS, detects light such as visible light, and outputs display data composed of color signals of 3 colors (red, green, and blue). As an example, the RGB camera 12 photographs the studio 10 or the publisher X. The video, which is a real space image captured by the RGB camera 12, is displayed as viewing data on the display surface of the publisher terminal 18, the studio monitor 14, the

user terminals

50, 70, 80, or the like, by the server 40. In addition, as an example, the video captured by the RGB camera 12 is displayed on the publisher terminal 18 and the studio monitor 14 as display data for confirming the publisher X. The number of the RGB cameras 12 may be 1 or more in the studio 10. When a plurality of RGB cameras 12 are used, the RGB cameras are provided at a plurality of positions so that the imaging directions are different from one another. This enables the studio 10 to be imaged in full coverage, and for example, the lower part of a table or the lower part of a chair can also be imaged. In this case, the plurality of RGB cameras 12 are provided such that the outer periphery of the imaging range of one RGB camera 12 overlaps the outer periphery of the imaging range adjacent to the imaging range, that is, such that no gap is formed between the imaging ranges.

As an example, the depth camera 13 is a 2 nd camera for acquiring three-dimensional position information in the live video system 1. For example, the depth camera 13 is an infrared camera. The infrared camera is provided with: a light projecting part for projecting infrared rays; and an infrared detection unit for detecting infrared rays. The depth camera 13 acquires three-dimensional position information such as depth information in real space from the time until the infrared pulse projected from the light projection unit is reflected and returned. The depth camera 13 acquires depth information or the like as a distance from the camera itself to the subject, and for example, acquires depth information or the like as a distance from the publisher X. For example, the position of the publisher X or the position of the target portion when the user places the article can be specified by a three-dimensional coordinate system having the detection portion of the depth camera 13 as the origin. The RGB camera 12 and the depth camera 13 may be integrated or may be separate devices. The depth camera 13 may be provided in the studio 10 at 1 stage or a plurality of stages. When a plurality of depth cameras 13 are used, the three-dimensional coordinates of the studio 10 can be detected in a full coverage manner by providing the depth cameras at a plurality of positions so that the imaging direction is different for each depth camera. For example, three-dimensional position information of the lower portion of the table 6, the lower portion of the chair, and the like may be acquired. In this case, the plurality of depth cameras 13 are provided such that the outer periphery of the detection range of one depth camera 13 overlaps with the outer periphery of the detection range adjacent to the detection range, that is, such that no gap is formed between the detection ranges.

The studio monitor 14 is disposed in the studio 10 as a real space, and is a display device for displaying images, and is a display device mainly installed on a wall, a desk, or the like of the studio 10 so as to be visually confirmed by the publisher X, as an example. The studio monitor 14 is, for example, a large device as opposed to a smart device. The studio monitor 14 is, for example, a flat panel display, and is a CRT, LCD, or organic EL display device. As an example, the studio monitor 14 displays a live image (excluding the article object 3 and the target object 5 described later) captured by the RGB camera 12 at the time of preparation before distribution. The studio monitor 14 displays, as an example, a live video (including an article object 3 and a target object 5, which will be described later) distributed by the server 40. The speaker 15 plays the voice data of the publisher X collected by the microphone 11, the music data reproduced by the control device 17, and the like. In addition, the speaker 15 plays sound data including the sound of the user A, B, C transmitted from the

user terminal

50, 70, 80. This allows the publisher X to hear the reaction of the user A, B, C.

The haptic device 16 is a machine that is worn by the publisher X and gives a skin sensation feedback with respect to the publisher X by using a mechanical stimulus. For example, the haptic device 16 drives an eccentric motor actuator, a linear resonance actuator, a piezoelectric actuator, or the like to generate physical vibration, and feeds back at least one of a feeling of force, a feeling of touch, and a feeling of pressure to the issuer X as the wearer by the vibration. For example, the haptic device 16 feeds back at least one of the feeling of strength, the feeling of touch, and the feeling of pressure to the publisher X by the pattern of the electrical stimulation to the skin.

Also, an olfactory feedback device that feeds back olfactory sensation may be provided on the basis of the haptic device 16. For example, the olfactory sensation feedback device may be configured such that one or more cartridges containing perfume, a sheet as a fragrance source, and the like are attached to a head-mounted device such as a head-mounted display or smart glasses, and a minute amount of fragrance is discharged from a spray opening toward the lower part of the nose.

The control device 17 is a communication device that is connected to various devices such as the microphone 11, the RGB camera 12, the depth camera 13, the studio monitor 14, the speaker 15, and the haptic device 16, inputs and outputs data, and communicates with the server 40. The control device 17 is a reproduction device for a recording medium such as a CD, DVD, BD, or semiconductor memory, and reproduces music data, video data, or the like recorded on the recording medium. The control device 17 also reproduces music data, video data, and the like distributed via the network. As an example, the control device 17 reproduces karaoke data (accompaniment data). The control device 17 mixes the sound data from the microphone 11, the video data from the RGB camera 12, and the like to generate a live video, and transmits the three-dimensional position information from the depth camera 13 to the server 40. For example, the control device 17 encodes a live video for distribution in a predetermined moving image compression method, and transmits the encoded live video to the server 40. The control device 17 receives and decodes the live video 4 transmitted from the server 40.

The publisher terminal 18 is, for example, a smart device having a size enough to be held with one hand, and includes a display unit having a touch panel, a microphone, a speaker, an image pickup unit, a communication unit, a storage unit such as a nonvolatile memory, and the like. Here, as an example, the same video as the studio monitor 14 is displayed. As an example, the publisher terminal 18 displays the live video 4 distributed from the server 40.

The publisher terminal 18 having such a configuration can function as at least one function selected from the control device 17, the microphone 11, the RGB camera 12, the studio monitor 14, and the speaker 15. Further, when the publisher terminal 18 also has a depth camera, it can function as the depth camera 13 instead. As described above, by using the publisher terminal 18 instead of functioning as a device provided in the studio 10, even if the publisher X does not perform imaging in the dedicated studio 10, live video for distribution can be conveniently imaged in a karaoke studio, home, outdoors, or the like. When the publisher terminal 18 is used as the RGB camera 12, the publisher terminal 18 is fixed to a tripod or the like, whereby a live image can be captured while preventing hand shake. In this case, the publisher terminal 18 can exchange data with the server 40 without passing through the control device 17.

The server 40 is a device for managing the live video system 1, and manages live video, as well as a publisher X who performs live video, and a user who views live video 4. In addition, the items provided by the user with respect to publisher X are managed. For example, the server 40 distributes the live video 4 to the

user terminals

50, 70, and 80. For example, the server 40 adds an item object 3 corresponding to an item provided by the user to the distributor X to the live video captured by the RGB camera 12, generates a live video 4, and distributes the video to the

user terminals

50, 70, and 80. In addition, as an example, a target part when the user provides the item to the publisher X is set for the live video. In the target portion in fig. 1, the container constituted by the palm of a part of the publisher X becomes the target object 5, but the publisher X itself may be the target object 5. In addition, the header of the publisher X may be set as the target object 5. In addition, the target object 5 may be a bucket, a cup, a trash box, a decorative box, or the like, which the publisher X holds, or may be a kungfu box or the like, which is placed on the table 6 in front of the publisher X. When the target portion is a real object located in the studio 10, the image of the real object is the target object 5 in the target portion. When the target portion is a virtual object that does not actually exist in the studio 10, the target portion is an image representing the virtual object and is set as the target object 5. The target position of the target portion, that is, the display position of the display target object 5 is determined by three-dimensional position information based on the depth information detected by the depth camera 13. For example, the target position is specified by a three-dimensional coordinate system with the detection unit of the depth camera 13 as the origin, or the like. As an example, since the target portion is a container in which an article is put, the target position is also defined as a three-dimensional space portion having a volume or volume for accommodating the article accordingly. For example, the server 40 adds the target object 5 to the live video captured by the RGB camera 12 to generate the live video 4, and distributes the live video to the

user terminals

50, 70, and 80. In addition, as an example, the server 40 generates a live video 4 in which the article object 3 is inserted into the target object 5.

For example, the user A, B, C participating in the live video system 1 is a person interested in the publisher X, a supporter such as a fan of the publisher X, and is located in a place separated from the studio 10, and can view the live video 4 using the

user terminals

50, 70, and 80. The person who the user A, B, C tosses the item (item provider, gift giver) with respect to publisher X. As an example, the user terminal 50 of the user a includes: a desktop or notebook type personal computer 50 a; and a wearable terminal connected to the personal computer 50a, a smart watch 60 as a smart device terminal. In the case where the personal computer 50a is a desktop computer, the user terminal 50 includes: a personal computer 50a having a desktop monitor; and a smart watch 60 connected to the personal computer 50 a. In addition, when the personal computer 50a is a notebook type, the user terminal 50 includes: a notebook personal computer 50a having a display portion; and a smart watch 60 connected to the notebook-type personal computer 50 a. For example, the user a of the user terminal 50 wears the smart watch 60 on a dominant hand or the like, and the smart watch 60 is connected to the personal computer 50a by wire or wirelessly. The smart watch 60 includes a detection unit such as an acceleration sensor or a gyro sensor, and detects, as user motion information, acceleration, angle (posture), and angular velocity of the user a when the user a performs a motion of a thrown object, for example.

Further, the personal computer 50a may be connected to a Head Mounted Display (HMD) by wire or wirelessly. The HMD may have a configuration as the personal computer 50 a. The HMD includes an optical see-through head mounted display, a video see-through head mounted display, a non-transmissive head mounted display, and the like. In the case of an optical see-through head-mounted display, display based on AR (augmented reality) is possible. In the case of a video see-through head mounted display or a non-transmissive head mounted display, display based on VR (virtual reality) is possible. The personal computer 50a may be an eyeglass-type information processing terminal such as smart glasses connected to a computer.

The user terminal 70 of the user B is, for example, a smart device terminal such as a smart phone or a tablet pc, and is a portable small-sized information processing terminal. A smart phone, for example, has a touch panel on a display surface. The user terminal 70 includes a detection unit such as an acceleration sensor or a gyro sensor. As an example, when the user B performs an operation of throwing an object, the user terminal 70 detects acceleration, angle, and angular velocity thereof as user operation information, and transmits the user operation information to the server 40. Further, instead of the motion of the user B throwing the object, when a slide operation of drawing in the direction of the displayed target object 5 is performed with a finger or a stylus on the display surface on which the live video image 4 is displayed, the user terminal 70 transmits operation data such as the coordinate data to the server 40. Since the user terminal 70 is a portable small-sized information processing terminal, the user B of the user terminal 70 can view the live video 4 of the publisher X wherever he is.

The user terminal 80 of the user C includes the smart device terminal 70a and the smart watch 60. In this case, the smart device terminal 70a functions as the notebook-size personal computer 50a of the user a. Thus, even when the user C performs the motion of throwing an object with the dominant hand wearing the smart watch 60, the image displayed on the display unit of the smart device terminal 70a held with the other hand can be visually confirmed. In addition, the motion of throwing the object can be performed in a state where the smart device terminal 70a is placed on a desk or the like or fixed to a tripod. This makes it possible to perform a throwing motion or the like by the dominant hand wearing the smart watch 60 while viewing the image displayed on the display unit of the smart device terminal 70 a. The user terminal 80 can also transmit the user action information to the server 40 by a slide operation drawn in a direction in which the display surface faces the target object 5 without using the smart watch 60.

The

user terminals

50, 70, and 80 can watch the live video 4 and present items to the publisher X actually shown at that time. For example, a list of items that can be given to the publisher X is displayed on the display surfaces of the

user terminals

50, 70, and 80 together with the live video 4, and the user A, B, C can select one item from the items in the list. Then, the selected item object 3 is displayed on the display surface of the

user terminal

50, 70, 80. In fig. 1, the article is a gem, but may be an ornament such as a bouquet or a headband, a coin, or the like, and may be a background image or the like showing a special effect of the motion of the publisher X or the place where the publisher X is located when viewed by the

user terminals

50, 70, 80.

Thereafter, the user a of the user terminal 50 swings the wrist while wearing the smart watch 60, and performs an action of throwing an object while targeting the target object 5. The user B of the user terminal 70 swings the wrist while carrying the user terminal 70, and performs an operation of throwing an object with the target object 5 as a target. The user C of the user terminal 80 swings the wrist while wearing the smart watch 60 and performs an action of throwing an object while targeting the target object 5. Then, the

user terminals

50, 70, and 80 transmit the detection results, that is, operation data such as acceleration data, angle (posture) data, and angular velocity data as user motion information, to the server 40. Further, the

user terminals

70 and 80 perform a slide operation of drawing a target object 5 displayed on the display surface on which the live video image 4 is displayed with a finger or a stylus pen, and transmit operation data such as the coordinate data to the server 40 as user operation information.

The server 40 adds the item object 3 of the item transmitted from the

user terminals

50, 70, and 80 to the live video, and allows the

user terminals

50, 70, and 80, the studio monitor 14, and the publisher terminal 18 to visually confirm the status. The display position (article position) of the article object 3 is calculated by the server 40 based on operation data such as acceleration data, angle data, and angular velocity data, which are user operation information transmitted from the

user terminals

50, 70, and 80. The acceleration data, angle data, angular velocity data, and the like transmitted from the

user terminals

50, 70, and 80 as the user motion information may be at least acceleration data. This is because the flying distance of the thrown article can be calculated from the acceleration data. When the display surface on which the live video image 4 is displayed is subjected to a slide operation in the direction of the target object 5, the display position (article position) of the article object 3 is calculated by the server 40 based on the coordinate data thereof and the like.

For example, when the user A, B, C performs an operation or operation of lightly dropping an article, the article position at which the article object 3 is displayed is a position away from the target portion and in front of the publisher X. When the user A, B, C performs an action or operation to throw an article more strongly, the article position at which the article object 3 is displayed is a position relatively close to the front of the target portion and the publisher X. When the article is thrown too strongly by the user A, B, C, the article position at which the article object 3 is displayed is a position forward or rearward of the target portion and the publisher X after the article bounces off a wall located rearward of the target portion and the publisher X. When the user A, B, C performs an operation or manipulation to throw an article to the right or left with respect to the publisher X or target section, the article position in which the article object 3 is displayed is the right or left position with respect to the target section or publisher X in accordance with the direction in which the user A, B, C throws the article. In the case of detecting the throwing motion by the user A, B, C, it is preferable that the front direction is set in advance, and the right side or the left side is detected with respect to the set front direction. In particular, when the user B uses the user terminal 70 as a smartphone, the terminal is not fixed, and therefore, it is preferable to perform a setting process for determining the front direction in advance. By displaying these on the publisher terminal 18 and the studio monitor 14, the publisher X can visually confirm that the article is thrown in the direction of the publisher X from the user A, B, C who is not actually present in the studio 10. In addition, even when the publisher X or the target part throws an article, the

user terminals

50, 70, and 80 display the article in the live video 4. Thus, the user A, B, C can confirm whether or not the article is thrown in the direction of the publisher X or the target portion.

When the plurality of users A, B, C watching the same live video image 4 throw an article from the publisher X, the articles thrown by others are also displayed. This enables competition among users for the number of articles thrown by the publisher X.

When the position of the article object 3 overlaps the target position of the target object 5, the server 40 displays the article object 3 in the live video 4 in a state where the article object 5 is in the target position. In addition, for example, when the target part is a bucket, a cup, a trash box, a decoration box, or the like that is actually carried by the publisher X, the server 40 changes the target position in accordance with the movement of the publisher X when the publisher X moves his or her hand or wrist. When a hand or a wrist is moved while a bucket, a cup, a trash can, a decorative box, or the like of a real object is placed on the table 6, the target portion becomes the bucket, the cup, the trash can, the decorative box, or the like on the table 6. In addition, for example, when the target portion is a virtual object, the server 40 sets a position at which a virtual object is to be displayed as a target position, and displays the target object 5 at the target position in the live video 4. Then, when the publisher X is set to hold the virtual object, the article object 3 is moved in accordance with the movement of the hand or wrist of the publisher X. When the publisher X holds a container or the like as a target portion, the server 40 drives the haptic device 16 so that the publisher X wearing the haptic device 16 senses the weight of the article. That is, the server 40 drives the haptic device 16 in such a manner that the publisher X feels heavier weight in the case of two items than in the case of one item in the container. In addition, the server 40 drives the haptic device 16 so that the vibration when the item is rolled is transmitted to the publisher X when the item is rolled in the container.

For example, the depth camera 13 of the studio 10 always calculates three-dimensional position information such as depth information at various places in the studio 10. The depth camera 13 extracts a human figure region of the publisher X, and distinguishes the human figure region from a non-human figure region. The depth camera 13 acquires, as skeleton data, each skeleton position at 25 of the publisher X, and calculates depth information of each skeleton position. The skeletal positions include, for example, skeletal positions of the left and right hands, the head, the neck, the left and right shoulders, the left and right elbows, the left and right knees, and the left and right feet. Furthermore, the number of bone locations acquired is not limited to 25. In addition, the depth camera 13 calculates the distance to a wall, a table, a floor, and the like in the studio 10. The depth camera 13 calculates the distance to the target portion (cup, bucket of material, container formed by the palm of the publisher X, and the like). Here, the depth information is, for example, a distance from an objective lens or a sensor surface in front of the depth camera 13 to a measurement target position (each point on the wall of the studio 10 or each point on the floor). In addition, the depth information is, for example, a distance from an objective lens or a sensor surface in front of the depth camera 13 to a bone position of the publisher X that is the subject.

In addition, the depth information is, for example, a distance from an objective lens or a sensor surface in front of the depth camera 13 to an object such as a cup serving as a target portion. The depth information is, for example, a distance from an objective lens or a sensor surface in front of the depth camera 13 to a palm of the publisher X, which is a target portion. The depth information is a distance to a plurality of points existing in the target portion. In this manner, the depth camera 13 can determine the shape and orientation of the actual object such as a cup or a palm to be the target portion, and the entrance opening of the article by subdividing the distance to each point of the target portion. Further, the greater the number of points of the target portion, the more accurately the shape, orientation, entrance opening of the article, and the like of the target portion can be determined.

When the article position where the article is displayed overlaps the target position where the target object 5 is displayed at a predetermined ratio, the server 40 determines that the article enters the target portion. The larger the predetermined ratio is, the easier the article enters the target portion, and the smaller the ratio is, the harder the article enters the target portion. The server 40 displays the state of the article when the article is inserted into the container in the live video image 4 in accordance with the attributes (size, shape, surface roughness, etc.) of the container of the target portion and the attributes (size, shape, surface roughness, etc.) of the article. In addition, when the target part is a real object existing in the real space, the attribute of the target part conforms to the attribute of a container existing in the real space and serving as the target part. The attribute of the target portion at this time is, for example, a measurement value measured by a sensor in advance. When the object is deviated from the target portion, the object is bounced by the table 6 on which the target portion is placed and the wall, and rolls on the table 6 and the floor. The server 40 displays the article object 3 in a scrolled or popped state in the live video image 4 in accordance with the attribute of the place (table 6, wall, floor) where the article is in contact. When the article overflows or falls out of the container, the article object 3 is displayed in the live video image 4 in a state of rolling or flipping in accordance with the attribute of the place (table 6, wall, floor) with which the article comes into contact. As an example, the live video 4 is displayed as follows: the ball rolls far when rolling on a hard and flat wooden floor, and substantially does not roll when rolling on a floor having a rough surface such as a carpet.

In a case where the item does not enter the container or the item overflows or falls out of the container, for example, in the studio 10, the item often leaves from the angle of view (imaging range) of the RGB camera 12 that is imaging the publisher X sitting in front of the table 6. In addition, when the publisher X goes to pick up such an item, in this case, the publisher X and the container may be away from the view angle (imaging range) of the RGB camera 12 that is imaging the publisher X sitting in front of the table 6. When three-dimensional position information of the entire studio 10 is obtained by using a plurality of depth cameras 13 or the like, the server 40 holds the position of the article by using three-dimensional position information determined based on operation data such as coordinate data based on a slide operation, such as acceleration data, angle data, and angular velocity data detected by the

user terminals

50, 70, and 80 as detection results. Therefore, for example, when the publisher X moves the RGB camera 12 to capture the scroll destination of the article, the scrolled article object 3 is added to the live video image 4. When the distributor X or the container is once moved away from the viewing angle of the RGB camera 12 and then moved into the viewing angle again without moving the RGB camera 12, the target object 5 and the article object 3 are displayed in the live video 4 from this point in time. In addition, in the case where the article overflows or falls out of the container, the container as the target portion becomes light accordingly. The server 40 drives the haptic device 16 in a manner that lightens the container by an amount corresponding to the reduction of the item in the container.

Further, the article or the container may be moved out of the detection range of the depth camera 13. In this case, the server 40 holds the target position and the article position of the target portion immediately after leaving the detection range, and updates the target position and the article position to the return position when returning to the detection range of the depth camera 13. In addition, the server 40 drives the haptic device 16 in accordance with the state of the container or the article when returning to the detection range of the depth camera 13.

[ depth camera 13]

For example, the depth camera 13 includes a light projecting section such as a projector that projects pulsed infrared light and an infrared light detecting section such as an infrared camera, and calculates depth information from the Time until the projected infrared light pulse is reflected and returned (Time of Flight (TOF) method). For example, the depth camera 13 always calculates three-dimensional position information such as depth information at each position in the studio 10.

The depth camera 13 extracts a human figure region of the publisher X and distinguishes the human figure region from a non-human figure region. For example, in the same place (for example, the studio 10), the person region is calculated based on the difference between the photographed front and back of the person. In addition, as an example, a region in which the detected infrared ray amount exceeds a threshold is determined as a human object region.

Also, the depth camera 13 detects the bone position. The depth camera 13 acquires depth information at each location in the person region, calculates a location (left and right hands, head, neck, left and right shoulders, left and right elbows, left and right knees, left and right feet, etc.) on the real space of the person captured in the person region based on the feature values of the depth and shape, and calculates the center position of each location as the bone position. The depth camera 13 calculates each part in the human figure region by comparing the feature amount of each part registered in the feature amount dictionary with the feature amount determined from the human figure region, using the feature amount dictionary stored in the storage unit. The depth camera 13 detects unevenness of a person (unevenness such as a palm shape, a wrinkle of clothes, and the like, unevenness of a hairstyle, and the like).

The depth camera 13 may output the detection result in the infrared detection unit to another device (the control device 17, the server 40, the

user terminals

50, 70, and 80, the control device 17 provided in the studio 10, and the like), and perform processing such as calculating depth information, extracting a human figure region, classifying into a human figure region and a non-human figure region, detecting a bone position, and specifying each part in the human figure region by another device.

The above motion capture process may be performed without labeling the publisher X, or may be performed with labeling the publisher X.

In addition, when calculating the depth information, a method (optical coding method) may be employed in which the projected infrared pattern is read and the depth information is acquired based on a deformation of the pattern.

Further, the depth information may be calculated from parallax information based on a twin-lens camera or a plurality of cameras. The depth information can also be calculated by performing image recognition on the video acquired by the RGB camera 12 and performing image analysis using a photogrammetric technique or the like. In this case, the RGB camera 12 functions as a detection section, and therefore the depth camera 13 is not necessary.

The depth information may also be calculated by combining the above methods.

[ control device 17]

As shown in fig. 2, the control device 17 that controls each device of the studio 10 has an interface (hereinafter, simply referred to as "IF") with each part of the studio 10, and includes an RGB camera IF 21; a depth camera IF 22; haptic device IF 23; an audio IF 24; display IF 25; a network IF 26; a communication IF 27; and a control unit 28.

The RGB camera IF21 controls the RGB camera 12, and inputs the video captured by the RGB camera 12. The depth camera IF22 controls the depth camera 13, and inputs three-dimensional position information from the depth camera 13. The haptic device IF23 controls the haptic device 16. That is, the haptic device 16 has an actuator 16a, and the haptic device IF23 drives the actuator 16 a. The audio IF24 can be connected to a sound output device such as an earphone or a headphone, in addition to the speaker 15. The audio IF24 is connected to an audio input device such as a microphone 11. The display IF25 is connected to the studio monitor 14 as an example. For example, the network IF26 communicates with the server 40 via the network 2 by wire or wirelessly. As an example, the communication IF27 communicates with the publisher terminal 18 by wire or wirelessly.

The control unit 28 is, for example, a CPU, ROM, or RAM, and transmits the video captured by the RGB camera 12, the live video (excluding the article object 3 and the target object 5) mixed with the audio data collected by the microphone 11, the three-dimensional position information, and the like to the server 40 via the network IF 26. The control unit 28 displays the live video on the studio monitor 14 via the display IF25 or on the display unit of the distributor terminal 18 via the communication IF 27. The control unit 28 receives the live video 4 (including the article object 3 and the object 5) distributed from the server 40 via the network IF26, and displays the received video on the studio monitor 14 via the display IF 25. Further, the control unit 28 displays the information on the display unit of the publisher terminal 18 via the communication IF 27.

[ publisher terminals 18]

The publisher terminal 18 is a device managed by the publisher X, and is a smart device terminal such as a smart phone or a tablet computer. The publisher terminal 18 includes: a display unit 31; a speaker 32; an operation section 33; a network IF 34; a data storage unit 35; a main memory 36; and a control unit 37. The display unit 31 is a liquid crystal display panel or an organic EL panel, and the operation unit 33 is a touch panel provided in the display unit 31. The network IF34 exchanges data with the control device 17. The data storage unit 35 is a nonvolatile memory and stores a program for viewing the live video 4. The main memory 36 is, for example, a RAM, and temporarily stores the live video 4, the control program, and the like under distribution. The control unit 37 is, for example, a CPU, and controls the overall operation of the publisher terminal 18. For example, when preparing a live video, the control unit 37 transmits one or more selection data items in a list of data specifying an object part and an object of the object part in the live video to the server 40. When playing the live video 4, the control unit 37 displays the live video 4 delivered from the server 40 on the display unit 31, and emits audio and music from the speaker 32.

[ Server 40]

The server 40 is a device for managing the live video system 1, and includes: a live database 41; an attribute database 42; a data storage unit 43; a network IF 44; a main memory 45; and a control unit 46.

The live broadcast database 41 as a management unit associates user IDs of users registered in the system with each other for each live broadcast, that is, for each distributor, and manages articles of each user. Specifically, the live database 41 associates each user ID with each live broadcast, and manages an article ID and the presence/absence of a received article. The item ID is an ID for uniquely identifying an item purchased by the user, and is an ID for uniquely identifying an item given by the user to the publisher in each live broadcast. The presence or absence of a received article is used to manage whether or not an article given by a user enters a container as a target portion.

In addition, the live database 41 manages all users who can participate in the live video in association with the user IDs. Users participating in each live broadcast may be selected from all registered users. The live database 41 manages, for example, prices of respective articles in association with article IDs.

The attribute database 42 manages the attribute parameters of the target part in the case where the target part is a real object; an item attribute parameter as a virtual object; attribute parameters of an object (table 6, wall, etc.) located outside the target part of the real space of the studio 10; and attribute parameters of the target part when the target part is a virtual object. Fig. 3 shows this example. In fig. 3, the measured value is a value measured by a sensor, the set value is a value set by the publisher X, the user A, B, C, or the service provider, and the calculated value is a value calculated using the measured value or the set value. Specifically, the attribute database 42 manages parameters of shape, position, inclination, velocity, acceleration, mass, hardness, surface roughness, warmth, smell, and the like.

For example, when the target portion is a real object, a plurality of buckets, cups, trash boxes, decoration boxes, moral boxes, and the like, which become the target portion, are prepared in the studio 10. In this case, the attribute database 42 stores the attribute parameters of the real object to be the target part prepared in advance in the studio 10. In addition, when the target portion is a virtual object, attribute parameters of a plurality of virtual object objects selectable by the publisher X are stored in the attribute database 42. For example, a bucket, a cup, a trash box, a decorative box, and a moral box as the target part are prepared. In this case, the attribute database 42 stores attribute parameters of buckets, cups, trash boxes, decoration boxes, and moral boxes as virtual objects. Since the target portion of the virtual object is not a real object, it is also possible to set parameters different from the real object in order to improve entertainment.

When the object is an actual object, the object is actually present, and therefore, the parameters such as the shape, the inclination, and the position are measurement values measured in advance for the object by the sensor, and the velocity and the acceleration of the object and the article when they move are actual measurement values. For example, when the object is actually moved, the target object 5 is displayed so as to be moved in accordance with the measurement value. Further, when the actual object is physically deformed, the target object 5 is also deformed in accordance with the measurement value. For example, when the palm is set as a container serving as a target portion as a real object, the container constituted by the palm is the target object 5. The target object 5 moves and deforms the container made of the palm according to the movement of the palm and the opening degree of the palm. For example, when a real object trash can is set as the target portion, a space surrounded by the inner surface and the bottom surface of the trash can is set as the storage portion, and when the trash can held by hand or the trash can placed thereon is moved, the target object 5 is also moved accordingly. Of course, the actual object may be moved or deformed by a calculation value calculated based on the actual measurement value. In addition, the parameters relating to the warmth and the surface roughness may be set to values different from those of the actual object. For example, even if the surface temperature of the target portion container and the table 6 is 15 degrees centigrade by thermography, the surface temperature can be set to a temperature different from the actual temperature (36 degrees centigrade of body temperature, for example). Thus, for example, when the article is chocolate, the chocolate is easily melted at 30 ℃ as compared with the case where the container and the table 6 of the target portion are at 15 ℃, and the chocolate is easily melted or melted, thereby enhancing the entertainment. When the surface roughness of the article and the surface roughness of the floor surface and the table 6 are set to be higher than the actual surface roughness, it is considered that the article is hard to roll and hard to bounce. As an example, the surface roughness can also be set by measuring the unevenness by the depth camera 13. The container such as a cup or a garbage can as a real object can be made different in parameters such as surface roughness between the outer surface and the inner surface of the container. In addition, as an example, not only one target portion but also a plurality of target portions can be set in one live video 4.

The article is a virtual object and is displayed as an article object 3. The items are not located in the real space of the studio 10 and therefore the shape, tilt, position, velocity etc. parameters are calculated from the initial values set. In addition, when the olfactory feedback device is provided, attribute parameters related to the odor are set in the attribute database 42. Thus, for example, when the item is a food, the smell corresponding to the food can be fed back to the publisher X.

The data storage unit 43 is a hard disk or a nonvolatile memory, and stores a program for distributing the live video 4, image data constituting the article object 3 or the target object 5, and the like. For example, when the article object 3 and the target portion are virtual objects, the image data constituting the target object 5 is three-dimensional moving image data or image data, and can be displayed in accordance with the orientation.

The network IF44 exchanges data with the control device 17, the publisher terminal 18, and the

user terminals

50, 70, and 80. The main memory 45 is, for example, a RAM, and temporarily stores the live video 4, the control program, and the like under distribution.

The control unit 46 is, for example, a CPU, and controls the overall operation of the server 40. For example, the control unit 46 functions as a distribution unit that distributes the live video 4 to the control device 17, the distributor terminal 18, and the

user terminals

50, 70, and 80 via the IF 44.

For example, the control unit 46 functions as a setting unit for setting the position of the target portion, sets a video captured by the RGB camera 12 based on the three-dimensional position information from the depth camera 13, adds the target object 5 to the position, and generates the live video image 4. The control unit 46 also functions as a display control unit that specifies the position of an article on the video in accordance with the user operation information from the

user terminals

50, 70, and 80, and generates the live video 4 to which the article object 3 is added. Then, the control unit 46 as a display control unit determines whether or not the target position of the target object 5 overlaps with the article position of the article object 3, and generates the live video 4 in which the article object 3 enters the target object 5 when the article position overlaps with the target position. When the object 3 and the target object 5 are in contact with each other, they are deformed in accordance with the attribute parameters. The control unit 46 also functions as a tactile control unit that controls the tactile device 16 worn by the publisher X in accordance with the state of the article with respect to the target portion.

For example, the control unit 46 functions as a period management unit that manages the 1 st special period in which the target unit functions as a container for the article. For example, the control unit 46 changes the setting of the parameter so that the target portion does not function as a container except for the 1 st special period. In this case, as an example, even if the article touches the target portion other than the 1 st special period, it is displayed that the article object 3 bounces without entering the target object 5.

For example, the control unit 46 functions as a period management unit that manages the 2 nd special period in which the user A, B, C offers a benefit. For example, in the 2 nd special period, the control unit 46 sets a change parameter to increase the shape of the target portion as an advantage, and facilitates entry of the article into the container of the target portion.

In addition, as an example, the change parameter is set so that the size of the target portion changes, and when the object enters when the target portion becomes small, a higher score (benefit) is obtained than the reference score in the case other than the 2 nd special period. When the target portion is large, the score may be set to be lower than the reference score at the time other than the 2 nd special period when the article enters.

For example, the 1 st specific period and the 2 nd specific period may be the same period on the time axis, or the 2 nd specific period may be present in the 1 st specific period. The control unit 46 may manage only one of the 1 st or 2 nd special periods, or may manage other special periods.

A plurality of articles can be accommodated in the target portion. When the first article enters the target portion, the control portion 46 displays that the first article bounces or rolls in the container according to the attribute of the container of the target portion. The control unit 46 continues to display the article object 3 corresponding to the article that entered the container, and when a new article enters further, displays a state of collision with the article that is originally present, according to the properties of the article and the container of the target portion. As an example, it is shown that two articles move in the container due to a collision of a new article with an originally existing article. As an example, the control unit 46 continuously displays a state where the article object 3 enters the target object 5 as a container. Then, for example, when the 1 st and 2 nd specification periods end, the control unit 46 deletes the article object 3 entering the target object 5 and resets the display of the article in accordance with the operation signal from the

user terminal

50, 70, 80.

The server 40 may directly control the microphone 11, the RGB camera 12, the depth camera 13, the studio monitor 14, the speaker 15, the haptic device 16, and the like provided in the studio 10 without the control device 17.

The server 40 may perform a part of the above processing in cooperation with the

user terminals

50, 70, and 80. For example. The live image acquired by the RGB camera 12, the three-dimensional position information acquired by the depth camera 13, and the like are transmitted to the

user terminals

50, 70, 80. The

user terminals

50, 70, and 80 may detect the motion of the user A, B, C, and display the target object 5 and the like on the

user terminals

50, 70, and 80, the studio monitor 14, and the publisher terminal 18 based on the live video, the three-dimensional position information, and the detection result of the motion of the user A, B, C.

[ user terminal 50]

As shown in fig. 4, the user terminal 50 is a device managed by the user a, and includes a desktop or notebook personal computer 50a and a smart watch 60, for example. As an example, the notebook personal computer 50a includes: an audio IF 51; display IF 52; a network IF 53; a communication IF 54; a data storage section 55; an operation IF 56; a main memory 57; and a control unit 58. The audio IF51 is connected to a sound output device such as a speaker, an earphone, or a headphone, or a sound input device such as a microphone. For example, the display IF52 is connected to the display unit 59 formed of a display device such as a liquid crystal display device.

For example, the network IF53 communicates with the server 40 via the network 2. The communication IF54 communicates with the smart watch 60 as an example. The communication IF54 and the smart watch 60 are connected by a wireless LAN or a wired LAN, and acceleration data, angle data, angular velocity data, and the like as user motion information are input from the smart watch 60. The data storage unit 55 is a nonvolatile memory, for example, a hard disk or a flash memory. The data storage unit 55 stores a program for reproducing the live video 4, a program for controlling communication with the smart watch 60, and the like. The operation IF56 is connected to an operation device such as a keyboard or a mouse. When the display unit 59 connected to the display IF52 is provided with a touch panel, the touch panel is connected thereto. The main memory 57 is, for example, a RAM, and temporarily stores the live video 4, the control program, and the like during distribution. For example, the control unit 58 is a CPU and controls the overall operation of the user terminal 50. For example, the control unit 58 transmits one or more item selection data in a list of items to be selected to the server 40. Further, for example, the control unit 58 transmits operation data such as acceleration data, angle data, and angular velocity data, which are pieces of user motion information detected by the smart watch 60, to the server 40.

The smart watch 60 is, for example, a wristwatch-type information processing terminal that is worn on the wrist of the dominant hand of the user a. The smart watch 60 includes: a sensor 61; a communication IF 62; a data storage section 63; a main memory 64; a vibrator 65; and a control unit 66. The sensor 61 is, for example, an acceleration sensor or a gyro sensor. For example, the communication IF62 transmits acceleration data detected by the sensor 61, angle data of the smart watch 60, and angular velocity data to the personal computer 50 a. For example, when the user a performs a motion to throw an object, the sensor 61 detects operation data such as acceleration data, angle data, and angular velocity data, which are user motion information relating to the swinging of the wrist. The data storage unit 63 is a nonvolatile memory, and is, for example, a hard disk or a flash memory. The data storage unit 63 stores a communication control program or the like for communicating with the driver for driving the sensor 61 or the personal computer 50 a. For example, the control unit 66 is a CPU and controls the overall operation of the smart watch 60. For example, the controller 66 drives the vibrator 65 to notify the user B of the article by feeling when the article enters the target portion or falls down.

In addition, the terminal connected to the user terminal 50 may be a small-sized portable information processing terminal such as a smartphone having an acceleration sensor or a gyro sensor, instead of the smart watch 60.

[ user terminal 70]

For example, the user terminal 70 is a device managed by the user B, and is a smart device terminal such as a smart phone or a tablet computer. As an example, the user terminal 70 includes: an audio IF 71; display IF 72; an operation IF 73; a sensor 74; a network IF 75; a data storage unit 76; a main memory 77; a control section 78; a display portion 79; a vibrator 79 a. The audio IF71 is connected to an audio output device such as a built-in speaker or earphone, or an audio input device such as a built-in microphone. For example, the audio IF71 plays audio data of the live video 4 from an audio output device such as a speaker or a headphone. The display IF72 is connected to a small display 79 such as a built-in liquid crystal panel or organic EL panel. The display 79 is provided with a touch panel, and the operation IF73 is connected to the touch panel. The sensor 74 is, for example, an acceleration sensor or a gyro sensor. For example, the network IF75 communicates with the server 40 via the network 2. For example, when the user performs a motion to throw an object, the web IF75 transmits operation data including acceleration data, angle data, and angular velocity data relating to the swinging of the wrist, which are detected by the sensor 74 as user motion information, to the server 40. When a slide operation is performed by a finger or a stylus on the display surface on which the live video image 4 is displayed in the direction of the displayed target object 5, operation data such as coordinate data thereof is transmitted to the server 40 as user operation information. The data storage unit 76 is a nonvolatile memory, for example, a flash memory. The data storage unit 76 stores a playback program for the live video 4. The main memory 77 is, for example, a RAM, and temporarily stores the live video 4, the control program, and the like.

For example, the control unit 78 is a CPU and controls the overall operation of the user terminal 70. For example, when playing back the live video 4, the control unit 78 transmits one or more pieces of selection data in a list of objects of the item to the server 40. Further, for example, when the user holds the user terminal 70 and performs the motion or operation of throwing an object, the control unit 78 transmits operation data such as acceleration data, angle data, angular velocity data, and coordinate data based on the sliding operation of the wrist as the user motion information to the server 40.

Since the smart device terminal 70a of the user C is the same terminal as the user terminal 70, detailed description thereof will be omitted.

[ video live broadcast preparation processing ]

The operation of the live video broadcast system 1 will be described below.

[ case where the target part is a container for a real object ]

Before live video is distributed, the control device 17 and the distributor terminal 18 log in the server 40, and set a settable target portion as preparation for live video. The publisher X uses the publisher terminal 18 or the control device 17 as an operation terminal, and transmits an image capture start operation signal for live video from the publisher terminal 18 or the control device 17 to the server 40. In the studio 10, the depth camera 13 acquires depth information at each location in the studio 10, calculates a person region, and then calculates a skeleton position in the person region, thereby calculating depth information at each skeleton position. Thereafter, the motion capture process is performed by the depth camera 13.

As shown in fig. 5 (a) and (b), upon receiving the image capturing start operation signal, the server 40 acquires the live image from the RGB camera 12 and the three-dimensional position information from the depth camera 13, and generates a live image 4 a. Then, the server 40 distributes the live video 4a to the control device 17 and the distributor terminal 18 via the network 2. The live view 4a is displayed on the studio monitor 14 through the control device 17. The live video 4a received through the network 2 without passing through the control device 17 or through the control device 17 is displayed on the distributor terminal 18. The live video 4a is a video that is not distributed to the

user terminals

50, 70, and 80, is a video that can be viewed by the distributor X to set a target part, and is a video that does not include the article object 3 and the target object 5.

Next, the publisher X performs an operation of setting the target portion. Fig. 5 (a) and (b) show an example in which a cup 5a as a real object is placed on a table 6 in a studio 10 as a target portion. The cup 5a is a container that becomes a target when the user A, B, C throws an article, and is a container into which the thrown article enters. In step S1, the server 40 adds an object designation start button to the live video image and displays the result on the display surfaces of the distributor terminal 18 and the studio monitor 14. Here, when the publisher X touches an area in which the target section designation start button is displayed on the display surface of the publisher terminal 18 with a finger or a stylus pen, the publisher terminal 18 transmits a target section designation start signal to the server 40.

In step S2, the server 40 enters the target portion specification mode when receiving the target portion specification start signal. Further, the control device 17 may transmit the target portion designation start signal to the server 40 by clicking a target portion designation start button with an operation portion such as a mouse connected to the control device 17 while viewing the studio monitor 14 connected to the control device 17. Then, the server 40 notifies that the target portion can be specified on the display surface of the publisher terminal 18 or the studio monitor 14.

In step S3, the server 40 adds a determination button for determining the target portion to the live video 4a, and displays the result on the display surfaces of the publisher terminal 18 and the studio monitor 14.

Next, the publisher terminal 18 designates an area of the target portion (designated area) as a target position. The designated area is designated by the publisher X tracing the outline of the object of the cup 5a displayed on the display surface of the publisher terminal 18 with a finger or a stylus pen. When the designated area is designated by the publisher X, the publisher terminal 18 transmits the designated area data indicating the designated area to the server 40, and the server 40 receives the designated area data in step S4. Further, the designated area may be designated by an operation unit such as a mouse connected to the control device 17 while viewing the studio monitor 14, and the designated area data may be transmitted from the control device 17 to the server 40.

In step S5, the server 40 specifies a specified area indicated by the specified area data from the live image 4a captured by the RGB camera 12, and detects and recognizes the cup 5 a. The server 40 acquires three-dimensional position information corresponding to the designated area, and recognizes the designated area in the studio 10 as the target object 5. The server 40 sets the area of the detected cup 5a as the target position. As an example, a space portion (a portion into which an article enters) inside the cup 5a is set as a target position. As an example, a space surrounded by the inner surface and the bottom surface of cup 5a is set as the target position. Moreover, the server 40 performs voice recognition on the sound collected by the microphone of the studio 10, defining the target object 5 as a "cup". As an example, the space portion (article entrance portion) inside the cup 5a can be defined as follows: the cup 5a is rotated by 360 degrees or the like to the left and right and up and down with respect to the depth camera 13, and the depth camera 13 detects all the surfaces of the cup 5a, such as the outer surface, the inner surface, and the bottom surface, and specifies all the surfaces of the cup 5a using the three-dimensional position information.

As a method for allowing the server 40 to recognize the target portion, the target portion may be recognized based on a difference between a state (non-captured state) in which the cup 5a as a real object of the target portion does not enter the viewing angle (imaging range) of the RGB camera 12 and the detection range of the depth camera 13 and a state (captured state) in which the cup 5a enters the viewing angle and the detection range. As an example, the target portion can be recognized from the difference between the state (not photographed state) where no actual cup 5a is placed on the table 6 and the state (photographed state) where the cup 5a is placed. Further, for example, the publisher X may hold the cup 5a with his hand and push the cup 5a toward the depth camera 13, bring the cup 5a closer to the depth camera 13 than any part of the publisher X, and detect a motion vector or the like to recognize the cup 5a as a target part. The server 40 may perform voice recognition of the sound of "cup" uttered by the publisher X with voice, extract an image of the cup from the live video 4a with reference to an image dictionary present on the network, and set the image as a target portion.

In step S6, the server 40 determines whether or not there is a re-specification operation of the specified area in the publisher terminal 18. Specifically, the server 40 determines whether the designated area data from the publisher terminal 18 is received again. Then, when the server 40 re-designates the designated area, the process from step S4 is repeatedly executed.

When the publisher X touches an area on which a determination button for determining the target portion is displayed with a finger or a stylus on the display surface of the publisher terminal 18, the publisher terminal 18 transmits a determination signal for determining the target portion to the server 40. When the server 40 receives the determination signal in step S7, it determines the target portion in the live video 4a in step S8. That is, the server 40 sets the cup 5a as the target object 5. That is, the target portion expressed by the target object 5 (cup 5a) is defined as a container based on the registered contents (parameters such as shape, position, inclination, velocity, acceleration, mass, hardness, surface roughness, warmth, smell, etc.) of the attribute database 42.

In the above example, although the case where one cup of a real object exists in the live image 4a has been described, a plurality of cups of real objects that can be target portions may exist in the live image 4 a. In this case, a parameter is set in the attribute database 42 for each cup. In the target portion specification mode, one cup may be selected from a plurality of cups existing in the live view image 4 a. The selection of the cup can be made by drawing or touching the selected cup with a finger or a stylus. Alternatively, when voice recognition is used, expressions such as "rightmost cup", "middle cup", and the like may be interpreted to identify the corresponding cup.

The server 40 tracks each bone position of the publisher X, depth information of the bone position, and the three-dimensional position of the cup 5a using the depth camera 13. In addition, the server 40 detects the cup 5a through the RGB camera 12. Therefore, when the publisher X moves the child quilt 5a while holding the cup 5a, the target position moves correspondingly. For example, when the publisher X places the cup on the table 6 and moves his hand, the position where the cup 5a is placed becomes the target position. When the article position of the article thrown from the user A, B, C overlaps with the target position where the target object 5 is displayed, the server 40 can display the article object 3 on the

user terminals

50, 70, and 80 so that the article object enters the container of the target object 5.

[ case where the target portion is set as a part of the publisher with a gesture ]

As shown in fig. 6 (a) and (b), in step S11, the server 40 adds a target portion designation start button to the live video 4a and displays the target portion designation start button on the display surfaces of the distributor terminal 18 and the studio monitor 14. In step S12, the server 40 enters the target portion specification mode when receiving the target portion specification start signal. In step S13, the server 40 adds a determination button for determining the target portion to the live video 4a and displays the result on the display surfaces of the distributor terminal 18 and the studio monitor 14.

When the target portion is set as a part of the publisher by the gesture, the server 40 manages the gesture and the target portion in association with each other within the program. For example, the server 40 defines the target portion as the palms of both hands when the publisher X performs a gesture with the right hand or the left hand or both hands in the order of a stone, a scissors, and a cloth, and defines the target portion as the head of the publisher X when performing a gesture with a scissors, a stone, and a cloth. The server 40 tracks each skeleton position (for example, a skeleton position of a palm) of the publisher X and depth information of the skeleton position (for example, depth information of the palm) by using the depth camera 13. In addition, the server 40 detects the palm using the RGB camera 12. In step S14, the server 40 detects a gesture performed by the publisher X based on these pieces of information.

In step S15, the server 40 recognizes a position associated with the detected gesture as a target portion. For example, when a gesture of "stone, scissors, cloth" is detected, palms of both hands are recognized as target portions. The server 40 recognizes the palms of both hands as the target portion using the voice data collected by the microphone 11 of the control device 17 and the microphone of the publisher terminal 18, for example, the voice data of "stone, scissors, cloth" specifying the target portion. That is, the palms of both hands become the target objects 5. The server 40 sets the detected palms of the hands as the target positions. As an example, a space portion (a portion into which an article enters) surrounded by palms of both hands is set as a target position.

As an example of a method for causing the server 40 to recognize a gesture for setting the target portion, the target portion may be recognized based on a difference between a state in which palms of both hands performing the gesture are not placed on the table 6 (a state in which no image is captured) and a state in which palms performing the gesture are placed (a state in which an image is captured). For example, the palms of both hands that the publisher X has performed the gesture may be pushed out in the direction of the depth camera 13, and the palms of both hands may be moved closer to the depth camera 13 to perform the gesture, so that the server 40 can reliably recognize the gesture.

In step S16, the server 40 determines whether or not the operation of reassigning the target section is performed at the publisher terminal 18. Specifically, the server 40 determines whether the gesture is performed again. Then, when the target portion is re-designated, the server 40 repeatedly executes the processing from step S14.

When the publisher X touches an area on which a determination button for determining the target portion is displayed with a finger or a stylus on the display surface of the publisher terminal 18, the publisher terminal 18 transmits a determination signal for determining the target portion to the server 40. When the server 40 receives the determination signal in step S17, it determines the target portion in the live video 4a in step S18. That is, the server 40 sets the portion associated with the gesture as the target object 5. For example, when a gesture of "stone, scissors, cloth" is performed, palms of both hands are set as target portions. That is, the target portion expressed by the target object 5 (palms of both hands) is defined as a container based on the registered contents (parameters such as shape, position, inclination, velocity, acceleration, mass, hardness, surface roughness, warmth, and odor) of the attribute database 42.

Thus, the server 40 tracks the palms of both hands, and therefore, even if the publisher X moves the palms of both hands, the target portion moves in correspondence with the palms. When the article position of the article thrown from the user A, B, C overlaps the target positions of the palms of the hands, the server 40 can display the article on the

user terminals

50, 70, and 80 so that the article enters the target object 5 (palms of the hands). As an example, even if the publisher X moves the palms of both hands and the target positions corresponding thereto change, when the thrown article position overlaps with the target position of the destination where the palms of both hands move, the article can be displayed on the

user terminals

50, 70, 80 so that the article enters the target object 5 (palms of both hands).

[ case where the target part is a virtual object ]

As shown in fig. 7 and (a) to (c) of fig. 8, in step S21, the server 40 adds an object designation start button to the live video 4a and displays the result on the display surfaces of the distributor terminal 18 and the studio monitor 14. In step S22, the server 40 enters the target portion specification mode when receiving the target portion specification start signal. The publisher X transmits selection data for selecting an object from a virtual object, not from a real object, to the server 40 via the publisher terminal 18 or the like.

As shown in fig. 8 (a), in step S23, the server 40 adds a list object 7 for selecting a target object 5 of a virtual object to be a target portion to the live video 4a, and displays the list object on the display surfaces of the publisher terminal 18 and the studio monitor 14. In the list object 7, a plurality of selection candidate objects constituting a selection candidate group for displaying virtual objects that do not exist in the studio 10 as the target objects 5 are displayed. In the list object 7, a bucket object, a cup object, a trash box object, a decorative box object, and a kungfu box object are displayed as selection candidates. Alternatively, the list object 7 may be scrolled left, right, up, and down, and other target objects 5 may be displayed and selected. The target object 5 may be an object that the user has made himself or an object that the user uploads to the server 40. Here, the bucket, the cup, the trash box, the decoration box, and the kungfu box are all virtual objects that do not exist in the studio 10, and attribute parameters are registered in the attribute database 42 for each virtual object. As an example, the target object 5 of the virtual object is three-dimensional data.

When the publisher X touches a desired object with a finger or a stylus in the publisher terminal 18, the publisher terminal 18 transmits selection data of the object to the server 40 in step S24. When receiving the selection data, the server 40 selects an object indicated by the selection data as the selection target portion as the target object 5. Then, in step S25, the server 40 adds a determination button for determining the target portion to the live video 4a and displays the result on the display surfaces of the publisher terminal 18 and the studio monitor 14. Accordingly, as shown in fig. 8 (b), the server 40 displays the target object 5 of the selected virtual object in a highlighted (e.g., enlarged) manner. Further, only the target object 5 of the selected virtual object may be displayed.

In the state of step S25, only the target object 5 displayed as the target portion is set, and the target position at which the target object 5 is displayed is not set. Then, the server 40 displays an explanation of transition to processing for setting the target position. After the reception, the publisher X performs a gesture for setting a target position of the display target object 5, for example. As an example, the gesture here is a motion of pushing out the palms of both hands representing the target object 5 in the direction of the depth camera 13. The server 40 tracks each skeletal position (for example, a skeletal position of a palm) of the publisher X and depth information of the skeletal position (for example, depth information of the palm) by the depth camera 13. In addition, the server 40 detects palms of both hands by the RGB camera 12. In step S26, the server 40 detects a gesture performed by the publisher X based on these pieces of information. Then, in step S27, the server 40 observes the palm addition target object 5 positioned most forward from the depth camera 13 (see fig. 8 (c)). In step S28, the server 40 sets the position of the target object 5 as the target position. As an example, the target position may be a position of a palm of the hand of the publisher X holding the bucket 5b, but here, the target position is a position of the bucket 5b held by the publisher X. As an example, the server 40 sets an area surrounded by the outer shape of the added bucket 5b as a target position. As an example, a space portion (a portion into which articles enter) inside the tub 5b is set as a target position. The space portion (portion into which articles enter) inside the tub 5b may be defined in advance in the attribute database 42. Thus, when an article enters the target portion from the opening of the bucket 5b held by the palm of the publisher X, it can be determined that the article enters the target portion, and if the article touches only the outer surface of the bucket 5b, the article does not enter the bucket, and the actual relationship between the article and the bucket can be reproduced more accurately.

Note that, instead of performing the setting processing in step S26, the target position may be determined in advance as a predetermined position on the table 6. In this case, when the target object 5 is selected, the selected target object 5 is added to the target position decided in advance. Then, even if the palm is separated from the bucket 5b, the target portion becomes the bucket 5b, and the object can be thrown with the bucket 5b as the target.

In step S29, the server 40 determines whether or not an operation of re-specifying the target position is performed in the publisher terminal 18. Specifically, the server 40 determines whether the gesture is performed again. Then, when the target portion is re-designated, the server 40 repeatedly executes the processing from step S26.

When the publisher X touches an area on which a determination button for determining the target portion is displayed with a finger or a stylus on the display surface of the publisher terminal 18, the publisher terminal 18 transmits a determination signal for determining the target portion to the server 40. When the server 40 receives the determination signal in step S30, it determines the target portion in the live video 4a in step S31. That is, the target portion expressed with the target object 5 (bucket 5b) is defined as a container based on the registered contents (parameters of shape, position, inclination, velocity, acceleration, mass, hardness, surface roughness, warmth, smell, and the like) of the attribute database 42.

Therefore, when the publisher X moves his hand to move the bucket 5b, the target position of the bucket 5b is also moved correspondingly. Alternatively, the bucket 5b may be placed on the table 6 and the hand released from the bucket 5 b. In this case, the position of the target object 5 of the bucket 5b on the table 6 becomes the target position. As described above, even if the position of the bucket 5b changes, when the target position of the destination of movement overlaps with the article position of the article, the article enters the container of the target object 5.

[ video live broadcast processing ]

As shown in fig. 9 (a), when the setting of the target portion is completed, the server 40 becomes a state capable of distributing the live video 4 in which the target object 5 is added to the live video 4a in step S41. In addition, for example, when receiving the distribution start signal from the distributor terminal 18, the server 40 is in a state in which the live video 4 can be viewed at the

user terminals

50, 70, and 80. In the following description, a case where the user B (user terminal 70) views the live video 4 will be described.

The user terminal 70 displays a list of the distributors X in the live video. For example, a list of publisher targets including a photograph of the face of the publisher, a portrait image, and the like is displayed. The user terminal 70 transmits publisher selection data to the server 40 when selecting a publisher object of a desired publisher. In step S42, when receiving the publisher selection data from the user terminal 70, the server 40 can view the live video 4 of the publisher indicated by the publisher selection data at the user terminal 70 that is the transmission source of the publisher selection data. The user terminal 70 can view the live video 4 of the selected publisher X.

In step S43, the server 40 determines whether or not to enter the special period. For example, the specific period is the 1 st specific period or the 2 nd specific period. The 1 st special period is a period in which the target portion indicated by the target object 5 functions as a container. The 1 st special period may be a period from the start of the live video broadcast by the publisher X to the end of the live video broadcast. The 1 st special period may start after the 1 st period (for example, 1 minute) elapses from the start of viewing by the user B, continue for the 2 nd period (for example, 10 minutes), and end when the 2 nd period elapses. The 1 st specific period may be set to start when the publisher X performs a start gesture (for example, raises a hand) to start the signal of the 1 st specific period, detects the start gesture by the server 40, and ends when performing an end gesture to raise the hand again (see fig. 9 (b)). Further, even if the article hits the target portion other than the 1 st special period, it can be displayed that the article object 3 bounces without entering the target object 5. As an example, the display of the target object 5 may be started at the start of the 1 st specific period and ended at the end.

The 2 nd special period is a period in which a benefit is given to the user B. For example, when the user terminal 70 starts viewing the live video 4 of the selected distributor X, the target portion indicated by the target object 5 functions as a container at any time. When entering the No. 2 special period, the server 40 changes the parameter setting so as to increase the shape (e.g., the shape of the entrance) of the target object 5 as the target portion, and as a user benefit, the item is easy to enter the container of the target object 5. Further, the size may be changed so that the size of the target portion (the shape of the entrance, etc.) is slightly smaller. In this case, when the target object 5 becomes large, the score at the time of entry of the article is set to a low score lower than the reference score, and when the target object 5 becomes small, the score at the time of entry of the article is set to a high score higher than the reference score as a user preference. For example, the score is usually 2, but when the target object 5 is large, the score is low (score 1), and when the target object 5 is small, the score is high (score 3). The size of the target object 5 is changed as appropriate when the target object is a virtual object rather than a real object. Even if the material object is a folding cup or bucket, the size of the material object can be changed by the operation of the publisher X. As an example, even if the object is a real object and the shape thereof is not changed, the "ease of entry" of the article may be changed only in the 2 nd special period. In this case, a display for notifying a change in "ease of entry" or a display for the 2 nd special period may be added to the actual object or the like.

For example, when a special period such as the 1 st special period or the 2 nd special period is associated with a music such as karaoke, a period of an interlude, an interval between music and music, or the like may be used. In addition, the 2 nd special period may be present in the 1 st special period. Thereby, the user can more pleasantly buy the item and throw the item in the 1 st special period and the 2 nd special period. The 1 st specific period and the 2 nd specific period may be the same period on the time axis. The control unit 46 may manage only one of the 1 st or 2 nd special periods, or may manage other special periods.

The special period may be terminated when all the articles are sold or when all the articles enter the target portion, with the number of articles (for example, 10 articles) that can be given (or purchased) during the special period being limited. The number of items assigned may be determined by the publisher X or may be determined by the service provider. The number of articles to be attached may be determined by the type of the selected target object 5, that is, the capacity. That is, when the container of the target object 5 is filled with the article, the special period can be ended. The criterion for determining whether or not the container of the target object 5 is filled with the articles will be described later.

Then, the server 40 proceeds to step S44 when entering the special period. In step S44, the server 40 starts a process of receiving a gift, that is, an article, from the user B who is viewing the live video image 4 of the publisher X by using the user terminal 70. The gift giving process of step S44 will be described later. In step S45, the server 40 determines whether or not the special period has ended, and when the special period has ended, the process proceeds to step S46. In step S46, the server 40x transmits an end notification of the special period to the user terminal 70 of the user B who is viewing the live video 4.

After the end of the special period, the user B can delete all the item objects 3 displayed as having entered the target object 5. In this case, when the user B touches the item all deletion button of the item displayed on the display surface of the user terminal 70 with a finger or a stylus, the user terminal 70 transmits an item all deletion signal to the server 40. When the server 40 receives the item all deletion signal in step S47, the process of deleting the item object 3 from the target object 5 is performed in step S48. If the operation of deleting all the articles is not performed, the processing of step S48 is omitted, and the article object 3 is continuously displayed until the viewing of the live video 4 is finished.

When the user B finishes viewing the live video image 4, the user B may touch an end button displayed on the display surface of the user terminal 70 with a finger or a stylus pen, and when the end button is operated, the user B disconnects the server 40. Or close the display window. When the publisher X finishes the video live broadcast, a distribution finishing operation is performed at the publisher terminal 18. Then, the publisher terminal 18 transmits a distribution end signal to the server 40. In step S49, when receiving the distribution end signal, the server 40 performs the end processing of the corresponding live video.

[ Gift presentation treatment ]

In the gifting process, the user B first selects an item to be given to the publisher X. Thereafter, an operation or an operation is performed to throw the item selected by the user terminal 70 with the target object 5 displayed on the user terminal 70 as a target, and the item is given to the publisher X.

As shown in fig. 10 a, in step S51, the server 40 displays an item selection object 8 in which selectable items are listed in the live video 4 on the display screen of the user terminal 70 (see fig. 10 b). As an example, in fig. 10 (b), as the selection candidate group of the article, the article object of the bouquet, the article object to which the special effect showing the motion of the publisher X is added, the article object of the hair of the cat ear, the article object of the background image of the live video image 4, and the article object of the gem are listed in a row from the left side in the article selection object 8. In addition to this, the article selection object 8 may be scrolled left, right, up, and down, and other article objects (chocolate, etc.) may be displayed and selected. In addition, the article may be an article that is self-made by the user and uploaded to the server 40 by the user.

The items listed in the item selection object 8 are a plurality of items prepared by the operator side. The prepared item may be different for each live broadcast or may be common to all live broadcasts. In addition, some items may be duplicated in a plurality of live broadcasts. The database of the server 40 manages the price of each item in association with the item ID. For example, the server 40 stores moving image data, audio data, music data, and the like as article data for displaying the article in association with the article ID. The article data is, for example, three-dimensional data.

Each item is paid, and a money amount is determined in correspondence with the item, and a price is associated with the item ID. For example, the bouquet of item ID "a" has an item of 200 yen. The item to which the special effect of the item ID "B" is added is 300 yen. The article of the cat ear hair band of article ID "C" is 500 yen. The item "D" of the background image of the item ID "D" is 1000 yen. Article "E" of the gemstone for article ID "E" is 2000 yen. As an example, the user a shown in the live broadcast database 41 of fig. 2 buys a bouquet of the item ID "a" in 200 yen, the user B buys a headband of the item ID "C" in 500 yen, and the user C buys a special effect of the item ID "B" in 300 yen. In this manner, the user A, B, C can give an item to the publisher X by buying the item via the

user terminals

50, 70, 80. This allows the publisher X and the operator to obtain the turnover corresponding to the item given from the user A, B, C. All items offered as gifts by the user A, B, C become the turnover of the publisher X and its operator regardless of whether the publisher X accepts the items (the items enter the containers of the target portion). There may also be free items in the item. In addition, in a live broadcast, one user can buy one item or a plurality of items. In the live database 41, the total of items purchased by the respective users and the purchase amount are managed for each publisher.

When the user selects one item from the list of item selection objects 8 at the user B terminal 70, the user terminal 70 transmits item selection data including the user ID and the item ID of the selected item to the server 40 as the selected item selected by the user. In step S52, the server 40 performs item selection processing based on the item selection data, and displays the selected item object 3 on the user terminal 70. Fig. 10 (c) shows, as an example, a state in which the server 40 selects an object representing an article of a gem and displays the object on the display surface of the user terminal 70. In fig. 10 (c), the target object 5 is a container configured by the palms of both hands of the publisher X set in the order described above [ in the case where the target portion is set as a part of the publisher with a hand gesture (see fig. 6 (a) and (b)) ].

Thereafter, the user of the user terminal 70 can give a gift to the publisher X located in the studio 10 while operating the user terminal 70. Specifically, the user B of the user terminal 70 can perform a simulated experience of throwing the selected item against the publisher X by holding the user terminal 70 and performing the action of throwing the object. Further, by performing a slide operation of drawing in the direction of the displayed target object 5 on the display surface on which the live video image 4 is displayed with a finger or a stylus, the item can be given to the publisher X (see fig. 11 (a)). The user A, B, C can freely adjust the flying mode of the article, adjust the flying direction and flying distance of the article, make the article easily enter the target part, and improve entertainment. On the other hand, the publisher X can hear the sound of the user A, B, C emitted from the speaker 15 and the like to know the reaction of the user A, B, C. This enables the publisher X to move the palms of both hands. Publisher X can also urge user A, B, C to buy more items by moving the palms of both hands so that items thrown by user A, B, C do not easily enter the palms of both hands. In fig. 11 (a) to (c), the target object 5 is a container configured by the palms of both hands of the publisher X set in the order described above [ in the case where the target portion is set as a part of the publisher with a gesture (see fig. 6 (a) and (b)) ].

In step S53, the server 40 determines whether the user B has performed the throwing motion. When the operation to throw the article is performed by the sliding operation, the server 40 determines that the throwing motion is performed when the movement amount exceeds a threshold value or the like when the operation data such as the coordinate data is received. For example, when the distance between the start point and the end point when the sliding operation of the touch panel is performed exceeds a threshold value, it is determined that the throwing motion is performed.

When the motion of the throwing object is performed by hand, the server 40 receives operation data such as acceleration data, angle data, and angular velocity data, which are user motion information, transmitted from the

user terminals

50 and 80 per unit time. The server 40 stores a threshold value for determining that the user B has performed the throwing motion, and determines that the throwing motion has been performed at the user terminal 70 when the threshold value is exceeded. For example, the server 40 stores threshold values such as acceleration data, angle data, and angular velocity data in order to specify a throwing motion. The server 40 determines that the throwing motion is performed when the acceleration data, the angle data, the angular velocity data, or the like exceeds a threshold value.

In step S54, the server 40 analyzes the user motion (throwing motion) to calculate the article position. When the sliding operation of the touch panel is performed, the server 40 calculates the direction and trajectory of the thrown article and the article position as the falling position from the start point coordinate, the end point coordinate, and the velocity at the time of the sliding operation. When the motion of throwing an object is performed by hand, the server 40 analyzes the direction, speed, and the like of the wrist waving of the user based on operation data such as acceleration data, angle data, and angular velocity data relating to the wrist waving transmitted from the user terminal 7. Thus, the server 40 calculates the trajectory of the projected article when the projected article is projected in the real space of the studio 10 and the live video image 4, and the article position as the drop position. Then, the server 40 adds the article object 3 to the position in the live video 4 specified from the calculated three-dimensional position information.

In step S55, the server 40 displays the article object 3 of the gem on the display surface of the user terminal 70 based on the analysis result. As a result, as shown in fig. 11 (b) and (c), the article object 3 is displayed on the display surface of the user terminal 70 in real time as it goes to the publisher X. The same item object 3 is displayed on the studio monitor 14 and the publisher terminal 18 of the studio 10 in real time. The item object 3 is displayed on the display surface of the user terminal 70, the studio monitor 14, and the publisher terminal 18, and is displayed at an item position that should be originally located in the real space of the studio 10. That is, by specifying the position of the item in the real space of the studio 10 from the three-dimensional position information, the item object 3 is displayed at the position to be located in the image captured with the orientation of the RGB camera 12, assuming that the orientation of the RGB camera 12 is changed. When the article position is out of the imaging range of the RGB camera 12, the falling article object 3 is not displayed. In addition, in the vicinity of the article object 3, the user ID from which the article was thrown is displayed so that the publisher X can visually confirm who provided the article.

When a plurality of users who are watching the same live video image 4 throw an article against the publisher X, the article thrown by another person is also displayed. This enables competition among users for the number of articles thrown by the publisher X.

Further, the publisher X can intentionally move the target object 5, make it difficult for the article object 3 of the article to enter the target object 5 of the target portion, and urge the user to buy and use more articles.

The article object 3 may be displayed at least at the target position and the landing position, and may not be displayed in the state, the trajectory, or the like during the flight to the target position and the landing position.

[ article object display processing ]

As shown in fig. 12 (a), in step S61, the server 40 determines whether the item position of the item thrown by the user B overlaps with the target position. That is, the server 40 determines whether or not the article object 3 overlaps the target object 5 (whether or not the article has entered the container of the target portion), and proceeds to step S62 if the article object overlaps the target object, and proceeds to step S91 of C if the article object does not overlap the container of the target portion (see fig. 15 a).

Here, the range of the object 3 and the range of the target object 5 are determined based on the three-dimensional position information, the user motion information, and the type of the object, for example. For example, the parameters set in the attribute database 42 are followed. As an example, the range of the object of the article is a range of the outer shape of the article. As an example, the range of the article object 3 is typically 1 point or a plurality of points. The target object 5 is in the form of a three-dimensional space having a volume or volume for accommodating articles. For example, in the case of a container such as a cup or a bucket, the target 5 is in a space surrounded by an inner surface and a bottom surface. In addition, the range of the target object 5 is 1 point or a plurality of points in the target object 5 as representative. In the determination of whether or not the article object 3 and the target object 5 overlap, it is easier to determine whether or not 1 point or an area of a plurality of points overlaps than whether or not 1 point of the article object 3 and/or the target object 5 overlaps. For example, if an article enters from an entrance of the target portion and overlaps the article position at the target position, it is determined that the article enters the target portion. For example, even when the object does not enter the target portion but the target position overlaps the object position, it is determined that the object enters the target portion.

In step S62, the server 40 adds, to the live video image 4, the state immediately after the article has entered the container formed by the palm of the target portion, in accordance with the attribute parameters of the article and the attribute parameters of the target portion. For example, when a jewel of an article enters a container formed by a palm of a target portion, the jewel may slightly impact the palm. The server 40 adds the state where the gem is initially touched with the palm to the live video image 4. In the user terminal 70, the state in which the article object 3 of the gem reaches the target object 5 of the container formed by the palm is displayed in the live video 4, and thereby the user B can visually confirm that the gift thrown by himself/herself reaches the publisher X.

In addition, the publisher X does not have a real gem to reach the container formed by the palm. Instead, only the publisher X can view the live video 4 displayed on the publisher terminal 18 and the studio monitor 14, and know that the article from the user B has entered the container formed by the palm. Then, in step S63, in order to give the impact of the gem touching the palm to the publisher X in a sensory manner, the server 40 drives the haptic device 16 worn by the publisher X so that the publisher X can recognize with a sense that the gem touches the container constituted by the palm. At this time, the vibration given to the publisher X by the haptic device 16 is given according to the attribute parameters (weight, shape, etc.) of the article, and the parameters (size, surface roughness, softness, etc.) of the palm. As an example, the heavier the item, the more intense the vibration is perceptually fed back to the publisher X by the haptic device 16.

Typically, after the initial touch of the container made up of the palm, the gemstone rolls somewhat and rests in the container made up of the palm. In step S64, the server 40 adds the video live video 4 with the container constituted by the palm scrolled and still according to the attribute parameter of the article and the parameter of the palm. In the user terminal 70, the state in which the article object 3 of the gem reaches the target object 5 of the container formed by the palm is displayed in the live video image 4, and thereby the user B can visually confirm that the gift thrown by the user B is received by the publisher X. For example, the server 40 counts the contact time of the object 3 with respect to the target object 5, and continues to display the object 3 in the target object 5 when the contact time is equal to or greater than a threshold value. When the contact time is less than the threshold value, for example, the contact time may be once entered into a timer and popped up to fly out of the container.

Correspondingly, in step S65, in order to give an impact corresponding to a state of rolling and being stationary in the container constituted by the palm of the hand with a sense, the server 40 drives the haptic device 16 worn by the publisher X in accordance with the attribute parameters of the article and the parameters of the palm of the hand, and the publisher X can roll and be stationary in the container constituted by the palm of the hand with a sense of recognition.

Further, when too many items are given to the publisher X, all items cannot enter the target portion as a container. The amount of the item entering the target portion is based on the settings for the target portion in the attribute database 42. The server 40 may reset the received article when the target portion is full of articles (when the target portion is overflowed), that is, may set the target portion to be empty. As an example, the determination as to whether or not the object portion is filled with the object portion is performed when the object contacts the space portion (the inner surface and the bottom surface of the container) of the container of the object 5 for a predetermined time or longer and the total volume of the objects entering the container is equal to or greater than a threshold value equal to or less than the volume of the container. In addition, as an example, the contact time is set to be equal to or longer than a predetermined time when the article is in contact with the space portion (the inner surface and the bottom surface of the container) of the container of the target object 5, and the sum of the contact areas between the inner surface and the bottom surface of the container and the article is equal to or larger than a threshold value equal to or smaller than the inner surface area of the container. These states can be considered as states in which the article is almost filled in the container.

All articles are virtual objects, and real objects and virtual objects exist in the target part. When an object hits a target portion, the object changes its shape due to an impact or mutually interferes with each other by repelling each other with a predetermined force. Therefore, when the target part is a real object, the object as a virtual object is deformed or rejected in accordance with the attribute parameters of the real object. In addition, when both the target portion and the article are virtual objects, distortion or repulsion occurs according to the attribute parameters of each other. That is, the amount of deformation of the article and the target portion, the amount of repulsion and bounce of the article, and the like are calculated according to the attribute parameters of each other. Then, according to the result, the article object 3, the target object 5 are displayed, and the haptic device 16 is further driven.

Further, the object may collide with an object other than the target portion such as a table or a floor surface, and the object may change its shape or repel each other with a predetermined force due to the impact. Therefore, when an object other than the target portion such as a table or a floor is a real object, the object as a virtual object is deformed or rejected according to the attribute parameters of the real object. When an object other than a target portion such as a table or a floor is a virtual object, the object as the virtual object is deformed or repelled according to the attribute parameters of each other. That is, the deformation amount of the article, the amount of the article repelling and bouncing, and the like are calculated according to the attribute parameters of each other. Then, according to the result, a virtual object such as the article object 3, a table, a floor, or the like is displayed, and the haptic device 16 is further driven.

The publisher X may leave the studio 10 or a seat (see fig. 12 (b)). Then, the object or the target portion held in the container formed by the palm of the publisher X holding the object may be drawn away from the viewing angle (imaging range) of the RGB camera 12. Then, in step S66, the server 40 determines whether the item position and the target position are within the screen. When drawing, the article object 3 and the target object 5 are not displayed on the user terminal 70 in step S67 ((c) of fig. 12). If the drawing is not shown, the process proceeds to step S68, and the user terminal 70 continuously displays the article object 3 and the target object 5 at the positions of the destination in the drawing.

When the article object 3 and the target object 5 draw, the publisher X may move the RGB camera 12 or change the image capturing direction so that the drawn article and target portion are drawn. In this case, in step S69, the server 40 determines again whether the item position and the target position are within the screen. Then, when drawing is performed, in step S70, the item object 3 and the target object 5 are displayed again on the user terminal 70. When the article position and the target position do not enter the screen, the server 40 continues to display the article object 3 and the target object 5 in a non-display state in step S71.

In addition, when a new gem (article) is put in a state where a container (target portion) formed by the palm of the publisher X already holds one or more gem (article), the server 40 may cause the new article object 3 to collide with the article object 3 existing in the target object 5 of the container formed by the palm, and to drop out or fall out of the container formed by the palm. Then, as shown in fig. 13 a, in step S81, when a new gem (article) is put in a state where one or more gems (articles) are already held in a container (target portion) formed by the palm of the publisher X, the server 40 adds a state where the new article object 3 hits the article object 3 existing in the target object 5 of the container formed by the palm, or both of them are bounced and fall out from the container formed by the palm immediately before the video live video image 4, and displays the same on the user terminal 70 and the publisher terminal 18 (see fig. 13 b and c). In fig. 13 (b) and (c), the article object about to fall out is indicated by 3 a. In step S82, the server 40 drives the haptic device 16 worn by the publisher X in accordance with the attribute parameters of the article and the parameters of the palm so as to give the feeling of the article rotating in the container made of the palm of the hand in response to the impact of the article colliding with each other, and the publisher X can recognize with the feeling that the article rolls in the container made of the palm of the hand and stands still.

In step S83, the server 40 determines whether or not the article object 3 of the gem has fallen out of the container made of the palm of the target object 5, based on the operation data of the user terminal 70 into which the article is inserted, the attribute parameters of the article, and the parameters of the palm. Then, the server 40 proceeds to step S84 when the drop occurs, and repeatedly executes the processing from step S81 when the drop does not occur.

When an article falls out of a container formed by a palm, the article rolls on the table 6 (see fig. 14 (a)) and falls off the table 6 and rolls on the floor surface according to the attribute parameters of the article, the floor surface, the attribute parameters of the table 6, and the like. The server 40 adds a series of states from a state where the article is rolled on the table 6 to a state where the article is dropped from the table 6 and rolled on the floor to the live video 4, and displays the result on the user terminal 70 and the publisher terminal 18. In fig. 14 (a), the scroll object is indicated by 3 a.

In step S84, the server 40 determines whether or not the item position is within the screen as a result of the item scrolling. Then, when drawing, in step S85, the article object 3 is not displayed on the user terminal 70. When the drawing is not shown, the process proceeds to step S86, and the user terminal 70 continuously displays the article object 3 at the position of the destination on the screen based on the article attribute parameters, the floor surface, the table 6 attribute parameters, and the like.

When the article object 3 draws a picture, the publisher X may move the RGB camera 12 or change the image pickup direction so as to put the drawn article in the picture. When the article is entered, in step S87, the article object 3a is displayed again on the user terminal 70 (see fig. 14 (b) to (d)). When the position of the article does not enter the screen, the server 40 keeps the article object 3a in the non-display state.

Further, by acquiring all three-dimensional position information in the studio 10 in advance as preparation for live video, even if the object 3 falls off the table 6 and rolls over a position that is not normally seen, such as under the table, the position can be specified.

Further, there is a case where the object thrown by the user B is deviated from the target portion. That is, there is a case where the article object 3 of a gem does not enter the target object 5 of the container constituted by the palm (see fig. 15 (b) when C of fig. 12 (a) and step S61 are no). In this case, as shown in fig. 15 (a), the server 40 determines whether or not the item position is within the screen in step S91, and when the screen is drawn, the item object 3b is not displayed on the user terminal 70 in step S92. When the drawing is not shown, the process proceeds to step S86, and the user terminal 70 continuously displays the article object 3 at the position of the destination on the screen based on the article attribute parameters, the floor surface, the table 6 attribute parameters, and the like.

When the article object 3 draws a picture, the publisher X may move the RGB camera 12 or change the image pickup direction so as to put the drawn article in the picture. When the article is entered, in step S94, the article object 3 is displayed again on the user terminal 70. When the position of the article does not enter the screen, the server 40 keeps the article object 3 in the non-display state.

For example, when the detection range of the depth camera 13 is larger than the imaging range of the RGB camera 12, the article position and the target position may be apart from the imaging range of the RGB camera 12 and also apart from the detection range of the depth camera 13. In this case, the server 40 holds the target position and the article position of the target portion immediately before leaving the detection range, and updates the target position and the article position to the return position when returning to the detection range of the depth camera 13. In addition, the server 40 drives the haptic device 16 in correspondence with the state of the container or the article when returning to the detection range of the depth camera 13.

According to the above-described video live system 1, the effects listed below can be obtained.

(1) In the live video broadcast system 1, as a user's wish, for example, the user A, B, C wants the publisher X to receive the mood of the item, buy as many items as possible, and throw the items so that the items enter the container of the target section. Further, the fact that the item has entered the container of the publisher X is also transmitted to the publisher X, and the user A, B, C can see the reaction of the publisher X. Thus, user A, B, C wants to buy as many items as possible and throw against publisher X. On the other hand, the publisher X can also surely sense the given item and urge the user A, B, C to buy more items. This can increase the number of participants of the publisher and the user A, B, C in the live video system 1, thereby activating the live video system 1.

(2) The user A, B, C can throw an article to a target portion set for the publisher X located in the studio 10 using the

user terminals

50, 70, 80. When the article position of the article overlaps the target position of the target portion, the article enters the container of the target portion. Thus, the user A, B, C can give a gift of an item by virtually throwing the item to the container of the target portion with respect to the publisher X located at a remote place. Thus, the live video broadcast system 1 can improve entertainment.

(3) When the user A, B, C gives a gift by performing an action of throwing an article to the studio 10 with respect to the publisher X, operation data such as acceleration data, angle data, and angular velocity data related to the swinging of the wrist are transmitted from the

user terminals

50, 70, and 80. In addition, coordinate data and the like when the slide operation is performed are transmitted. Therefore, the user A, B, C can freely adjust the flying style of the article, adjust the flying direction and flying distance of the article, and make the article easily enter the target portion, thereby improving entertainment.

(4) The publisher X can move the container that is the target portion by moving itself. The publisher X can also make it difficult for the article thrown by the user A, B, C to enter the container of the target portion by moving the container of the target portion, and can prompt the user A, B, C to buy more articles.

(5) When the object 3 overlaps the target object 5, this can be regarded as the entry of the object into the target portion. This makes it possible to visually notify the user A, B, C or the publisher X of the entry of the item into the target portion.

(6) By limiting the period of the items to be presented to the publisher to the 1 st special period, the items can be collectively purchased and presented to the publisher X in the 1 st special period.

(7) The publisher X can also constitute a container serving as a target portion of the thrown article with the palm of his own hand. The publisher X may be a hand-held physical object such as a cup or a bucket. This improves the degree of freedom in selecting the target portion by the publisher X, and the user A, B, C can easily grasp the target portion by the live video 4 because the target portion is a real container.

(8) The position of the container on which the target portion is placed, that is, the target position can be set to a position slightly away from a desk or the like of the seat of the publisher X, in addition to a part of the publisher X (a palm, a cup of a hand-held real object, or the like). In other words, the degree of freedom in setting the position of the target position can be improved.

(9) The target portion is not limited to a real object, and may be a virtual object that does not actually exist in the studio 10, whereby the degree of freedom in selecting the target portion can be increased, and the entertainment can be improved.

(10) The target portion can be selected from the candidate group. Since the type of the set target portion is limited, the attribute parameters of the target portion can be set in a detailed manner. The publisher X can also freely select a target part from the candidates.

(11) The target position of the target portion set in association with the publisher in the studio 10 can be specified from the three-dimensional position information detected by the depth camera 13.

(12) The attribute parameter of the target portion is set to a value close to a real object, thereby enhancing the sense of reality, and the target portion is made virtual by setting the attribute parameter different from the real object, thereby enhancing the entertainment.

(13) The attribute parameter of the article is set to a value close to a real object, thereby enhancing the realistic sensation, and the article is made virtual by setting the attribute parameter different from the real object, thereby enhancing the entertainment.

(14) By having the publisher wear the haptic device 16, sensory feedback can be imparted to the publisher. For example, the object can be perceived by the tactile sensation 16 as entering the target portion, rolling in the container, and falling out of the container to be lighter.

[ 2 nd embodiment ]

Fig. 16 shows embodiment 2 of the present invention. In the display control system according to embodiment 2, the subject is a dog image in a sacrifice facility such as an outdoor world war institute. The gongda box before the dog figure is the target part. The kungfu box of the target portion is a virtual object that the user D, E going to debye can recognize with a user terminal such as a Head Mounted Display (HMD) or a smart device, and is not a real object located in front of a dog image as a real object. Here, the user D wears the HMD110, and the user E holds the user terminal 140 of the smart device. The user D, E actually comes to the vicinity of the dog image. For example, the place where the user D, E is located is near a dog image and is close to the subject to the extent that money can be thrown. Here, if many people go to facilities for worship and sacrifice, etc., it is difficult to throw sesame oil money with the gongde box as a target. When the user D, E starts the application program of the HMD110 or the user terminal 140, the HMD110 or the user terminal 140 can visually confirm the target object 5 in the kung box located in front of the dog image, and the user can throw the sesame oil money as a virtual article against the target object 5. In addition, the position where the target object 5 is set can be a position where the user frequently throws coins like the telawav fountain. The position where the target object 5 is set may be a donation box for collecting donations for maintenance management or repair, for example, before public facilities such as cultural property and relics. The target object 5 may be a dog image itself as a subject. In addition, the user terminal may be a glasses-type information processing terminal such as smart glasses.

As shown in fig. 17, the HMD110, which is a user terminal of the user D, includes: a display unit 112 provided on the wearing unit worn on the head and positioned in front of the eyes of the user D; a control unit 115 that is incorporated in the wearing portion and controls the entire HMD 110; a speaker 116; an operation unit 117. The control unit 115 includes: a network IF 118; a data storage unit 119; a main memory 120; and a control unit 121. The data storage unit 119 stores an application program for adding an item object 3 to a target object 5 as a virtual money box by throwing a virtual item such as scented oil money, and the control unit 121 controls the overall operation according to the program.

In a case where an optically transparent display such as a transmissive display panel is used as an example, the display unit 112 adds the target object 5 to the transmissive display panel. As another example, a retina projection type display device that directly images the target object 5 on the retina may be used instead of the transmission type display panel. In the case of the video see-through type or the non-transmissive type, the target object 5 is added to the image of the real space reflected on the display element. Specifically, the HMD110 displays the object 3 as scented oil money on the display unit 112. The HMD110 has an RGB camera 113 and a depth camera 114. The HMD110 detects a dog image from the real space image of the RGB camera 113, acquires three-dimensional position information of the position of the dog image by the depth camera 114, and adds the target object 5 in front of the dog image in the display unit 112. The depth camera 114 acquires three-dimensional position information of the user D at various places throughout the detection range. When the HMD110 throws the scented oil money, the display unit 112 adds the object 3 of scented oil money to the palm of the hand performing the operation of throwing the scented oil money. When the user D performs the action of throwing the scented oil money, the palm of the wrist or the like performing the action of throwing the scented oil money is tracked by the depth camera 114 to detect a change in position, and the display unit 112 adds the object 3 of the scented oil money flying in accordance with the detected user action information.

When the article position of the article object 3 overlaps the target position of the target object 5, the HMD110 determines that the sesame oil money has entered the money box. When the position of the object does not overlap the target position, the HMD110 determines that the sesame oil money does not enter the money box. The charge of the article may be made when the position of the article overlaps the target position, that is, when the sesame oil money enters the money box. Further, it is also possible to repeatedly throw the sesame oil money without additional charging until the sesame oil money enters the money box. The target position where the target object 5 is displayed may or may not completely coincide with the size of the kungfu box. For example, the target position for placing the merit box is set to be slightly larger than the size of the target object of the merit box. Thus, even when the object 3 of the scented oil money and the target object 5 do not overlap each other to some extent, the scented oil money can be put into the money box. The item charge in embodiment 1 is for profit, but the item charge in embodiment 2 is not for profit, but is based on the user's goodwill and religious mind. Therefore, the article object 3 is preferably set to be easy to enter the target object 5. In the case where the dog image itself is set as the target object 5, it may be determined that the donation is successful when the balm money item object 3 hits the target object 5.

The user terminal 140 of the user E is a smart device having the same configuration as the user terminal 70 described above, but further includes an RGB camera 113 and a depth camera 114. The user E displays the real space video on the display unit while capturing the dog image with the RGB camera 113 of the user terminal 140. Then, the user terminal 140 detects a dog image from the real space image of the RGB camera 113, and acquires three-dimensional position information of the position of the dog image by the depth camera 114, adding the target object 5 in front of the dog image. When throwing the scented oil money, the user E performs a slide operation of drawing in the direction of the target object 5. Thus, the user terminal 140 adds the article object 3 based on the user operation information by the slide operation. That is, the user terminal 140 displays the state in which the object 3 of scented oil money flies toward the target object 5 of the moral box on the display unit. When the article position of the article object 3 overlaps the target position of the target object 5, the user terminal 140 determines that the scented oil money enters the money box. In addition, when the position of the article is not overlapped with the target position, the sesame oil money is judged not to enter the money box. The judgment as to whether or not the sesame oil money enters the gongda box is the same as the case of the HMD 110.

In the display control system, there is also a server 130 for managing scented oil money. The HMD110 transmits a charge signal to the server 130 when the sesame oil money is bought or when the sesame oil money enters the cash box. The server 130 manages the charge amount in association with the user ID, and calculates the charge amount for each user ID based on the charge signal. The sesame oil money is finally remitted to a group such as religious legal people operating sacrifice facilities.

According to the display control system described above, the effects listed below can be obtained.

(1) The user D, E can virtually throw the joss oil money from a location separated by a certain distance to the gongda box even if the user does not actually come in front of the dog image. Therefore, the debarkation time can be shortened.

(2) The target object 5 can be set only by the user terminal managed by the user, and the object is thrown with the target object 5 as a target. That is, the devices (the microphone 11, the RGB camera 12, the depth camera 13, the studio monitor 14, the speaker 15, the control device 17, the publisher terminal 18, and the like) in the studio 10 in embodiment 1 can be omitted.

(3) In the display control system, only a merit box which does not require a manager is virtually set and a virtual coin is thrown. That is, the joss oil money is not actually thrown to the gongde box. Therefore, it is possible to prevent a misbehaving person from taking away the sesame oil money in the money box. In addition, the use of counterfeit coins can be prevented. This reduces the operation cost of non-profit funding activities performed before the dog image, and enables continuous activities.

(4) Since the sesame oil money is not actually thrown into the money box, the money collection operation is not required as in the actual money box.

The video live broadcast system can be appropriately modified and implemented as follows.

In embodiment 1, the haptic device 16 worn by the publisher X may be omitted. This is because whether or not an item enters the target part of the publisher X can be visually confirmed by the studio monitor 14 and the publisher terminal 18.

The attribute database 42 according to embodiment 1 can be applied to embodiment 2 as well. In embodiment 1 and embodiment 2, the attribute database 42 may be omitted. The attribute parameters managed in the attribute database 42 may be only the attribute parameters of at least one of the target portion attribute parameters and the article attribute parameters.

In embodiment 1, the tracking of the movement of the publisher X and the tracking of the target part (the cup, the bucket of the real object, the palm of the publisher X, etc.) in which the target object 5 is set are performed by obtaining the three-dimensional position information by the depth camera 13, but the depth camera 13 may be omitted and only the RGB camera 12 may be used. For example, detection of the target portion can improve detection accuracy of detecting the target portion from a live video image by using the AI of the deep learning.

In addition, the detection and recognition process of the motion (the motion of the skeleton and each skeleton is referred to as a motion) in the motion capture process may be performed by applying a recognition process based on a learning result by deep learning or the like to the image from the RGB camera 12.

The operation of the user for specifying (for example, the operation of throwing an article or the direction in which the article is thrown with a certain degree of intensity) is not limited to the determination (detection) based on the detection results of the detection unit of the smart watch 60 or the detection unit of the smart device terminal 70 a. For example, the difference between frames and the motion vector may be calculated based on the video captured by the camera.

In embodiment 1, a special period such as the 1 st special period or the 2 nd special period may not be provided. In this case, the article can be put into the target portion during the time when the user terminal views the live video. In embodiment 2, a special period may be provided.

The target portion may be only one kind of real object (for example, only a cup or a barrel of the real object) set in advance. In addition, the target portion may be only one type of virtual object (for example, only a cup or a bucket of the virtual object) set in advance. Further, the article may be only one type of article (for example, only a coin or only a gem) set in advance. In embodiment 2, the situation in which the money for sesame oil is thrown to the gongda box of the target portion is determined in advance. Therefore, the target portion may be set as a money box in advance, or the article may be set as a coin in advance (the user D, E cannot select the kind of the article).

In addition, the special period is not limited to the live video. For example, many people are sacrificing facilities such as a magical society during the first debate in the year. Therefore, when there are many people, the merit box serving as the target portion may be set relatively large.

In embodiment 1, the video that can be viewed by the user terminal is not limited to live video. For example, the server 40 is set in a state where it is viewable by the user terminal when it is the distribution time of the video (the video to which the target object 5 is added) that has been shot by the distribution publisher X, and is capable of viewing the distribution video at the user terminal when the user accesses the user terminal after the distribution time. The user can throw the article toward the target portion at the user terminal. In this case, the delivery is stopped when the delivery end time is reached.

In embodiment 1, the processing performed by the server 40 may be distributed to the edge servers.

The display control program is installed in the computer through a network, a portable recording medium such as an optical disk or a semiconductor memory.

Description of the reference numerals

1 … video live system, 2 … network, 3 … item object, 4 … video live video, 4a … live video, 5 … target object, 5a … cup, 5b … bucket, 6 … table, 7 … summary object, 8 … item selection object, 9 … lighting, 10 … studio, 11 … microphone, 12 … RGB camera, 13 … depth camera, 14 … studio monitor, 15 … speaker, 16 … haptic device, 16a … actuator, 17 … control device, 18 … publisher terminal.

Claims

1. A display control system is characterized by comprising:

an acquisition unit that acquires three-dimensional position information of a real space;

a setting unit that specifies a position of a target portion of an article in the real space using the three-dimensional position information, and sets the target portion at the position; and

and a display control unit that adds a target object corresponding to the target portion to a position of the target portion in a video, and adds an article object corresponding to the article to an article position of the article in the video, the article position being specified based on user motion information indicating a user motion for moving the article toward the target object.

2. The display control system according to claim 1,

the display control system further includes a distribution unit that distributes the video of the real space in which the subject is located,

adding the target object and the item object with respect to the image.

3. The display control system according to claim 2,

the display control unit moves the target object in conjunction with movement of the subject.

4. The display control system according to claim 1 or 2,

the target portion is a container for the article,

the display control unit displays that the article object enters the target object when the article position overlaps the position of the target portion.

5. The display control system according to claim 1 or 2,

further comprising a period management unit for managing a special period during which the target unit receives the article,

the display control unit displays that the article object enters the target object when the article position overlaps the target portion position during the special period.

6. The display control system according to claim 1 or 2,

the target part is a real object existing in the real space,

the position of the target portion is set to the position of the real object in the real space,

the target object is an object corresponding to the real object.

7. The display control system according to claim 1 or 2,

the target part is a virtual object that does not exist in the real space,

the position of the target portion is set in the real space,

the target object is an object corresponding to the virtual object.

8. The display control system according to claim 2,

the target portion is an object or a part of an object,

the position of the target portion is set as an object or a part of an object.

9. The display control system according to claim 1 or 2,

the setting unit sets a selection target unit selected from the selection candidate group of the target unit as the target unit,

the display control unit adds the target object corresponding to the target unit to the video.

10. The display control system according to claim 1,

the acquisition section acquires the three-dimensional position information based on depth information detected by a depth camera.

11. The display control system according to claim 2,

the acquisition unit acquires the three-dimensional position information based on an image captured by an RGB camera.

12. The display control system according to claim 1,

further comprising an attribute database for managing an object part attribute parameter indicating an attribute of the object part,

the display control unit adds the target object to the video according to the target portion attribute parameter.

13. The display control system according to claim 1,

there is also an attribute database that manages item attribute parameters that represent attributes of the item,

the display control unit adds the article object to the video according to the article attribute parameter.

14. The display control system according to claim 2,

further comprising a haptic control unit for controlling a haptic device worn on a person as the subject,

the haptic control unit controls the haptic device based on at least one of a target portion attribute parameter indicating an attribute of the target portion and an article attribute parameter indicating an attribute of the article.

15. A display control method characterized by comprising, in a display control unit,

the acquisition section is caused to acquire three-dimensional position information of the real space,

causing a setting unit to determine a position of a target portion of an accepted item in the real space using the three-dimensional position information and set the target portion at the position,

the display control unit adds a target object corresponding to the target portion to a position of the target portion specified by the three-dimensional position information in an image corresponding to the real space, and adds an article object corresponding to the article to an article position specified by user motion information indicating a user motion for moving the article toward the target object in the image.

16. A display control program for causing a computer to execute a display control method comprising:

the display control method includes a setting unit that specifies a position of a target portion in a real space, the target portion being a target portion of an article, using three-dimensional position information of the real space, and sets the target portion at the position, a display control unit that adds a target object corresponding to the target portion to the position of the target portion specified by the three-dimensional position information in an image corresponding to the real space, and adds an article object corresponding to the article to an article position specified by user operation information indicating a user operation for moving the article toward the target object in the image.