WO2018155351A1

WO2018155351A1 - Reproduction method, reproduction system, and reproduction apparatus

Info

Publication number: WO2018155351A1
Application number: PCT/JP2018/005613
Authority: WO
Inventors: 旭谷口; 敦宏辻; 幸　裕弘; 坂井　剛; 羊佑塩田; 浩充森下
Original assignee: パナソニックＩｐマネジメント株式会社
Priority date: 2017-02-21
Filing date: 2018-02-19
Publication date: 2018-08-30

Abstract

This reproduction method comprises: acquiring (S21) a first content (C10) that is formed of a first video content (C11) and a first sound content (C12) independent of each other; acquiring (S22) a second content (C20) that is formed of a second video content (C21) and a second sound content (C22) independent of each other; and reproducing (S23) the acquired first content (C10), then reproducing the acquired second content (C20).

Description

REPRODUCTION METHOD, REPRODUCTION SYSTEM, AND REPRODUCTION DEVICE

The present disclosure relates to a reproduction method, a reproduction system, and a reproduction apparatus for reproducing video content and sound content.

Patent Document 1 discloses a moving image playback apparatus that smoothly switches a moving image provided by streaming.

JP 2010-41246 A

This disclosure provides a playback method that can reduce a sense of discomfort given to a user when video content and sound content are switched to different content.

The reproduction method according to the present disclosure acquires first content composed of first video content and first sound content that are independent from each other, and includes second video content and second sound content that are independent from each other. After the second content is acquired and the acquired first content is reproduced, the acquired second content is reproduced.

These general or specific aspects may be realized by a system, an apparatus, an integrated circuit, a computer program, or a recording medium such as a computer-readable CD-ROM. The system, the apparatus, the integrated circuit, and the computer program And any combination of recording media.

The method according to the present disclosure can reduce a sense of discomfort given to the user when the video content and the sound content are switched to different content.

FIG. 1 is a schematic diagram of a reproduction system according to an embodiment. FIG. 2 is a block diagram illustrating an example of a hardware configuration of the playback device. FIG. 3 is a block diagram illustrating an example of the hardware configuration of the server. FIG. 4 is a block diagram illustrating an example of a hardware configuration of the information processing apparatus. FIG. 5 is a block diagram illustrating an example of a functional configuration of the reproduction system according to the embodiment. FIG. 6 is a block diagram illustrating an example of a specific configuration of the playback unit. FIG. 7 is a diagram illustrating an example of processing for switching from the first content to the second content. FIG. 8 is a sequence diagram illustrating an example of a reproduction method by the reproduction system according to the embodiment. FIG. 9 is a flowchart illustrating an example of the details of the reproduction process performed by the reproduction apparatus according to the embodiment. FIG. 10 is a sequence diagram illustrating an example of a registration method by the reproduction system according to the embodiment. FIG. 11 is a block diagram illustrating an example of a functional configuration of a reproduction system according to a modification of the embodiment.

Hereinafter, embodiments will be described in detail with reference to the drawings as appropriate. However, more detailed description than necessary may be omitted. For example, detailed descriptions of already well-known matters and repeated descriptions for substantially the same configuration may be omitted. This is to avoid the following description from becoming unnecessarily redundant and to facilitate understanding by those skilled in the art.

In addition, the inventor provides the accompanying drawings and the following description in order for those skilled in the art to fully understand the present disclosure, and is not intended to limit the claimed subject matter. .

(Embodiment)
The embodiment will be described below with reference to FIGS.

[1-1. Constitution]
FIG. 1 is a schematic diagram of a reproduction system according to an embodiment.

Specifically, in FIG. 1, a playback device 100, a server 200, a communication network 300, and an information processing device 400 are shown. For example, the playback system 1 includes the playback device 100 and the server 200 among these components. The playback system 1 may further include an information processing apparatus 400. In the playback system 1, a plurality of playback devices 100 may be connected to the communication network 300. In the reproduction system 1, a plurality of information processing devices 400 may be connected to the communication network 300.

The playback system 1 is a system for providing a first user with content configured by a combination of independent video content and sound content from the server 200 to the playback device 100. One playback device 100 may correspond to one first user or a plurality of first users. When the reproduction system 1 includes a plurality of reproduction apparatuses 100, a plurality of first users may correspond to each of the plurality of reproduction apparatuses 100 in a one-to-one correspondence or a one-to-many correspondence. Also good. Further, the plurality of playback devices 100 may correspond to one first user. Similarly, one information processing apparatus 400 may correspond to one second user or a plurality of second users. When the reproduction system 1 includes a plurality of information processing apparatuses 400, a plurality of second users may correspond to each of the plurality of information processing apparatuses 400, or one to many. It may be. Further, the plurality of information processing apparatuses 400 may correspond to one second user. For example, video content or sound content is provided to the server 200 via the information processing apparatus 400 from a second user such as a content creator.

Here, the independent content is content generated on the assumption that the content itself is reproduced independently. That is, the reproduction time for reproducing the video content constituting the content once from the beginning to the end is often different from the reproduction time for reproducing the sound content once from the beginning to the end. Further, in the video content and the sound content constituting the content, the creator of the video content and the creator of the sound content are often different.

As described above, the playback system 1 can generate a large amount of content by generating content by combining video content and sound content that are independent of each other. For this reason, it is possible to reduce the shortage of content.

On the other hand, when video content and sound content that are independent from each other are combined, the playback time of the video content and the playback time of the sound content are often different as described above. For this reason, when the first content is reproduced and then switched to the second content for reproduction, even if one of the first video content and the first sound content constituting the first content is reproduced, the other reproduction is completed. Will not. That is, when switching the playback of video content and sound content from the first content to the second content at a specified timing, switching to the second video content during playback of the first video content or during playback of the first sound content Whether to switch to the second sound content will be performed.

In this case, switching the first video content to the second video content during playback is more likely to give the user a greater sense of discomfort than switching the first video content to the second sound content during playback. For this reason, the present inventor has further reduced the uncomfortable feeling given to the user by performing a reproduction process for stopping the sound content at the timing when the video content ends.

Hereinafter, the configuration of the playback system 1 for performing the playback process will be described in detail.

Next, the hardware configuration of the playback apparatus 100 will be described with reference to FIG.

FIG. 2 is a block diagram showing an example of the hardware configuration of the playback device.

As shown in FIG. 2, the playback device 100 includes a CPU 101 (Central Processing Unit), a main memory 102, a storage 103, a communication IF (Interface) 104, a display 105, and a speaker 106 as hardware configurations. Prepare.

The CPU 101 is a processor that executes a control program stored in the storage 103 or the like.

The main memory 102 is a volatile storage area used as a work area used when the CPU 101 executes a control program.

The storage 103 is a non-volatile storage area that holds a control program, content, and the like.

The communication IF 104 is a communication interface that communicates with the server 200 via the communication network 300. The communication IF 104 is, for example, a wired LAN interface. The communication IF 104 may be a wireless LAN interface. Further, the communication IF 104 is not limited to a LAN interface, and may be any communication interface as long as it can establish a communication connection with the communication network 300.

The display 105 is a display device that displays a processing result in the CPU 101. The display 105 displays, for example, video obtained by playing video content. The display 105 is, for example, a liquid crystal display or an organic EL display.

Speaker 106 outputs the processing result in CPU 101. The speaker 106 outputs, for example, sound or music obtained by playing sound content.

The hardware configuration of the server 200 will be described with reference to FIG.

FIG. 3 is a block diagram showing an example of the hardware configuration of the server.

As shown in FIG. 3, the server 200 includes a CPU 201 (Central Processing Unit), a main memory 202, a storage 203, and a communication IF (Interface) 204 as hardware configurations.

The CPU 201 is a processor that executes a control program stored in the storage 203 or the like.

The main memory 202 is a volatile storage area used as a work area used when the CPU 201 executes a control program.

The storage 203 is a non-volatile storage area that holds a control program, content, and the like.

The communication IF 204 is a communication interface that communicates with the playback apparatus 100 or the information processing apparatus 400 via the communication network 300. The communication IF 204 is, for example, a wired LAN interface. Note that the communication IF 204 may be a wireless LAN interface. The communication IF 204 is not limited to a LAN interface, and may be any communication interface as long as it can establish a communication connection with the communication network 300.

The hardware configuration of the information processing apparatus 400 will be described with reference to FIG.

FIG. 4 is a block diagram illustrating an example of a hardware configuration of the information processing apparatus.

As shown in FIG. 4, the information processing apparatus 400 includes a CPU 401 (Central Processing Unit), a main memory 402, a storage 403, a communication IF (Interface) 404, an input IF (Interface) 405, as hardware configurations. Is provided.

The CPU 401 is a processor that executes a control program stored in the storage 403 or the like.

The main memory 402 is a volatile storage area used as a work area used when the CPU 401 executes a control program.

The storage 403 is a non-volatile storage area that holds a control program, content, and the like.

The communication IF 404 is a communication interface that communicates with the server 200 via the communication network 300. The communication IF 404 is, for example, a wired LAN interface. Note that the communication IF 404 may be a wireless LAN interface. The communication IF 404 is not limited to a LAN interface, and may be any communication interface as long as it can establish a communication connection with the communication network 300.

The input IF 405 is an input device such as a numeric keypad, a keyboard, and a mouse.

Next, the functional configuration of the playback system 1 will be described with reference to FIG.

FIG. 5 is a block diagram illustrating an example of a functional configuration of the reproduction system according to the embodiment.

First, the functional configuration of the playback apparatus 100 will be described.

The playback apparatus 100 includes a communication unit 110 and a playback unit 130. The playback device 100 may further include a content DB (Database) 120.

The communication unit 110 acquires the first content from the server 200 via the communication network 300. The first content includes first video content and first sound content that are independent of each other. Further, the communication unit 110 acquires the second content from the server 200 via the communication network 300. The second content includes second video content and second sound content that are independent of each other. The communication unit 110 is realized by the CPU 101, the main memory 102, the storage 103, and the communication IF 104, for example.

The content DB 120 stores the first content and the second content acquired by the communication unit 110. The content DB 120 is realized by the storage 103, for example. The first content and the second content stored in the content DB 120 are not limited to the content acquired by the communication unit 110 but may be content stored in advance or acquired by the communication unit 110. Content stored in advance and content stored in advance may be mixed. The content DB 120 stores, for example, previously stored content before factory shipment.

Here, the reproducing unit 130 will be described with reference to FIGS. 6 and 7.

FIG. 6 is a block diagram showing an example of a specific configuration of the playback unit.

The reproduction unit 130 reproduces the first content C10 or the second content C20 acquired by the communication unit 110. Note that the playback unit 130 may perform streaming playback of the first content C10 or the second content C20 acquired by the communication unit 110, or read and play back the first content C10 or the second content C20 from the content DB 120. May be.

The playback unit 130 includes a video playback unit 131 and a sound playback unit 132.

The video playback unit 131 plays back video content. Specifically, the video reproduction unit 131 reproduces video content and displays the video obtained by the reproduction on the display 105.

The sound reproduction unit 132 reproduces sound content. Specifically, the sound reproduction unit 132 reproduces sound content and causes the speaker 106 to output sound obtained by the reproduction.

Specifically, the playback unit 130 plays back the second content C20 after playing back the first content C10, for example, as shown in FIG. For example, as the reproduction of the first content C10, the reproduction unit 130 reproduces the first video content C11 and the first sound content C12 in the first period, and the second video content C21 and the second audio content C21 in the second period after the first period. The second sound content C22 is reproduced. The reproduction unit 130 switches from reproduction of the first video content C11 to reproduction of the second video content C21 and switches from reproduction of the first sound content C12 to reproduction of the second sound content C22 at a specified timing.

For example, the playback unit 130 stops the playback of the first sound content C12 and starts the playback of the second content C20 at the first timing when the playback of the first video content C11 of the first content C10 ends. Also good. When the playback time of the first video content C11 is shorter than the playback time of the first sound content C12, the playback unit 130 plays back the first sound content C12 even if the playback of the first sound content C12 has not ended. Stop at one timing. Note that the reproduction time is the time required to reproduce the content once from the beginning to the end at a single speed. That is, each of the first and second video contents C11 and C21 and the first and second sound contents C12 and C22 is a content that is played back with a playback time of a finite length.

Note that each of the first and second sound contents C12 and C22 may be sound contents that are reproduced in an infinite loop. The sound content to be played in an infinite loop is, for example, content including control information for causing the playback device 100 to play back the sound content from the beginning of the sound content at the timing when one playback ends. In addition, the sound content that is played in an infinite loop is, for example, content that is configured to be played back by seamlessly connecting the end point and the start point of the sound content. Here, seamlessly connected and played back means that, for example, the sound at the end of the sound content and the sound at the start of the sound content include similar sounds. is there. The similar sound means that both are included in a predetermined volume range and a predetermined frequency region.

Further, the reproduction unit 130 may repeat the reproduction of the first sound content C12 when the reproduction of the first video content C11 continues even after the reproduction of the first sound content C12 is completed. When the playback time of the first video content C11 is longer than the playback time of the first sound content C12, the playback unit 130 repeatedly plays back the first sound content C12, thereby playing back the first video content C11 during the first period. Then, the reproduction of the first sound content C12 is continued. Further, the reproduction unit 130 may repeatedly reproduce the first sound content C12 until the reproduction of the first video content C11 is completed.

Further, when the reproduction of the first sound content C12 does not end at the first timing, the reproduction unit 130 may stop the reproduction of the first sound content C12 at the first timing by fading out. Further, the reproduction unit 130 may start reproduction of the second sound content C22 by fading in the reproduction of the second content C20.

FIG. 7 is a diagram for explaining an example of processing for switching from the first content to the second content.

As shown in FIG. 7, the reproduction unit 130 reproduces the first sound content C12 during the first period Δt11 during reproduction of the first video content C11. The reproducing unit 130 repeatedly reproduces the first sound content C12 during the first period Δt11. In the case of FIG. 7, the first period Δt11 is at least twice as long as the reproduction time Δt21 of the first sound content C12. Therefore, in the reproduction unit 130, the sound reproduction unit 132 reproduces the first sound content C12 three times, and reproduces the second sound content C22 at a timing t4 when the first period Δt11 in the middle of the third reproduction ends. Switch to. In the playback unit 130, the video playback unit 131 switches to playback of the next second video content C21 because the playback of the first video content C11 ends at timing t4. In addition, the sound reproduction unit 132 fades out the first sound content C12 and fades in the second sound content C22 at timing t4. For this reason, the sound reproduction unit 132 starts to decrease the reproduction volume of the first sound content C12 that is reproduced at the first volume at the timing t3 that is a fade-out period before the timing t4, and the second volume until the timing t4. Reduce the volume to. In addition, the sound reproduction unit 132 starts reproduction of the second sound content C22 at the third volume from the timing t4, and increases the reproduction volume to the fourth volume before the timing t5 after the fade-in period. The first to fourth sound volumes may be average sound volumes for a predetermined period. The first volume and the fourth volume may be the same volume. Further, the second volume and the third volume may be the same volume.

Note that the playback unit 130 may display a period of displaying credit information as a fade-out period in a predetermined period until the playback time of the first video content C11 ends, for example, when displaying credit information indicating the creator of the content. That is, the playback unit 130 may reduce the volume of the first sound content C12 from the first volume to the second volume during the period in which the credit information is displayed. The credit information may be included in the content related information. The credit information may or may not be included in the content related information of the video content. The credit information may or may not be included in the content related information of the sound content.

The reproduction unit 130 is realized by, for example, the CPU 101, the main memory 102, the storage 103, the display 105, and the speaker 106.

Referring back to FIG. 5, the functional configuration of the server 200 will be described.

The server 200 includes a database 210, a comparison unit 220, a generation unit 230, and a communication unit 240.

The database 210 includes a video content DB (Database) 211 and a sound content DB (Database) 212. The video content DB 211 stores a plurality of independent video contents. The video content DB 211 stores content related information corresponding to each of the plurality of video contents together with the plurality of video contents. The sound content DB 212 stores a plurality of independent sound contents. The sound content DB 212 stores content related information corresponding to each of the plurality of sound contents together with the plurality of sound contents. The video content DB 211 stores video content acquired from the information processing apparatus 400 via the communication network 300 by the communication unit 240. Similarly, the sound content DB 212 stores sound content acquired from the information processing apparatus 400 via the communication network 300 by the communication unit 240. Each of the video content DB 211 and the sound content DB 212 is realized by the storage 203, for example.

The content related information is, for example, content metadata (that is, attribute information). One set of metadata exists for one content, and includes information on reproduction time, author, ambient level, video ambient level, or sound ambient level, and content genre. Details of the ambient degree, the video ambient degree, and the sound ambient degree will be described later.

The playback time is information indicating the length of time when the content is played back.

The author is information indicating the author of the content, and includes information including the author's name and contact information.

The ambient degree is an ambient degree associated with the content.

The video ambient degree is the ambient degree associated with the video part included in the content.

The sound ambient degree is an ambient degree associated with a sound part included in the content.

Thus, the ambient degree of content and the like can be set by metadata.

Metadata is created in a predetermined format. The index is obtained by analyzing the metadata according to the metadata format. The index is an index associated with the content, and is an index expressed by a continuous value. An example of the index is an estimated index that indicates the degree of attention the user is directed to the content being played back. More specifically, the index is an index that is an index having a smaller value as the degree of attention directed to the content being played by the user is greater, or the user is directed to the content being played. As the degree of attention directed is greater, an index having a larger value may be employed. Here, the former is also referred to as an ambient level and the latter is also referred to as a conscious level. As the degree of attention directed by the user increases, for example, it is more likely to continue watching the screen on which the video is displayed from the beginning to the end of the playback time of the content, and concentrate on viewing the output sound. It can be said that it is suitable.

The index may include brightness, saturation, hue, or the like that is an index related to the color of the video included in the content being played back, or volume or frequency distribution that is an index of the sound included in the content being played back Etc. may be included. Further, the index may include an index calculated by a predetermined calculation method from the plurality of indexes.

Hereinafter, the explanation will be made using the ambient degree as an index, but the same explanation can be established by using the consciousness degree and other indices. The ambient degree is an index expressed as a continuous value from 0 to 100, for example. When the degree of ambient is 0, it means that the degree of attention estimated to be directed by the user is the largest, and when the degree of ambient is 100, the degree of attention estimated to be directed by the user is the smallest. Then.

The ambient degree associated with the content can be calculated from the video ambient degree that is the ambient degree associated with the video part of the content and the sound ambient degree that is the ambient degree associated with the sound part of the content. The video ambient degree is an example of a video index. The sound ambient degree is an example of a sound index.

The video ambient degree may be calculated based on, for example, the brightness, saturation or hue of the video of the content, or the scene change mode. More specifically, it is calculated as follows.

・ The higher the brightness of the content video, the lower the ambient degree is calculated.

・ The higher the saturation of the content video, the lower the ambient degree is calculated.

Based on the color of the content video, the higher the warm color such as red, orange or yellow, the higher the ambient, the higher the cold color such as blue or purple, the lower the ambient Calculated.

・ The lower the degree of ambientity, the more scene changes in the video.

-As a mode of video switching at the time of a scene change, when switching from one scene to the next scene, the more the image gradually changes like fade out, fade in or cross fade, the more A high degree of ambient is calculated. When switching from one scene to the next, the more frequently the images are switched, the lower the degree of ambient is calculated.

In addition, the sound ambient degree may be calculated based on, for example, the volume of the sound of the content, the frequency distribution of the sound, or the change in volume. More specifically, it is calculated as follows.

・ The lower the degree of ambient, the higher the volume of the content sound.

-Regarding the frequency distribution of the sound of the content, the higher the sound in the high sound range (for example, about 1 kHz to 20 kHz) or the low sound range (for example, about 20 Hz to 200 Hz), the higher the ambient degree is calculated, and the medium sound range (for example, about 200 Hz to 1 kHz) ), The lower the degree of ambient is calculated.

・ The steeper change in volume results in a lower ambient level.

Note that, as a method of calculating the content ambient degree from the video ambient degree and the sound ambient degree, any method can be adopted, but for example, an average or a weighted average can be used. For example, when the weighted average weight is in the range from 0 to 1 and the video ambient degree weight is α, the ambient degree of the content is expressed as (Equation 1) below.

Ambient degree of content = α x (Video ambient degree) + (1-α) x (Sound ambient degree) (Formula 1)

Here, the weighting of the video ambient degree and the sound ambient is determined as follows, for example.

(1) Increasing the weight of the sound ambient level Generally, in order to prevent a person from intentionally paying attention to the video presented by the playback device 100 or the like, the eyes are meditated or the eyes or body It is only relatively easy to change the direction. On the other hand, in order to prevent a person from paying attention to the sound presented by the playback device 100 or the like, there is a method of closing the ear, but it is not so easy, and the ear is temporarily blocked. Even so, it is difficult to completely eliminate the sound felt by the user. Therefore, the user can intentionally turn away the attention regarding the video portion of the content regardless of the degree of video ambient, but the degree of attention does not have to be close to the degree of sound ambient regarding the sound portion of the content. I do not get.

Therefore, it is effective to make the weight of the sound ambient degree heavier than the weight of the video ambient degree, that is, to make α smaller than 0.5. In this way, in the degree of ambient that is linked to the content, by making the contribution of the degree of attention directed by the person relative to the sound relatively large, the attention that the user directs the behavior of the ambient degree that is linked to the content. It is possible to get close to the sense of the degree.

(2) When increasing the weight of the video ambient degree It has been stated that it is relatively easy for humans not to pay attention to the video presented by the playback device 100, but the size of the display 105 is large. This makes it difficult to distract from the video presented by the playback device 100.

Therefore, it is effective to increase the weight of the video ambient degree as the size of the display 105 on which the content is assumed to be displayed is larger. For example, when a threshold value is set for the dimension of the display 105 that is assumed to display the content, and the content is assumed to be displayed by the display 105 having a dimension that exceeds the threshold, the weight of the video ambient degree is set to sound. It is effective to make it heavier than the weight of the ambient degree, that is, to make α larger than 0.5. This threshold value can be about 50 inches or 70 inches in the length of the diagonal line of the display 105, for example.

In this way, in the index associated with the content, the contribution of the degree of attention directed by the person to the video is relatively increased, so that the behavior of the ambient degree associated with the content is noticed by the user. You can get close to a sense of degree.

Note that α may be changed by an input from the operator of the playback system 1, the provider of the content, or the user. In this way, the operator of the playback system 1 can flexibly change the weight of the video ambient level and the sound ambient level. As a result, there is an advantage that it is possible to specify more flexible content suitable for the user's sense.

The video ambient level and the sound ambient level may be classified into a plurality of ranks according to the magnitude of the ambient level. In this case, the plurality of ranges of ambient degrees that define the plurality of ranks of the video ambient degree and the plurality of ranges of ambient degrees that define the plurality of ranks of the sound ambient degree do not have to coincide with each other. For example, the video ambient degree may be classified as rank A in the range of 0 to 20, and the sound ambient degree may be classified as rank A in the range of 0 to 30. That is, the video ambient degree and the sound ambient degree may be classified into a plurality of ranks within the same rank or different ambient degree ranges.

Also, the video ambient degree and the sound ambient degree may be normalized so that the minimum value and the maximum value coincide.

There can be a variety of content, but it is part of the environment, such as paintings on the wall or parts of wallpaper, floor or ceiling that are not often watched by users It may be content. Note that the content may be content that is assumed to be acquired in order to acquire information on news or culture or to obtain entertainment.

Note that the server 200 may calculate the ambient degree using the above method using at least one of the content stored in the database 210 and the content related information. When the degree of ambient is calculated in this way, the content-related information may not include the degree of ambient.

The comparison unit 220 compares the video attribute information included in each of the plurality of video contents with the sound attribute information included in each of the plurality of sound contents. For example, when the genre of the video content matches the genre of the sound content, the comparison unit 220 determines that they are similar to each other. The genre may include the author of the content and the date (or month, year) when the content was created. For example, the comparison unit 220 compares the video ambient degree and the sound ambient degree using a predetermined method, and determines whether or not they are similar. When the rank to which the video ambient degree of the video content belongs and the rank to which the sound ambient degree of the sound content belong are the same among the plurality of ranks classified according to the magnitude of the ambient degree, the comparison unit 220 It is determined that the video content and the sound content are similar to each other. The comparison unit 220 calculates the video ambient degree from the metadata included in the video attribute information using the above method, and calculates the sound ambient degree from the metadata included in the sound attribute information using the above method. It may be calculated. The comparison unit 220 is realized by, for example, the CPU 201, the main memory 202, and the storage 203.

The generation unit 230 generates a plurality of contents composed of video content and sound content having attribute information similar to each other according to the comparison result by the comparison unit 220. That is, the generation unit 230 generates a plurality of contents composed of combinations of video content and sound content similar to each other. The generation unit 230 is realized by the CPU 201, the main memory 202, and the storage 203, for example.

The communication unit 240 transmits two or more contents among the plurality of contents generated by the generation unit 230 to the playback device 100 via the communication network 300. When the communication unit 240 receives a content acquisition request from the playback device 100, the communication unit 240 may transmit the content corresponding to the acquisition request to the playback device 100. The communication unit 240 is realized by the communication IF 204, for example.

Next, the functional configuration of the information processing apparatus 400 will be described.

The information processing apparatus 400 includes a content DB 410, a registration unit 420, an input reception unit 430, and a communication unit 440.

The content DB 410 stores video content or sound content. The video content or the sound content is, for example, content created by a second user such as a content creator. When the creator of the video content and the creator of the sound content are different, there are a plurality of second users. The content DB 410 is realized by the storage 403, for example.

The registration unit 420 registers video content or sound content in the server 200 via the communication unit 440 according to information input by the second user to the input reception unit 430. For example, the registration unit 420 registers content-related information such as an ID for identifying the second user, content attribute information, and content playback time in association with the content. The registration unit 420 causes the communication unit 440 to transmit content related information and content to the server 200 via the communication network 300. The registration unit 420 is realized by, for example, the CPU 401, the main memory 402, and the storage 403.

The input reception unit 430 receives an input by the second user. Specifically, the input receiving unit 430 receives an input for the second user to register content in the server 200. The input receiving unit 430 is realized by the input IF 405, for example.

[1-2. Operation]
Next, the operation of the reproduction system 1 will be described.

FIG. 8 is a sequence diagram showing an example of a reproduction method by the reproduction system according to the embodiment.

The server 200 transmits the first content C10 to the playback device 100 via the communication network 300 (S11).

The playback device 100 receives the first content C10 transmitted by the server 200 via the communication network 300 (S21).

The server 200 transmits the second content C20 to the playback device 100 via the communication network 300 (S12).

The playback device 100 receives the second content C20 transmitted by the server 200 via the communication network 300 (S22).

Note that, although the server 200 transmits the first content C10 and the second content C20 apart, the server 200 may transmit the first content C10 and the second content C20 to the playback device 100 together. Therefore, the playback device 100 may receive the first content C10 and the second content C20 together.

The playback device 100 plays back the received first content C10 and second content C20 (S23). Details of the reproduction processing by the reproduction apparatus 100 will be described later.

FIG. 9 is a flowchart showing an example of details of the reproduction processing by the reproduction apparatus according to the embodiment.

In the playback device 100, when the playback process is started, the playback unit 130 plays back the first content C10 (S31).

The video playback unit 131 of the playback unit 130 acquires the timing when the playback of the first video content C11 included in the first content C10 ends (S32). For example, the video reproduction unit 131 acquires the reproduction time of the first video content C11 from the content related information included in the first video content C11. Then, the video reproduction unit 131 sets the timing after the reproduction time of the first video content C11 from the timing when the reproduction of the first content C10 is started as the timing when the reproduction of the first video content C11 ends.

The sound reproducing unit 132 of the reproducing unit 130 acquires the timing when the reproduction of the first sound content C12 included in the first content C10 ends (S33). For example, the sound reproducing unit 132 acquires the reproduction time of the first sound content C12 from the content related information included in the first sound content C12. Then, the sound reproduction unit 132 sets the timing after the reproduction time of the first sound content C12 from the timing when the reproduction of the first content C10 is started as the timing when the reproduction of the first sound content C12 ends.

Next, the playback unit 130 determines whether or not the playback of the first video content C11 ends before the playback of the first sound content C12 (S34). That is, the playback unit 130 determines whether or not the timing at which the playback of the first video content C11 ends is earlier than the timing at which the playback of the first sound content C12 ends.

When it is determined that the reproduction of the first video content C11 ends before the reproduction of the first sound content C12 (Yes in S34), the reproduction unit 130 determines whether the reproduction of the first video content C11 is completed. (S35).

When the playback unit 130 determines that the playback of the first video content C11 has ended (Yes in S35), the playback unit 130 stops the playback of the first sound content C12 and starts the playback of the second content (S36). That is, the playback unit 130 switches from playback of the first video content C11 to playback of the second video content C21 at a timing when playback of the first video content C11 ends, and from playback of the first sound content C12 to second. Switch to the playback of the sound content C22. Upon completion of step S36, the playback unit 130 may end the playback process, or may perform the same playback process on the third content next to the second content C20.

On the other hand, when the reproducing unit 130 determines that the reproduction of the first video content C11 has not ended (No in S35), the reproducing unit 130 repeats Step S35. Therefore, the reproducing unit 130 waits until the reproduction of the first video content C11 is completed.

In step S34, when the playback unit 130 determines that the playback of the first video content C11 ends after the timing when the playback of the first sound content C12 ends (No in S34), the playback of the first sound content C12 ends. It is determined whether or not (S37).

When the reproduction unit 130 determines that the reproduction of the first sound content C12 has ended (Yes in S37), the reproduction unit 130 repeats the reproduction of the first sound content C12 (S38), and returns to step S34. In this case, in the next step S34, it is determined whether or not the reproduction of the first video content C11 ends before the reproduction of the first sound content C12 that is repeatedly reproduced.

On the other hand, when the reproducing unit 130 determines that the reproduction of the first sound content C12 has not ended (No in S37), the reproducing unit 130 repeats Step S37. Therefore, the reproducing unit 130 stands by until the reproduction of the first sound content C12 is completed.

FIG. 10 is a sequence diagram showing an example of a registration method by the reproduction system according to the embodiment.

The registration unit 420 of the information processing device 400 selects one content from a plurality of video contents or a plurality of sound contents stored in the content DB 410 according to the input received by the input receiving unit 430 (S41). ).

The input receiving unit 430 receives input of content related information of the selected content (S42). As a result, the registration unit 420 associates the selected content with the received content-related information.

The communication unit 440 transmits the associated content related information together with the selected content to the server 200 via the communication network 300 (S43).

In the server 200, the communication unit 240 receives the content related information together with the content transmitted by the information processing apparatus 400 (S51).

The database 210 of the server 200 stores content related information together with the content received by the communication unit 240 (S52).

[1-3. Effect etc.]
According to the playback method according to the present embodiment, after playing back the first content C10 composed of the first video content C11 and the first sound content C12 that are independent from each other, the second video that is independent from each other. The second content C20 composed of the content C21 and the second sound content C22 is reproduced. Therefore, the first sound content C12 can be switched to the second sound content C22 at the timing of switching the first video content C11 to the second video content C21. Therefore, it is possible to reduce a sense of discomfort given to the user when the video content and the sound content are switched to different content.

In the playback method, each of the first content C10 and the second content C20 is composed of a combination of video content and sound content having similar attribute information. For this reason, the impression given to the user can be a unified impression for the video content and the sound content. For this reason, even when the video content and the sound content independent from each other are combined and reproduced, the uncomfortable feeling given to the user can be effectively reduced.

In the playback method, at the timing when the playback of the first video content C11 ends, the playback is switched from the first video content C11 to the second video content C21, and the first sound content C12 is changed to the second sound content C22. Switch and play. For this reason, the discomfort given to the user can be effectively reduced.

Also, in the playback method, in the playback process, when the playback of the first video content C11 continues even after the playback of the first sound content C12 is completed, the playback of the first sound content C12 is repeated. For this reason, during the reproduction of the first video content C11, the reproduction of the first sound content C12 can be continued. Therefore, the uncomfortable feeling given to the user can be effectively reduced.

Also, in the playback method, in the playback process, when the playback of the first sound content C12 does not end at the timing when the playback of the first video content C11 ends, the playback of the first sound content C12 is stopped at the timing by fading out. To do. For this reason, switching of reproduction from the first sound content C12 to the second sound content C22 can be realized more naturally. Therefore, it is possible to effectively reduce the uncomfortable feeling given to the user when the video content and the sound content are switched to different content.

In the playback method, in the playback process, the playback of the second sound content C22 is started by fading in the playback of the second content C20. For this reason, switching of reproduction from the first sound content C12 to the second sound content C22 can be realized more naturally. Therefore, it is possible to effectively reduce the uncomfortable feeling given to the user when the video content and the sound content are switched to different content.

[1-4. Modified example]
[1-4-1. Modification 1]
In the above-described embodiment, the server 200 includes the comparison unit 220 and the generation unit 230. However, instead, the playback device may include the comparison unit and the generation unit.

FIG. 11 is a block diagram illustrating an example of a functional configuration of a reproduction system according to a modification of the embodiment.

A reproduction system 1A according to the modification includes a server 200A having a configuration that does not include the comparison unit 220 and the generation unit 230, a comparison unit 140 that corresponds to the comparison unit 220, and a generation unit 150 that corresponds to the generation unit 230. 100A.

In the server 200A, the communication unit 240 transmits a plurality of video contents stored in the video content DB 211 and a plurality of sound contents stored in the sound content DB 212 to the playback device 100A via the communication network 300.

In the playback device 100A, the communication unit 110 receives a plurality of video contents and a plurality of sound contents transmitted by the server 200A via the communication network 300. The communication unit 110 stores the received plurality of video contents and the plurality of sound contents in the content DB 120. The comparison unit 140 compares the video attribute information included in each of the plurality of video contents with the sound attribute information included in each of the plurality of sound contents. The generation unit 150 generates a plurality of contents composed of video content and sound content having attribute information similar to each other according to the comparison result by the comparison unit 140, and stores the generated plurality of contents in the content DB 120.

Then, the reproducing unit 130 reproduces the second content C20 after reproducing the first content C10 among the plurality of contents stored in the content DB 120. Since the reproduction processing by the reproduction unit 130 is the same as that in the embodiment, description thereof is omitted.

[1-4-2. Modification 2]
When reproducing the contents C10 and C20, the reproducing apparatus 100 in the above embodiment may display an image related to the ambient degree together with the contents C10 and C20. The image may include at least one of an image indicating the ambient degree of the contents C10 and C20 and an image indicating the range of the ambient degree received by a receiving unit such as a remote controller (not shown).

By displaying an image relating to the ambient degree together with the contents C10 and C20 on the display 105, the user visually recognizes the image together with the reproduced contents C10 and C20. If the user visually recognizes an image indicating the degree of ambient, the user can recognize the degree of ambient of the contents C10 and C20 that are currently reproduced. Further, the user can recognize the range of the ambient degree designated by the user by visually recognizing the image indicating the range of the ambient degree. By recognizing these, for example, the user can instruct the playback device 100 to change the specified ambient degree higher or lower than the current degree through the reception unit.

Note that, instead of presenting an image relating to the ambient degree, or together, a sound relating to the ambient degree may be output by the speaker 106, and the same effect as described above can be obtained.

[1-5. Other effects]
Further, according to the control method of the playback device shown in the present embodiment and this modification, the playback device specifies the index associated with the content within the range of the index, and thereby the content to be played back Can be specified. At that time, the user need not recall the search key. The user can specify the content to be played back by the playback device simply by specifying the rough value of the index associated with the content within the range. In this way, the playback device enables more flexible content specification. Also, since flexible content specification is possible, the problem of increase in processing load and power consumption of the playback device when determination of content reflecting the user's intention fails can be avoided.

Also, the playback device enables more flexible content specification by using, as a specific index, an estimated index that indicates the degree of attention that the user directs to the content being played back.

Also, the playback device, server, or information processing device calculates an index associated with the content based on the degree of attention that the user has directed to each of the video and sound included in the content. As a result, the content index can be calculated in consideration of the video and sound included in the content.

Also, the playback device, server, or information processing device calculates an index associated with the content by a weighted average obtained by increasing the weight of the sound index of the video index and the sound index. In general, it is relatively easy for a person not to pay attention to the video presented by the playback device, but it is not easy to intentionally not pay attention to the sound. Absent. In other words, it is difficult to intentionally turn away from the sound presented by the playback device. Therefore, in the index linked to the content, the contribution of the degree of attention directed by the person to the sound is relatively increased, so that the index used for specifying the content can be adapted to the sense of the degree of attention directed by the user. Index.

Also, the playback device, server, or information processing device calculates an index associated with the content by a weighted average obtained by increasing the weight of the video index of the video index and the sound index. In general, when the size of a display screen for displaying content is large, it is difficult for the user to distract from the video. In such a case, in the index associated with the content, the index of the index used for specifying the content is set with respect to the degree of the attention directed by the user by relatively increasing the contribution of the degree of attention directed by the person to the video. It can be an indicator that matches the sense of

Also, the playback device, server, or information processing device can calculate the video index by specifically using the brightness, saturation, hue, or scene change mode of the video included in the content.

Also, the playback device, server, or information processing device can calculate the sound index by specifically using the volume, frequency distribution, or volume change mode included in the content.

Also, the playback device, server, or information processing device can cause the user to recognize the index of the content by presenting the index associated with the content along with the content being played back to the user. Then, it is possible to cause the user to make a determination as to whether or not the content that the user wants to present on the playback apparatus is compatible with the index range designated by the user.

In addition, when the playback device, the server, or the information processing device plays back both video content and sound content, both the index of the video content and sound content to be played back are included in the range specified by the user. Can do. Thus, the user can play both the video content and the sound content that are estimated to have the same level of attention by the playback device.

Also, the playback device can cause the content provider to recognize the index associated with the content by presenting the index when the content is stored in the server in advance.

Also, the playback device can make the content provider recognize the adjusted content index after adjusting the content. The content provider recognizes the index of the adjusted content, confirms the result of the adjustment made to the content provided by itself, and determines whether to store it in the server based on the result Can take action.

(Other embodiments)
In each of the above embodiments, each component is realized by executing a software program suitable for each component, but may be configured by dedicated hardware. Each component may be realized by a program execution unit such as a CPU or a processor reading and executing a software program recorded on a recording medium such as a hard disk or a semiconductor memory. Here, the software that realizes the reproduction method of each of the above embodiments is the following program.

That is, this program acquires the first content composed of the first video content and the first sound content independent from each other to the computer, and is composed of the second video content and the second sound content independent from each other. After the acquired second content is acquired and the acquired first content is reproduced, a reproduction method for reproducing the acquired second content is executed.

As described above, the playback method, playback system, and playback device according to one or more aspects of the present invention have been described based on the embodiment, but the present invention is not limited to this embodiment. Unless it deviates from the gist of the present invention, one or more of the present invention may be applied to various modifications that can be conceived by those skilled in the art, or forms constructed by combining components in different embodiments. It may be included within the scope of the embodiments.

In the playback device 100 according to the above embodiment, the playback unit 130 stops the playback of the first sound content C12 at the first timing when the playback of the first video content C11 of the first content C10 ends, and Although the reproduction of the second content C20 is started, the present invention is not limited to this. For example, when the ambient level of the video content is larger than a predetermined value and the ambient level of the sound content is smaller than the predetermined value, as described above, at the timing when the reproduction of the first audio content C12 ends, the first video content C11 Even if the process of switching to the second video content C21 is performed on the way, the uncomfortable feeling given to the user is small.

Therefore, the playback unit 130 may determine whether the video ambient degree is larger than a predetermined value (or a predetermined rank) and whether the sound ambient degree is smaller than a predetermined value (or a predetermined rank). Then, as a result of the determination, when the video ambient degree is larger than a predetermined value (or predetermined rank) and the sound ambient degree is smaller than the predetermined value (or predetermined rank), the reproducing unit 130 determines that the first content C10 The reproduction of the first video content C11 may be stopped and the reproduction of the second content C20 may be started at the timing when the reproduction of the one-sound content C12 ends.

For example, in the above-described embodiment, the sound ambient degree is described based on the volume of the sound of the content, the frequency distribution of the sound, or the change of the volume. However, the present invention is not limited to this. Among the sound frequency characteristics, the approximation with the so-called “1 / f fluctuation” characteristic, the number of overtone components, the regularity of the timbre waveform (frequency of several Hz or less) Area) and the like.

Note that the sound ambient level is an index at the research stage compared to the video ambient level, but the mid-range sound around 200 Hz is equivalent to vocals and human speech, and is likely to be heard by humans. I know it. Therefore, it is considered that the degree of attention directed by the user increases, and the degree of consciousness increases (the degree of ambient decreases).

Human beings live while listening to sounds in a wide band that exist in nature (not artificially processed), but the brain always processes these wide band sounds unconsciously. The human brain discriminates unusual sounds using clues such as overtone structure changes and subtle delays, and the degree of attention increases in order to detect danger. That is, it is considered that the degree of consciousness increases (the degree of ambient decreases).

In addition, the human brain tries to understand what is different from nature by unknowingly complementing it, so when listening to sounds that are different from the natural world, it will use brain resources, increasing the degree of consciousness (the degree of ambient is increased). It is thought that). Therefore, music that is composed to increase the degree of user's attention is not only highly conscious (low ambient), but also sounds that exist in the natural world, such as river buzz, can be recorded in a recording environment (such as a microphone or Depending on the performance of the recording device, the degree of ambient may be reduced.

This disclosure can be applied to a playback method or the like that can reduce a sense of discomfort given to a user when video content and sound content are switched to different content.

1,

1A playback system

100, 100A playback device 101 CPU
102 Main memory 103 Storage 104 Communication IF
105 Display 106 Speaker 110 Communication Unit 120 Content DB
130 playback unit 131 video playback unit 132 sound playback unit 140 comparison unit 150

generation unit

200, 200A server 201 CPU
202 Main memory 203 Storage 204 Communication IF
210 Database 211 Video content DB
212 Sound content DB
220 Comparison Unit 230 Generation Unit 240 Communication Unit 300 Communication Network 400 Information Processing Device 401 CPU
402 Main memory 403 Storage 404 Communication IF
405 Input IF
410 Content DB
420 registration unit 430 input reception unit 440 communication unit C10 first content C11 first video content C12 first sound content C20 second content C21 second video content C22 second sound content

Claims

Obtaining first content composed of first video content and first sound content independent of each other;
Obtaining second content composed of second video content and second sound content independent of each other;
A playback method for playing back the acquired second content after playing back the acquired first content.
further,
Compare the video attribute information included in each of the plurality of video contents with the sound attribute information included in each of the plurality of sound contents,
According to the comparison result, a plurality of contents composed of video contents and sound contents having attribute information similar to each other are generated,
In the acquisition of the first content, the first content is acquired from the plurality of generated contents,
The playback method according to claim 1, wherein in acquiring the second content, the second content is acquired from the plurality of generated contents.
The video attribute information is an index associated with a video included in the video content, and includes a video index that is an estimation index indicating a degree of attention directed by the user to the video content being played back,
The sound attribute information includes an index associated with a sound included in the sound content, and a sound index that is an estimation index indicating a degree of attention directed by the user to the sound content being reproduced,
The reproduction method according to claim 2, wherein in the comparison, the video index is compared with the sound index.
2. The reproduction stops the reproduction of the first sound content and starts reproduction of the second content at a first timing when the reproduction of the first video content of the first content ends. 4. The reproduction method according to any one of items 3.
5. The playback method according to claim 4, wherein, in the playback, when the playback of the first video content continues even after the playback of the first sound content ends, the playback of the first sound content is repeated.
6. The playback method according to claim 4, wherein if the playback of the first sound content does not end at the first timing, the playback of the first sound content is stopped at the first timing by fading out. 6. .
The reproduction method according to any one of claims 4 to 6, wherein in the reproduction, reproduction of the second sound content is started by fading in the reproduction of the second content.
A server storing a plurality of video contents and a plurality of sound contents, each of which is independent of each other;
A playback device connected to the server via a communication network,
The playback device
A plurality of the first video contents of the plurality of video contents and a first sound content of the plurality of sound contents via the communication network from the server; An acquisition unit that acquires a second video content of the video content and a second content composed of the second sound content of the plurality of sound content;
After the first video content and the first sound content constituting the first content obtained by the obtaining unit are reproduced at least once, the second video content constituting the obtained second content and And a playback unit that plays back the second sound content.
The first content is composed of the first video content and the first sound content that are independent from each other via the communication network from the external server, and the second video content and the second sound content are independent from each other. An acquisition unit for acquiring the second content;
After the first video content and the first sound content constituting the first content obtained by the obtaining unit are reproduced at least once, the second video content constituting the obtained second content and A playback unit that plays back the second sound content.