WO2021131782A1

WO2021131782A1 - Information processing system, information processing method, and program

Info

Publication number: WO2021131782A1
Application number: PCT/JP2020/046248
Authority: WO
Inventors: 義行小林; 高林　和彦; 山岸　靖明
Original assignee: ソニーグループ株式会社
Priority date: 2019-12-26
Filing date: 2020-12-11
Publication date: 2021-07-01

Abstract

The present disclosure relates to an information processing system, an information processing method, and a program which enable migration to be performed more efficiently. The information processing system is provided with: a plurality of virtual servers which form a cloud computing environment; and an orchestrator which manages the plurality of virtual servers and executes the migration of an encoder instance from a first virtual server to a second virtual server. In addition, when executing the migration, only data necessary for a takeover is acquired from among data held in the encoder instance of the first virtual server, and the data is set to an encoder instance of the second virtual server. The present technology can be applied to, for example, a video and voice encoding processing system that performs live streaming.

Description

Information processing system and information processing method, and program

This disclosure relates to an information processing system, an information processing method, and a program, and more particularly to an information processing system, an information processing method, and a program that enable more efficient migration.

Generally, cloud computing is a concept that uses a large number of computers connected by a computer network such as the Internet, and is used mainly for the purpose of providing services via the network. By the way, the services provided by cloud computing appear to be provided by physical servers, but virtual hardware that is simulated or emulated by software running on one or more physical servers. Provided by.

For example, this virtual hardware is called a virtual machine. The state in which a virtual machine is deployed and executed on a physical server is called a virtual machine instance.

Furthermore, there is a technology called migration that moves virtual machine instances between physical servers. Migration is a technology for continuing service operation on a physical server in a cloud computing environment without stopping the service on the virtual machine instance when an operating system is upgraded or a hardware failure occurs.

For example, Patent Document 1 discloses a method in which "mobility configuration data" is defined in a virtual machine instance execution environment, and the virtual machine instance itself does not move but moves within a limited range such as an application or a process. There is.

In addition, Patent Document 2 has an approach of mounting an FPGA (Field Programmable Gate Array) emulator to virtualize the FPGA, and a method of continuously using hardware acceleration by the FPGA between the migration source and the migration destination. It is disclosed.

Further, Patent Document 3 describes a method for improving efficiency by moving only necessary data without moving the virtual machine instance itself when migrating a multifunction device equipped with a virtual machine. It is disclosed.

Japanese Unexamined Patent Publication No. 2014-96136 JP-A-2018-5576 Japanese Unexamined Patent Publication No. 2010-72872

By the way, in the cloud computing environment as described above, when migrating the video / audio encoding process, it was difficult to maintain the real-time property of the video / audio encoding process. At the same time, regarding video / audio encoding processing using hardware support, it was difficult to continue hardware support even at the migration destination virtual machine instance.

Therefore, in a cloud computing environment, there is a demand for a video / audio encoding migration method that is performed with high efficiency so that real-time performance and hardware support can be continued without the need for migration of the virtual machine instance itself.

This disclosure was made in view of such a situation, and is intended to enable more efficient migration.

The information processing system of one aspect of the present disclosure manages a plurality of virtual servers constituting a cloud computing environment and the plurality of the virtual servers, and migrates an encoder instance from the first virtual server to the second virtual server. It is equipped with an orchestrator to be executed, and at the time of migration, only the data necessary for taking over is acquired from the data held internally by the encoder instance of the first virtual server, and the data is used as the encoder of the second virtual server. Set in the instance.

The information processing method of one aspect of the present disclosure manages a plurality of virtual servers constituting a cloud computing environment and the plurality of the virtual servers, and migrates an encoder instance from the first virtual server to the second virtual server. At the time of migration, the information processing system including the orchestrator to be executed acquires only the data necessary for taking over the data held internally by the encoder instance of the first virtual server, and obtains the data. Includes setting in the encoder instance of 2 virtual servers.

The program of one aspect of the present disclosure manages a plurality of virtual servers constituting a cloud computing environment and the plurality of the virtual servers, and migrates an encoder instance from the first virtual server to the second virtual server. At the time of migration, only the data necessary for taking over the data held internally by the encoder instance of the first virtual server is acquired by the computer of the information processing system provided with the orchestrator, and the data is acquired by the first virtual server. The information processing including setting in the encoder instance of the virtual server of 2 is executed.

In one aspect of the present disclosure, at the time of migration, of the data held internally by the encoder instance of the first virtual server, only the data necessary for taking over is acquired, and that data is transferred to the encoder instance of the second virtual server. Set.

It is a block diagram which shows the configuration example of a cloud computing environment. It is a figure explaining the migration of a virtual machine instance. It is a block diagram which shows the structural example of one Embodiment of the video-audio encoding processing system to which this technique is applied. It is a figure explaining the migration of an audio encoder instance. It is a flowchart explaining the migration process. It is a figure explaining CELT audio encoding. It is a figure explaining the input and output in CELT. It is a figure explaining the copy of only necessary data in CELT. It is a figure explaining the reference relationship between frames in MPEG-2 video encoding. It is a figure explaining the rearrangement of a video frame. It is a figure explaining the case where migration occurred between B5 picture and B6 picture of an input video frame. It is a figure explaining the modification of the migration process in CELT audio encoding. It is a figure explaining the modification of the migration process in MPEG-2 video encoding. It is a block diagram which shows the structural example of one Embodiment of the computer to which this technique is applied.

Hereinafter, specific embodiments to which the present technology is applied will be described in detail with reference to the drawings.

<About migration of cloud computing environment and virtual machine instance>
First, before explaining the video / audio encoding processing system to which the present technology is applied, a general cloud computing environment and migration of a virtual machine instance will be described with reference to FIGS. 1 and 2.

FIG. 1 is a block diagram showing a configuration example of a cloud computing environment.

As shown in FIG. 1, the cloud computing environment 11 is configured by connecting an orchestrator 21, a plurality of virtual servers 22, and a virtual network switch 23 via a network. Note that FIG. 1 shows a configuration example in which two virtual servers 22-1 and 22-2 are connected, but even if one virtual server 22 is connected, three virtual servers 22 are connected. The above may be connected. Further, in the following, when it is not necessary to distinguish between the virtual servers 22-1 and 22-2, the description will be simply referred to as the virtual server 22.

The orchestrator 21 transmits resource allocation parameters and instance initialization parameters to the virtual server 22, and sets routing for the virtual network switch 23.

The virtual server 22 deploys and executes the virtual machine instance 32 on the physical server 31 according to the resource allocation parameter and the instance initialization parameter, and the virtual machine instance 32 can execute various application programs 33. For example, in the virtual server 22-1, one virtual machine instance 32-1 is deployed to the physical server 31-1, and two application programs 33A-1 and 33B-1 are executed by the virtual machine instance 32. To. Further, in the virtual server 22-2, two virtual machine instances 32A-2 and 32B-2 are deployed to the physical server 31-2, and the application program 33A-2 is executed by the virtual machine instance 32A-2. , The application program 33B-2 is executed by the virtual machine instance 32B-2.

The virtual network switch 23 routes packets according to the routing settings by the orchestrator 21 for network virtualization.

In the cloud computing environment 11 having such a configuration, different operating systems can be operated on one physical server 31 (server computer as a physical entity) by using the virtual machine technology. Then, in the cloud computing environment 11, various application programs 33 can be executed in the operating system, and software that normally operates in a different computer architecture environment can be executed.

Further, in the cloud computing environment 11, the virtual machine technology can be used to move the service point or increase or decrease the resources (CPU, memory, etc.) allocated to the service without making the end user aware of it.

The migration of the virtual machine instance 32 will be described with reference to the cloud computing environment 11 shown in FIG.

For example, in the cloud computing environment 11, the migration source virtual server 22-1 takes a memory snapshot of the virtual machine instance 32-1 while the virtual machine instance 32-1 is running, and the virtual server 22 Transfer to -2. Subsequently, the virtual server 22-1 acquires the network settings, connection status, and the like of the virtual machine instance 32-1 and transfers them to the virtual server 22-2. In response to this, the virtual server 22-2, which is the migration destination, establishes a memory snapshot, network settings, connection status, etc. on the virtual machine instance 32-2.

Further, in the cloud computing environment 11, the local storage of the migration source virtual server 22-1 is copied to the virtual machine instance 32-2 of the migration destination virtual server 22-2. Then, the migration is realized by restarting the virtual machine instance 32-2 on the virtual server 22-2. After that, the virtual server 22-1 stops the virtual machine instance 32-1.

Note that such migration requires a required time of several seconds to several minutes depending on the size (scale) of the virtual machine instance 32. Therefore, various techniques for improving the efficiency of migration have already been proposed.

For example, when memory copy is performed for migration, the memory information changes sequentially because the virtual machine instance 32-1 is operating on the migration source physical server 31-1 even during the memory copy. Therefore, after executing a memory copy corresponding to the memory area used by the virtual machine instance 32-1 to be moved on the migration source physical server 31-1, the difference regarding the memory information changed during the memory copy. It is necessary to copy and send it to the migration destination physical server 31-2.

And since the memory state changes while copying this difference, it is necessary to copy the difference between them. In this way, when the memory difference information becomes small after executing the memory copy and the difference copy several times, the virtual machine instance 32-1 is stopped at the migration source and the last remaining difference is copied. After that, the virtual machine instance 32-2 is started at the migration destination, and the processing by the virtual machine instance 32-2 is restarted. Here, the time required from the stop of the virtual machine instance 32-1 to the restart of the virtual machine instance 32-2 is referred to as downtime. And since the system is blacked out when the downtime becomes long, various techniques for shortening the downtime have already been proposed.

That is, the above-mentioned Patent Document 1 discloses a technique for reducing the amount of data transfer between physical servers by limiting the scope of migration to each application program or process. Further, in Patent Document 3 described above, although it is implemented in a multifunction device, it is further described by moving only the data required for the processing being executed in the virtual machine instance without moving the virtual machine instance itself. It has been shown that the amount of data transferred can be reduced. In addition, when a process with hardware acceleration is being executed, it is uncertain whether the migration destination provides the same hardware acceleration function. Therefore, Patent Document 2 discloses a technique for continuing hardware support even in a virtual machine instance as a migration destination.

As described above, in the conventional technology, a migration technology that suppresses the time to stop the virtual machine instance by suppressing the amount of data transfer between virtual servers is shown. In particular, container-based virtualization technology has made it easier to extract only memory areas related to specific application programs and processes from memory snapshots of virtual machine instances.

However, with the prior art, it was not easy to separate the code part and the data part from the memory area. Therefore, for example, as disclosed in Patent Document 3, it is necessary to design in advance the minimum data set required at the time of migration according to the contents of the application program, the process, and the like.

Also, in the conventional technology, the technology for continuing hardware support even in the migration destination virtual machine instance is shown. However, when the application program or process to be migrated is video / audio encoding processing, software processing and hardware processing may coexist. Therefore, in that case, the encoding process could not be taken over only by the state of the CPU register and the snapshot transfer of the memory.

Furthermore, in all conventional technologies, when transferring data between virtual servers, the migration source virtual machine instance is stopped, so there is a system blackout period. This is a fatal drawback in the video / audio encoding process that requires real-time processing.

In particular, Patent Document 3 discloses a technique for transferring only data necessary for migration. However, the technique disclosed in Patent Document 3 is characterized in that processing can be continued without depending on the processing method of the processing block by transferring the input data for a specific processing block in the multifunction device. This is different from the technology described below.

That is, as described below, the present technology is characterized in that the data existing inside the processing block is taken out and transferred. This focuses on the fact that in general video / audio encoding processing, data compression is performed using the correlation of the video / audio data in the time direction.

Conversely, if only the input data to the video / audio encoder is transferred, the data time correlation is used inside the encoder, so even if an encoder instance that has been initialized with the same encoding parameters is prepared, the encoder instance At the time of switching, there is a possibility that a discontinuity point may occur in the output video / audio data. In particular, depending on the video / audio encoding coding method, discontinuities in the output video / audio data are surely generated. That is, noise is generated in the video or audio due to the switching of the encoder instances executed in a place invisible to the end user.

On the other hand, in this technology, it is possible to avoid the occurrence of such discontinuities in the output video / audio data and reproduce the video or audio without generating noise.

<Configuration example of video / audio encoding processing system>
FIG. 3 is a block diagram showing a configuration example of an embodiment of the video / audio encoding processing system 51 to which the present technology is applied.

As shown in FIG. 3, the video / audio encoding processing system 51 is configured by connecting an orchestrator 61, a virtual server 62, a virtual network switch 63, an audio input unit 64, and an audio output unit 65 via a network. .. For example, the video / audio encoding processing system 51 starts the video / audio encoding processing when the video / audio encoding processing definition document is posted to the orchestrator 61 by the end user. Here, for example, an input URL (Uniform Resource Locator), an output URL, an encoding parameter, and the like are described in the video / audio encoding processing definition document.

The orchestrator 61 manages one or more virtual servers 62. For example, the orchestrator 61 transmits the resource allocation parameter and the instance initialization parameter to the virtual server 62 according to the contents of the video / audio encoding processing definition document. As a result, the orchestrator 61 allocates resources for the virtual machine instance 72 running on the virtual server 62, starts and initializes the application program executed by the virtual machine instance 72, and the like. Further, the orchestrator 61 carries out the migration performed between the plurality of virtual servers 62.

The virtual server 62 has a configuration in which the operating system, the virtual machine instance 72, and the audio encoder instance 73 (application program) operate on the physical server 71. For example, the virtual machine instance 72 may be a hypervisor type virtual environment, a host type virtual environment, or a container type virtual environment. Further, the virtual machine instance 72 may be a virtual environment in which virtualization is performed directly on the physical server 71, and in this case, no operating system is required.

The main function of the virtual network switch 63 is to perform network virtualization and route packets for that purpose. For example, the virtual network switch 63 acquires audio data input via the audio input unit 64 according to routing settings (input URL, output URL, etc.) by the video / audio encoding processing system 51, and uses the encoded audio data as audio. Output to the output unit 65.

Then, in the video / audio encoding processing system 51, the audio encoder instance 73 of the virtual server 62 executes the audio encoding processing for the audio data input from the audio input unit 64 via the virtual network switch 63. After that, in the video / audio encoding processing system 51, the audio data encoded by the audio encoder instance 73 is output to the audio output unit 65 via the virtual network switch 63. The video / audio encoding processing system 51 can execute the video encoding processing by executing the video encoder instance 75 as shown in FIG. 10 described later on the virtual server 62.

Further, when the audio encoder instance 73 is migrated in the video / audio encoding processing system 51, the packet sequence number being processed by the audio encoder instance 73 is supplied to the orchestrator 61. As a result, the orchestrator 61 can acquire only the data required for migration from the audio encoder instance 73 and set the data as the migration destination.

By the way, the packet sequence number being processed does not match the input unit of the audio encoder instance 73 and the network packet because the video sample is fragmented or the audio sample is grouped. Therefore, in the video / audio encoding processing system 51, when migrating the audio encoder instance 73, the virtual server uses only the minimum data set that can ensure the continuity and consistency of the video / audio output at the migration source and the migration destination. Transfer between 62.

The migration of the audio encoder instance 73 will be described with reference to the video / audio encoding processing system 51 shown in FIG.

In the video / audio encoding processing system 51 shown in FIG. 4, the virtual server 62-1 is the migration source, and the virtual server 62-2 is the migration destination. Further, the audio encoder instances 73-1 and 73-2 have intermediate state buffers 74-1 and 74-2 for recording only the data necessary for taking over among the data held inside.

Then, when the migration is performed in the video / audio encoding processing system 51, only the data recorded in the intermediate state buffer 74-1 of the audio encoder instance 73-1 is the intermediate state buffer 74-2 of the audio encoder instance 73-2. Is copied to.

For example, the video / audio encoding processing system 51 can separate the code and the data in the application program or process and execute the migration only in the data part. As a result, in the video / audio encoding processing system 51, the data transfer time can be shortened. Further, the video / audio encoding processing system 51 can receive hardware support even at the migration destination because the code is not migrated.

<Processing example of migration processing>
An example of the migration process in the video / audio encoding processing system 51 will be described with reference to the flowchart shown in FIG. In the following, the processing for the audio frame by the audio encoder instance 73 shown in FIG. 4 will be described, but the same processing is also performed for the video frame by the video encoder instance 75 (FIG. 10 described later).

In step S11, the orchestrator 61 transmits a stop preparation instruction command for instructing the audio encoder instance 73-1 on the virtual server 62-1 to be the migration source to prepare for instance stop to the virtual server 62-1.

In step S12, the virtual server 62-1 receives the stop preparation instruction command transmitted in step S11, and executes a process of combining or separating network packets as described below as the stop preparation process.

For example, the virtual server 62-1 receives the last audio frame before migration. At this time, the last audio frame may be divided and transmitted as a plurality of network packets, or may be combined and a plurality of frames may be transmitted in one network packet. Therefore, for example, the virtual server 62-1 completes the join processing when the payload of the network packet is in the middle of the join process.

After that, the virtual server 62-1 inputs the audio frame (group) obtained by the join process to the audio encoder instance 73-1 and performs the encoding process. At the same time, the virtual server 62-1 performs a flash process on the audio encoder instance 73-1 and forcibly outputs the data if the audio encoder instance 73-1 retains the output internally. Further, the virtual server 62-1 acquires the initial setting parameters of the encoding from the audio encoder instance 73-1 which is the migration source. Then, when the stop preparation process for preparing the instance stop on the virtual server 62-1 is completed, the process proceeds to step S13.

In step S13, the virtual server 62-1 sets the completion notification that the stop preparation process executed according to the stop preparation instruction command is completed and the initial setting parameter acquired from the audio encoder instance 73-1 to the orchestrator 61. Send to. As a result, the orchestrator 61 acquires the initial setting parameters.

In step S14, the orchestrator 61 transmits an acquisition request command for requesting (inquiring) acquisition of the packet sequence number of the last processed network packet to the audio encoder instance 73-1. Here, as the packet sequence number of the network packet, for example, when using the RTP (Real-time Transport Protocol) protocol, the value in the sequence number field of the RTP packet header can be referred to and the value can be used.

In step S15, the audio encoder instance 73-1 transmits the packet sequence number of the last processed network packet to the orchestrator 61 in response to the acquisition request command transmitted in step S14. As a result, the orchestrator 61 acquires the packet sequence number of the last processed network packet.

In step S16, the orchestrator 61 determines a new routing in order to transfer the network packet from the migration source audio encoder instance 73-1 to the migration destination audio encoder instance 73-2.

In step S17, the orchestrator 61 transmits a start instruction command instructing the start of the audio encoder instance 73-2 together with resource allocation in the virtual server 62-2 to the virtual server 62-2 to be the migration destination.

In step S18, the virtual server 62-2 receives the start instruction command transmitted in step S17, turns on the power of the physical server 71-2, and performs a start process for starting the virtual machine instance 72-2. .. At that time, the virtual server 62-2 allocates hardware resources such as CPU and memory based on the contents described in the service definition document. After that, the virtual server 62-2 loads the software and starts the audio encoder instance 73-2.

When the start-up process of step S18 is completed, the process proceeds to step S19, and the virtual server 62-2 sends a completion notification to the orchestrator 61 notifying that the start-up process executed according to the start-up instruction command is completed.

In step S20, the orchestrator 61 transmits an initialization instruction command including an encoding initialization parameter to the virtual server 62-2.

In step S21, the virtual server 62-2 receives the initialization instruction command transmitted in step S20 and executes the initialization process of setting the initialization parameters of the encoding in the audio encoder instance 73-2.

When the initialization process of step S21 is completed, the process proceeds to step S22, and the virtual server 62-2 transmits a completion notification to the orchestrator 61 notifying that the initialization process executed according to the initialization instruction command is completed. ..

In step S23, the orchestrator 61 transmits an acquisition instruction command instructing acquisition of the data recorded in the intermediate state buffer 74 of the audio encoder instance 73-1 as the migration source to the virtual server 62-1.

In step S24, the virtual server 62-1 receives the acquisition instruction command transmitted in step S23 and extracts the data recorded in the intermediate state buffer 74-1 of the audio encoder instance 73-1.

In step S25, the virtual server 62-1 transmits the data extracted in step S24 to the orchestrator 61. As a result, the orchestrator 61 acquires the data recorded in the intermediate state buffer 74-1 of the audio encoder instance 73-1.

In step S26, the orchestrator 61 transmits to the virtual server 62-2 a setting instruction command instructing to set the data in the intermediate state buffer 74-2 together with the data acquired in step S25.

In step S27, the virtual server 62-2 acquires the data and the setting instruction command transmitted in step S26 and executes the setting process of setting the intermediate state buffer 74-2.

In step S28, the virtual server 62-2 sends a completion notification to the orchestrator 61 to notify that the setting process executed according to the setting instruction command is completed.

In step S29, the orchestrator 61 transmits an termination instruction command instructing the termination of the audio encoder instance 73-1 that is the migration source to the virtual server 62-1.

In step S30, the virtual server 62-1 receives the end instruction command transmitted in step S29 and executes the end process of terminating the audio encoder instance 73-1. At this time, the virtual server 62-1 may stop the virtual machine instance 72-1.

When the end processing of step S30 is completed, the processing proceeds to step S31, and the virtual server 62-1 sends a completion notification to the orchestrator 61 notifying that the end processing executed according to the end instruction command is completed.

In step S32, the orchestrator 61 transmits a change instruction command instructing the change of the routing to the virtual network switch 63 together with the setting of the new routing determined in step S16.

In step S33, the virtual network switch 63 receives the new routing setting and change instruction command transmitted in step S32, and executes the routing change process. For example, the virtual network switch 63 switches the routing setting of the virtual network switch so that the network packet input to the migration source virtual machine instance 72-1 is input to the migration destination virtual machine instance 72-2. ..

When the routing change processing of step S33 is completed, the processing proceeds to step S34, and the virtual network switch 63 transmits a completion notification to the orchestrator 61 notifying that the routing change processing executed according to the change instruction command is completed. ..

As described above, the video / audio encoding processing system 51 executes the migration process to ensure the continuity and consistency of the video / audio output. That is, when the video / audio encoding process is operated in a cloud computing environment, the video / audio encoding process is performed when the video / audio encoding process is migrated due to maintenance of the operating environment or service handover in a 5G core network. It can continue seamlessly and output consistent video or audio. As a result, it is particularly effective to use the video / audio encoding processing system 51 for the production and distribution (streaming) of live content.

Further, the video / audio encoding processing system 51 moves the video / audio encoding processing to another virtual server 62 even if a failure occurs in the operating environment when the video / audio encoding processing is operated in the cloud computing environment. Video or audio can be recovered from the middle. As a result, the video / audio encoding processing system 51 is effective not only for the production and distribution (streaming) of live content, but also for, for example, on-demand content production.

Further, the video / audio encoding processing system 51 avoids affecting other processing on the same virtual server 62 when executing the video / audio encoding processing in a multi-tenant cloud computing environment, and the video / audio encoding processing system 51 avoids affecting other processing on the same virtual server 62. Only the encoding process can be migrated.

In addition, the video / audio encoding processing system 51 starts the migration source virtual machine instance 72-1 and audio encoder instance 73-1 in duplicate and the migration destination virtual machine instance 72-2 and audio encoder instance 73-2. It is possible to keep it. As a result, the video / audio encoding processing system 51 can avoid the occurrence of a so-called blackout period, and can stably switch the virtual server 62.

<About CELT audio encoding>
The data recorded in the intermediate state buffer 74 in CELT audio encoding will be described with reference to FIGS. 6 to 8.

FIG. 6 is a diagram illustrating CELT (Constrained Energy Lapped Transform) audio encoding.

CELT is an audio coding method used in live streaming. For example, the audio encoder instance 73 supports CELT encoding.

Generally, in audio coding, a time-frequency conversion called MDCT (Modified Discrete Cosine Transform) is used. For example, as shown in FIG. 6, in MDCT processing, the input audio signal is divided into audio frames having a predetermined time width, and the MDCT processing block of 2N samples (hatched in FIG. 6) overlaps with the adjacent audio frame by half. Blocks with) are generated. Then, MDCT is performed for each MDCT processing block, and N MDCT coefficients are acquired. In the inverse transformation, 2N signal samples are acquired from N MDCT coefficients. Then, after the MDCT process, a process of compressing the MDCT coefficient obtained by performing MDCT is performed on the MDCT processed block obtained from the adjacent audio frame.

For example, as shown in FIG. 7, in CELT encoding, two temporally adjacent uncompressed audio frames are input to the audio encoder instance 73, and one compressed audio frame is output from the audio encoder instance 73.

Therefore, in the audio encoding for MDCT processing, it is assumed that the uncompressed audio frame input to the audio encoder instance 73-1 is switched to be input to the audio encoder instance 73-2 from a certain point in time. In this case, the compressed audio frame output from the audio encoder instance 73-1 and the compressed audio frame output from the audio encoder instance 73-2 may not be continuous as audio.

For example, in order to avoid the discontinuity due to input switching, an MDCT processing block obtained by duplicating adjacent audio frames at the input switching point by half is required. Then, in order to acquire such an MDCT processing block, two methods described below are considered.

The first method is a method of re-inputting all the audio frames to the migration destination audio encoder instance 73-2 in order from the first audio frame input to the migration source audio encoder instance 73-1. The second method is to copy the last generated MDCT processing block held by the migration source audio encoder instance 73-1 to the migration destination audio encoder instance 73-2, and the latter half of the MDCT processing block. It is a method to generate an MDCT processing block using the data of.

By the way, the first method is not realistic considering the processing cost and applicability to live encoding. Therefore, in the video / audio encoding processing system 51, in the migration of the audio encoder instance 73 that encodes the CELT, the second method, that is, the method of acquiring the MDCT processing block last generated at the migration source and setting it at the migration destination. Is adopted.

For example, as shown in FIG. 8, only the internal state data of the audio encoder instance 73-1, that is, only the output of the final frame in the middle of encoding is moved to the audio encoder instance 73-2. FIG. 8 shows an example in which migration is performed between the 6th frame and the 7th frame of the input audio frame sequence.

In this example, the MDCT processing block W (6), which is the temporarily stored data after the MDCT calculation is executed at the time of output of the fifth frame, is transmitted from the audio encoder instance 73-1 to the audio encoder instance 73-2. At this time, by making the audio encoder instance 73-2 hot standby in the virtual server 62-2, it is possible to avoid the occurrence of downtime. Then, the audio encoder instance 73-2 can obtain a complete audio output even after the sixth frame by using the MDCT processing block W (6).

For example, by adopting such a method, it is possible to output audio without a gap. However, it is necessary to acquire the MDCT processing block generated last at the migration source and add an implementation to set it at the migration destination.

The coding methods that include MDCT processing in the audio encoding processing include MP3, Dolby Digital (AC-3), Vorbis (Ogg), Windows Media Audio (WMA), ATRAC, Cook, Advanced Audio Coding (AAC), and LDAC. , Dolby AC-4, MPEG-H 3D Audio.

In addition, even lossless compression such as Adaptive Differential Pulse Code Modulation (ADPCM) and Free Lossless Audio Codec (FLAC) can be applied to encoding processing including linear predictive coding. For example, linear predictive coding includes forward prediction, which predicts the prediction of an audio sample at a certain time from a group of audio samples in the past, and backward prediction, which predicts a prediction of an audio sample at a certain time from a group of audio samples in the future. There is a prediction of. It is self-evident that both predictions require audio samples held by past or future audio frames.

<About MPEG-2 video encoding>
The data recorded in the intermediate state buffer 74 in MPEG-2 (Moving Picture Experts Group -2) video encoding will be described with reference to FIGS. 9 to 11.

FIG. 9 is a diagram illustrating a reference relationship between frames in MPEG-2 video encoding.

For example, as a configuration of the MPEG-2 video encoder, encode the input uncompressed video frame in the order of I1, B2, B3, P4, B5, B6, P7, B8, B9, P10. It is assumed that it has been set. Here, I is an I picture, P is a P picture, B is a B picture, and a number is a presentation order.

"I picture" is a video frame that does not refer to other frames, "P picture" is a video frame that records only the difference from the previous video frame in time, and "B picture" is It is a video frame that records only the difference between the two video frames before and after in time. Therefore, an uncompressed video frame encoded as an I picture does not refer to another video frame.

Also, the MPEG-2 video encoder performs reordering processing. The reordering process is a process of rearranging the order of video frames from the presentation order to the decoding order at the time of output of the encoder. For example, as shown in FIG. 10, the video frame sequences of I1, B2, B3, P4, B5, B6, P7, B8, B9, P10 are rearranged inside the video encoder instance 75, and I1, P4, It is output in the order of B2, B3, P7, B5, B6, P10, B8, B9.

For these technical reasons, a video frame encoded as an I-picture or P-picture is retained and required inside the video encoder instance 75 because it is predictive reference data that may be referenced by other video frames. May be referred to accordingly. Therefore, in the MPEG-2 video encoder, in order to migrate the video encoder instance 75, the video frame held internally by the video encoder instance 75-1 which is the migration source is migrated to the video encoder instance which is the migration destination. If you do not copy to 75-2, the encoding process will break down.

For example, with reference to FIG. 11, a case where migration occurs between the B5 picture and the B6 picture of the input video frame will be described. That is, in this case, the B5 picture is input to the video encoder instance 75-1 which is the migration source, and the B6 picture is input to the video encoder instance 75-2 which is the migration destination.

At this time, the P4 picture and the B5 picture held inside the video encoder instance 75-1 are copied to the video encoder instance 75-2 as processing data. However, the P4 picture is copied as a referenced picture, while the B5 picture is copied as a process waiting state to be encoded. Then, the video encoder instance 75-2 sets the P4 picture and the B5 picture, and waits for the input of the B6 picture. The B2 picture and the B3 picture are extruded from the video encoder instance 75-1.

Similarly, the case where migration occurs between I1, B2, B3, P4, B5, B6, P7, B8, B9, and P10 will be described below.

First, a case where migration occurs between I1 picture and B2 picture of the input video frame will be described. That is, in this case, the I1 picture is input to the video encoder instance 75-1 which is the migration source, and the B2 picture is input to the video encoder instance 75-2 which is the migration destination. At this time, the I1 picture held inside the video encoder instance 75-1 is copied to the video encoder instance 75-2 as processing data.

In addition, a case where migration occurs between B2 picture and B3 picture of the input video frame will be described. That is, in this case, the B2 picture is input to the video encoder instance 75-1 which is the migration source, and the B3 picture is input to the video encoder instance 75-2 which is the migration destination. At this time, the I1 picture and the B2 picture held inside the video encoder instance 75-1 are copied to the video encoder instance 75-2 as processing data. However, the I1 picture is copied as a referenced picture, but the B2 picture is copied as a process waiting state to be encoded.

In addition, a case where migration occurs between B3 picture and P4 picture of the input video frame will be described. That is, in this case, the B3 picture is input to the video encoder instance 75-1 which is the migration source, and the B4 picture is input to the video encoder instance 75-2 which is the migration destination. At this time, the I1 picture, the B2 picture, and the B3 picture held inside the video encoder instance 75-1 are copied to the video encoder instance 75-2 as processing data. However, the I1 picture is copied as a referenced picture, but the B2 and B3 pictures are copied as a process waiting state to be encoded.

In addition, a case where migration occurs between P4 picture and B5 picture of the input video frame will be described. That is, in this case, the P4 picture is input to the video encoder instance 75-1 which is the migration source, and the B5 picture is input to the video encoder instance 75-2 which is the migration destination. At this time, the P4 picture held inside the video encoder instance 75-1 is copied to the video encoder instance 75-2 as processing data. However, the P4 picture is copied as a referenced picture.

In addition, a case where migration occurs between B6 picture and P7 picture of the input video frame will be described. That is, in this case, the B6 picture is input to the video encoder instance 75-1 which is the migration source, and the P7 picture is input to the video encoder instance 75-2 which is the migration destination. At this time, the P4 picture, the B5 picture, and the B6 picture held inside the video encoder instance 75-1 are copied to the video encoder instance 75-2 as processing data. However, the P4 picture is copied as a referenced picture, but the B5 picture and B6 picture are copied as a processing waiting state to be encoded.

Note that the case where migration occurs between P7 picture and B8 picture of the input video frame is processed in the same way as the case where migration occurs between P4 picture and B5 picture of the input video frame. Further, the case where the migration occurs between the B8 picture and the B9 picture of the input video frame is processed in the same manner as the case where the migration occurs between the B5 picture and the B6 picture of the input video frame. Further, the case where the migration occurs between the B9 picture and the P10 picture of the input video frame is processed in the same manner as the case where the migration occurs between the B6 picture and the P7 picture of the input video frame. Therefore, the description of the processing for those cases will be omitted.

It should be noted that the video encoding process can be applied to the encoding process for calculating the predicted difference in the time direction such as I frame, P frame, and B frame. Methods for performing such video encoding include MPEG-4 AVC (Advanced Video Coding) and HEVC (High Efficiency Video Coding).

<Modification example of migration processing of video / audio encoding processing system>
A modification of the migration process performed in the video / audio encoding processing system 51 will be described with reference to FIGS. 12 and 13.

For example, in the video / audio encoding processing system 51, it is possible to acquire the internal state of the audio encoder instance 73-1 and perform the migration without adding the interface to be set in the audio encoder instance 73-2.

For example, the orchestrator 61 instructs the virtual server 62-2 to terminate the output data when the final input is received by the audio encoder instance 73-1 which is the migration source. Then, when the audio encoder instance 73-2, which is the migration destination, receives the first output, the orchestrator 61 instructs the virtual server 62-2 to perform the initialization processing of the output data.

That is, in the migration process described with reference to the flowchart shown in FIG. 5, after the processes from step S11 to step S22 are performed in the same manner, the orchestrator 61 terminates the output data instead of the process in step S23. A terminal processing instruction command instructing processing is transmitted to the virtual server 62-1.

For example, the termination process of the output data in CELT audio encoding is to fade out the final output by inputting silent data after the input voice, as described with reference to FIG. 12 described later. At the same time, the flash processing of the audio encoder instance 73-1 is performed, and if there is data whose output is reserved internally by the audio encoder instance 73-1, it is forcibly output.

Further, the termination processing of the output data in the MPEG-2 video encoding is to forcibly encode as a GOP end as described with reference to FIG. 13 described later. At the same time, the flash processing of the video encoder instance 75-1 is performed, and if there is data whose output is reserved internally by the video encoder instance 75-1, it is forcibly output.

After that, the routing change processing in the virtual network switch 63 is performed in the same manner as the processing in steps S32 to S34 of FIG.

Then, the orchestrator 61 transmits a start point processing instruction command instructing the start point processing of the output data to the virtual server 62-1.

For example, the start point processing of output data in CELT audio encoding is to fade in the first output by inputting silent data prior to the input voice, as described with reference to FIG. 12 described later.

Further, the start point processing of the output data in the MPEG-2 video encoding is to forcibly encode as a GOP start as described with reference to FIG. 13 described later.

After that, the termination process for terminating the audio encoder instance 73-1 is performed in the same manner as the processes in steps S29 to S31 of FIG.

By executing such a migration process in the video / audio encoding processing system 51, the coding boundary can be forcibly generated.

A modified example of the migration process in CELT audio encoding will be described with reference to FIG.

FIG. 12 shows an example in which migration is performed between the 6th frame and the 7th frame of the input audio frame sequence. First, the sixth frame is input to the audio encoder instance 73-1, and then silence data is input to forcibly generate the final frame. Therefore, the final frame output from the audio encoder instance 73-1 is in a state of being faded out.

On the other hand, the coding parameter is transmitted to the audio encoder instance 73-2, the audio encoder instance 73-2 is initialized using the received coding parameter, and the seventh and subsequent frames after the silent data is input. Wait for input. Then, the audio encoder instance 73-2 discards the first output frame and outputs from the second output frame. At this time, the second output frame is in a state of being faded in. Further, by making the audio encoder instance 73-2 hot standby in the virtual server 62-2, it is possible to avoid the occurrence of downtime.

For example, by adopting such a method, one frame will be discarded, but it will be possible to output audio without generating noise. In addition, it is not necessary to implement the MDCT processing block that was last generated by the migration source as described above and set it as the migration destination.

A modified example of the migration process in MPEG-2 video encoding will be described with reference to FIG.

FIG. 13 shows an example in which migration occurs between the B5 picture and the B6 picture of the input video frame. First, the video encoder instance 75-1 creates a GOP boundary by cutting the back reference. As a result, I1, P4, B2, B3, and P5 are output as GOP # 1 from the video encoder instance 75-1.

On the other hand, the video encoder instance 75-2 generates a GOP boundary by cutting the forward reference. As a result, I6, P9, B7, B8, P12, B10, B11, ... Are output from the video encoder instance 75-2 as GOP # 2.

<Computer configuration example>
Next, the series of processes (information processing method) described above can be performed by hardware or software. When a series of processes is performed by software, the programs constituting the software are installed on a general-purpose computer or the like.

FIG. 14 is a block diagram showing a configuration example of an embodiment of a computer on which a program for executing the above-mentioned series of processes is installed.

The program can be recorded in advance on the hard disk 105 or ROM 103 as a recording medium built in the computer.

Alternatively, the program can be stored (recorded) in the removable recording medium 111 driven by the drive 109. Such a removable recording medium 111 can be provided as so-called package software. Here, examples of the removable recording medium 111 include a flexible disk, a CD-ROM (Compact Disc Read Only Memory), an MO (Magneto Optical) disk, a DVD (Digital Versatile Disc), a magnetic disk, a semiconductor memory, and the like.

The program can be installed on the computer from the removable recording medium 111 as described above, or can be downloaded to the computer via a communication network or a broadcasting network and installed on the built-in hard disk 105. That is, for example, the program transfers wirelessly from a download site to a computer via an artificial satellite for digital satellite broadcasting, or transfers to a computer by wire via a network such as LAN (Local Area Network) or the Internet. be able to.

The computer has a built-in CPU (Central Processing Unit) 102, and the input / output interface 110 is connected to the CPU 102 via the bus 101.

When a command is input by the user by operating the input unit 107 or the like via the input / output interface 110, the CPU 102 executes a program stored in the ROM (Read Only Memory) 103 accordingly. .. Alternatively, the CPU 102 loads the program stored in the hard disk 105 into the RAM (Random Access Memory) 104 and executes it.

As a result, the CPU 102 performs processing according to the above-mentioned flowchart or processing performed according to the above-mentioned block diagram configuration. Then, the CPU 102 outputs the processing result from the output unit 106, transmits it from the communication unit 108, or records it on the hard disk 105, if necessary, via the input / output interface 110, for example.

The input unit 107 is composed of a keyboard, a mouse, a microphone, and the like. Further, the output unit 106 is composed of an LCD (Liquid Crystal Display), a speaker, or the like.

Here, in the present specification, the processing performed by the computer according to the program does not necessarily have to be performed in chronological order in the order described as the flowchart. That is, the processing performed by the computer according to the program also includes processing executed in parallel or individually (for example, parallel processing or processing by an object).

Further, the program may be processed by one computer (processor) or may be distributed processed by a plurality of computers. Further, the program may be transferred to a distant computer and executed.

Further, in the present specification, the system means a set of a plurality of components (devices, modules (parts), etc.), and it does not matter whether all the components are in the same housing. Therefore, a plurality of devices housed in separate housings and connected via a network, and a device in which a plurality of modules are housed in one housing are both systems. ..

Further, for example, the configuration described as one device (or processing unit) may be divided and configured as a plurality of devices (or processing units). On the contrary, the configurations described above as a plurality of devices (or processing units) may be collectively configured as one device (or processing unit). Further, of course, a configuration other than the above may be added to the configuration of each device (or each processing unit). Further, if the configuration and operation of the entire system are substantially the same, a part of the configuration of one device (or processing unit) may be included in the configuration of another device (or other processing unit). ..

Further, for example, this technology can have a cloud computing configuration in which one function is shared and jointly processed by a plurality of devices via a network.

Further, for example, the above-mentioned program can be executed in any device. In that case, the device may have necessary functions (functional blocks, etc.) so that necessary information can be obtained.

Further, for example, each step described in the above flowchart can be executed by one device or can be shared and executed by a plurality of devices. Further, when a plurality of processes are included in one step, the plurality of processes included in the one step can be executed by one device or shared by a plurality of devices. In other words, a plurality of processes included in one step can be executed as processes of a plurality of steps. On the contrary, the processes described as a plurality of steps can be collectively executed as one step.

In the program executed by the computer, the processing of the steps for describing the program may be executed in chronological order according to the order described in this specification, or may be called in parallel or called. It may be executed individually at a necessary timing such as time. That is, as long as there is no contradiction, the processing of each step may be executed in an order different from the above-mentioned order. Further, the processing of the step for writing this program may be executed in parallel with the processing of another program, or may be executed in combination with the processing of another program.

It should be noted that the present techniques described in the present specification can be independently implemented independently as long as there is no contradiction. Of course, any plurality of the present technologies can be used in combination. For example, some or all of the techniques described in any of the embodiments may be combined with some or all of the techniques described in other embodiments. It is also possible to carry out a part or all of any of the above-mentioned techniques in combination with other techniques not described above.

<Example of configuration combination>
The present technology can also have the following configurations.
(1)
Multiple virtual servers that make up a cloud computing environment
It is equipped with an orchestrator that manages a plurality of the above virtual servers and migrates encoder instances from the first virtual server to the second virtual server.
At the time of migration
Of the data held internally by the encoder instance of the first virtual server, only the data necessary for taking over is acquired, and the data is acquired.
An information processing system that sets the data in the encoder instance of the second virtual server.
(2)
The information processing system according to (1) above, wherein when the encoder instance performs MDCT (Modified Discrete Cosine Transform) processing of an audio frame, the data required for taking over is an MDCT processing block.
(3)
The information processing system according to (1) or (2) above, wherein when the encoder instance performs linear predictive coding processing of a video frame, the data required for taking over is predictive reference data.
(4)
The information processing system according to (3) above, wherein when the encoder instance performs a video frame reordering process, the data required for taking over is a referenced video frame.
(5)
An information processing system including a plurality of virtual servers constituting a cloud computing environment and an orchestrator that manages the plurality of the virtual servers and migrates an encoder instance from the first virtual server to the second virtual server. ,
At the time of migration
Of the data held internally by the encoder instance of the first virtual server, only the data necessary for taking over is acquired, and
An information processing method including setting the data in an encoder instance of the second virtual server.
(6)
An information processing system including a plurality of virtual servers constituting a cloud computing environment and an orchestrator that manages the plurality of the virtual servers and migrates an encoder instance from the first virtual server to the second virtual server. On the computer
At the time of migration
Of the data held internally by the encoder instance of the first virtual server, only the data necessary for taking over is acquired, and
A program for executing information processing including setting the data in the encoder instance of the second virtual server.

Note that the present embodiment is not limited to the above-described embodiment, and various changes can be made without departing from the gist of the present disclosure. Further, the effects described in the present specification are merely examples and are not limited, and other effects may be obtained.

11 cloud computing environment, 21 orchestrator, 22 virtual server, 23 virtual network switch, 31 physical server, 32 virtual machine instance, 33 application program, 51 video / audio encoding processing system, 61 orchestrator, 62 virtual server, 63 virtual network Switch, 64 voice input unit, 65 voice output unit, 71 physical server, 72 virtual machine instance, 73 audio encoder instance, 74 intermediate state buffer

Claims

Multiple virtual servers that make up a cloud computing environment
It is equipped with an orchestrator that manages a plurality of the above virtual servers and migrates encoder instances from the first virtual server to the second virtual server.
At the time of migration
Of the data held internally by the encoder instance of the first virtual server, only the data necessary for taking over is acquired, and the data is acquired.
An information processing system that sets the data in the encoder instance of the second virtual server.
The information processing system according to claim 1, wherein when the encoder instance performs MDCT (Modified Discrete Cosine Transform) processing of an audio frame, the data required for taking over is an MDCT processing block.
The information processing system according to claim 1, wherein when the encoder instance performs linear predictive coding processing of a video frame, the data required for taking over is predictive reference data.
The information processing system according to claim 3, wherein when the encoder instance performs a video frame reordering process, the data required for the takeover is a referenced video frame.
An information processing system including a plurality of virtual servers constituting a cloud computing environment and an orchestrator that manages the plurality of the virtual servers and migrates an encoder instance from the first virtual server to the second virtual server. ,
At the time of migration
Of the data held internally by the encoder instance of the first virtual server, only the data necessary for taking over is acquired, and
An information processing method including setting the data in an encoder instance of the second virtual server.
An information processing system including a plurality of virtual servers constituting a cloud computing environment and an orchestrator that manages the plurality of the virtual servers and migrates an encoder instance from the first virtual server to the second virtual server. On the computer
At the time of migration
Of the data held internally by the encoder instance of the first virtual server, only the data necessary for taking over is acquired, and
A program for executing information processing including setting the data in the encoder instance of the second virtual server.