WO2019107631A1

WO2019107631A1 - Device and method for producing content

Info

Publication number: WO2019107631A1
Application number: PCT/KR2017/014090
Authority: WO
Inventors: 황벽주; 김민영; 정병준; 배성욱
Original assignee: (주) 유윈인포시스
Priority date: 2017-11-29
Filing date: 2017-12-04
Publication date: 2019-06-06
Also published as: KR101981955B1

Abstract

A content producing device according to an embodiment of the present invention comprises: a content providing unit for receiving an original content; a control unit for receiving an input of a reproduction magnification from a user; an image processing unit for generating and outputting, according to the reproduction magnification, a slow motion image corresponding to an original image of the original content; and an audio processing unit for calculating the number of audio samples according to a type of a target frame of the slow motion image, and generating and outputting slow audio data including an audio sample of the original content according to the number of the audio samples.

Description

Apparatus and method for producing content

BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a technique for producing contents, and more particularly, to a technique for producing contents including slow audio data.

In order to provide the slow motion contents, audio corresponding to the slow motion image and the slow motion should be generated. However, when audio corresponding to the slow motion is generally generated, audio data corresponding to the slow motion (hereinafter referred to as slow audio data) is generated such that the number of samples of audio corresponding to the frame unit is matched to each frame There is a need.

However, when the slow audio data is generated in accordance with the slow motion according to the magnification set by the user rather than the fixed magnification, the audio samples of the audio samples corresponding to the end of each frame and the audio samples corresponding to the beginning of the next frame are continuous There is a high possibility that a platter sound is generated in a speaker for reproducing slow audio data.

SUMMARY OF THE INVENTION The present invention provides a content producing apparatus and method for generating slow audio data.

According to an aspect of the present invention, there is provided a content providing apparatus including: a content providing unit for receiving original content; A control unit for receiving a playback magnification from a user; An image processor for generating an original image of the original content as a slow motion image according to the reproduction magnification and outputting the slow motion image; And an audio processing unit for calculating the number of audio samples according to the type of the target frame of the slow motion video and generating and outputting slow audio data including audio samples of the original content according to the number of audio samples, Device is provided.

Wherein the audio processing unit comprises: a cache for storing a basic audio sample and a spare audio sample corresponding to the reproduction magnification; A pitch shift unit for calculating the number of audio samples according to a type of a target frame of the slow motion image; And a sliding audio buffer for generating and outputting slow audio data including audio samples corresponding to the target frame according to the number of audio samples.

Wherein the sliding audio buffer preferentially stores the basic audio samples and extracts additional audio samples corresponding to the number of audio samples from the cache from the cache if the number of audio samples is greater than the number of basic audio samples And to generate and output slow audio data including the basic audio samples and the additional audio samples after the basic audio samples.

Wherein the pitch shift unit calculates the number of original frames as the number of audio samples, which is a discarded value of a value obtained by dividing the number of 1-times audio samples by the total number of frames of the slow motion image when the target frame is an original frame, The number of audio samples is calculated by dividing the number of 1-time audio samples divided by the total number of frames of the slow motion image by the number of audio samples, and when the target frame is an additional interpolation frame, A value obtained by dividing the number obtained by subtracting the number of original frame samples and the number of interpolated frame samples from the number of double-speed audio samples divided by the number of additional interpolation frames may be calculated as the number of audio samples.

According to another aspect of the present invention, there is provided a method of producing content by a content production apparatus, comprising: receiving original content; Receiving a playback magnification from a user; Generating an original image of the original content as a slow motion image according to the reproduction magnification; Calculating a number of audio samples according to a type of a target frame of the slow motion image and generating and outputting slow audio data including audio samples of the original content according to the number of audio samples; / RTI >

Wherein the step of calculating the number of audio samples according to the kind of the target frame of the slow motion image and generating and outputting the slow audio data including the audio samples of the original content according to the number of audio samples, Storing a basic audio sample and a spare audio sample in a cache; Calculating a number of audio samples according to a type of a target frame of the slow motion image; And generating and outputting slow audio data including audio samples corresponding to a target frame according to the number of audio samples.

Wherein generating and outputting slow audio data including audio samples corresponding to a target frame according to the number of audio samples includes: storing the basic audio samples preferentially in a sliding audio buffer; Extracting additional audio samples corresponding to the number of audio samples of the spare audio samples from the cache and storing them after the basic audio samples if the number of audio samples is greater than the number of basic audio samples; And generating and outputting slow audio data including the basic audio sample and the additional audio sample.

Wherein the step of calculating the number of audio samples according to the kind of the target frame of the slow motion image comprises the step of calculating the number of audio samples by subtracting a value obtained by dividing the number of 1x audio samples by the total number of frames of the slow motion image, Calculating a number of frame samples as the number of audio samples; Calculating the number of interpolated frame samples, which is a value obtained by dividing the value obtained by dividing the number of 1-times audio samples by the total number of frames of the slow motion image, when the target frame is an interpolation frame; And calculating a value obtained by dividing a value obtained by subtracting the number of original frame samples and the number of interpolated frame samples from the number of 1x audio samples by the number of additional interpolation frames when the target frame is a supplementary interpolation frame. . &Lt; / RTI >

As described above, according to the present invention, it is possible to prevent a plosive sound generated when reproducing slow audio data according to an arbitrary reproduction magnification.

Further, according to the present invention, it is possible to generate slow audio data using an arbitrary reproduction magnification without using a designated reproduction magnification in order to prevent a plague sound from being generated.

BRIEF DESCRIPTION OF THE DRAWINGS FIG. 1 illustrates a content production apparatus according to an embodiment of the present invention. FIG.

2 is a diagram illustrating an audio processing unit of a content production apparatus according to an embodiment of the present invention;

3 is a conceptual illustration of a frame of a slow motion image generated by a content production apparatus according to an embodiment of the present invention.

4 is a diagram illustrating an original image and an original audio to which a content production apparatus according to an exemplary embodiment of the present invention is input;

FIG. 5 illustrates an audio sequence when the content production apparatus according to an embodiment of the present invention does not adjust the number of audio samples according to the type of each frame, and an audio sequence when the number of audio samples is adjusted according to the type of each frame FIG.

While the present invention has been described in connection with certain exemplary embodiments, it is to be understood that the invention is not limited to the disclosed embodiments, but, on the contrary, is intended to cover various modifications and similarities. It should be understood, however, that the invention is not intended to be limited to the particular embodiments, but includes all modifications, equivalents, and alternatives falling within the spirit and scope of the invention.

FIG. 1 is a diagram illustrating a content production apparatus according to an embodiment of the present invention.

Referring to FIG. 1, a content production apparatus according to an exemplary embodiment of the present invention includes a content providing unit 110, a control unit 120, an image processing unit 130, and an audio processing unit 140.

The content providing unit 110 receives original content including video and audio from a communication network or a storage medium and transmits a video of the original content (hereinafter referred to as an original video) to the video processing unit 130, (Hereinafter, referred to as original audio) to the audio processing unit 140.

The control unit 120 receives from the designated input device (for example, a mouse, a keyboard, a shuttle interface, etc.) range information indicating an object whose playback speed is to be changed and playback magnification of original content corresponding to the range information, And transmits the information and reproduction magnification to the image processing unit 130 and the audio processing unit 140. At this time, the reproduction magnification may be a magnification that the user arbitrarily inputs, not the magnification selected by the user who selected one of the preset reproduction magnifications.

The image processing unit 130 generates a slow motion image corresponding to the reproduction magnification of the original image corresponding to the range information of the entire original image. The image processing unit 130 outputs a slow motion image. At this time, the image processing unit 130 may insert the interpolation frame between each frame of the original image according to the magnification information to generate the slow motion image. In addition, when the number of interpolation frames corresponding to the scaling information between frames of the original image (hereinafter, referred to as original frames) is inserted, the image processing unit 130 determines that, when the time at which some original frames are reproduced does not match the scaling information, The slow motion image can be generated by inserting additional interpolation frames between some original frames. The process of creating and inserting the interpolation frame and the additional interpolation frame may follow a known method such as standard.

The audio processing unit 140 generates slow audio data so that the audio samples corresponding to the range information of the entire original audio correspond to each frame of the slow motion image according to the reproduction magnification. The process of generating audio data by the audio processing unit 140 will be described in detail with reference to FIGS. 2 to 3. FIG. The audio processing unit 140 outputs slow audio data.

FIG. 2 is a diagram illustrating an audio processing unit of a content production apparatus according to an embodiment of the present invention. FIG. 3 conceptually illustrates a frame of a slow motion image generated by a content production apparatus according to an exemplary embodiment of the present invention FIG.

2, an audio processing unit 140 of a content production apparatus according to an exemplary embodiment of the present invention includes a cache 210, a pitch shift unit 220, and a sliding audio buffer 230.

The cache 210 receives a predetermined number of audio samples (hereinafter, referred to as basic audio samples) and a predetermined number of audio samples (hereinafter, referred to as spare audio samples) Quot;). At this time, the cache 210 may store a basic audio sample and an additional audio sample for one frame of the slow motion image.

The pitch shift unit 220 calculates the number of audio samples corresponding to each frame corresponding to the range information. For example, the pitch shifting unit 220 may identify the type of the frame corresponding to the audio sample for which the slow audio data is to be generated (hereinafter, referred to as a target frame) (either an original frame, an interpolation frame, or an additional interpolation frame And the number of audio samples corresponding to the type of the target frame can be calculated. In other words, the original image of 0.07 second composed of four frames at 1x speed is changed to 0.31 times speed according to the input of the user, and four original frames 310, four interpolation frames 320, and an additional interpolation frame 330 ), The pitch shifting unit 220 sets the number of audio samples corresponding to the entire frame corresponding to the range of the original image at the 1x speed (hereinafter, referred to as 1x audio sample number) The reduced value of the value divided by the total number of frames of the slow motion image can be calculated as the number of original frame samples. That is, when the number of audio samples corresponding to one frame is 1602, the pitch shifting unit 220 subtracts 6408, which is four times 1602, from the total number of frames of the slow-motion image, which is 13, The number of frame samples can be calculated. In addition, the pitch shift unit 220 can calculate a value obtained by dividing the value obtained by dividing the number of 1-times audio samples by the total number of frames of the slow-motion image, as the number of interpolated frame samples. That is, the pitch shifting unit 220 can calculate the number of interpolated frame samples by 493, which is a value obtained by dividing 6408 by 13. In addition, the pitch shift unit 220 can calculate a value obtained by dividing the value obtained by subtracting the number of original frame samples and the number of interpolation frame samples from the number of 1-times audio samples divided by the number of additional interpolation frames, as the number of additional interpolation frame samples. The pitch shift unit 220 transmits either the number of original frame samples, the number of interpolated frame samples, or the number of additional interpolated frame samples to the sliding audio buffer 230 according to the type of the target frame. In addition, the pitch shift unit 220 may store the number of interpolation frame samples or the number of additional interpolation frame samples, which is the subtraction of the number of original frame samples, among the spare audio samples stored in the cache 210, when the target frame is an interpolation frame or an additional interpolation frame (Hereinafter referred to as an additional audio sample) to the sliding audio buffer 230.

The sliding audio buffer 230 receives basic audio samples from the content providing unit 110. In addition, the sliding audio buffer 230 receives the additional audio samples from the cache 210 and generates and outputs slow audio data configured to follow the stored basic audio samples.

FIG. 4 is a diagram illustrating an original image and an original audio to which a content production apparatus according to an exemplary embodiment of the present invention is input, FIG. 5 is a diagram illustrating a content production apparatus according to an exemplary embodiment of the present invention, The audio sequence in the case where the number of samples is not adjusted and the audio sequence in the case where the number of audio samples is adjusted in accordance with the type of each frame.

Referring to FIG. 4, the content production apparatus may receive content matching five audio samples for each frame on an image composed of twelve frames. At this time, as illustrated in FIG. 4, the audio sequence of the content may be linearly formed. When the content illustrated in FIG. 4 is reproduced at 0.5 times speed, an interpolation frame may be inserted between each original frame as shown in FIG. That is, a total of five audio samples can be matched in one frame, which is the original frame, and in the interpolation 1-1, which is the interim frame. However, when 2.5 audio samples are matched in each of the 1 frame and the interpolation 1-1, and 2.5 audio samples are matched to each frame in general, the audio sequence is deformed into a non-linear shape as shown in FIG. The content production apparatus according to an embodiment of the present invention matches two audio samples with respect to an original frame and matches three audio samples with respect to an interpolation frame so as to include a linear audio sequence as well as an audio sequence of original content Slow audio data can be generated. Accordingly, the general low-speed reproduction for the audio sequence generates a non-linear audio sequence such as 510, so that a plosive sound is generated in the speaker during audio reproduction. However, in the content production apparatus according to an embodiment of the present invention, By adjusting the number of audio samples to be 520, it is possible to prevent a plague sound from being generated even at a low speed reproduction.

FIG. 6 is a flowchart illustrating a method of generating content by a content production apparatus according to an exemplary embodiment of the present invention. Each step to be described below allows the subject of each step to be collectively referred to as a content production apparatus for a brief and clear description of a process or an invention performed through each functional unit constituting the content production apparatus.

Referring to FIG. 6, in step 610, the content production apparatus receives original content. At this time, the content production apparatus can receive the playback magnification and range information of the slow audio data from the user.

In step 620, the content production apparatus stores the basic audio samples and the redundant audio samples of the original content in the cache 210. At this time, the content production apparatus can cache a predetermined number of basic audio samples and spare audio samples for the reproduction magnification from the audio samples that are earlier than the audio samples that are not included in the slow audio data among the audio samples of the original contents . Accordingly, the content production apparatus can store a predetermined number of audio samples in the cache 210 after the audio samples included in the slow audio data corresponding to the target frame in the previous step and according to the reproduction magnification.

In step 630, the content production apparatus calculates the number of audio samples corresponding to the target frame of the slow motion image. For example, the content production apparatus determines that the target frame of the slow motion image generated according to the reproduction magnification is one of an original frame, an interpolation frame, and an additional interpolation frame, and outputs audio corresponding to the target frame The number of samples is calculated.

In step 640, the content production apparatus generates and outputs slow audio data including audio samples of the number of audio samples corresponding to the target frame. That is, the content production apparatus generates and outputs slow audio data including the number of audio samples corresponding to the type of the target frame.

In step 650, the content production apparatus confirms whether the target frame is the last frame corresponding to the range information of the original content.

If the target frame is the last frame corresponding to the range information of the original content in step 650, the content production apparatus ends the production process of the content.

If the target frame is not the last frame corresponding to the range information of the original content in step 650, the content production apparatus repeats the process from step 620.

The present invention has been described above with reference to the embodiments thereof. Many embodiments other than the above-described embodiments are within the scope of the claims of the present invention. It will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims. The disclosed embodiments should, therefore, be considered in an illustrative rather than a restrictive sense. The scope of the present invention is defined by the appended claims rather than by the foregoing description, and all differences within the scope of equivalents thereof should be construed as being included in the present invention.

Claims

A content providing unit for receiving original content;

A control unit for receiving a playback magnification from a user;

An image processor for generating an original image of the original content as a slow motion image according to the reproduction magnification and outputting the slow motion image; And

An audio processing unit for calculating the number of audio samples according to a type of a target frame of the slow motion video and generating and outputting slow audio data including audio samples of the original content according to the number of audio samples;

And a content creator.
The method according to claim 1,

The audio processing unit includes:

A cache for storing a primary audio sample and a secondary audio sample corresponding to the playback magnification;

A pitch shift unit for calculating the number of audio samples according to a type of a target frame of the slow motion image; And

And generating and outputting slow audio data including audio samples corresponding to the target frame according to the number of audio samples,

And a content creator.
3. The method of claim 2,

The sliding audio buffer includes:

Preferentially storing the basic audio samples,

Extracting additional audio samples corresponding to the number of audio samples of the spare audio samples from the cache and storing them after the basic audio samples if the number of audio samples is greater than the number of the basic audio samples,

And generates and outputs slow audio data including the basic audio sample and the additional audio sample.
3. The method of claim 2,

The pitch-

Wherein when the target frame is an original frame, the number of audio samples is calculated as the number of original frame samples, which is a value of a value obtained by dividing the number of 1-times audio samples by the total number of frames of the slow motion video,

The number of interpolated frame samples which is a value obtained by dividing the value obtained by dividing the number of 1x-speed audio samples by the total number of frames of the slow-motion image is calculated as the number of audio samples when the object frame is an interpolation frame,

And a value obtained by dividing the value obtained by subtracting the number of original frame samples and the number of interpolated frame samples from the number of 1x audio samples by the number of additional interpolation frames is calculated as the number of audio samples when the target frame is a supplementary interpolation frame. Lt; / RTI >
A method of producing a content by a content production apparatus,

Receiving original content;

Receiving a playback magnification from a user;

Generating an original image of the original content as a slow motion image according to the reproduction magnification; And

Calculating a number of audio samples according to a type of a target frame of the slow motion image, and generating and outputting slow audio data including audio samples of the original content according to the number of audio samples;

And generating the content.
6. The method of claim 5,

Wherein the step of calculating the number of audio samples according to the type of the target frame of the slow motion image and generating and outputting the slow audio data including the audio samples of the original content according to the number of audio samples,

Storing a basic audio sample and a spare audio sample corresponding to the playback magnification in a cache;

Calculating a number of audio samples according to a type of a target frame of the slow motion image; And

Generating and outputting slow audio data including audio samples corresponding to a target frame according to the number of audio samples;

And generating the content.
The method according to claim 6,

And generating and outputting slow audio data including audio samples corresponding to a target frame according to the number of audio samples,

Storing the basic audio samples preferentially in a sliding audio buffer;

Extracting additional audio samples corresponding to the number of audio samples of the spare audio samples from the cache and storing them after the basic audio samples if the number of audio samples is greater than the number of basic audio samples; And

And generating and outputting slow audio data including the basic audio sample and the additional audio sample.
The method according to claim 6,

Wherein the step of calculating the number of audio samples according to the type of the target frame of the slow-

Calculating, as the number of audio samples, the number of original frame samples, which is a discarded value of a value obtained by dividing the number of 1 × -speed audio samples by the total number of frames of the slow motion image when the target frame is an original frame;

Calculating the number of interpolated frame samples, which is a value obtained by dividing the value obtained by dividing the number of 1-times audio samples by the total number of frames of the slow motion image, when the target frame is an interpolation frame; And

Calculating a value obtained by dividing a value obtained by subtracting the number of original frame samples and the number of interpolated frame samples from the number of 1x audio samples by the number of additional interpolation frames when the target frame is a supplementary interpolation frame;

And generating the content.