CN103928036A - Method and device for generating audio file according to image - Google Patents

Method and device for generating audio file according to image Download PDF

Info

Publication number
CN103928036A
CN103928036A CN 201310013003 CN201310013003A CN103928036A CN 103928036 A CN103928036 A CN 103928036A CN 201310013003 CN201310013003 CN 201310013003 CN 201310013003 A CN201310013003 A CN 201310013003A CN 103928036 A CN103928036 A CN 103928036A
Authority
CN
Grant status
Application
Patent type
Prior art keywords
image
audio
file
according
method
Prior art date
Application number
CN 201310013003
Other languages
Chinese (zh)
Inventor
谢巍
Original Assignee
联想(北京)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date

Links

Abstract

The invention discloses a method for generating an audio file according to an image, and relates to the technical field of electrons. The image can be represented through the audio file, so that user experience is more diversified. The method includes the steps that a luminance and chrominance image is acquired, wherein the image includes three factor values of each pixel; the tone and audio length corresponding to each pixel are calculated according to any two factor values of the corresponding pixel in the image; the tone and audio length corresponding to each pixel in the image are recorded, so that the audio file is generated. The method and device for generating the audio file according to the image are mainly used in the audio file generating process according to the image.

Description

一种根据图像生成音频文件的方法及装置 A method and apparatus for generating an audio file based on the image

技术领域 FIELD

[0001 ] 本发明涉及电子技术领域,尤其涉及一种根据图像生成音频文件的方法及装置。 [0001] The present invention relates to electronic technologies, and in particular relates to a method and apparatus for generating an audio file according to the image. 背景技术 Background technique

[0002] 多媒体电子产品给人们的生活和工作带来了许多不同体验,例如可以通过电子产品欣赏照片、视频和音频。 [0002] multimedia electronic products to the life and work of people brought a lot of different experiences, for example, can enjoy photos, video and audio through electronics.

[0003] 照片和视频等图像数据是通过每个像素的因子值进行表示的,例如可以是红色绿色黑色(Red Green Black,RGB)图像,也可以是亮度色度(YUV)图像。 [0003] photo and video image data representing the value of each pixel by a factor of, for example, a Red Green Black (Red Green Black, RGB) images, or may be a luminance-chrominance (the YUV) images. 以YUV格式的图像为例,对于图像中的每个一个像素都用Y值、U值和V值表示。 In YUV format image as an example, for each pixel in the image is a Y value are used, U and V values ​​represent. 其中,Y代表像素的亮度,U和V代表像素的色度。 Wherein Y represents a luminance pixel chrominance, U, and V represents a pixel. 显示设备可以将每个像素的Y值、U值和V值将图像显示。 The display device may be a Y value of each pixel, U and V values ​​of the image displayed.

[0004] 但是,即使将图像显示,用户也仅能通过视觉欣赏图像,带来的用户体验较为单一。 [0004] However, even if the image is displayed, the user can only appreciate by visual images, the user experience is single.

发明内容 SUMMARY

[0005] 本发明的实施例提供一种根据图像生成音频文件的方法及装置,可以通过音频表现图像,使得用户体验更多元化。 Example [0005] The present invention provides a method and apparatus for generating an audio file according to the image, the image can be audio performance, so that the user experience more diversified.

[0006] 为达到上述目的,本发明的实施例采用如下技术方案: [0006] To achieve the above object, embodiments of the present invention adopts the following technical solutions:

[0007] 本发明的一方面,提供一种根据图像生成音频文件的方法,包括: [0007] In one aspect of the present invention, there is provided a method for generating an audio file according to the image, comprising:

[0008] 获取亮度色度图像;其中所述图像包含每个像素的三个因子值; [0008] obtaining luminance chrominance image; wherein the image comprises a three factor values ​​for each pixel;

[0009] 根据所述图像中所述像素的任意两个因子值,计算所述像素所对应的音调和音长; [0009] The factor of any two values ​​of the pixels in the image, calculating the corresponding pixel tone and sound length;

[0010] 记录所述图像中每个所述像素所对应的音调和音长,生成音频文件。 [0010] recorded in each of the pixels in the image corresponding to the tone and sound length, generating an audio file.

[0011] 结合本发明的第一方面,在一种可能的实现方式中,所述图像为红绿黑RGB图像,所述图像的三个因子值分别为:红色通道R、绿色通道G和黑色通道B。 [0011] with the first aspect of the present invention, in one possible implementation, the image is a three factor value RGB image red, green and black, respectively, of the image: red R, green G, and black channel channel B.

[0012] 结合本发明的第一方面,在一种可能的实现方式中,所述图像为亮度色度YUV图像,所述图像的三个因子值分别为:亮度Y以及色度U和V。 [0012] with the first aspect of the present invention, in one possible implementation, the image is a luminance-chrominance YUV image, the image value of the three factors are: the luminance Y and chrominance U and V.

[0013] 结合本发明的第一方面,在一种可能的实现方式中,所述获取图像,包括: [0013] with the first aspect of the present invention, in one possible implementation, the image acquisition, comprising:

[0014] 获取一个图片文件作为所述图像; [0014] Gets the image as an image file;

[0015] 或,从视频文件中获取一帧图像作为所述图像。 [0015] or, acquiring an image from a video file as the image.

[0016] 结合本发明的第一方面和上述可能的实现方式,在另一种可能的实现方式中,所述根据所述图像中所述像素的任意两个因子值,计算所述像素所对应的音调和音长,包括: [0016] The binding of the first aspect and the possible implementation of the present invention, in another possible implementation, the two factors in accordance with any values ​​of the pixels in the image, calculating the corresponding pixel the tone and sound length, including:

[0017] 根据所述像素的第一因子值确定所述像素所对应的音调; [0017] determining whether the corresponding pixel tone factor in accordance with a first value for the pixel;

[0018] 根据所述像素的第二因子值确定所述像素所对应的音长。 [0018] determines the sound length corresponding to the pixel according to the pixel values ​​of the second factor.

[0019] 结合本发明的第一方面和上述可能的实现方式,在另一种可能的实现方式中,在根据所述图像中所述像素的任意两个因子值,计算所述像素所对应的音调和音长之后,所述方法还包括: [0019] with the first aspect and the foregoing possible implementation of the present invention, in another possible implementation, the two factors in accordance with any values ​​of the pixels in the image, calculating the corresponding pixel after the tone and sound length, said method further comprising:

[0020] 根据所述图像中的每个所述像素的第三因子值确定所述音频文件的演奏速度。 [0020] determining the performance speed of the audio file according to the third factor the value of each of the pixels in the image.

[0021] 结合本发明的第一方面和上述可能的实现方式,在另一种可能的实现方式中,在获取图像之后,所述方法还包括: [0021] in conjunction with a first aspect of the present invention and the above-mentioned possible implementation manner, in another possible implementation manner, after acquiring an image, said method further comprising:

[0022] 将所述图像中的所述像素按照三个因子值的取值区间进行分类; [0022] The pixels of the image are classified by three factor value interval value;

[0023] 其中,所述根据所述图像中所述像素的任意两个因子值,计算所述像素所对应的音调和音长,具体包括:分别对每个分类计算所述分类中各像素所对应的音调和音长; [0023] wherein any two factors in accordance with the values ​​of the pixels in the image, and calculating a pixel pitch corresponding to the sound length, comprises: separately calculated for each classification of the categories corresponding to each pixel the tone and sound length;

[0024] 所述记录所述图像中每个所述像素所对应的音调和音长,具体包括:以每个分类作为一个声部,记录每个分类中各像素所对应的音调和音长,生成音频文件。 [0024] The recording in each of the pixels of the image corresponding to the tone and sound length, specifically comprising: as a part of each classification, the classification of each record corresponding to each pixel in the tone and sound length, generating an audio file.

[0025] 本发明的第二方面,提供一种根据图像生成音频文件的装置,包括: [0025] The second aspect of the present invention, there is provided an apparatus for generating an audio file according to the image, comprising:

[0026] 获取单元,用于获取图像;其中所述图像包含每个像素的三个因子值; [0026] acquiring unit, for acquiring an image; wherein the image comprises a three factor values ​​for each pixel;

[0027] 计算单元,用于根据所述获取单元获取的图像中所述像素的任意两个因子值,计算所述像素所对应的音调和音长; [0027] calculation unit for obtaining any of the two images acquired factor value in the pixel, calculating a corresponding pixel pitch and sound length;

[0028] 生成单元,用于记录所述计算单元计算的所述图像中每个所述像素所对应的音调和音长,生成音频文件。 [0028] generating unit for recording the calculating unit calculates the image in each of the pixels corresponding to the tone and sound length, generating an audio file.

[0029] 结合本发明的第二方面,在一种可能的实现方式中,所述图像为红绿黑RGB图像,所述图像的三个因子值分别为:红色通道R、绿色通道G和黑色通道B。 [0029] combination with the second aspect of the present invention, in one possible implementation, the image is a three factor value RGB image red, green and black, respectively, of the image: red R, green G, and black channel channel B.

[0030] 结合本发明的第二方面,在一种可能的实现方式中,所述图像为亮度色度YUV图像,所述图像的三个因子值分别为:亮度Y以及色度U和V。 [0030] A second aspect of the present invention is incorporated, in one possible implementation, the image is a luminance-chrominance YUV image, the image value of the three factors are: the luminance Y and chrominance U and V.

[0031] 结合本发明的第二方面,在一种可能的实现方式中,所述获取单元,还用于: [0031] A second aspect of the present invention is incorporated, in one possible implementation, the obtaining unit is further configured to:

[0032] 获取一个图片文件作为所述图像; [0032] Gets the image as an image file;

[0033] 或,从视频文件中获取一帧图像作为所述图像。 [0033] or, acquiring an image from a video file as the image.

[0034] 结合本发明的第二方面和上述可能的实现方式,在另一种可能的实现方式中,所述计算单元,包括: [0034] The second aspect of the present invention in conjunction with the above and possible implementation manner, in another possible implementation, the computing unit, comprising:

[0035] 音调子单元,用于根据所述获取单元获取的像素的第一因子值确定所述像素所对应的音调; [0035] The tone sub-unit, for determining the corresponding pixel tone value according to the acquiring unit acquires the first factor of the pixels;

[0036] 音长子单元,用于根据所述获取单元获取的像素的第二因子值确定所述像素所对应的音长。 [0036] eldest sound unit for determining the sound length based on the corresponding pixel value acquiring unit acquires the second factor of the pixel.

[0037] 结合本发明的第二方面和上述可能的实现方式,在另一种可能的实现方式中,该装置还包括: [0037] The second aspect of the present invention in conjunction with the above and possible implementation manner, in another possible implementation, the apparatus further comprising:

[0038] 速度单元,用于在所述计算单元根据所述图像中所述像素的任意两个因子值,确定所述像素所对应的音调和音长之后,根据所述获取单元获取的图像中的每个所述像素的第三因子值确定所述音频文件的演奏速度。 [0038] The speed unit for calculating unit according to any of the following two factors in the value of the image pixel, the pixel is determined corresponding to the pitch and sound length, according to the image acquiring unit acquires in each pixel value of the third factor determining the performance speed of the audio file.

[0039] 结合本发明的第二方面和上述可能的实现方式,在另一种可能的实现方式中,该装置还包括: [0039] The second aspect of the present invention in conjunction with the above and possible implementation manner, in another possible implementation, the apparatus further comprising:

[0040] 分类单元,用于将所述图像中的所述像素按照三个因子值的取值区间进行分类; [0040] The classification unit, for the pixels in the image are classified according to three factors value interval value;

[0041] 其中,所述计算单元具体用于:分别对每个分类计算所述分类中各像素所对应的音调和音长; [0041] wherein said calculating unit is configured to: calculate the classification corresponding to each pixel in the tone and sound length for each category;

[0042] 所述记录单元具体用于:以每个分类作为一个声部,记录每个分类中各像素所对应的音调和音长,生成音频文件。 [0042] The recording unit is configured: to each category as a part, each record corresponding to each pixel classified in the pitch and sound length, generating an audio file.

[0043] 本发明实施例提供的根据图像生成音频文件的方法及装置,获取图像,根据所述图像中每个像素的任意两个因子值,计算所述像素所对应的音调和音长,从而生成该图像对应的音频文件,可以将图像的内容通过音频表现出来,让用户能够通过听觉感受图像内容,使得用户体验更多元化。 [0043] The embodiment of the present invention, the pixel pitch and the sound length corresponding to provide method and apparatus for generating an audio file according to the image, acquiring an image based on the image factor of two arbitrary value of each pixel is calculated, thereby generating the image corresponding to the audio file, the contents of the image can be manifested through audio listening experience by allowing users to image content, making the user experience more diversified.

附图说明 BRIEF DESCRIPTION

[0044] 为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。 [0044] In order to more clearly illustrate the technical solutions in the embodiments or the prior art embodiment of the present invention, briefly introduced hereinafter, embodiments are described below in the accompanying drawings or described in the prior art needed to be used in describing the embodiments the drawings are only some embodiments of the present invention, those of ordinary skill in the art is concerned, without creative efforts, can derive from these drawings other drawings.

[0045] 图1为本发明实施例1中的一种根据图像生成音频文件的方法流程图; [0045] FIG 1 in Example 1. A flowchart of a method generating an audio image files embodiment of the invention;

[0046] 图2为本发明实施例2中的一种根据YUV图像生成音频文件的方法流程图; [0046] FIG 2 Example 2 a medium flowchart of a method generating an audio file YUV image embodiment of the invention;

[0047] 图3为本发明实施例3中的一种根据YUV图像生成音频文件的方法流程图; [0047] In one kind of embodiment 3 FIG. 3 flowchart of a method generating an audio file YUV image embodiment of the invention;

[0048] 图4为本发明实施例4中的一种根据RGB图像生成音频文件的方法流程图; [0048] Figure 4 Example 4 a medium flowchart of a method generating an audio file RGB image embodiment of the invention;

[0049] 图5为本发明实施例5中的一种根据图像生成音频文件的装置组成示意图。 [0049] Figure 5 implement a schematic composition of Example 5 in the image generating apparatus according to the present invention, an audio file.

具体实施方式 detailed description

[0050] 下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。 [0050] below in conjunction with the present invention in the accompanying drawings, technical solutions of embodiments of the present invention are clearly and completely described, obviously, the described embodiments are merely part of embodiments of the present invention, but not all embodiments example. 基于本发明中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。 Based on the embodiments of the present invention, those of ordinary skill in the art to make all other embodiments without creative work obtained by, it falls within the scope of the present invention.

[0051] 实施例1 [0051] Example 1

[0052] 本发明实施例提供一种根据图像生成音频文件的方法,如图1所示,该方法可以包括: [0052] The embodiment of present invention provides a method of image generation according to an audio file, shown in Figure 1, the method may comprise:

[0053] 101、获取图像;其中所述图像包含每个像素的三个因子值。 [0053] 101, acquiring an image; wherein the image comprises a three factor value for each pixel.

[0054] 其中,所述图像可以为红绿黑RGB图像,所述图像的三个因子值分别为:红色通道R、绿色通道G和黑色通道B。 [0054] wherein, the image may be red, green and black is a factor value of three RGB image, said image are: red R, green G, and channel black channel B. 或者,所述图像还可以为亮度色度YUV图像,所述图像的三个因子值分别为:亮度Y以及色度U和V。 Alternatively, the image may also be a factor of three values ​​of the luminance chrominance YUV image, which are as follows: the luminance Y and chrominance U and V. 所述获取图像,包括:获取一个图片文件作为所述图像;或,从视频文件中获取一帧图像作为所述图像。 The image acquisition, comprising: obtaining the image as an image file; or, acquiring an image from a video file as the image. 通过本发明实施例的方法可以将单一的图片转换成音频文件,也可以将由多帧图像组成的视频转换成音频文件。 The method of the embodiments of the present invention can be converted into a single image for an audio file may be converted by the video image composed of a plurality of frames into an audio file.

[0055] 102、根据所述图像中所述像素的任意两个因子值,计算所述像素所对应的音调和音长。 [0055] 102, two factor values ​​according to any of the pixels in the image, calculating the corresponding pixel tone and sound length.

[0056] 其中,所述根据所述图像中每个像素的任意两个因子值,计算所述像素所对应的音调和音长,包括:根据所述像素的第一因子值确定所述像素所对应的音调;根据所述像素的第二因子值确定所述像素所对应的音长。 [0056] wherein any of the image according to the value of each pixel of the two factors, and calculating a pixel pitch corresponding to the sound length, comprising: a corresponding pixel of the pixel value determined in accordance with a first factor pitch; second factor of the pixel value is determined according to the length of a pixel corresponding to noise.

[0057] 进一步的,在计算得到每个像素所对应的音调和音长之后,还可以根据所述图像中的每个所述像素的第三因子值确定所述音频文件的演奏速度。 [0057] Further, after the calculated corresponding to each pixel tone and sound length, the musical performance tempo may also be determined in accordance with a third audio file factor value of each of the pixels in the image.

[0058] 103、记录所述图像中每个所述像素所对应的音调和音长,生成音频文件。 [0058] 103, the image is recorded in each of the pixels corresponding to the tone and sound length, generating an audio file. [0059] 进一步的,为了提高音频效果,还可以在计算音调和音长之前,先对像素进行分类,从而生成多声部的音频文件。 [0059] Further, in order to improve the audio effects can also be calculated before the pitch and sound length, the first classification of the pixels, thereby generating a multi-part of an audio file. 具体的,可以将所述图像中的各像素按照三个因子值的取值区间进行分类;其中,所述根据所述图像中每个像素的任意两个因子值,计算所述像素所对应的音调和音长,具体包括:根据所述像素的任意两个因子值,对每一个分类分别该分类中的各像素所对应的音调和音长;所述记录所述图像中每个所述像素所对应的音调和音长,具体包括:以每个分类作为一个声部,记录每个分类中每个所述像素所对应的音调和音长,生成音频文件。 Specifically, each pixel in the image may be classified according to three factors value interval value; wherein the any two images according to the factor value for each pixel, calculating the corresponding pixel tone and sound length comprises: two factor according to any one of the pixel values, respectively for each category of the category corresponding to each pixel in the tone and sound length; the recording of the image corresponding to each of the pixels tone and sound length, comprises: in each category as a part, each of the records in each category corresponding pixel tone and sound length, generating an audio file.

[0060] 本发明实施例提供的根据图像生成音频文件的方法,获取图像,根据所述图像中每个像素的任意两个因子值,计算所述像素所对应的音调和音长,从而生成该图像对应的音频文件,可以将图像的内容通过音频表现出来,让用户能够通过听觉感受图像内容,使得用户体验更多元化。 [0060] The embodiment according to the present invention provides a method of generating an image of an audio file, acquiring an image based on the image factor of two arbitrary value for each pixel, calculating the corresponding pixel tone and sound length, thereby generating the image corresponding audio file, the contents of the image can be manifested through audio listening experience by allowing users to image content, making the user experience more diversified.

[0061] 需要指出的是,使用YUV的方式,采样频率要求较低,因此将YUV图像转换成音频文件的效率较高,用户体验较好。 [0061] It should be noted that the use of the YUV manner, lower sampling frequency requirement so a higher converted YUV image into an audio file efficiency, better user experience. 使用RGB的方式,在现有的采样频率要求较高,生成的音频在用户的体验性上较差,因此,本发明不局限于此。 An RGB manner, in the conventional high sampling frequency required, the generated audio on the user experience is poor, therefore, the present invention is not limited thereto.

[0062] 实施例2 [0062] Example 2

[0063] 本发明实施例提供一种根据图像生成音频文件的方法,如图2所示,该方法可以包括: [0063] The embodiment of present invention provides a method of image generation according to an audio file, shown in Figure 2, the method may comprise:

[0064] 201、获取亮度色度YUV图像。 [0064] 201, obtaining luminance chrominance YUV image.

[0065] 其中,所述获取亮度色度YUV图像,包括:获取一个图片文件作为所述YUV图像;或,从视频文件中获取一帧图像作为所述YUV图像。 [0065] wherein, the luminance chrominance YUV image acquisition, comprising: obtaining a picture file as the YUV image; or, acquiring an image from a video file as the YUV image. 通过本发明实施例的方法可以将单一的图片转换成音频文件,也可以将由多帧图像组成的视频转换成音频文件。 The method of the embodiments of the present invention can be converted into a single image for an audio file may be converted by the video image composed of a plurality of frames into an audio file. 所述YUV图像包含每个像素的三个因子值,分别为第一因子值、第二因子值和第三因子值。 The YUV image value of each pixel comprises three factors, namely a first factor value, the second value and the third factor factor values. 例如,第一因子值可以为代表亮度的Y值,第二因子值可以为U值,第三因子值可以为V值,其中U值和V值是两个色度分量。 For example, a first factor may be a value representative of the luminance value Y, a second factor may be a value U value, the value of the third factor may be a value V, wherein U and V are the two chrominance component values.

[0066] 值得说明的是,在获取YUV图像的过程中,可以在确定Y通道的因子值后采样U和V两个因子值,对采样的密度要求较低,采样得到的YUV图像数据较少,这样对YUV图像中每个像素所对应的音调和音长等的计算量也较少,因此可以提高将图像转换成音频文件的效率。 [0066] It should be noted that, in the process of obtaining YUV image, U and V can be two sampling values ​​after determining factor factor values ​​for the Y channel, low density sampling requirements, less YUV image data obtained by sampling , so that the amount of calculation and long pitch and tone of each pixel corresponding to a YUV image is also small, thereby improving image is converted audio file efficiency. 尤其是当将视频文件中的连续多帧图像转换成音频文件时,可以达到较高的转换效率,提闻将图像转换成首频的效率,从而提闻用户体验。 Especially when converting the video file continuous multi-frame image into an audio file, can achieve a high conversion efficiency, mention smell image into first frequency conversion efficiency, thereby improving the user experience smell.

[0067] 202、根据所述像素的第一因子值确定所述像素所对应的音调。 [0067] 202, the corresponding pixel is determined according to the tone value of the pixel of the first factor.

[0068] 其中,按照音调的高低通常将音调分为高音区、中音区和低音区三个音区,每个音区各包含有三个八度音符空间,其中每个音符又包含该音符的升半音、中音和降半音,因此总共可以有:3音区X 3八度X 7音符X 3音调=189个音调。 [0068] in which, according to high and low tones are usually divided into the pitch treble, middle range and bass sounds three zones, each zone, each containing three octave space, where each note also contains the notes the sharp, midrange and flatted, and therefore can have a total: 3 zones X 7 X 3 octaves of notes X 3 = 189 tones tones.

[0069] 在本实施例中,可以采用动态的音调映射方式,将YUV图像中所有像素中最大的第一因子值和最小的第一因子值分别对应到189个音调的最高音调和最低音调,按照YUV图像中第一因子值的分布情况将其余像素的第一因子值也成比例的映射到最高音调和最低音调之间的音调,从而得到各个像素所对应的音调。 [0069] In the present embodiment, tone mapping may be dynamic manner, all the pixels in a first maximum value YUV image factor and a minimum value of the first factor corresponding to the maximum pitch 189 and minimum pitch tones, YUV image according to the distribution of a first factor value of the first factor the remaining value of the pixel in proportion to the mapping between the highest and the lowest tone of the tone pitch to obtain a pitch corresponding to each pixel. 例如,YUV图像的所有像素中亮度最高的像素的Y值为200,亮度最低的像素的Y值为20,那么便可以将20对应为最低音调,将200对应为最高音调,从而将YUV图像中的所有像素的Y值均归一化到189个音调上,得到YUV图像中各个像素所对应的音调。 For example, all the pixels Y YUV image pixel value of the highest luminance of 200, the minimum luminance Y pixel value of 20, then it may be 20 tones corresponding to the lowest, the highest tone 200 corresponds to the image in YUV Y values ​​of all the pixels are normalized to 189 tones in each pixel to obtain a YUV image corresponding tone.

[0070] 可选的,也可以预设固定的映射表,预先设定第一因子值取值范围内所有取值对应的音调,从而根据YUV图像中各像素的第一因子值查询映射表得到每个像素对应的音调。 [0070] Alternatively, the default may be a fixed mapping table, all the values ​​corresponding to a predetermined tone value within a first range factor to obtain a YUV image in accordance with a first factor value of each pixel in the mapping table query tone corresponding to each pixel. 例如Y的取值范围是0-255,将O对应为最低音调,将255对应为最高音调,从而将Y的取值归一化到189个音符上,实现映射关系。 Y is, for example, the range 0-255, O corresponds to the lowest tone, the 255 tones corresponding to the highest, so that the Y values ​​normalized to 189 notes, maps relationships.

[0071] 203、根据所述像素的第二因子值确定所述像素所对应的音长。 [0071] 203, the determined pixel corresponding to sound length based on the second factor value of the pixel.

[0072] 其中,音频文件不仅受音调的影响,还有一个重要的影响因素就是每个音符的音长。 [0072] in which audio files not only affected tone, there is another important factor is the length of each note sound. 音频演奏过程中分为11中音长,例如可以为2/4拍和3/4拍等。 The audio while playing alto divided into 11 long, for example, 2/4 beat and 3/4 beat and so on. 每个像素对应的音长不同,便体现了音频的节奏。 Each pixel corresponds to a different sound long, it reflects the rhythm of audio. 具体的,可以将像素的第二因子值归一化到11中音长上。 Specifically, the pixel may be a second factor values ​​were normalized to the length 11 midrange. 例如,预设固定的音长映射表,将色度U值的取值范围均匀的划分为11个取值区间,按照U值从小到大的顺序将11个取值区间分别按照最短音长到最长音长对应为11种音长。 For example, a predetermined fixed length tone mapping table, dividing a uniform range of values ​​U chrominance value interval 11, according to the value of U ascending order of the value interval 11 respectively, according to the length of the shortest tone up to 11 kinds of sound length corresponding to sound length.

[0073] 204、根据所述YUV图像中的每个所述像素的第三因子值确定所述音频文件的演奏速度。 [0073] 204, to determine the performance speed of the audio file according to the third factor of the pixel values ​​of each of the YUV image.

[0074] 其中,确定了每个像素对应的音调和音长后,音频的表现形式还因演奏速度不同而带来不同的演奏效果。 After [0074] wherein each pixel is determined corresponding to the pitch and sound length, the audio manifestation further vary the performance speed performance and bring different effects. 在本实施例中可以将第三因子值对应为演奏速度。 In the present embodiment corresponding to the third factor may be a value for playing speed. 优选的,为了音频趋于一定的稳定性以适应人们的听觉习惯,可以采用变化幅度较小的V值作为第三因子值。 Preferably, a certain stability to audio tends to suit people accustomed to hearing, minor variations may be employed as a third value V amplitude factor value. 由于整个音频文件可以仅确定一个演奏速度,因此可以去所有像素的第三因子值,取平均值得到演奏速度值。 Since the entire audio file can only determine a performance speed, it is possible to go to a third factor values ​​for all pixels, averaging the values ​​obtained performance tempo. 例如,演奏的速度范围设定在15到200之间,可以采用表示亮度的Y值或标识色度的V值,作为演奏速度的决定因素。 For example, the performance speed range is set between 15 and 200, V values ​​or the Y value represents the luminance chrominance identification may be employed, as a determinant of the performance speed. 计算YUV图像中所有像素的V值的平均值,将该平均值作为演奏速度。 Calculating an average V value of all the pixels in a YUV image, the average value as performance tempo. 可选的,也可以将所有像素中最大的V值与最小的V值之间的差作为演奏速度。 Alternatively, it may be a difference between all the pixels in the maximum V value and the minimum value of V as the performance tempo.

[0075] 205、记录所述YUV图像中每个所述像素所对应的音调和音长以及音频演奏速度,生成音频文件。 [0075] 205, the YUV image recording in each of the pixels corresponding to the tone and sound length and an audio performance speed, generating an audio file.

[0076] 本发明实施例提供的根据图像生成音频文件的方法,获取YUV图像,根据所述YUV图像中每个像素的任意两个因子值,计算所述像素所对应的音调和音长,从而生成该YUV图像对应的音频文件,可以将图像的内容通过音频表现出来,让用户能够通过听觉感受图像内容,使得用户体验更多元化。 [0076] The present invention provides an image generating method according to an audio file, YUV image acquisition, according to any of the two factors YUV image value for each pixel, calculating the corresponding pixel tone and sound length, thereby generating the YUV image corresponding to the audio file, the contents of the image can be manifested through audio listening experience by allowing users to image content, making the user experience more diversified.

[0077] 并且,由于YUV图像要求的采样率较低,可以简化进行音调和音长等计算的计算量,从而提高将图像转换成音频文件的效率,提高转换的实时性,从而进一步提高用户体验。 [0077] Further, due to the lower YUV image requires sampling rate, it can be simplified for calculation calculated pitch and sound length, so as to improve the image is converted audio file efficiency and real-time conversion, thereby further improving the user experience.

[0078] 实施例3 [0078] Example 3

[0079] 本发明实施例提供一种根据图像生成音频文件的方法,如图3所示,该方法可以包括: [0079] The embodiment of present invention provides a method of image generation according to an audio file, shown in Figure 3, the method may comprise:

[0080] 301、获取亮度色度YUV图像。 [0080] 301, obtaining luminance chrominance YUV image.

[0081] 其中,所述获取亮度色度饱和度YUV图像,包括:获取一幅图片作为所述YUV图像;或,从视频图像中获取一帧图像作为所述YUV图像。 [0081] wherein said obtaining luminance chroma saturation YUV image, comprising: acquiring an image as the YUV image; or, acquiring an image from the video image as the YUV image. 通过本发明实施例的方法可以将单一的图片转换成音频文件,也可以将由多帧图像组成的视频转换成音频文件。 The method of the embodiments of the present invention can be converted into a single image for an audio file may be converted by the video image composed of a plurality of frames into an audio file.

[0082] 302、将所述YUV图像中的所有像素按照三个因子值的取值区间分类。 [0082] 302, all the pixels in the YUV image classified according to three factors value interval value.

[0083] 其中,分类的方法可以为按照三个因子值的取值范围不同划分为不同分类,具体的,可以根据V值和V值确定的颜色区间,将YUV图像中的所有像素划分为黑色和白色两种分类,将黑色分类和白色分类分别作为一个声部,例如男生和女生,对两个声部分别进行音调、音长和演奏速度的计算。 [0083] wherein, the method may be classified into three ranges according to the different factor values ​​are divided into different categories, in particular, according to section V V color values ​​and the determined values, all the pixels in a YUV image is divided black classification and white, black and white classification as a classification of each part, for example, boys and girls, of two portions calculated acoustic tone, sound length and playing speed. 最后,将两个声部分别确定得到的音频合成为一个和声效果的音频文件。 Finally, the two sound audio synthesizing portions determined as a sound effect obtained audio file. 或者,还可以将YUV图像中的所有像素划分为红色、绿色和蓝色三个颜色空间,作为三个声部。 Alternatively, all pixels may also be divided into a YUV image in red, green and blue three color space, as the three parts. 需要说明的是,以上仅为对像素进行分类的几种举例,实际应用中还可以有其他分类方式,本发明实施例对此不做限定。 Incidentally, the above is only several examples of pixel classification, a practical application there can be other classification, this embodiment is not limited in embodiments of the present invention.

[0084] 303、分别对每个分类计算所述分类中各像素所对应的音调和音长。 [0084] 303, were calculated for each category of each classification of the corresponding pixel tone and sound length.

[0085] 其中,以每一个分类作为对象,生成每个分类对应的音频其中,所述根据所述YUV图像中每个像素的任意两个因子值,计算所述像素所对应的音调和音长,包括:根据所述像素的第一因子值确定所述像素所对应的音调;根据所述像素的第二因子值确定所述像素所对应的音长。 [0085] wherein each category as the target to generate corresponding audio wherein each category, the YUV image according to any of the two factor value for each pixel, calculating the corresponding pixel tone and sound length, comprising: determining a pixel pitch corresponding to a pixel value according to the first factor; determining pixels corresponding to the sound length based on the pixel value of the second factor.

[0086] 例如,在本实施例中,可以采用动态的音调映射方式,将任一分类中所有像素中最大的第一因子值和最小的第一因子值分别对应到189个音调的最高音调和最低音调,按照该分类中第一因子值的分布情况将其余像素的第一因子值也成比例的映射到最高音调和最低音调之间的音调,从而得到各个像素所对应的音调。 [0086] For example, in the present embodiment, the tone mapping may be dynamic manner, the highest tone of any of a classification of all the pixels in the first factor in the maximum value and the minimum value of the first factor corresponding to 189 tones, and lowest tone, the distribution according to the classification of the remaining value of the first factor of the first factor value of the pixel in proportion to the mapping between the highest and the lowest tone of the tone pitch to obtain a pitch corresponding to each pixel. 具体的分别对每一个分类中的像素转换成音频的实现方式可以参考上述步骤202-204,本发明实施例这里不再详细赘述。 Specific are converted for each pixel into an audio classification implementation may refer to the above-described steps 202-204, embodiments of the present invention is not described in detail herein embodiments described herein.

[0087] 304、以每个分类作为一个声部,记录每个分类中每个所述像素所对应的音调和音长,生成音频文件。 [0087] 304 to each category as a part, each of said records in each category corresponding to the pixel pitch and sound length, generating an audio file.

[0088] 其中,对分类得到的每一个声部,分别执行类似与步骤202-204的音频生成流程,得到多个声部的音频,最后将多个声部按照像素在YUV图像中的排列顺序合成为一个音频文件,得到多声部的音频文件。 [0088] wherein for each part classification obtained, respectively, perform audio generation process similar to steps 202-204, to obtain a plurality of audio parts, and finally the plurality of parts in accordance with the arrangement order of the pixels in a YUV image synthesized as an audio file, get audio files polyphony.

[0089] 本发明实施例提供的根据图像生成音频文件的方法,通过获取YUV图像,根据所述YUV图像中每个像素的任意两个因子值,计算所述像素所对应的音调和音长,从而生成该YUV图像对应的音频文件,可以将图像的内容通过音频表现出来,让用户能够通过听觉感受图像内容,使得用户体验更多元化。 [0089] The embodiments of the present invention provides an image generation method according to an audio file, YUV image by obtaining, according to any of the two factors YUV image value for each pixel, calculating the corresponding pixel tone and sound length, whereby the generated YUV image corresponding to the audio file, the contents of the image can be manifested through audio listening experience by allowing users to image content, making the user experience more diversified.

[0090] 并且,通过将图像中的全部像素划分为多个分类,将每个分类对应一个声部,得到多声部的音频文件,使得转的得到的音频具有和声效果,从而提高用户对图像的听觉体验。 [0090] Further, by dividing all the pixels in the image into a plurality of classification, a part corresponding to each category, to obtain a multi-part audio file, such that the resulting audio sound effect switch having, thereby improving the user listening experience of the image.

[0091] 实施例4 [0091] Example 4

[0092] 本发明实施例提供一种根据图像生成音频文件的方法,如图4所示,该方法可以包括: [0092] The embodiment of present invention provides a method of image generation according to an audio file, shown in Figure 4, the method may comprise:

[0093] 401、获取红绿黑RGB图像。 [0093] 401, acquiring RGB images red, green and black.

[0094] 其中,所述获取RGB图像,包括:获取一个图片文件作为所述RGB图像;或,从视频文件中获取一帧图像作为所述RGB图像。 [0094] wherein, said acquiring RGB image, comprising: obtaining a picture file as the RGB image; or, acquiring an image from the video file as an RGB image. 所述获取亮度色度饱和度RGB图像,包括:获取一个图片文件作为所述RGB图像;或,从视频文件中获取一帧图像作为所述RGB图像。 The chroma saturation, luminance acquiring RGB image, comprising: obtaining a picture file as the RGB image; or, acquiring an image from the video file as an RGB image. 通过本发明实施例的方法可以将单一的图片转换成音频文件,也可以将由多帧图像组成的视频转换成音频文件。 The method of the embodiments of the present invention can be converted into a single image for an audio file may be converted by the video image composed of a plurality of frames into an audio file.

[0095] 402、将所述RGB图像中的所有像素按照三个因子值的取值区间进行分类。 [0095] 402, all the pixels of the RGB image is classified according to three factors value interval value.

[0096] 例如,可以将图像中的所有像素划分为红色系、绿色系和黑色系三大类,从而在最后生成三个声部的和声效果的音频文件。 [0096] For example, all pixels in the image is divided into red, green and black lines based three categories, so that at the end of the chorus of the audio file is generated in three parts. 或者,可以按照区域进行划分,图像的上半部分作为一个声部,下半部分作为一个声部等等,本发明实施例对于像素的划分规则不做限定。 Alternatively, the area division may be performed in accordance with the upper half of the image as a part, the lower half as a part, etc., for a pixel division rules embodiment of the present invention is not limited thereto.

[0097] 403、分别对每个分类计算所述分类中各像素所对应的音调和音长。 [0097] 403, were calculated for each category of each classification of the corresponding pixel tone and sound length.

[0098] 404、以每个分类作为一个声部,记录每个分类中每个所述像素所对应的音调和音长,生成音频文件。 [0098] 404 to each category as a part, each of said records in each category corresponding to the pixel pitch and sound length, generating an audio file.

[0099] 其中,与图3的实施例中YUV图像的处理方式类似,对于RGB图像也可以对分类得到的每一个声部,分别执行类似于步骤202-204的音频生成流程,得到多个声部的音频,不同的是这里的三个因子值分别是R因子、G因子和B因子。 [0099] wherein, the embodiment of FIG. 3 in a similar manner to Example YUV image processing, for each one of the parts of the RGB image can be obtained by classifying were performed similarly audio generation process steps 202-204, to obtain a plurality of acoustic an audio portion, here three different factor factor values ​​are R, G, and B factor factor. 最后,可以将多个声部按照像素在RGB图像中的排列顺序合成为一个音频文件,得到多声部的音频文件。 Finally, the plurality of parts can be synthesized in accordance with the arrangement order of pixels in an RGB image as an audio file, an audio file to obtain a multi-part.

[0100] 本发明实施例提供的根据图像生成音频文件的方法,通过获取RGB图像,根据所述RGB图像中每个像素的任意两个因子值,计算所述像素所对应的音调和音长,从而生成该RGB图像对应的音频文件,可以将图像的内容通过音频表现出来,让用户能够通过听觉感受图像内容,使得用户体验更多元化。 [0100] The present invention provides an image generating method according to an audio file, by obtaining RGB image, the RGB image according to two arbitrary factor value for each pixel, calculating the corresponding pixel tone and sound length, whereby the generated RGB image corresponding to the audio file, the contents of the image can be manifested through audio listening experience by allowing users to image content, making the user experience more diversified.

[0101] 并且,通过将图像中的全部像素划分为多个分类,将每个分类对应一个声部,得到多声部的音频文件,使得转的得到的音频具有和声效果,从而提高用户对图像的听觉体验。 [0101] Further, by dividing all the pixels in the image into a plurality of classification, a part corresponding to each category, to obtain a multi-part audio file, such that the resulting audio sound effect switch having, thereby improving the user listening experience of the image.

[0102] 实施例5 [0102] Example 5

[0103] 本发明实施例提供一种根据图像生成音频文件的装置,如图5所示,该装置可以包括:获取单元51、计算单元52、生成单元53。 [0103] An embodiment provides a shown in FIG. 5 the image generating apparatus according to the present invention, an audio file, the apparatus may comprise: an obtaining unit 51, calculation unit 52, generating unit 53.

[0104] 获取单元51,用于获取图像;其中所述图像包含每个像素的三个因子值; [0104] acquiring unit 51 for acquiring an image; wherein the image comprises a three factor values ​​for each pixel;

[0105] 计算单元52,用于根据所述获取单元51获取的图像中所述像素的任意两个因子值,计算所述像素所对应的音调和音长; [0105] calculation unit 52, for any two factor value according to the acquiring unit 51 acquires the image of the pixel, calculating a corresponding pixel pitch and sound length;

[0106] 生成单元53,用于记录所述计算单元52计算的所述图像中每个所述像素所对应的音调和音长,生成音频文件。 [0106] generating unit 53, for recording the image of the calculating unit 52 calculates each of the pixels corresponding to the tone and sound length, generating an audio file.

[0107] 进一步可选的,所述图像可以为红绿黑RGB图像,所述图像的三个因子值分别为:红色通道R、绿色通道G和黑色通道B。 [0107] Further alternatively, the image may be a factor of three red, green and black value RGB image, said image are: red R, green G, and channel black channel B. 可选的,所述图像为亮度色度YUV图像,所述图像的三个因子值分别为:亮度Y以及色度U和V。 Optionally, the image is a luminance-chrominance YUV image, the image value of the three factors are: the luminance Y and chrominance U and V.

[0108] 进一步的,所述获取单元51,还用于:获取一个图片文件作为所述图像;或,从视频文件中获取一帧图像作为所述图像。 [0108] Further, the acquiring unit 51 is further configured to: obtain an image as the image file; or, acquiring an image from a video file as the image.

[0109] 进一步的,所述计算单元52,包括:音调子单元521、音长子单元522。 [0109] Further, the calculation unit 52, comprising: tone sub-unit 521, sound unit 522 eldest.

[0110] 音调子单元521,用于根据所述获取单元51获取的像素的第一因子值确定所述像素所对应的音调; [0110] 521 tone sub-unit, for determining the value of the corresponding pixel tone factor in accordance with a first acquiring unit 51 acquires the pixel;

[0111] 音长子单元522,用于根据所述获取单元51获取的像素的第二因子值确定所述像素所对应的音长。 [0111] eldest sound unit 522, for determining the corresponding pixel according to a second sound length factor acquisition unit 51 acquires the value of the pixel.

[0112] 进一步的,该装置还包括:速度单元54。 [0112] Further, the apparatus further comprising: a speed unit 54.

[0113] 速度单元54,用于在所述计算单元52根据所述图像中每个像素的任意两个因子值,确定所述像素所对应的音调和音长之后,根据所述获取单元51获取的图像中的每个所述像素的第三因子值确定所述音频文件的演奏速度。 [0113] 54 speed unit, the calculating unit for, after the image 52 of any two factor value for each pixel, the pixel is determined corresponding to the pitch and sound length, according to the acquisition unit 51 acquires the third factor of the pixel values ​​of each image to determine the performance speed of the audio file.

[0114] 进一步的,该装置还包括:分类单元55。 [0114] Further, the apparatus further comprising: a classification unit 55.

[0115] 分类单元55,用于将所述图像中的所有像素按照三个因子值的取值区间进行分类;[0116] 其中,所述计算单元52具体用于:分别对每个分类计算所述分类中各像素所对应的音调和音长; [0115] The classification unit 55 for all pixels in the image are classified according to three factors value interval values; [0116] wherein, the calculating unit 52 is specifically configured to: calculate for each classification classification of each pixel of said corresponding tone and sound length;

[0117] 所述记录单元53具体用于:以每个分类作为一个声部,记录每个分类中各像素所对应的音调和音长,生成音频文件。 [0117] The recording unit 53 is specifically configured to: in each category as a part, each record corresponding to each pixel classified in the pitch and sound length, generating an audio file.

[0118] 本发明实施例提供的根据图像生成音频文件的装置,通过获取图像,根据所述图像中每个像素的任意两个因子值,计算所述像素所对应的音调和音长,从而生成该图像对应的音频文件,可以将图像的内容通过音频表现出来,让用户能够通过听觉感受图像内容,使得用户体验更多元化。 [0118] The present invention provides an image generating apparatus according to an audio file, by acquiring the image, according to any of the two factors image value for each pixel, calculating the corresponding pixel tone and sound length, thereby generating the image corresponding audio file, the contents of the image can be manifested through audio listening experience by allowing users to image content, making the user experience more diversified.

[0119] 并且,通过将图像中的全部像素划分为多个分类,将每个分类对应一个声部,得到多声部的音频文件,使得转的得到的音频具有和声效果,从而提高用户对图像的听觉体验。 [0119] Further, by dividing all the pixels in the image into a plurality of classification, a part corresponding to each category, to obtain a multi-part audio file, such that the resulting audio sound effect switch having, thereby improving the user listening experience of the image.

[0120] 通过以上的实施方式的描述,所属领域的技术人员可以清楚地了解到本发明可借助软件加必需的通用硬件的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。 [0120] By the above described embodiments, those skilled in the art may clearly understand that the present invention may be implemented by software plus necessary universal hardware implemented, also be implemented by hardware, but the former is preferred in many cases embodiments. 基于这样的理解,本发明的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在可读取的存储介质中,如计算机的软盘,硬盘或光盘等,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本发明各个实施例所述的方法。 Based on such understanding, the technical solutions of the present invention in essence or the part contributing to the prior art may be embodied in a software product out, in the storage medium may be readable, such as a floppy disk of the computer software product is stored and the like, a hard disk or optical disk, and include several instructions that enable a computer device (may be a personal computer, a server, or network device) to execute the methods according to embodiments of the present invention.

[0121] 以上所述,仅为本发明的具体实施方式,但本发明的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本发明揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本发明的保护范围之内。 [0121] The above are only specific embodiments of the present invention, but the scope of the present invention is not limited thereto, any skilled in the art in the art within the technical scope of the present invention is disclosed, variations may readily occur or Alternatively, it shall fall within the protection scope of the present invention. 因此,本发明的保护范围应以所述权利要求的保护范围为准。 Accordingly, the scope of the present invention should be defined by the scope of the claims.

Claims (14)

  1. 1.一种根据图像生成音频文件的方法,其特征在于,包括: 获取图像;其中所述图像包含每个像素的三个因子值; 根据所述图像中所述像素的任意两个因子值,计算所述像素所对应的音调和音长; 记录所述图像中每个所述像素所对应的音调和音长,生成音频文件。 An image generation method according to an audio file, characterized by comprising: acquiring an image; wherein the image comprises a three factor value for each pixel; any two factor values ​​of the pixels in the image according to, calculating the corresponding pixel tone and sound length; recorded in each of the pixels in the image corresponding to the tone and sound length, generating an audio file.
  2. 2.根据权利要求1所述的根据图像生成音频文件的方法,其特征在于, 所述图像为红绿黑RGB图像,所述图像的三个因子值分别为:红色通道R、绿色通道G和黑色通道B。 The image generating method according to an audio file according to claim 1, wherein said image is a three factor value RGB image red, green and black, respectively, of the image: red R, green G and channel black channel B.
  3. 3.根据权利要求1所述的根据图像生成音频文件的方法,其特征在于, 所述图像为亮度色度YUV图像,所述图像的三个因子值分别为:亮度Y以及色度U和V。 The image generating method of an audio file, characterized in that said according to claim 1, the image is a luminance-chrominance YUV image, the image value of the three factors are: the luminance Y and chrominance U and V .
  4. 4.根据权利要求3所述的根据图像生成音频文件的方法,其特征在于,所述获取图像,包括: 获取一个图片文件作为所述图像; 或,从视频文件中获取一帧图像作为所述图像。 The image generating method of an audio file, wherein according to claim 3, the image acquisition, comprising: obtaining the image as an image file; or, acquiring an image from a video file as the image.
  5. 5.根据权利要求1所述的根据图像生成音频文件的方法,其特征在于,所述根据所述图像中所述像素的任意两个因子值,计算所述像素所对应的音调和音长,包括: 根据所述像素的第一因子值确定所述像素所对应的音调; 根据所述像素的第二因子值确定所述像素所对应的音长。 5. The method of claim 1 according to the image generating an audio file, characterized in that the two factors in accordance with any values ​​of the pixels in the image, and calculating a pixel pitch corresponding to the sound length, including the claims : determining a pixel pitch corresponding to a pixel value according to the first factor; determining pixels corresponding to the sound length based on the pixel value of the second factor.
  6. 6.根据权利要求5所述的根据图像生成音频文件的方法,其特征在于,在根据所述图像中所述像素的任意两个因子值,计算所述像素所对应的音调和音长之后,所述方法还包括: 根据所述图像中的每个所述像素的第三因子值确定所述音频文件的演奏速度。 The image generating method according to an audio file according to claim 5, characterized in that, after the two factors in accordance with any values ​​of the pixels in the image, and the calculated tone pitch corresponding to the pixel length, the said method further comprising: a third factor value according to each of the pixels in the image to determine the speed of playing the audio file.
  7. 7.根据权利要求1-6中任一项所述的根据图像生成音频文件的方法,其特征在于,在获取图像之后,所述方法还包括: 将所述图像中的所有像素按照三个因子值的取值区间进行分类; 其中,所述根据所述图像中所述像素的任意两个因子值,计算所述像素所对应的音调和音长,具体包括:分别对每个分类计算所述分类中各像素所对应的音调和音长; 所述记录所述图像中每个所述像素所对应的音调和音长,具体包括:以每个分类作为一个声部,记录每个分类中各像素所对应的音调和音长,生成音频文件。 According to any of claims 1 to 6 a method of generating an image of an audio file, characterized in that one of the, after acquiring an image, said method further comprising: all pixels in the image according to three factors classification value interval value; wherein the values ​​of the two factors in accordance with any of the pixels in the image, and calculating a pixel pitch corresponding to the sound length, comprises: calculating separately for each classification classified each pixel corresponding to the pitch and sound length; the recording of the image corresponding to each of the pixel pitch and sound length, comprises: in each category as a part, each record corresponding to each pixel classified as the tone and sound length, to generate audio files.
  8. 8.一种根据图像生成音频文件的装置,其特征在于,包括: 获取单元,用于获取图像;其中所述图像包含每个像素的三个因子值; 计算单元,用于根据所述获取单元获取的图像中所述像素的任意两个因子值,计算所述像素所对应的音调和音长; 生成单元,用于记录所述计算单元计算的所述图像中每个所述像素所对应的音调和音长,生成音频文件。 An image generating apparatus according to an audio file, characterized by comprising: an acquisition unit for acquiring an image; wherein the image comprises a three factor value for each pixel; calculating unit for obtaining the unit any two images acquired factor values ​​of the pixels, calculating a corresponding pixel pitch and sound length; generating means for recording the calculating unit calculates the image in each of the pixels corresponding to the pitch and sound long to generate audio files.
  9. 9.根据权利要求8所述的根据图像生成音频文件的装置,其特征在于, 所述图像为红绿黑RGB图像,所述图像的三个因子值分别为:红色通道R、绿色通道G和黑色通道B。 9. The image generating apparatus according to an audio file according to claim 8, wherein said image is a three factor value RGB image red, green and black, respectively, of the image: red R, green G and channel black channel B.
  10. 10.根据权利要求8所述的根据图像生成音频文件的装置,其特征在于, 所述图像为亮度色度YUV图像,所述图像的三个因子值分别为:亮度Y以及色度U和V。 10. The image generating apparatus according to an audio file according to claim 8, wherein the image is a luminance-chrominance YUV image, the image value of the three factors are: the luminance Y and chrominance U and V .
  11. 11.根据权利要求10所述的根据图像生成音频文件的装置,其特征在于,所述获取单元,还用于: 获取一个图片文件作为所述图像; 或,从视频文件中获取一帧图像作为所述图像。 11. The image generating apparatus according to an audio file according to claim 10, wherein said obtaining unit is further configured to: obtain an image as the image file; or, acquiring an image from a video file as said image.
  12. 12.根据权利要求11所述的根据图像生成音频文件的装置,其特征在于,所述计算单元,包括: 音调子单元,用于根据所述获取单元获取的像素的第一因子值确定所述像素所对应的首调; 音长子单元,用于根据所述获取单元获取的像素的第二因子值确定所述像素所对应的音长。 12. The image generating apparatus 11 of the audio file according to claim, wherein said calculating means comprises: tone sub-unit, for determining said first factor value according to the acquiring unit acquires pixel pixel corresponding to the first tone; eldest sound unit for determining the sound length based on the corresponding pixel value acquiring unit acquires the second factor of the pixel.
  13. 13.根据权利要求12所述的根据图像生成音频文件的装置,其特征在于,还包括: 速度单元,用于根据所述获取单元获取的所述图像中的每个所述像素的第三因子值确定所述音频文件的演奏速度。 13. The image generating apparatus 12 of the audio file according to claim, characterized in that, further comprising: a speed unit for obtaining the third factor of said each of the pixels in the image acquisition unit determining a value of the playing speed of the audio file.
  14. 14.根据权利要求8-13中任一项所述的根据图像生成音频文件的装置,其特征在于,还包括: 分类单元,用于将所述图像中的所有像素按照三个因子值的取值区间进行分类; 其中,所述计算单元具体用于:分别对每个分类计算所述分类中各像素所对应的音调和音长; 所述记录单元具体用于:以每个分类作为一个声部,记录每个分类中各像素所对应的音调和音长,生成音频文件。 All pixels in the image according to the factor value is taken three classification unit, configured to: according to claim 8-13 as claimed in any one of the image generating apparatus according to an audio file, characterized by further comprising classification value interval; wherein said calculating unit is configured to: calculate the classification corresponding to each pixel in the tone and sound length for each category; the recording unit is configured: to each category as a part recording the corresponding tone and sound length for each pixel in each category, generating an audio file.
CN 201310013003 2013-01-14 2013-01-14 Method and device for generating audio file according to image CN103928036A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN 201310013003 CN103928036A (en) 2013-01-14 2013-01-14 Method and device for generating audio file according to image

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN 201310013003 CN103928036A (en) 2013-01-14 2013-01-14 Method and device for generating audio file according to image

Publications (1)

Publication Number Publication Date
CN103928036A true true CN103928036A (en) 2014-07-16

Family

ID=51146233

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 201310013003 CN103928036A (en) 2013-01-14 2013-01-14 Method and device for generating audio file according to image

Country Status (1)

Country Link
CN (1) CN103928036A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104918059A (en) * 2015-05-19 2015-09-16 京东方科技集团股份有限公司 Method and device for image transmission and terminal device

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5286908A (en) * 1991-04-30 1994-02-15 Stanley Jungleib Multi-media system including bi-directional music-to-graphic display interface
CN1287320A (en) * 1999-09-03 2001-03-14 北京航空航天大学 Method of converting image information into music
CN1862656A (en) * 2005-05-13 2006-11-15 杭州波导软件有限公司 Method for converting musci score to music output and apparatus thereof
CN102289778A (en) * 2011-05-10 2011-12-21 南京大学 An image conversion method to music

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5286908A (en) * 1991-04-30 1994-02-15 Stanley Jungleib Multi-media system including bi-directional music-to-graphic display interface
CN1287320A (en) * 1999-09-03 2001-03-14 北京航空航天大学 Method of converting image information into music
CN1862656A (en) * 2005-05-13 2006-11-15 杭州波导软件有限公司 Method for converting musci score to music output and apparatus thereof
CN102289778A (en) * 2011-05-10 2011-12-21 南京大学 An image conversion method to music

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104918059A (en) * 2015-05-19 2015-09-16 京东方科技集团股份有限公司 Method and device for image transmission and terminal device

Similar Documents

Publication Publication Date Title
US5048390A (en) Tone visualizing apparatus
Lindestad et al. Voice source characteristics in Mongolian “throat singing” studied with high-speed imaging technique, acoustic spectra, and inverse filtering
JP2004212473A (en) Karaoke machine and karaoke reproducing method
US20070012165A1 (en) Method and apparatus for outputting audio data and musical score image
EP1035732A1 (en) Apparatus and method for presenting sound and image
CN101354745A (en) Method and apparatus for recognizing video document
Hodgson Understanding Records: A Field Guide to Recording Practice
JP2004240077A (en) Musical tone controller, video controller and program
JP2004102146A (en) Karaoke scoring device having vibrato grading function
US6756533B2 (en) Automatic music composing apparatus and automatic music composing program
JP2003061027A (en) Video contents automatic classification device and video contents automatic classifying method
JP2002218262A (en) Stored image delivery method, recording medium and stored image delivery device
CN102170420A (en) Method for obtaining ring tone and ring tone obtaining system
CN103442083A (en) Method, system, clients and server for transmitting correlated contents through audio files
JP2005156713A (en) Device for classifying automatic musical composition, and method
US20070058183A1 (en) Gamut compression method, program, and gamut compression device
CN1477598A (en) Self-adaptive enhancing image colour method and equipment
Bartlett Tonal effects of close microphone placement
JP2005189658A (en) Luminescence presenting system and luminescence presenting apparatus
WO2004081940A1 (en) A method and apparatus for generating an output video sequence
CN102289778A (en) An image conversion method to music
CN103886881A (en) Method and system for expanding song selecting library
CN1433548A (en) Network based music playing/song accompanying service system and method
CN101197929A (en) Information processing apparatus, display control processing method and display control processing program
JP2005018549A (en) Image display device, method and program

Legal Events

Date Code Title Description
C06 Publication
C10 Entry into substantive examination
RJ01