WO2015131573A1 - Method and device for producing image having sound, and computer storage medium - Google Patents

Method and device for producing image having sound, and computer storage medium Download PDF

Info

Publication number
WO2015131573A1
WO2015131573A1 PCT/CN2014/092391 CN2014092391W WO2015131573A1 WO 2015131573 A1 WO2015131573 A1 WO 2015131573A1 CN 2014092391 W CN2014092391 W CN 2014092391W WO 2015131573 A1 WO2015131573 A1 WO 2015131573A1
Authority
WO
WIPO (PCT)
Prior art keywords
information
picture
audio
tag
audio file
Prior art date
Application number
PCT/CN2014/092391
Other languages
French (fr)
Chinese (zh)
Inventor
周江
吴钊
陈瑞
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Publication of WO2015131573A1 publication Critical patent/WO2015131573A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T11/002D [Two Dimensional] image generation
    • G06T11/20Drawing from basic elements, e.g. lines or circles
    • G06T11/203Drawing of straight lines or curves

Definitions

  • the present invention relates to the field of information processing, and in particular, to a method and apparatus for making an audio picture and a computer storage medium.
  • embodiments of the present invention are directed to a method and apparatus for making an audio picture, which can retain picture information and audio information in a recording and collection environment with a small amount of data.
  • a first aspect of the embodiments of the present invention provides a method for making an audio picture, the method comprising:
  • the picture information, the picture associated information related to the picture information analysis, and the audio information are added to the audio file.
  • the picture association information includes at least an APIC tag and a MIME type
  • Adding the picture information and the picture association information to the audio file including:
  • the length of the information is added to the tag.
  • the method further includes:
  • the method further includes updating a label length of the label according to an information length of the label.
  • the audio information includes noise information
  • the method further includes:
  • the noise information in the audio information is deleted according to a predetermined policy.
  • the noise information includes: a camera sound formed by the electronic device collecting the picture information and an environmental noise specified by the user.
  • a second aspect of the embodiments of the present invention provides a device for making an audio picture.
  • the device includes:
  • An image acquisition unit configured to collect picture information
  • the audio collection unit is configured to collect audio information of the collection environment where the picture information is located;
  • the audio forming unit is configured to add the picture information, the picture associated information related to the picture information analysis, and the audio information to an audio file.
  • the picture association information includes at least an APIC tag and a MIME type
  • the audio forming unit is configured to add an APIC tag and a MIME type of the picture information in a tag of the audio file; and add the picture information to the tag.
  • the audio forming unit is configured to determine an information length of the picture information and the picture association information according to the picture information, the APIC tag, and a MIME type; and add the information length to the tag.
  • the audio forming unit is further configured to: before adding the audio information to the audio file, the method further comprising updating a label length of the label according to an information length of the label.
  • the audio information includes noise information
  • the device also includes:
  • a noise processing unit configured to delete noise information in the audio information according to a predetermined policy before adding the audio information to the audio file.
  • the third aspect of the embodiments of the present invention further provides a computer storage medium, where the computer storage medium stores computer executable instructions, where the computer executable instructions are used to execute at least one of the methods of the first aspect of the embodiments of the present invention. one.
  • the picture information and the audio information formed are collected to form an audio file carrying the picture information; and the picture information is carried in the audio file, so that Some can play a lot of audio players that carry picture information, and can output picture information and audio information at the same time.
  • Frequency storage of picture information and audio information which can reduce the amount of data, and when the picture information is stored in a label (specifically, such as an ID3 tag) of the audio file, forming an information format conforming to an existing audio file
  • the existing electronic device can output the audio file by using an audio application that can display picture information, and does not need to be installed on a terminal or platform with a specific algorithm, thereby avoiding the disadvantage that it is difficult to access due to a specific algorithm, thereby It has the advantages of strong compatibility with the prior art and good versatility.
  • FIG. 1 is a schematic flow chart of a method for making an audio picture according to an embodiment of the present invention
  • FIG. 2 is a schematic flowchart of forming an audio file according to an embodiment of the present invention.
  • FIG. 3 is a second schematic flowchart of a method for producing an audio picture according to an embodiment of the present invention.
  • FIG. 4 is a schematic structural diagram of an apparatus for manufacturing an audio picture according to an embodiment of the present invention.
  • FIG. 5 is a second schematic structural diagram of an apparatus for making an audio picture according to an embodiment of the present invention.
  • FIG. 6 is a schematic flow chart of making an audio picture according to an example of the present invention.
  • Embodiment 1 is a diagrammatic representation of Embodiment 1:
  • this embodiment provides a method for making an audio picture, where the method includes:
  • Step S110 collecting picture information
  • Step S120 collecting audio information of the collection environment where the picture information is located
  • Step S130 Add the picture information, the picture associated information related to the picture information analysis, and the audio information to the audio file.
  • the method described in this embodiment is applied to an electronic device carrying a camera function and an audio collection function; for example, a mobile phone or a tablet computer.
  • the electronic device synchronously collects the ambient sound of the current environment when the picture is collected, and the audio information is formed, so that the visual information (picture information) of the collection environment collected by the photo is synchronously recorded the audio information (audio). information).
  • the audio information and the picture information collected by the electronic device are fused and stored in the audio file, so that the formed audio file can output both the audio information and the picture information when outputting, but relative to the video. Has the advantage of less information.
  • the audio file includes a tag and an audio data portion for outputting audio;
  • the tag includes information associated with the audio information, such as information such as a singer name, an album name, a genre, and the like.
  • the label is preferably an ID3 tag.
  • the ID3 tag is an integral part of an audio file, and is mainly used in the prior art to store some information associated with audio, information such as singer, title, album name, and chronological style of the audio, and the information is not audio of the audio information.
  • Content when the electronic device forms audio according to the audio content, the information can be output in the form of text and/or picture.
  • the audio information is added to the audio data portion; the picture information may be added to the audio data portion or may be added to the tag; preferably, the picture information and the picture associated information are added to the In the tag, the audio file thus formed conforms to the information format of the existing audio file, and is convenient for the electronic device to parse and output the audio file formed in the step S130 without providing a proprietary algorithm or a dedicated application when parsing and outputting the audio file.
  • the method may include: adding the picture information and the picture association information to a label of the audio file; and adding the audio information to the audio data portion of the audio file. Specifically, the picture information and the picture associated information are written in the ID3 tag; the audio information is written in the audio data portion.
  • the picture information is written in a tag of the audio file; and the picture association information of the picture information is recorded in the tag; the picture association information includes a MIME type; according to the MIME type, the electronic device can be parsed When outputting a picture, determine the information grid of the picture information Style and opening method, etc.
  • the ID3 tag includes ID3V1, ID3V2, and ID3V2.3; wherein the V1 represents version 1; the V2 represents version 2; and the V2.3 represents version 2.3.
  • the ID3V1 is located at the end of the audio file; the ID3V2 is located at the beginning of the audio file.
  • a picture of an audio file can be stored in an ID3 tag of ID3V2 or above ID3V2. Therefore, in the embodiment, the ID3 tag adopts a version of ID3V2 or ID3V2.3 or higher.
  • the picture information collected by the electronic device is first merged into the audio file. While the audio file is being played, the electronic device reads the ID3 tag of the audio file, and outputs the picture information by parsing and outputting the ID3 tag.
  • the audio file preferably uses an audio file in an mp3 or AAC format.
  • the picture information is preferably a jpeg format picture.
  • the jpeg format image has the advantages of compression ratio and realistic decompression effect.
  • the use of mp3 format or AAC format, the same compression ratio and the degree of audio distortion after decompression, etc., can facilitate the processing and storage of information.
  • User A takes a group photo with friends.
  • User A wants to record this cheerful scene or share the cheerful scene with other friends.
  • the static information of photos or pictures obviously the cheerful atmosphere will be halved, and the method described in this embodiment can synchronously collect picture information and environmental sounds, and form a friend who can open without installing a specific algorithm or application.
  • the audio file formed by the method described in this embodiment is output.
  • the audio file or the video application such as a cool dog music application, may be used in the prior art.
  • the step S130 may specifically include:
  • Step S131 adding an APIC tag tag and a MIME type of the picture information to the tag;
  • Step S132 Add the picture information to the tag.
  • step S130 further includes step S133 and step S134:
  • Step S133 determining, according to the picture information and the APIC tag, the MIME type, the information length of the picture information and the picture association information;
  • the step S134 is: adding the information length to the label.
  • the steps S133 and S134 may be performed after the step S132, or may be performed before or in synchronization with the step S132; the method shown in FIG. 2 is not limited.
  • the APIC tag indicates that the current location in the audio file is followed by the picture information and the picture association information.
  • the APIC tag is followed by a tag length; the tag length is the step.
  • the length of the information described in 133 and 134; the M bytes reserved in the ID3 tag to record the information length of the picture information and the picture associated information; typically the M is equal to 4.
  • the reserved M bytes are updated according to the information length; when the electronic device outputs the audio file, according to The APIC tag and the length of the information know which of the ID3 tags are picture information; and which are associated information of the audio file, such as audio recording time and the like.
  • determining the length of the information according to the step S133 and the step S134, adding the information length to the label before adding the picture information is not limited to The sequence of processes shown in Figure 2.
  • the electronic device can know, according to the APIC tag and the length of the information, which bytes in the ID3 tag store the picture information and the picture associated information.
  • the picture related information in the ID3 tag may further include information such as a picture type, a text encoding identifier, a memo string, and a frame flag, and the details of the information. You can refer to the information format of the existing ID3 tags, which will not be repeated here.
  • the picture association information includes at least a MIME type; the MIME type corresponds to one or more bytes in the ID3 tag, and the corresponding data type may be a character string.
  • the character string of the byte corresponding to the MIME is: jpeg; when the electronic device interprets the MIME type, knowing that the file format of the picture information is jpeg, thereby knowing which data format is used to parse and display the image information.
  • the MIME type is image.
  • the electronic device in order to output the audio file, is convenient to determine which data belongs to the ID3 tag and which belongs to the audio content, and the ID3 tag also includes a tag length indicating the length of the ID3 tag information;
  • the method described also includes:
  • the method further includes updating a label length of the label according to an information length of the label.
  • the audio information includes noise information
  • the method further includes:
  • Step S121 deleting noise information in the audio information according to a predetermined policy.
  • the collected audio information may include some noise information that the user does not want.
  • noise filtering is also performed through the step S121 to delete the noise information, so that the information desired by the user can be saved.
  • the noise information includes: a camera sound formed by the electronic device in collecting the picture information and an environmental noise specified by the user.
  • the electronic device In the image acquisition, in order to prompt the user to complete the image acquisition, usually the electronic device will emit a sound similar to “ ⁇ ”. If the mute photo is taken, the user may not be able to accurately determine whether the photo is completed; if the prompt tone is retained, these The prompt tone will be collected into the audio information as an ambient sound; in this embodiment, the camera sound formed by the collected picture information can be removed by step S121; how to remove it can be adopted in the following manner:
  • the deletion of the information in the audio information that meets the preset condition with the difference of the comparison sounds achieves the purpose of removing the camera sound.
  • the collected audio information may be compared with the sound sample to delete the sound; in a specific implementation process, after the comparison result is formed, the comparison result may also be used.
  • the prompt information is generated and output, and according to the input of the user based on the prompt information, whether the photo sound or the environmental noise needs to be deleted is determined.
  • the embodiment provides a method for making an audio picture, which carries the picture information in the label of the audio information, has strong compatibility with the prior art, and has the advantages of universality; the electronic device does not need to install the specified application. It can simultaneously output the picture information and audio information collected separately.
  • Embodiment 2 is a diagrammatic representation of Embodiment 1:
  • the embodiment provides a device for making an audio picture.
  • the device includes:
  • the image collection unit 110 is configured to collect picture information.
  • the audio collection unit 120 is configured to collect audio information of the collection environment where the picture information is located;
  • the audio forming unit 130 is configured to add the picture information, the picture associated information related to the picture information analysis, and the audio information to an audio file.
  • the device may be an electronic device such as a mobile phone including a camera and a recording function, and a tablet computer.
  • the image acquisition unit 110 may specifically include a camera component or the like that can perform image acquisition.
  • the audio collection unit 120 may include a recorder or the like.
  • the audio forming unit 130 may include a processor and a storage medium; the processor may be an electronic device such as a central processing unit CPU, a microprocessor MCU, a digital signal processor DSP, and a programmable processor PLC.
  • the storage medium is for storing the formed audio file. In a specific implementation, the storage medium may also be used to store the picture information, audio information, and the like.
  • the audio forming unit 130 is configured to write an APIC tag and a MIME type of the picture information in a tag of the audio file (such as an ID3 tag); and add the picture information to the tag.
  • a tag of the audio file such as an ID3 tag
  • the audio forming unit 130 specifically writes the picture information and the picture associated information in the audio file, and the writing in the label can well match the information format of the existing audio file, so there is no need to pass a specific An algorithm or application to parse the audio file has the advantage of being versatile.
  • the audio forming unit 130 is further configured to determine an information length of the picture information and the picture association information according to the picture information and the APIC tag and a MIME type; and add the information length in the tag .
  • the specific content of the MIME type, the APIC tag, and the tag can be referred to in Embodiment 1, and will not be repeated here.
  • the label is preferably an ID3 tag.
  • the audio forming unit 130 is further configured to: before adding the audio information to the audio file, the method further comprises: updating the location according to the information length of the label The label length of the label.
  • the audio forming unit 130 distinguishes the picture information and other information in the label according to the length of the picture information; and writes the label length in the label, and uses the label of the electronic device when outputting the audio file. Audio information; facilitates the output of subsequent audio information.
  • the audio information includes noise information
  • the device further includes:
  • the noise processing unit 140 is configured to delete the noise information in the audio information according to a predetermined policy before adding the audio information to the audio file.
  • the specific structure of the noise processing unit 140 may include various types of processors, and the specific structure may also be a structure such as a noise processor in the prior art.
  • the noise processing unit 140 is added, and the noise in the audio file can be filtered, and only the environmental sound desired by the user can be retained, thereby improving the intelligence of the electronic device and the satisfaction of the user.
  • the noise information includes: a camera sound formed by the electronic device in collecting the picture information and an environmental noise specified by the user.
  • the device in the embodiment provides the implementation hardware for the method in the first embodiment, and is used to form an audio file including picture information when taking a picture by taking a voice, instead of requiring a special application decoding output in the prior art.
  • the image file including audio information improves versatility and compatibility.
  • the method includes:
  • Step 1 Using a smartphone camera function to generate a jpeg picture; and simultaneously recording an MP3 file; in the specific implementation process, the jpeg picture (ie, the picture information above) and the MP3 file (ie, the above) are also formed.
  • the prompt information is output to prompt the user to simultaneously collect the picture information and the audio information, so that the electronic device can perform the subsequent steps after receiving the operation performed by the user based on the prompt.
  • Step 2 Build the ID3V2.3 information of the new MP3 file and locate the end of the ID3 tag.
  • Step 3 Add information such as APIC tag, text encoding identifier, MIME type, image type, and remarks to the ID3 tag.
  • Step 4 Open the jpeg picture, write its picture data to the data area of the ID3 tag, and update the tag length of ID3;
  • Step 5 Write the audio data in the MP3 file in step 1 to the MP3 file in which the picture data is written.
  • the embodiment of the present invention further provides a computer storage medium, wherein the computer storage medium stores computer executable instructions, and the computer executable instructions are used to execute at least one of the methods of the embodiments of the present invention, such as 1 At least one of FIG. 2 and FIG.
  • the computer storage medium may be a removable storage device, a read-only memory (ROM), a random access memory (RAM), a magnetic disk or an optical disk, and the like, which can store program code.
  • the computer storage medium can be selected as a non-transitory storage medium.
  • the disclosed apparatus and method may be implemented in other manners.
  • the device embodiments described above are merely illustrative.
  • the division of the unit is only a logical function division.
  • there may be another division manner such as: multiple units or components may be combined, or Can be integrated into another system, or some features can be ignored or not executed.
  • the coupling, or direct coupling, or communication connection of the components shown or discussed may be indirect coupling or communication connection through some interfaces, devices or units, and may be electrical, mechanical or other forms. of.
  • the units described above as separate components may or may not be physically separated, and the components displayed as the unit may or may not be physical units, that is, may be located in one place or distributed to multiple network units; You can choose which one according to your actual needs. Some or all of the units implement the objectives of the embodiment of the present embodiment.
  • each functional unit in each embodiment of the present invention may be integrated into one processing module, or each unit may be separately used as one unit, or two or more units may be integrated into one unit; the above integration
  • the unit can be implemented in the form of hardware or in the form of hardware plus software functional units.
  • the foregoing program may be stored in a computer readable storage medium, and the program is executed when executed.
  • the foregoing storage device includes the following steps: the foregoing storage medium includes: a mobile storage device, a read-only memory (ROM), a random access memory (RAM), a magnetic disk, or an optical disk.
  • ROM read-only memory
  • RAM random access memory
  • magnetic disk or an optical disk.
  • optical disk A medium that can store program code.

Landscapes

  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Management Or Editing Of Information On Record Carriers (AREA)
  • Signal Processing For Digital Recording And Reproducing (AREA)

Abstract

Disclosed are a method and device for producing an image having sound. The method comprises: image information is collected; audio information of a collection environment in which the image information is located is collected; the image information, image linked information related to image information analysis, and the audio information are added in an audio file. Also disclosed is a computer storage medium.

Description

制作有声图片的方法及装置和计算机存储介质Method and device for making audio picture and computer storage medium 技术领域Technical field
本发明涉及信息处理领域,尤其涉及一种制作有声图片的方法及装置和计算机存储介质。The present invention relates to the field of information processing, and in particular, to a method and apparatus for making an audio picture and a computer storage medium.
背景技术Background technique
在日常生活和工作中,很多场合需要采用图片记录视觉信息,又要采用音频记录听觉信息,这是就需要将图片和声音混合存储,现有方法通常才采用视频来实现图片和声音的混合存储,但是视频文件的信息量大,导致存储和信息分享不够方便。In daily life and work, many occasions need to use pictures to record visual information, but also use audio to record auditory information. This requires mixing pictures and sounds. Existing methods usually use video to achieve mixed storage of pictures and sounds. However, the amount of information in the video file is large, which makes storage and information sharing inconvenient.
故提出一种能够同时记录视觉信息和听觉信息,同时又能使数据量保持较小的有声图片是现有技术中亟待解决的问题。Therefore, it is an urgent problem to be solved in the prior art to propose an audio picture capable of simultaneously recording visual information and auditory information while keeping the amount of data small.
发明内容Summary of the invention
有鉴于此,本发明实施例期望提供一种制作有声图片的方法及装置,能够以较少的数据量保留记录采集环境中的图片信息和音频信息。In view of this, embodiments of the present invention are directed to a method and apparatus for making an audio picture, which can retain picture information and audio information in a recording and collection environment with a small amount of data.
为达到上述目的,本发明的技术方案是这样实现的:In order to achieve the above object, the technical solution of the present invention is achieved as follows:
本发明实施例第一方面提供一种制作有声图片的方法,所述方法包括:A first aspect of the embodiments of the present invention provides a method for making an audio picture, the method comprising:
采集图片信息;Collect picture information;
采集所述图片信息所在采集环境的音频信息;Collecting audio information of the collection environment where the picture information is located;
在音频文件中添加所述图片信息、与所述图片信息解析相关的图片关联信息及所述音频信息。The picture information, the picture associated information related to the picture information analysis, and the audio information are added to the audio file.
基于上述方案,Based on the above scheme,
所述图片关联信息至少包括APIC标记以及MIME类型; The picture association information includes at least an APIC tag and a MIME type;
所述在音频文件中添加所述图片信息及图片关联信息,包括:Adding the picture information and the picture association information to the audio file, including:
在所述音频文件的标签中添加APIC标记以及所述图片信息的MIME类型;Adding an APIC tag and a MIME type of the picture information to a tag of the audio file;
在所述音频文件的标签中添加所述图片信息。Adding the picture information to a tag of the audio file.
基于上述方案,Based on the above scheme,
所述在音频文件中添加所述图片信息及图片关联信息,还包括Adding the picture information and the picture associated information to the audio file, including
依据所述图片信息、所述APIC标记及MIME类型确定所述图片信息和所述图片关联信息的信息长度;Determining, according to the picture information, the APIC tag, and the MIME type, the information length of the picture information and the picture association information;
在所述标签中添加所述信息长度。The length of the information is added to the tag.
基于上述方案,Based on the above scheme,
所述方法还包括:The method further includes:
在所述音频文件中添加所述音频信息之前,所述方法还包括根据所述标签的信息长度,更新所述标签的标签长度。Before adding the audio information to the audio file, the method further includes updating a label length of the label according to an information length of the label.
基于上述方案,Based on the above scheme,
所述音频信息中包括噪音信息;The audio information includes noise information;
所述方法还包括:The method further includes:
在所述音频文件中添加所述音频信息之前,依据预定策略删除所述音频信息中的噪音信息。Before the audio information is added to the audio file, the noise information in the audio information is deleted according to a predetermined policy.
基于上述方案,Based on the above scheme,
所述噪音信息包括:电子设备在采集所述图片信息形成的拍照音以及用户指定的环境噪音。The noise information includes: a camera sound formed by the electronic device collecting the picture information and an environmental noise specified by the user.
本发明实施例第二方面提供一种制作有声图片的装置,A second aspect of the embodiments of the present invention provides a device for making an audio picture.
所述装置包括:The device includes:
图像采集单元,配置为采集图片信息;An image acquisition unit configured to collect picture information;
音频采集单元,配置为采集所述图片信息所在采集环境的音频信息; The audio collection unit is configured to collect audio information of the collection environment where the picture information is located;
音频形成单元,配置为在音频文件中添加所述图片信息、与所述图片信息解析相关的图片关联信息及所述音频信息。The audio forming unit is configured to add the picture information, the picture associated information related to the picture information analysis, and the audio information to an audio file.
基于上述方案,Based on the above scheme,
所述图片关联信息至少包括APIC标记以及MIME类型;The picture association information includes at least an APIC tag and a MIME type;
所述音频形成单元,配置为在所述音频文件的标签中添加APIC标记以及所述图片信息的MIME类型;在所述标签中添加所述图片信息。The audio forming unit is configured to add an APIC tag and a MIME type of the picture information in a tag of the audio file; and add the picture information to the tag.
基于上述方案,Based on the above scheme,
所述音频形成单元,配置为依据所述图片信息、所述APIC标记及MIME类型确定所述图片信息和所述图片关联信息的信息长度;及在所述标签中添加所述信息长度。The audio forming unit is configured to determine an information length of the picture information and the picture association information according to the picture information, the APIC tag, and a MIME type; and add the information length to the tag.
基于上述方案,Based on the above scheme,
所述音频形成单元,还配置为在所述音频文件中添加所述音频信息之前,所述方法还包括根据所述标签的信息长度,更新所述标签的标签长度。The audio forming unit is further configured to: before adding the audio information to the audio file, the method further comprising updating a label length of the label according to an information length of the label.
基于上述方案,Based on the above scheme,
所述音频信息中包括噪音信息;The audio information includes noise information;
所述装置还包括:The device also includes:
噪音处理单元,配置为在所述音频文件中添加所述音频信息之前,依据预定策略删除所述音频信息中的噪音信息。And a noise processing unit configured to delete noise information in the audio information according to a predetermined policy before adding the audio information to the audio file.
本发明实施例第三方面还提供一种计算机存储介质,所述计算机存储介质中存储有计算机可执行指令,所述计算机可执行指令用于执行本发明实施例第一方面所述方法的至少其中之一。The third aspect of the embodiments of the present invention further provides a computer storage medium, where the computer storage medium stores computer executable instructions, where the computer executable instructions are used to execute at least one of the methods of the first aspect of the embodiments of the present invention. one.
本发明实施例制作有声图片的方法及装置和计算机存储介质,将采集形成的图片信息以及音频信息,形成一个携带有图片信息的音频文件;且将所述图片信息携带在音频文件中,这样现有的能够很多能够播放携带有图片信息的音频播放器都能够同时输出图片信息以及音频信息,与采用视 频存储图片信息和音频信息,能够减少数据量,且当将所述图片信息存储在所述音频文件的标签(具体如ID3标签)中时,形成的是符合现有的音频文件的信息格式的音频文件,现有的电子设备采用普通可以展示图片信息的音频应用就能输出所述音频文件,不需要安装在有特定算法的终端或平台上,避免了因特定算法导致难以访问的缺点,从而具有与现有技术的兼容性强且具有通用性好的优点。The method and device for producing an audio picture and the computer storage medium in the embodiment of the present invention, the picture information and the audio information formed are collected to form an audio file carrying the picture information; and the picture information is carried in the audio file, so that Some can play a lot of audio players that carry picture information, and can output picture information and audio information at the same time. Frequency storage of picture information and audio information, which can reduce the amount of data, and when the picture information is stored in a label (specifically, such as an ID3 tag) of the audio file, forming an information format conforming to an existing audio file An audio file, the existing electronic device can output the audio file by using an audio application that can display picture information, and does not need to be installed on a terminal or platform with a specific algorithm, thereby avoiding the disadvantage that it is difficult to access due to a specific algorithm, thereby It has the advantages of strong compatibility with the prior art and good versatility.
附图说明DRAWINGS
图1为本发明实施例所述制作有声图片的方法的流程示意图之一;1 is a schematic flow chart of a method for making an audio picture according to an embodiment of the present invention;
图2为本发明实施例所述的音频文件形成的流程示意图;2 is a schematic flowchart of forming an audio file according to an embodiment of the present invention;
图3为本发明实施例所述制作有声图片的方法的流程示意图之二;3 is a second schematic flowchart of a method for producing an audio picture according to an embodiment of the present invention;
图4为本发明实施例所述制作有声图片的装置的结构示意图之一;4 is a schematic structural diagram of an apparatus for manufacturing an audio picture according to an embodiment of the present invention;
图5为本发明实施例所述制作有声图片的装置的结构示意图之二;FIG. 5 is a second schematic structural diagram of an apparatus for making an audio picture according to an embodiment of the present invention; FIG.
图6为本发明示例所述制作有声图片的流程示意图。FIG. 6 is a schematic flow chart of making an audio picture according to an example of the present invention.
具体实施方式detailed description
以下结合附图对本发明的优选实施例进行详细说明,应当理解,以下所说明的优选实施例仅用于说明和解释本发明,并不用于限定本发明。The preferred embodiments of the present invention are described in detail below with reference to the accompanying drawings.
实施例一:Embodiment 1:
如图1所示,本实施例提供一种制作有声图片的方法,所述方法包括:As shown in FIG. 1 , this embodiment provides a method for making an audio picture, where the method includes:
步骤S110:采集图片信息;Step S110: collecting picture information;
步骤S120:采集所述图片信息所在采集环境的音频信息;Step S120: collecting audio information of the collection environment where the picture information is located;
步骤S130:在音频文件中添加所述图片信息、与所述图片信息解析相关的图片关联信息及所述音频信息。Step S130: Add the picture information, the picture associated information related to the picture information analysis, and the audio information to the audio file.
本实施例所述的方法应用于携带有照相功能以及音频采集功能的电子设备中;具体如,手机或平板电脑等。 The method described in this embodiment is applied to an electronic device carrying a camera function and an audio collection function; for example, a mobile phone or a tablet computer.
所述电子设备,在采集图片时,同步采集当前所在环境的环境音,将形成所述音频信息,这样就同步通过照片采集的采集环境的视觉信息(图片信息)还同步记录了听觉信息(音频信息)。The electronic device synchronously collects the ambient sound of the current environment when the picture is collected, and the audio information is formed, so that the visual information (picture information) of the collection environment collected by the photo is synchronously recorded the audio information (audio). information).
本实施例中为了解决上述问题,将电子设备采集的音频信息和图片信息融合存储在音频文件中,这样形成的音频文件,在输出时既能输出音频信息也能输出图片信息,但是相对于视频具有信息量少的优点。In this embodiment, in order to solve the above problem, the audio information and the picture information collected by the electronic device are fused and stored in the audio file, so that the formed audio file can output both the audio information and the picture information when outputting, but relative to the video. Has the advantage of less information.
所述音频文件包括标签以及用于输出音频的音频数据部分;所述标签中包括与音频信息关联的信息,具体如歌手名称、专辑名称、曲风等信息。所述标签优选为ID3标签。The audio file includes a tag and an audio data portion for outputting audio; the tag includes information associated with the audio information, such as information such as a singer name, an album name, a genre, and the like. The label is preferably an ID3 tag.
所述ID3标签是音频文件的组成部分,在现有技术中主要用于存储与音频关联的一些信息,该音频的歌手、标题、专辑名称以及年代风格等信息,这些信息不属于音频信息的音频内容;这些信息在电子设备依据音频内容形成音频时,可以同步以文字和/或图片等形式输出。The ID3 tag is an integral part of an audio file, and is mainly used in the prior art to store some information associated with audio, information such as singer, title, album name, and chronological style of the audio, and the information is not audio of the audio information. Content; when the electronic device forms audio according to the audio content, the information can be output in the form of text and/or picture.
其中,所述音频信息添加在所述音频数据部分;所述图片信息可以添加在所述音频数据部分也可以添加在所述标签中;优选为将所述图片信息及图片关联信息添加在所述标签中,这样形成的音频文件符合现有音频文件的信息格式,方便电子设备在解析和输出音频文件时,不用提供专有算法或专用应用来解析及输出所述步骤S130中形成的音频文件。The audio information is added to the audio data portion; the picture information may be added to the audio data portion or may be added to the tag; preferably, the picture information and the picture associated information are added to the In the tag, the audio file thus formed conforms to the information format of the existing audio file, and is convenient for the electronic device to parse and output the audio file formed in the step S130 without providing a proprietary algorithm or a dedicated application when parsing and outputting the audio file.
故在所述步骤S130中,具体可包括:在音频文件的标签中添加所述图片信息及图片关联信息;及在所述音频文件的音频数据部分添加所述音频信息。具体地,如在ID3标签中写入图片信息及图片关联信息;在音频数据部分写入音频信息。Therefore, in the step S130, the method may include: adding the picture information and the picture association information to a label of the audio file; and adding the audio information to the audio data portion of the audio file. Specifically, the picture information and the picture associated information are written in the ID3 tag; the audio information is written in the audio data portion.
在本实施例中利用在音频文件的标签中写入所述图片信息;并在标签中记录所述图片信息的图片关联信息;所述图片关联信息包括MIME类型;根据MIME类型电子设备可以在解析输出图片时,确定图片信息的信息格 式以及打开方式等。In the embodiment, the picture information is written in a tag of the audio file; and the picture association information of the picture information is recorded in the tag; the picture association information includes a MIME type; according to the MIME type, the electronic device can be parsed When outputting a picture, determine the information grid of the picture information Style and opening method, etc.
所述标签有多种,不局限于上述的ID3标签;以下详细介绍一下所述ID3标签的相关信息。所述ID3标签包括ID3V1、ID3V2以及ID3V2.3;其中所述V1表示版本1;所述V2表示版本2;所述V2.3表示版本2.3。所述ID3V1位于所述音频文件的末尾;所述ID3V2位于所述音频文件的开头。在ID3V2或ID3V2以上版本的ID3标签中可以存储音频文件的图片。故在本实施例中所述ID3标签采用ID3V2或ID3V2.3以上的版本。There are a plurality of types of tags, and are not limited to the ID3 tags described above; the related information of the ID3 tags is described in detail below. The ID3 tag includes ID3V1, ID3V2, and ID3V2.3; wherein the V1 represents version 1; the V2 represents version 2; and the V2.3 represents version 2.3. The ID3V1 is located at the end of the audio file; the ID3V2 is located at the beginning of the audio file. A picture of an audio file can be stored in an ID3 tag of ID3V2 or above ID3V2. Therefore, in the embodiment, the ID3 tag adopts a version of ID3V2 or ID3V2.3 or higher.
在本实施例中首先是将电子设备采集的图片信息融合到音频文件中,在播放音频文件的同时,电子设备将读取音频文件ID3标签,通过解析输出ID3标签来输出图片信息。In this embodiment, the picture information collected by the electronic device is first merged into the audio file. While the audio file is being played, the electronic device reads the ID3 tag of the audio file, and outputs the picture information by parsing and outputting the ID3 tag.
在具体的实现过程中,所述音频文件优选采用mp3或AAC格式的音频文件。所述图片信息优选为jpeg格式图片。jpeg格式的图片,具有压缩率及解压效果逼真等优点。采用mp3格式或AAC格式,同样的具有压缩率以及解压后对音频失真程度小等优点,从而能便于信息的处理以及存储。In a specific implementation process, the audio file preferably uses an audio file in an mp3 or AAC format. The picture information is preferably a jpeg format picture. The jpeg format image has the advantages of compression ratio and realistic decompression effect. The use of mp3 format or AAC format, the same compression ratio and the degree of audio distortion after decompression, etc., can facilitate the processing and storage of information.
以下提供一个本实施例所述方法的具体应用场景:A specific application scenario of the method in this embodiment is provided below:
具体如,用户A给一群朋友合影,朋友们在拍照的瞬间很欢快的喊出了“茄子”等口号;用户A为了记录下这个欢快的场景或将给欢快的场景分享给其他朋友,显然只有照片或图片这种静态信息,显然欢快的气氛将减半,而采用本实施例中所述方法,可以同步采集图片信息以及环境音,并形成一个朋友们不用安装特定算法或应用就可以打开的携带有图片信息的音频文件,就能感受到用户A当前拍照时朋友欢快的视觉信息和听觉信息。输出本实施例所述方法形成的音频文件,由于音频文件标签中携带有图片信息,可以采用现有技术中可以输出图片信息的音频播放应用或视频应用,具体如酷狗音乐应用等。Specifically, User A takes a group photo with friends. The friends slogan “Egglet” and other slogans at the moment of taking pictures. User A wants to record this cheerful scene or share the cheerful scene with other friends. Obviously only The static information of photos or pictures, obviously the cheerful atmosphere will be halved, and the method described in this embodiment can synchronously collect picture information and environmental sounds, and form a friend who can open without installing a specific algorithm or application. By carrying the audio file with the picture information, you can feel the visual information and the auditory information of the friend A when the user A is currently taking a picture. The audio file formed by the method described in this embodiment is output. The audio file or the video application, such as a cool dog music application, may be used in the prior art.
如图2所示,所述步骤S130可具体包括: As shown in FIG. 2, the step S130 may specifically include:
步骤S131:在所述标签中添加APIC标记tag以及所述图片信息的MIME类型;Step S131: adding an APIC tag tag and a MIME type of the picture information to the tag;
步骤S132:在所述标签中添加所述图片信息。Step S132: Add the picture information to the tag.
基于上述方案,所述步骤S130,还包括步骤S133及步骤S134:Based on the above solution, the step S130 further includes step S133 and step S134:
所述步骤S133:依据所述图片信息以及所述APIC标记、MIME类型确定所述图片信息和所述图片关联信息的信息长度;Step S133: determining, according to the picture information and the APIC tag, the MIME type, the information length of the picture information and the picture association information;
所述步骤S134:在所述标签中添加所述信息长度。The step S134 is: adding the information length to the label.
所述步骤S133及S134可以在所述步骤S132以后执行,也可以在所述步骤S132之前或与所述S132同步执行;不局限于图2所示的方法。The steps S133 and S134 may be performed after the step S132, or may be performed before or in synchronization with the step S132; the method shown in FIG. 2 is not limited.
所述APIC标记表示所述音频文件中当前位置以后为图片信息和图片关联信息;在具体的实现过程中,所述APIC标记之后紧接着的是一个tag长度;所述tag长度即为所述步骤133和步骤134中所述的信息长度;在ID3标签中预留出的M个字节来记录所述图片信息以及图片关联信息的信息长度;通常所述M等于4。The APIC tag indicates that the current location in the audio file is followed by the picture information and the picture association information. In a specific implementation process, the APIC tag is followed by a tag length; the tag length is the step. The length of the information described in 133 and 134; the M bytes reserved in the ID3 tag to record the information length of the picture information and the picture associated information; typically the M is equal to 4.
在所述ID3标签中添加了所述图片信息以及图片关联信息且确定了信息长度之后,依据所述信息长度更新预留出的M个字节;当电子设备在输出所述音频文件时,根据所述APIC标记以及所述信息长度,知道ID3标签中哪些是图片信息;哪些是音频文件的关联信息,如音频录制时间等信息。After the picture information and the picture association information are added to the ID3 tag and the information length is determined, the reserved M bytes are updated according to the information length; when the electronic device outputs the audio file, according to The APIC tag and the length of the information know which of the ID3 tags are picture information; and which are associated information of the audio file, such as audio recording time and the like.
在具体实现时,在添加所述图片信息之前依据所述步骤S133及步骤S134确定出所述信息长度,在添加所述图片信息之前就把所述信息长度添加到所述标签中,不局限于图2所示流程顺序。In a specific implementation, before the adding the picture information, determining the length of the information according to the step S133 and the step S134, adding the information length to the label before adding the picture information, is not limited to The sequence of processes shown in Figure 2.
电子设备根据所述APIC标记及所述信息长度可知道所述ID3标签中当前哪些字节存储的是图片信息及图片关联信息。在具体的实现过程中,若存储了有多张图片时,所述ID3标签中所述图片关联信息还可包括图片类型、文本编码标识、备注字符串以及帧标志等信息,这些信息的详细内容 可以参见现有的ID3标签的信息格式,在此就不再一一重复了。The electronic device can know, according to the APIC tag and the length of the information, which bytes in the ID3 tag store the picture information and the picture associated information. In a specific implementation process, if multiple pictures are stored, the picture related information in the ID3 tag may further include information such as a picture type, a text encoding identifier, a memo string, and a frame flag, and the details of the information. You can refer to the information format of the existing ID3 tags, which will not be repeated here.
所述图片关联信息至少包括MIME类型;所述MIME类型对应了ID3标签中1个或多个字节,对应的数据类型可为字符串。具体如,所述MIME对应的字节的字符串为:jpeg;电子设备解读到所述MIME类型时,知道所述图片信息的文件格式为jpeg,从而知道以哪种数据格式解析并显示所述图片信息。在具体的实现过程中,当所述图片信息的文件格式为image时,则所述MIME类型为image。The picture association information includes at least a MIME type; the MIME type corresponds to one or more bytes in the ID3 tag, and the corresponding data type may be a character string. Specifically, the character string of the byte corresponding to the MIME is: jpeg; when the electronic device interprets the MIME type, knowing that the file format of the picture information is jpeg, thereby knowing which data format is used to parse and display the image information. In a specific implementation process, when the file format of the picture information is image, the MIME type is image.
同样的为了在输出所述音频文件时,方便电子设备确定哪些是数据是属于ID3标签,哪些是属于音频内容,在ID3标签中也包括一个表征ID3标签信息长度的标签长度;故在本实施例中所述方法还包括:Similarly, in order to output the audio file, the electronic device is convenient to determine which data belongs to the ID3 tag and which belongs to the audio content, and the ID3 tag also includes a tag length indicating the length of the ID3 tag information; The method described also includes:
在所述音频文件中添加所述音频信息之前,所述方法还包括根据所述标签的信息长度,更新所述标签的标签长度。Before adding the audio information to the audio file, the method further includes updating a label length of the label according to an information length of the label.
基于上述方案,所述音频信息中包括噪音信息;Based on the above solution, the audio information includes noise information;
如图3所示,在步骤S130之前,所述方法还包括:As shown in FIG. 3, before the step S130, the method further includes:
步骤S121:依据预定策略删除所述音频信息中的噪音信息。Step S121: deleting noise information in the audio information according to a predetermined policy.
采集的音频信息中可能包括了一些用户不想要的噪音信息,在本实施例中还通过所述步骤S121进行噪音过滤,以删除所述噪音信息,这样就能保存用户想要的信息。The collected audio information may include some noise information that the user does not want. In the embodiment, noise filtering is also performed through the step S121 to delete the noise information, so that the information desired by the user can be saved.
具体地,所述噪音信息包括:电子设备在采集所述图片信息形成的拍照音及用户指定的环境噪音。Specifically, the noise information includes: a camera sound formed by the electronic device in collecting the picture information and an environmental noise specified by the user.
在进行图像采集时,为了提示用户已经完成图像采集,通常电子设备会发出类似“咔嚓”等声音,若采用静音拍照,可能用户将无法准确判断是否完成了拍照;若保留这种提示音,这些提示音将被作为环境音采集到所述音频信息中;在本实施例中可以通过步骤S121把这种采集图片信息形成的拍照音去掉;具体如何去掉可以采用以下方式: In the image acquisition, in order to prompt the user to complete the image acquisition, usually the electronic device will emit a sound similar to “咔嚓”. If the mute photo is taken, the user may not be able to accurately determine whether the photo is completed; if the prompt tone is retained, these The prompt tone will be collected into the audio information as an ambient sound; in this embodiment, the camera sound formed by the collected picture information can be removed by step S121; how to remove it can be adopted in the following manner:
预先保存所述拍照音对应的对比音;Preserving the contrast sound corresponding to the photographing sound in advance;
将采集到的音频信息与对比音比对,形成比对结果;Comparing the collected audio information with the contrast sound to form a comparison result;
依据所述别对结果,将所述音频信息中与所述比对音差异满足预设条件的信息的删除,达到去除拍照音的目的。According to the different result, the deletion of the information in the audio information that meets the preset condition with the difference of the comparison sounds achieves the purpose of removing the camera sound.
用户在拍照时,可能还有其他意想不到的噪音干扰,具体如正好开过的汽车的长鸣音、火车行进的铁轨音,建筑工地建筑机械运动造成的噪音等用户指定的环境噪音。When the user takes a picture, there may be other unexpected noise disturbances, such as the long beeping sound of the car that has just been turned on, the railroad sound of the train traveling, the noise caused by the construction machinery movement of the construction site, and other user-specified environmental noise.
去除所述指定的环境噪音时,同样可以采用将采集的音频信息与声音样本的对比,来删除这些声音;在具体的实现过程中,可以在形成所述对比结果后,还将依据比对结果生成并输出提示信息,再将依据用户基于所述提示信息的输入,确定是否需要删除所述拍照音或环境噪音。When the specified ambient noise is removed, the collected audio information may be compared with the sound sample to delete the sound; in a specific implementation process, after the comparison result is formed, the comparison result may also be used. The prompt information is generated and output, and according to the input of the user based on the prompt information, whether the photo sound or the environmental noise needs to be deleted is determined.
在具体的使用过程中,存在着用户在铁轨旁进行拍照,想保留这种铁轨音,通过与噪音样本比对后发现音频信息中有铁轨音,若直接视为指定的环境噪音显然可能删除了不是用户想要的环境音,故此时还可以通过图文或音频等方式告知用户当前检测到了铁轨音,是否删除将铁轨音视为环境噪音的删除提示;用户通过选择确认或取消等输入来指示电子设备具体的操作。In the specific use process, there is a user taking pictures along the railroad track, and want to keep the railroad track sound. After comparing with the noise sample, it is found that there is a railroad sound in the audio information. If it is directly regarded as the specified ambient noise, it may obviously be deleted. It is not the environmental sound that the user wants. Therefore, it is also possible to inform the user through the text or audio, whether the track sound is currently detected, whether to delete the track sound as the environmental noise deletion prompt; the user indicates by inputting confirmation or cancel input. The specific operation of the electronic device.
综合上述,本实施例提供了一种制作有声图片的方法,将图片信息承载在音频信息的标签中,与现有技术兼容性强,且具有通用性好的优点;电子设备不用安装指定应用就能达到同时输出分别采集的图片信息和音频信息。In summary, the embodiment provides a method for making an audio picture, which carries the picture information in the label of the audio information, has strong compatibility with the prior art, and has the advantages of universality; the electronic device does not need to install the specified application. It can simultaneously output the picture information and audio information collected separately.
实施例二:Embodiment 2:
如图4所示,本实施例提供一种制作有声图片的装置,As shown in FIG. 4, the embodiment provides a device for making an audio picture.
所述装置包括:The device includes:
图像采集单元110,配置为采集图片信息; The image collection unit 110 is configured to collect picture information.
音频采集单元120,配置为采集所述图片信息所在采集环境的音频信息;The audio collection unit 120 is configured to collect audio information of the collection environment where the picture information is located;
音频形成单元130,配置为在音频文件中添加所述图片信息、与所述图片信息解析相关的图片关联信息及所述音频信息。The audio forming unit 130 is configured to add the picture information, the picture associated information related to the picture information analysis, and the audio information to an audio file.
所述装置可以为同时包括照相机及录音功能的手机以及平板电脑等电子设备。The device may be an electronic device such as a mobile phone including a camera and a recording function, and a tablet computer.
所述图像采集单元110具体可包括可完成图像采集的照相机部件等。所述音频采集单元120可包括录音器等。The image acquisition unit 110 may specifically include a camera component or the like that can perform image acquisition. The audio collection unit 120 may include a recorder or the like.
所述音频形成单元130可以包括处理器及存储介质;所述处理器可以是中央处理器CPU、微处理器MCU、数字信号处理器DSP以及可编程处理器PLC等电子设备。所述存储介质用于存储形成的所述音频文件。在具体实现时,所述存储介质还可以用于存储所述图片信息、音频信息等。The audio forming unit 130 may include a processor and a storage medium; the processor may be an electronic device such as a central processing unit CPU, a microprocessor MCU, a digital signal processor DSP, and a programmable processor PLC. The storage medium is for storing the formed audio file. In a specific implementation, the storage medium may also be used to store the picture information, audio information, and the like.
基于上述方案,所述音频形成单元130,配置为在所述音频文件的标签(如ID3标签)写入APIC标记以及所述图片信息的MIME类型;在所述标签中添加所述图片信息。此处,限定了所述音频形成单元130具体在音频文件哪写入图片信息及图片关联信息,写在标签中能够很好的与现有的音频文件的信息格式相匹配,故无需通过特定的算法或应用来解析所述音频文件,从而具有通用性好的优点。Based on the above scheme, the audio forming unit 130 is configured to write an APIC tag and a MIME type of the picture information in a tag of the audio file (such as an ID3 tag); and add the picture information to the tag. Here, it is defined that the audio forming unit 130 specifically writes the picture information and the picture associated information in the audio file, and the writing in the label can well match the information format of the existing audio file, so there is no need to pass a specific An algorithm or application to parse the audio file has the advantage of being versatile.
此外,所述音频形成单元130还配置为依据所述图片信息以及所述APIC标记和MIME类型确定所述图片信息和所述图片关联信息的信息长度;及在所述标签中添加所述信息长度。In addition, the audio forming unit 130 is further configured to determine an information length of the picture information and the picture association information according to the picture information and the APIC tag and a MIME type; and add the information length in the tag .
所述MIME类型、APIC标记以及标签的具体内容可以参见实施例一,在此就不再重复了。所述标签优选为ID3标签。The specific content of the MIME type, the APIC tag, and the tag can be referred to in Embodiment 1, and will not be repeated here. The label is preferably an ID3 tag.
基于上述方案,所述音频形成单元130,还配置为在所述音频文件中添加所述音频信息之前,所述方法还包括根据所述标签的信息长度,更新所 述标签的标签长度。Based on the above solution, the audio forming unit 130 is further configured to: before adding the audio information to the audio file, the method further comprises: updating the location according to the information length of the label The label length of the label.
本实施例中所述音频形成单元130通过在标签中依据图片信息长度区别中的图片信息与其他信息;通过在标签中写入标签长度,用以电子设备在输出所述音频文件时的标签和音频信息;方便了后续音频信息的输出。In the embodiment, the audio forming unit 130 distinguishes the picture information and other information in the label according to the length of the picture information; and writes the label length in the label, and uses the label of the electronic device when outputting the audio file. Audio information; facilitates the output of subsequent audio information.
基于上述方案,所述音频信息中包括噪音信息;Based on the above solution, the audio information includes noise information;
如图5所示,所述装置还包括:As shown in FIG. 5, the device further includes:
噪音处理单元140,配置为在所述音频文件中添加所述音频信息之前,依据预定策略删除所述音频信息中的噪音信息。The noise processing unit 140 is configured to delete the noise information in the audio information according to a predetermined policy before adding the audio information to the audio file.
所述噪音处理单元140的具体结构可以包括各种类型的处理器,具体结构还可以是现有技术中的噪音处理器等结构。The specific structure of the noise processing unit 140 may include various types of processors, and the specific structure may also be a structure such as a noise processor in the prior art.
在本实施例中增设噪音处理单元140,可以过滤所述音频文件中的噪音,能仅保留用户想要的环境音,提高了电子设备的智能性及用户的使用满意度。In the embodiment, the noise processing unit 140 is added, and the noise in the audio file can be filtered, and only the environmental sound desired by the user can be retained, thereby improving the intelligence of the electronic device and the satisfaction of the user.
具体地,所述噪音信息包括:电子设备在采集所述图片信息形成的拍照音以及用户指定的环境噪音。Specifically, the noise information includes: a camera sound formed by the electronic device in collecting the picture information and an environmental noise specified by the user.
本实施例所述的装置,为实施例一中所述方法的提供了实现硬件,用于采用留声拍照时形成的是包括图片信息的音频文件,而非现有技术中需要专门应用解码输出的包括音频信息的图片文件,提高了通用性及兼容性。The device in the embodiment provides the implementation hardware for the method in the first embodiment, and is used to form an audio file including picture information when taking a picture by taking a voice, instead of requiring a special application decoding output in the prior art. The image file including audio information improves versatility and compatibility.
以下结合本发明实施例提供一个具体应用示例:如图6所示,所述方法包括:A specific application example is provided in conjunction with the embodiment of the present invention: as shown in FIG. 6, the method includes:
步骤1:采用智能手机拍照功能拍照生成一个jpeg图片;且同时录音生成一个MP3文件;在具体的实现过程中还在形成所述jpeg图片(即上文的图片信息)以及MP3文件(即上文的音频信息)之后,输出提示信息,以提示用户当前同时采集到了图片信息和音频信息,方便电子设备在接收用户基于上述提示进行的操作之后,执行后续步骤。 Step 1: Using a smartphone camera function to generate a jpeg picture; and simultaneously recording an MP3 file; in the specific implementation process, the jpeg picture (ie, the picture information above) and the MP3 file (ie, the above) are also formed. After the audio information is output, the prompt information is output to prompt the user to simultaneously collect the picture information and the audio information, so that the electronic device can perform the subsequent steps after receiving the operation performed by the user based on the prompt.
步骤2:构建新的MP3文件的ID3V2.3信息,定位ID3标签末尾。Step 2: Build the ID3V2.3 information of the new MP3 file and locate the end of the ID3 tag.
步骤3:在ID3标签中添加APIC标记、文本编码标识、MIME类型、图片类型以及备注说明等信息。Step 3: Add information such as APIC tag, text encoding identifier, MIME type, image type, and remarks to the ID3 tag.
步骤4:打开jpeg图片,将其的图片数据写入到ID3标签的数据区域,并更新ID3的tag长度;Step 4: Open the jpeg picture, write its picture data to the data area of the ID3 tag, and update the tag length of ID3;
步骤5:将步骤1中的MP3文件中的音频数据写入到写入了图片数据的MP3文件中。Step 5: Write the audio data in the MP3 file in step 1 to the MP3 file in which the picture data is written.
本发明实施例还提供一种计算机存储介质,所述计算机存储介质中存储有计算机可执行指令,所述计算机可执行指令用于执行本发明实施例所述方法的至少其中之一,具体如1、图2及图3中的至少一个。The embodiment of the present invention further provides a computer storage medium, wherein the computer storage medium stores computer executable instructions, and the computer executable instructions are used to execute at least one of the methods of the embodiments of the present invention, such as 1 At least one of FIG. 2 and FIG.
所述计算机存储介质可为移动存储设备、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质,所述计算机存储介质可选为非瞬间存储介质。The computer storage medium may be a removable storage device, a read-only memory (ROM), a random access memory (RAM), a magnetic disk or an optical disk, and the like, which can store program code. The computer storage medium can be selected as a non-transitory storage medium.
在本申请所提供的几个实施例中,应该理解到,所揭露的设备和方法,可以通过其它的方式实现。以上所描述的设备实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,如:多个单元或组件可以结合,或可以集成到另一个系统,或一些特征可以忽略,或不执行。另外,所显示或讨论的各组成部分相互之间的耦合、或直接耦合、或通信连接可以是通过一些接口,设备或单元的间接耦合或通信连接,可以是电性的、机械的或其它形式的。In the several embodiments provided by the present application, it should be understood that the disclosed apparatus and method may be implemented in other manners. The device embodiments described above are merely illustrative. For example, the division of the unit is only a logical function division. In actual implementation, there may be another division manner, such as: multiple units or components may be combined, or Can be integrated into another system, or some features can be ignored or not executed. In addition, the coupling, or direct coupling, or communication connection of the components shown or discussed may be indirect coupling or communication connection through some interfaces, devices or units, and may be electrical, mechanical or other forms. of.
上述作为分离部件说明的单元可以是、或也可以不是物理上分开的,作为单元显示的部件可以是、或也可以不是物理单元,即可以位于一个地方,也可以分布到多个网络单元上;可以根据实际的需要选择其中的 部分或全部单元来实现本实施例方案的目的。The units described above as separate components may or may not be physically separated, and the components displayed as the unit may or may not be physical units, that is, may be located in one place or distributed to multiple network units; You can choose which one according to your actual needs. Some or all of the units implement the objectives of the embodiment of the present embodiment.
另外,在本发明各实施例中的各功能单元可以全部集成在一个处理模块中,也可以是各单元分别单独作为一个单元,也可以两个或两个以上单元集成在一个单元中;上述集成的单元既可以采用硬件的形式实现,也可以采用硬件加软件功能单元的形式实现。In addition, each functional unit in each embodiment of the present invention may be integrated into one processing module, or each unit may be separately used as one unit, or two or more units may be integrated into one unit; the above integration The unit can be implemented in the form of hardware or in the form of hardware plus software functional units.
本领域普通技术人员可以理解:实现上述方法实施例的全部或部分步骤可以通过程序指令相关的硬件来完成,前述的程序可以存储于一计算机可读取存储介质中,该程序在执行时,执行包括上述方法实施例的步骤;而前述的存储介质包括:移动存储设备、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质。A person skilled in the art can understand that all or part of the steps of implementing the above method embodiments may be completed by using hardware related to the program instructions. The foregoing program may be stored in a computer readable storage medium, and the program is executed when executed. The foregoing storage device includes the following steps: the foregoing storage medium includes: a mobile storage device, a read-only memory (ROM), a random access memory (RAM), a magnetic disk, or an optical disk. A medium that can store program code.
以上所述,仅为本发明的具体实施方式,但本发明的保护范围并不局限于此,凡按照本发明原理所作的修改,都应当理解为落入本发明的保护范围。 The above is only the specific embodiment of the present invention, but the scope of the present invention is not limited thereto, and modifications made in accordance with the principles of the present invention should be understood as falling within the scope of the present invention.

Claims (12)

  1. 一种制作有声图片的方法,所述方法包括:A method of making an audio picture, the method comprising:
    采集图片信息;Collect picture information;
    采集所述图片信息所在采集环境的音频信息;Collecting audio information of the collection environment where the picture information is located;
    在音频文件中添加所述图片信息、与所述图片信息解析相关的图片关联信息及所述音频信息。The picture information, the picture associated information related to the picture information analysis, and the audio information are added to the audio file.
  2. 根据权利要求1所述的方法,其中,The method of claim 1 wherein
    所述图片关联信息至少包括APIC标记以及MIME类型;The picture association information includes at least an APIC tag and a MIME type;
    所述在音频文件中添加所述图片信息及图片关联信息,包括:Adding the picture information and the picture association information to the audio file, including:
    在所述音频文件的标签中添加APIC标记以及所述图片信息的MIME类型;Adding an APIC tag and a MIME type of the picture information to a tag of the audio file;
    在所述音频文件的标签中添加所述图片信息。Adding the picture information to a tag of the audio file.
  3. 根据权利要求2所述的方法,其中,The method of claim 2, wherein
    所述在音频文件中添加所述图片信息及图片关联信息,还包括Adding the picture information and the picture associated information to the audio file, including
    依据所述图片信息、所述APIC标记及MIME类型确定所述图片信息和所述图片关联信息的信息长度;Determining, according to the picture information, the APIC tag, and the MIME type, the information length of the picture information and the picture association information;
    在所述标签中添加所述信息长度。The length of the information is added to the tag.
  4. 根据权利要求3所述的方法,其中,The method of claim 3, wherein
    所述方法还包括:The method further includes:
    在所述音频文件中添加所述音频信息之前,所述方法还包括根据所述标签的信息长度,更新所述标签的标签长度。Before adding the audio information to the audio file, the method further includes updating a label length of the label according to an information length of the label.
  5. 根据权利要求1至4任一项所述的方法,其中,The method according to any one of claims 1 to 4, wherein
    所述音频信息中包括噪音信息;The audio information includes noise information;
    所述方法还包括: The method further includes:
    在所述音频文件中添加所述音频信息之前,依据预定策略删除所述音频信息中的噪音信息。Before the audio information is added to the audio file, the noise information in the audio information is deleted according to a predetermined policy.
  6. 根据权利要求5所述的方法,其中,The method of claim 5, wherein
    所述噪音信息包括:电子设备在采集所述图片信息形成的拍照音以及用户指定的环境噪音。The noise information includes: a camera sound formed by the electronic device collecting the picture information and an environmental noise specified by the user.
  7. 一种制作有声图片的装置,所述装置包括:A device for making a sound picture, the device comprising:
    图像采集单元,配置为采集图片信息;An image acquisition unit configured to collect picture information;
    音频采集单元,配置为采集所述图片信息所在采集环境的音频信息;The audio collection unit is configured to collect audio information of the collection environment where the picture information is located;
    音频形成单元,配置为在音频文件中添加所述图片信息、与所述图片信息解析相关的图片关联信息及所述音频信息。The audio forming unit is configured to add the picture information, the picture associated information related to the picture information analysis, and the audio information to an audio file.
  8. 根据权利要求7所述的装置,其中,The apparatus according to claim 7, wherein
    所述图片关联信息至少包括APIC标记以及MIME类型;The picture association information includes at least an APIC tag and a MIME type;
    所述音频形成单元,配置为在所述音频文件的标签中添加APIC标记以及所述图片信息的MIME类型;在所述标签中添加所述图片信息。The audio forming unit is configured to add an APIC tag and a MIME type of the picture information in a tag of the audio file; and add the picture information to the tag.
  9. 根据权利要求8所述的装置,其中,The device according to claim 8, wherein
    所述音频形成单元,配置为依据所述图片信息、所述APIC标记及MIME类型确定所述图片信息和所述图片关联信息的信息长度;及在所述标签中添加所述信息长度。The audio forming unit is configured to determine an information length of the picture information and the picture association information according to the picture information, the APIC tag, and a MIME type; and add the information length to the tag.
  10. 根据权利要求8所述的装置,其中,The device according to claim 8, wherein
    所述音频形成单元,还配置为在所述音频文件中添加所述音频信息之前,所述方法还包括根据所述标签的信息长度,更新所述标签的标签长度。The audio forming unit is further configured to: before adding the audio information to the audio file, the method further comprising updating a label length of the label according to an information length of the label.
  11. 根据权利要求7至10任一项所述的装置,其中,A device according to any one of claims 7 to 10, wherein
    所述音频信息中包括噪音信息;The audio information includes noise information;
    所述装置还包括:The device also includes:
    噪音处理单元,配置为在所述音频文件中添加所述音频信息之前,依 据预定策略删除所述音频信息中的噪音信息。a noise processing unit configured to add the audio information to the audio file before The noise information in the audio information is deleted according to a predetermined policy.
  12. 一种计算机存储介质,所述计算机存储介质中存储有计算机可执行指令,所述计算机可执行指令用于执行权利要求1至6所述方法的至少其中之一。 A computer storage medium having stored therein computer executable instructions for performing at least one of the methods of claims 1 to 6.
PCT/CN2014/092391 2014-09-24 2014-11-27 Method and device for producing image having sound, and computer storage medium WO2015131573A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201410494907.8A CN105513103A (en) 2014-09-24 2014-09-24 Method and apparatus for producing audio images
CN201410494907.8 2014-09-24

Publications (1)

Publication Number Publication Date
WO2015131573A1 true WO2015131573A1 (en) 2015-09-11

Family

ID=54054454

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2014/092391 WO2015131573A1 (en) 2014-09-24 2014-11-27 Method and device for producing image having sound, and computer storage medium

Country Status (2)

Country Link
CN (1) CN105513103A (en)
WO (1) WO2015131573A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106375681A (en) * 2016-09-29 2017-02-01 维沃移动通信有限公司 Static-dynamic image production method, and mobile terminal

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101174448A (en) * 2007-12-10 2008-05-07 北京炬力北方微电子有限公司 Talking picture playing method and device, method for generating index file of talking picture
US20100194761A1 (en) * 2009-02-02 2010-08-05 Phillip Rhee Converting children's drawings into animated movies
CN102129812A (en) * 2010-01-12 2011-07-20 微软公司 Viewing media in the context of street-level images
CN102609968A (en) * 2012-03-05 2012-07-25 信源通科技(深圳)有限公司 Method and system for realizing audio picture

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101174448A (en) * 2007-12-10 2008-05-07 北京炬力北方微电子有限公司 Talking picture playing method and device, method for generating index file of talking picture
US20100194761A1 (en) * 2009-02-02 2010-08-05 Phillip Rhee Converting children's drawings into animated movies
CN102129812A (en) * 2010-01-12 2011-07-20 微软公司 Viewing media in the context of street-level images
CN102609968A (en) * 2012-03-05 2012-07-25 信源通科技(深圳)有限公司 Method and system for realizing audio picture

Also Published As

Publication number Publication date
CN105513103A (en) 2016-04-20

Similar Documents

Publication Publication Date Title
TWI321779B (en) Information processor, contents recording method, program, and storage medium
US10255929B2 (en) Media presentation playback annotation
CN107613357B (en) Sound and picture synchronous optimization method and device and readable storage medium
US9779775B2 (en) Automatic generation of compilation videos from an original video based on metadata associated with the original video
US9304657B2 (en) Audio tagging
EP2940940B1 (en) Methods for sending and receiving video short message, apparatus and handheld electronic device thereof
JP2009163496A (en) Content reproduction system
US20150243325A1 (en) Automatic generation of compilation videos
WO2016119370A1 (en) Method and device for implementing sound recording, and mobile terminal
CN109547841B (en) Short video data processing method and device and electronic equipment
CN111046226B (en) Tuning method and device for music
WO2016202176A1 (en) Method, device and apparatus for synthesizing media file
US20090177700A1 (en) Establishing usage policies for recorded events in digital life recording
WO2013178140A1 (en) Driving record processing method, apparatus, and device
CN111527746A (en) Electronic device for linking music to photographing and control method thereof
CN106095881A (en) Method, system and the mobile terminal of a kind of display photos corresponding information
JPWO2010073695A1 (en) Edit information presenting apparatus, edit information presenting method, program, and recording medium
DE102018119101A1 (en) IMPLEMENTING ACTION ON ACTIVE MEDIA CONTENT
Koenig et al. Forensic authenticity analyses of the header data in re-encoded WMA files from small Olympus audio recorders
WO2015131573A1 (en) Method and device for producing image having sound, and computer storage medium
US20120284267A1 (en) Item Randomization with Item Relational Dependencies
JP2007026329A (en) Content transmitting and receiving system
JP5060636B1 (en) Electronic device, image processing method and program
WO2011011180A2 (en) Media processing comparison system and techniques
WO2015131700A1 (en) File storage method and device

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 14884896

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 14884896

Country of ref document: EP

Kind code of ref document: A1