KR20230008687A

KR20230008687A - Method and apparatus for automatic picture labeling and recording in smartphone

Info

Publication number: KR20230008687A
Application number: KR1020220188194A
Authority: KR
Inventors: 오영식
Original assignee: 오영식
Priority date: 2021-02-25
Filing date: 2022-12-29
Publication date: 2023-01-16
Also published as: KR20220121667A

Abstract

An apparatus and method for automatically labeling and recording a picture in a smartphone are disclosed. An apparatus for automatically labeling and recording a picture in a smartphone may include: a camera lens for taking a picture; a shooting button for instructing photo shoot; a display part for displaying a taken photo; a voice input button for instructing voice input and text conversion; a voice recognition part for converting the input voice into text or text string; and a file storage part for storing the converted character or character string as a file name in a photo file taken under the control of a control part.

Description

Photo automatic labeling and recording device and method on a smartphone {METHOD AND APPARATUS FOR AUTOMATIC PICTURE LABELING AND RECORDING IN SMARTPHONE}

아래 실시예들은 스마트폰에서 찍은 사진에 자동으로 명칭을 붙이거나 음향을 녹음하는 기술에 관한 것이다.The following embodiments relate to a technology for automatically naming or recording sound to pictures taken on a smartphone.

스마트폰이나 디지털 카메라로 일상생활에서 사진을 많이 찍는다. 사진의 대상은 사람일 수 도 있고 사물이나 사건의 경우도 있다. 통상적으로 스마트폰 또는 카메라로 사진을 찍으면 내장된 시계에 의해서 사진 촬영 일시가 사진 파일명으로 저장되던지, 또는 촬영 일련번호로 파일명으로 저장된다. 또한 일부 카메라나 스마트폰에는 촬영 시간 및 카메라의 속성 정보, 촬영 위치 정보 등이 사진에 포함된 메타 데이터의 형태로 저장하기도 한다.I take a lot of pictures in my daily life with my smartphone or digital camera. The subject of a photograph can be a person or an object or event. In general, when a picture is taken with a smartphone or camera, the date and time the picture was taken is saved as a picture file name or a picture serial number is saved as a file name by the built-in clock. Also, in some cameras or smart phones, the shooting time, camera property information, and shooting location information are stored in the form of metadata included in the photo.

1. 카메라가 실장된 휴대폰에서 사진/음성 동시 편집방법, 등록번호 10-0341987, 등록일자 2002년06월12일1. Simultaneous photo/voice editing method on a mobile phone equipped with a camera, registration number 10-0341987, registration date June 12, 2002 2. 디지털 이미지 캡쳐 세션 및 메타데이터 연관, 출원번호 10-2020-7006887, 출원일자 2018년09월 10일2. Digital image capture session and related metadata, application number 10-2020-7006887, application date September 10, 2018 상기 특허문헌 1은 사진 파일과 함께 음성 파일을 함께 저장하여 사진을 재생할 때 음성과 사진이 동시에 출력되게 하는 기술에 관한 것이다.Patent Document 1 relates to a technique for simultaneously outputting audio and pictures when playing a picture by storing a picture file and a sound file together. 또한 상기 특허문헌 2는 스마트폰 카메라에 입력되는 물건의 이미지에서 물건 명칭 등의 메타 데이터를 인식하고 통신 네트워크로 서버와 연결한 후 가격 등의 정보를 수신하여 표시하며, 이 과정에서 선택적으로 사용자의 음성을 문자로 변환하여 물건의 명칭 등의 메타정보를 추가적으로 인식하는 기술에 관한 것이다.In addition, Patent Document 2 recognizes meta data such as the name of an object from an image of an object input to a smartphone camera, connects to a server through a communication network, and receives and displays information such as price. It relates to a technology for additionally recognizing meta information such as the name of an object by converting voice into text.

그러나 스마트폰 카메라 사용자에게 가장 필요한 기능은, 자신이 여러 장의 다른 사진들을 구분할 수 있는 문구 또는 해당 사진에 대한 정보를 사진의 명칭이나 메타 데이터의 형태로 사진 파일에 저장하는 것이다. 그래서 추후에 사진 파일의 명칭이나 메타정보를 보고 사진을 구분하거나 사진에 관한 정보를 파악하거나, 또는 여러 장의 사진 파일 중에서 자신이 원하는 키워드로 한 장 이상의 사진들을 검색하는 것이 매우 중요한 기능이다.However, the most necessary function for a smartphone camera user is to save a phrase or information about a corresponding photo in a photo file in the form of a photo name or meta data by which the user can distinguish between several different photos. Therefore, later, it is a very important function to classify photos by looking at the name or meta information of the photo file, to grasp information about the photo, or to search for one or more photos with a desired keyword among multiple photo files.

이를 위해서 현재의 카메라나 스마트폰에서는 일단 사진을 촬영한 후에 기기에 사진 파일이 저장되면, 사용자가 파일 관리자 또는 사진 재생 기능을 이용하여 기존의 파일명을 문자편집 기능을 이용하여 수정해야 한다. 카메라나 스마트폰의 경우에는 단말기 크기가 작아 문자편집 기능이 불편한 것이 일반적이다. 또한 사진을 찍을 때마다 문자편집 기능을 사용하여 사진 파일명을 수정하는 것은 매우 불편하다. 또한 사진을 여러 장 촬영한 이후에 나중에 한꺼번에 파일명을 수정하려고 하면 각각의 사진에 대한 정확한 내용을 기억하고 구분하는 것이 쉽지 않은 어려움이 있다.To this end, in current cameras or smart phones, once a photo is taken and a photo file is stored in the device, the user must use a file manager or photo playback function to modify the existing file name using a text editing function. In the case of a camera or smartphone, it is common that the text editing function is inconvenient due to the small size of the device. Also, it is very inconvenient to modify the photo file name using the text editing function every time a photo is taken. In addition, if you try to modify the file name at once after taking several pictures, it is difficult to remember and distinguish the exact contents of each picture.

이런 불편을 해소하기 위해 상기 특허문헌 1과 같이, 사진 파일과 함께 음성 파일을 함께 저장하는 경우도 있지만, 사진에 대한 배경 정보를 확인하기 위해서는 일일이 음성 파일을 재생해야 하므로 매우 불편하고, 사진 파일과 음성 파일의 연결이 변경되거나 해당 음성 파일이 삭제되는 경우에는 해당 기능을 사용할 수 없게 되는 문제가 있다. 또한, 특허문헌 2의 경우에는 사진 파일을 구분하기 위한 것이 아니라, 사진에 포함된 물건을 인식하고 검색하며 구매하기 위한 목적으로 그 용도와 방법이 완전히 다르다.In order to solve this inconvenience, as in Patent Document 1, in some cases, an audio file is stored together with a photo file, but it is very inconvenient to play the audio file individually to check the background information on the photo, which is very inconvenient. There is a problem in that the function cannot be used when the connection of the voice file is changed or the corresponding voice file is deleted. In addition, in the case of Patent Document 2, the use and method are completely different for the purpose of recognizing, searching, and purchasing objects included in the photo, not for distinguishing photo files.

최근에는 음성 인식 기술이 발전하여 정확도가 향상되었으므로, 카메라나 스마트폰에서 사진을 촬영한 후에 사용자의 음성으로 사진의 명칭을 말하고, 이를 카메라 자체적으로 또는 원격지의 음성 인식 서버를 이용하여 음성을 인식하여 문자로 변환하고 사진 파일의 명칭으로 또는 사진의 메타 데이터로 저장한다면, 상기의 문제들이 모두 해결된다. 또한 추가적으로 사진이나 동영상에 해당되는 음성 또는 음향 파일을 생성하여 사진 또는 동영상과 일체화시킴으로써, 사진이나 동영상 재생시에 촬영 현장의 상황을 좀 더 생생하게 표현할 수 있도록 한다.Recently, voice recognition technology has been developed and accuracy has improved. After taking a picture with a camera or smartphone, say the name of the picture with the user's voice, and recognize it by the camera itself or by using a remote voice recognition server. If converted to text and saved as the name of a photo file or metadata of a photo, all of the above problems are solved. In addition, a voice or sound file corresponding to a photo or video is additionally created and integrated with the photo or video, so that the situation at the shooting site can be expressed more vividly when the photo or video is played.

일 실시예에 따른 스마트폰에서 사진 자동 레이블링 및 녹음 장치는, 사진을 촬영하는 카메라 렌즈; 사진 촬영을 지시하는 촬영버튼; 촬영한 사진을 표시하는 표시부; 음성입력과 문자변환을 지시하는 음성입력버튼; 입력된 음성을 문자 또는 문자열로 변환하는 음성인식부; 및 제어부의 제어하에 촬영한 사진 파일에 상기 변환된 문자 또는 문자열을 파일의 명칭으로 저장하는 파일저장부를 포함할 수 있다.An apparatus for automatically labeling and recording photos in a smartphone according to an embodiment includes a camera lens for taking photos; a photographing button instructing photographing; a display unit for displaying a photograph taken; a voice input button instructing voice input and character conversion; a voice recognition unit that converts input voice into text or text; and a file storage unit for storing the converted character or character string as a file name in a photograph file taken under the control of the controller.

사용자의 의도에 의해서 촬영된 사진과 관련 있는 음향신호를 녹음하기 위한 녹음 버튼을 더 포함할 수 있다.A record button for recording a sound signal related to a photograph taken by a user's intention may be further included.

사용자는 사진 촬영과 동시에 또는 이전 또는 이후에 음성입력버튼을 누르고 사진의 명칭을 말하며, 마이크로 입력된 음성 신호는 음성인식부에서 처리되어 문자 또는 문자열로 변환되고, 변환된 문자 또는 문자열은 촬영된 사진 파일의 명칭 또는 메타 데이터로 자동으로 저장되는 기능The user presses the voice input button at the same time as, before or after taking a photo, and says the name of the photo. Automatically saved as file name or meta data

을 더 포함할 수 있다.may further include.

음성입력버튼을 클릭하면 음성 신호의 문자열 인식을 시작하고 다시 해당 버튼을 클릭하면 음성 인식을 종료하거나, 음성입력버튼을 클릭하여 음성 인식을 시작하고 사전에 정해진 시간이 지나거나, 더 이상 음성 신호의 입력이 없으면, 자동으로 음성 인식을 종료하는 기능Clicking the voice input button starts recognizing the character string of the voice signal, clicking the corresponding button again ends voice recognition, or clicking the voice input button starts recognizing the voice and after a predetermined time has elapsed, the voice signal is no longer Ability to automatically end voice recognition when there is no input

을 더 포함할 수 있다.may further include.

사진 또는 동영상을 촬영한 이후에, 사전에 정해진 시간동안 음성 인식으로 문자열 인식을 수행하여 파일명을 생성하는 기능을 더 포함할 수 있다.A function of generating a file name by performing string recognition through voice recognition for a predetermined time after taking a photo or video may be further included.

음성 인식은 스마트폰의 음성 인식부에서 자체적으로 수행되던지, 음성 인식부의 주도하에 음성 또는 음향 신호 데이터를 스마트폰의 통신 네트워크를 통해서 원격지의 변환 서버로 전송하고 변환 서버에서 문자 또는 문자열로 변환된 이후에 다시 스마트폰으로 수신되는 기능을 더 포함할 수 있다.Voice recognition is performed by the voice recognition unit of the smartphone itself, or voice or sound signal data is transmitted to a conversion server in a remote place through the communication network of the smartphone under the leadership of the voice recognition unit, and the conversion server converts it into a character or string. Thereafter, a function to be received by the smartphone again may be further included.

촬영된 사진에서 복수의 특징(특정 물건이나 장소, 사람, 또는 이벤트(행위나 사건))이 추출된 경우에는, 각각의 특징의 명칭을 복수의 사진 파일들의 명칭으로 저장하거나, 하나의 사진 파일에 복수의 메타 데이터로 저장하는 기능을 더 포함할 수 있다.When a plurality of features (a specific object, place, person, or event (act or event)) are extracted from a photographed photograph, the name of each feature is saved as the name of a plurality of photo files or stored in a single photo file. A function of storing a plurality of meta data may be further included.

상기의 방식으로 촬영된 사진 파일들에 부여된 명칭 또는 메타 데이터에 따라서, 자동으로 사진 파일들이 분류되어 서로 다른 폴더에 저장될 수도 있고, 촬영된 시간에 따라서 명칭 앞 또는 뒤에 일련번호를 붙일 수도 있다. 또는 동일한 파일명의 사진이 이미 있는 경우에는, 파일명의 뒤에 촬영 시간에 따른 일련번호 또는 촬영일시정보를 추가로 붙여서 구분하는 기능을 더 포함할 수 있다.Depending on the names or metadata given to the photo files taken in the above way, the photo files may be automatically classified and stored in different folders, or a serial number may be added before or after the name according to the time taken. . Alternatively, if a photo with the same file name already exists, a function of distinguishing the file by adding a serial number according to the shooting time or information on the shooting date may be further included at the end of the file name.

상기의 녹음버튼을 이용하여 스마트폰에 입력된 음향 신호를 녹음하고 이를 별도의 음향 파일로 저장하거나 사진 파일에 메타 데이터로 포함시킬 수 있으며, 별도의 음향 파일로 저장하는 경우에는 해당 사진 또는 동영상 파일과 동일한 파일명 또는 메타 데이터로 음향 파일을 저장하며, 사진의 경우에는 동영상 형식으로 변경하여 녹음된 음향 신호를 포함시켜서 하나의 파일로 생성할 수도 있는 기능을 더 포함할 수 있다.The sound signal input to the smartphone can be recorded using the above record button and saved as a separate sound file or included as meta data in a photo file. In the case of saving as a separate sound file, the corresponding photo or video file A sound file is stored with the same file name or meta data, and in the case of a photo, a function of generating a single file by including a recorded sound signal in a video format may be further included.

상기의 사진 또는 동영상 파일에 상기의 음향 파일이 포함되거나 해당 음향 파일이 존재하는 경우에는 사진 또는 동영상 파일의 메타 데이터에 그 존재 여부 또는 음향 파일명을 포함시키는 기능을 더 포함할 수 있다.When the sound file is included in the photo or video file or the corresponding sound file exists, a function of including the existence or sound file name in metadata of the photo or video file may be further included.

촬영한 사진들 또는 동영상들의 목록에는 섬네일을 표시할 수도 있으며, 특정 사진의 섬네일이나 파일명을 클릭하면 사진을 화면에 재생하고 동시에 녹음된 음향 파일을 재생하기도 하고, 특정 동영상의 경우에는 섬네일이나 파일명을 클릭하면 동영상의 첫 화면을 정지화면으로 재생하고 동시에 상기의 음향 파일을 재생한 후에, 나머지 동영상을 재생하는 기능을 더 포함할 수 있다.Thumbnails can be displayed in the list of taken photos or videos. If you click the thumbnail or file name of a specific photo, the photo is played on the screen and the recorded sound file is played simultaneously. In the case of a specific video, the thumbnail or file name is displayed. When clicked, a function of playing the first screen of the video as a still image and simultaneously playing the sound file and then playing the rest of the video may be further included.

사전에 정해진 여러 명의 사용자 그룹 중에 한 명이 소유한 스마트폰에, 사전에 정해진 키워드가 사진 파일의 명칭 또는 메타 데이터에 포함된 사진 파일이 생성되면, 자동으로 해당 그룹 구성원들이 소유한 스마트폰에 공유되어 동기화되는 기능을 더 포함할 수 있다.When a photo file containing a pre-determined keyword in the name or metadata of a photo file is created on a smartphone owned by one of several pre-determined user groups, it is automatically shared with the smartphones owned by members of the group. A synchronization function may be further included.

사전에 정해진 여러 명의 사용자 그룹중에 한 명이 소유한 스마트폰에, 특정인이 사전에 정한 키워드가 사진 파일의 명칭 또는 메타 데이터에 포함된 사진 파일이 생성되면, 자동으로 해당 특정인이 소유한 스마트폰에 공유되어 동기화되는 기능을 더 포함할 수 있다.When a photo file containing a pre-determined keyword in the name or metadata of a photo file is created on a smartphone owned by one of a group of pre-determined users, it is automatically shared on the smartphone owned by the specific person. and may further include a function of being synchronized.

스마트폰에서 사진 자동 레이블링 및 녹음 방법에 있어서, 스마트폰의 카메라로 사진 또는 동영상을 촬영하는 단계; 스마트폰의 음성입력버튼을 누르든지 또는 상기의 촬영 후 일정시간 동안 스마트폰에 음성을 입력하고, 이를 스마트폰 또는 외부 서버에서 인식하여 문자열로 변환시킨 후 상기의 사진 또는 동영상의 파일명칭 또는 메타 데이터로 저장하는 단계; 생성된 문자열의 파일명이 정확하지 않거나 변경을 원하면 다시 음성을 입력하여 파일명을 수정하는 단계; 선택적으로 녹음버튼을 이용하여 음성 또는 음향 신호를 녹음하고 상기의 사진 파일명과 동일한 파일명의 별도의 음향 파일로 저장하는 단계; 촬영된 사진 또는 동영상 목록에서 섬네일 또는 상기의 파일명을 선택하면 사진과 함께 녹음된 상기의 음향 파일을 함께 재생하는 단계를 포함한다.A method for automatically labeling and recording photos in a smartphone, comprising: taking a photo or video with a camera of the smartphone; Press the voice input button of the smartphone or input the voice into the smartphone for a certain period of time after taking the above picture, recognize it on the smartphone or an external server and convert it into a character string, and then the file name or metadata of the photo or video above Saving as; If the file name of the generated string is not correct or a change is desired, correcting the file name by inputting voice again; Optionally recording a voice or sound signal using a record button and storing it as a separate sound file with the same file name as the photo file name; When a thumbnail or the above file name is selected from a list of taken pictures or videos, the sound file recorded together with the picture is played.

일 실시예에 따르면 사진 또는 동영상 촬영시에 음성 정보를 문자정보로 인식하여 해당 파일 명칭이나 메타 데이터로 저장할 수 있다.According to an embodiment, voice information may be recognized as text information when a photo or video is taken and stored as a corresponding file name or metadata.

일 실시예에 따르면 상기의 저장된 파일의 명칭이나 메타 데이터를 표시하여 사용자가 파일을 구분하거나 파일의 정보를 파악할 수 있다.According to an embodiment, the name or meta data of the stored file is displayed so that the user can distinguish the file or grasp the information of the file.

일 실시예에 따르면 상기의 저장된 파일의 명칭이나 메타 데이터를 이용하여 특정 키워드에 맞는 파일들을 검색하거나 분류할 수 있다.According to an embodiment, files matching a specific keyword may be searched for or classified using the names of the stored files or meta data.

일 실시예에 따르면 사진 또는 동영상 촬영시에 추가적으로 음향 신호를 녹음하고 저장하여 사진 또는 동영상 재생시에 들을 수 있도록 할 수 있다.According to an embodiment, a sound signal may be additionally recorded and stored when a picture or video is taken so that the sound signal can be heard when the picture or video is reproduced.

도 1은 일 실시예에 따른 스마트폰에서 사진 자동 레이블링 및 녹음 장치의 구성을 도시하는 도면이다.
도 2는 일 실시예에 따른 스마트폰에서 사진 자동 레이블링 및 녹음 장치에서의 스마트폰의 화면을 구성하는 도면들이다.

1 is a diagram illustrating a configuration of a photo automatic labeling and recording device in a smart phone according to an embodiment.
2 is diagrams configuring a screen of a smart phone in an apparatus for automatically labeling and recording photos in a smart phone according to an embodiment.

이하에서, 첨부된 도면을 참조하여 실시예들을 상세하게 설명한다. 그러나, 실시예들에는 다양한 변경이 가해질 수 있어서 특허출원의 권리 범위가 이러한 실시예들에 의해 제한되거나 한정되는 것은 아니다. 실시예들에 대한 모든 변경, 균등물 내지 대체물이 권리 범위에 포함되는 것으로 이해되어야 한다.Hereinafter, embodiments will be described in detail with reference to the accompanying drawings. However, since various changes can be made to the embodiments, the scope of the patent application is not limited or limited by these embodiments. It should be understood that all changes, equivalents or substitutes to the embodiments are included within the scope of rights.

실시예에서 사용한 용어는 단지 설명을 목적으로 사용된 것으로, 한정하려는 의도로 해석되어서는 안된다. 단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함한다. 본 명세서에서, "포함하다" 또는 "가지다" 등의 용어는 명세서 상에 기재된 특징 점, 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것이 존재함을 지정하려는 것이지, 하나 또는 그 이상의 다른 특징 점들이나 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해되어야 한다.Terms used in the examples are used only for descriptive purposes and should not be construed as limiting. Singular expressions include plural expressions unless the context clearly dictates otherwise. In this specification, terms such as "include" or "have" are intended to indicate that there is a characteristic point, number, step, operation, component, part, or combination thereof described in the specification, but one or more other It should be understood that the presence or addition of feature points, numbers, steps, operations, components, parts, or combinations thereof is not precluded.

다르게 정의되지 않는 한, 기술적이거나 과학적인 용어를 포함해서 여기서 사용되는 모든 용어들은 실시예가 속하는 기술 분야에서 통상의 지식을 가진 자에 의해 일반적으로 이해되는 것과 동일한 의미를 가지고 있다. 일반적으로 사용되는 사전에 정의되어 있는 것과 같은 용어들은 관련 기술의 문맥 상 가지는 의미와 일치하는 의미를 가지는 것으로 해석되어야 하며, 본 출원에서 명백하게 정의하지 않는 한, 이상적이거나 과도하게 형식적인 의미로 해석되지 않는다.Unless defined otherwise, all terms used herein, including technical or scientific terms, have the same meaning as commonly understood by a person of ordinary skill in the art to which the embodiment belongs. Terms such as those defined in commonly used dictionaries should be interpreted as having a meaning consistent with the meaning in the context of the related art, and unless explicitly defined in the present application, they should not be interpreted in an ideal or excessively formal meaning. don't

또한, 첨부 도면을 참조하여 설명함에 있어, 도면 부호에 관계없이 동일한 구성 요소는 동일한 참조부호를 부여하고 이에 대한 중복되는 설명은 생략하기로 한다. 실시예를 설명함에 있어서 관련된 공지 기술에 대한 구체적인 설명이 실시예의 요지를 불필요하게 흐릴 수 있다고 판단되는 경우 그 상세한 설명을 생략한다.In addition, in the description with reference to the accompanying drawings, the same reference numerals are given to the same components regardless of reference numerals, and overlapping descriptions thereof will be omitted. In describing the embodiment, if it is determined that a detailed description of a related known technology may unnecessarily obscure the gist of the embodiment, the detailed description will be omitted.

또한, 실시 예의 구성 요소를 설명하는 데 있어서, 제 1, 제 2, A, B, (a), (b) 등의 용어를 사용할 수 있다. 이러한 용어는 그 구성 요소를 다른 구성 요소와 구별하기 위한 것일 뿐, 그 용어에 의해 해당 구성 요소의 본질이나 차례 또는 순서 등이 한정되지 않는다. 어떤 구성 요소가 다른 구성요소에 "연결", "결합" 또는 "접속"된다고 기재된 경우, 그 구성 요소는 그 다른 구성요소에 직접적으로 연결되거나 접속될 수 있지만, 각 구성 요소 사이에 또 다른 구성 요소가 "연결", "결합" 또는 "접속"될 수도 있다고 이해되어야 할 것이다.In addition, in describing the components of the embodiment, terms such as first, second, A, B, (a), and (b) may be used. These terms are only used to distinguish the component from other components, and the nature, order, or order of the corresponding component is not limited by the term. When an element is described as being “connected,” “coupled to,” or “connected” to another element, that element may be directly connected or connected to the other element, but there may be another element between the elements. It should be understood that may be "connected", "coupled" or "connected".

어느 하나의 실시 예에 포함된 구성요소와, 공통적인 기능을 포함하는 구성요소는, 다른 실시 예에서 동일한 명칭을 사용하여 설명하기로 한다. 반대되는 기재가 없는 이상, 어느 하나의 실시 예에 기재한 설명은 다른 실시 예에도 적용될 수 있으며, 중복되는 범위에서 구체적인 설명은 생략하기로 한다.Components included in one embodiment and components having common functions will be described using the same names in other embodiments. Unless stated to the contrary, descriptions described in one embodiment may be applied to other embodiments, and detailed descriptions will be omitted to the extent of overlap.

도 1은 일 실시예에 따른 스마트폰에서 사진 자동 레이블링 및 녹음 장치의 구성을 도시하는 도면이다. 스마트폰 사용자가 촬영버튼을 누르거나 리모컨을 사용하거나 음성 또는 음향으로 신호를 주거나 카메라 입력 제스쳐로 신호를 주거나 스마트폰의 움직임으로 신호를 주거나 또는 원격지에서 통신 방식을 통해 전송한 제어 명령으로, 스마트폰에 내장된 카메라를 이용하여 대상체를 향해 사진을 촬영한다. 사진은 정지 영상 또는 동영상 모두 해당된다. 사용자는 사진 촬영과 동시에 또는 이전 또는 이후에 음성입력버튼을 누르고 사진의 명칭을 말하고, 마이크로 입력된 음성 신호는 음성인식부에서 처리되어 문자 또는 문자열로 변환된다. 변환된 문자 또는 문자열은 촬영된 사진 파일의 명칭 또는 메타 데이터로 자동으로 저장된다. 메타 데이터는 사진 파일에 포함된 이미지 데이터 이외의 정보를 말한다. 이 때, 파일 명칭 또는 메타 데이터는 사용자의 확인 단계를 거치면서 삭제되거나 수정될 수도 있다. 동영상 파일의 경우에도 동일하게 처리된다.1 is a diagram illustrating a configuration of a photo automatic labeling and recording device in a smart phone according to an embodiment. A smartphone user presses the shooting button, uses a remote control, gives a signal with voice or sound, gives a signal with a camera input gesture, gives a signal with a smartphone movement, or sends a control command from a remote place through a communication method. Take a picture toward the object using the built-in camera. Photos are either still images or moving images. The user presses the voice input button at the same time as, before or after taking the photo and says the name of the photo, and the voice signal input by the microphone is processed by the voice recognition unit and converted into text or character string. The converted text or string is automatically saved as the name or meta data of the photographed picture file. Meta data refers to information other than image data included in a photo file. At this time, the file name or meta data may be deleted or modified through a user's confirmation step. In the case of a video file, it is processed in the same way.

음성 인식으로 생성된 사진 파일의 명칭을 자동적으로 메타 데이터에 포함시키면, 추후에 사진 또는 동영상의 파일 명칭이 다르게 변경되더라도 원래의 음성 인식으로 입력된 문자열 또는 파일 명칭을 확인하거나 복구할 수 있다.If the name of the photo file created by voice recognition is automatically included in the metadata, even if the file name of the photo or video is changed later, the character string or file name originally input by voice recognition can be confirmed or restored.

또는 음성입력버튼을 눌렀을 때 주변에서 입력된 음향 신호를 문자 또는 문자열로 변환하여 사진 파일의 명칭이나 메타 데이터로 저장될 수도 있다.Alternatively, when a voice input button is pressed, a sound signal input from the surroundings may be converted into a character or string and stored as a name or metadata of a photo file.

또는 음성입력버튼을 클릭하면 음성 신호의 문자열 인식을 시작하고 다시 해당 버튼을 클릭하면 음성 인식을 종료하거나, 음성입력버튼을 클릭하여 음성 인식을 시작하고 사전에 정해진 시간이 지나거나, 더 이상 음성 신호의 입력이 없으면, 자동으로 음성 인식을 종료할 수 도 있다.Alternatively, clicking the voice input button starts recognizing the text of the voice signal, and clicking the corresponding button again ends voice recognition, or clicking the voice input button starts voice recognition and after a predetermined time has elapsed, the voice signal no longer exists. If there is no input, voice recognition may be automatically terminated.

또는 사진 또는 동영상을 촬영한 이후에, 스마트폰에서 특정 음향 신호를 자동으로 발생시키면서 음성 인식을 시작하고, 사전에 정해진 시간 이후에 또는 사용자가 음성입력버튼을 클릭해서 음성 인식을 중단시킬 수도 있다.Alternatively, after taking a picture or video, the smartphone automatically generates a specific sound signal to start voice recognition, and the voice recognition may be stopped after a predetermined time or by clicking the voice input button by the user.

음성입력버튼을 클릭하는 대신에, 리모컨을 사용하거나 특정 단어, 음성 또는 음향으로 스마트폰으로 신호를 주거나 스마트폰 카메라 입력으로 특정 몸동작으로 신호를 주거나 스마트폰의 특정 움직임 패턴으로 신호를 주거나, 사전에 정해진 스마트폰의 설정 버튼을 눌러서 또는 원격지에서 통신 방식을 통해 전송한 제어 명령으로, 마이크로 입력된 음성 신호는 음성인식부에서 처리되어 문자로 변환될 수도 있다.Instead of clicking the voice input button, use the remote control, give a signal to the smartphone with a specific word, voice or sound, give a signal with a specific body movement with the smartphone camera input, give a signal with a specific movement pattern of the smartphone, A voice signal input into a microphone by pressing a set button of a smartphone or by a control command transmitted through a communication method from a remote place may be processed by a voice recognition unit and converted into text.

음성 인식이 진행되는 동안에는 음성입력버튼의 색깔이 변경되어 사용자가 음성이 인식되고 있는 것을 인지할 수 있도록 할 수도 있다.While voice recognition is in progress, the color of the voice input button may be changed so that the user can recognize that voice is being recognized.

또한 사용자가 의도적으로 스마트폰에 음성 인식의 명령을 지정하지 않아도, 기본적인 기능으로 스마트폰으로 사진을 촬영하는 일정 시간 동안에 또는 일정 수준 이상의 음성이 인식되는 동안에, 스마트폰에 입력된 음성 신호를 인식하고 문자 또는 문자열로 변환하여 사진 파일의 명칭 또는 메타 데이터로 저장할 수도 있다.In addition, even if the user does not intentionally specify a voice recognition command to the smartphone, as a basic function, it recognizes the voice signal input to the smartphone during a certain period of time while taking a picture with the smartphone or while a voice above a certain level is recognized. It can be converted into text or string and saved as the name of a photo file or meta data.

사진 또는 동영상 촬영 후에 음성 인식으로 파일명이나 메타 데이터를 설정한 후에, 사용자는 음성입력버튼을 다시 눌러서 상기의 동일한 방법에 의해서 음성 인식으로 문자열을 다시 생성하고 표시부에 표시된 문자열 파일명을 확인하고 이전의 파일명이나 메타 데이터를 수정할 수도 있다.After setting the file name or meta data by voice recognition after taking a picture or video, the user presses the voice input button again to create a string again by voice recognition in the same way as above, check the string file name displayed on the display, and confirm the previous file name. or metadata can be modified.

상기의 사진 촬영과 문자 변환, 사진 파일의 저장의 주요 기능들은 스마트폰이나 카메라의 본체에서 수행될 수도 있고, 본체에 저장된 별도의 소프트웨어 앱 프로그램에 의해서 수행될 수도 있으며, 그 주요 기능들이 본체와 앱 프로그램에 분산되어 수행될 수도 있다.The main functions of taking pictures, converting text, and saving photo files may be performed in the main body of a smartphone or camera, or may be performed by a separate software application program stored in the main body, and the main functions are the main body and the app. It can also be distributed and executed in a program.

또는 음성 또는 음향 신호 데이터는 스마트폰의 통신 네트워크를 통해서 원격지의 변환 서버로 전송되고 변환 서버에서 문자 또는 문자열로 변환된 이후에 다시 스마트폰으로 수신될 수도 있다.Alternatively, the voice or sound signal data may be transmitted to a conversion server in a remote location through a communication network of the smartphone, converted into text or character strings in the conversion server, and then received by the smartphone again.

또한 사진 촬영 기능은 원격지의 스마트폰 카메라에서 수행하고, 음성 또는 음향 신호의 입력은 지역(local)에서 수행하며, 문자 변환과 사진 파일의 저장은 지역에 있는 변환 서버에서 수행할 수도 있다.In addition, the photo taking function may be performed by a remote smartphone camera, voice or sound signal input may be performed locally, and text conversion and photo file storage may be performed by a local conversion server.

최근에 여러 가지 종류의 많은 사진들이 클라우드 서버에 수집되고 있고 semi-supervised learning 기술을 적용하면, 특정 물건이나 장소, 사람 또는 이벤트(행위나 사건)의 특징이 있는 사진을 입력했을 때 기존에 저장되어 있던 유사한 사진과 비교하여 그 특징을 자동으로 추출할 수 있다.Recently, many photos of various types are being collected in cloud servers, and if semi-supervised learning technology is applied, when a photo with characteristics of a specific object, place, person, or event (action or event) is input, it is previously stored It can automatically extract its features by comparing it to similar photos.

따라서 스마트폰에서 촬영된 사진의 특징(특정 물건이나 장소, 사람, 또는 이벤트(행위나 사건))을 스마트폰 또는 원격지의 변환 서버에서 인식하고 그 특징의 명칭을 추출하여, 사진 파일의 명칭이나 메타 데이터로 저장될 수 있다. 물론 이 단계에서도 사용자의 확인이나 음성 인식을 통한 수정 단계를 둘 수 있다.Therefore, the feature (specific object, place, person, or event (action or event)) of a photo taken on a smartphone is recognized by the smartphone or a remote conversion server, and the name of the feature is extracted, and the name or meta of the photo file is recognized. can be stored as data. Of course, even in this step, a correction step through user confirmation or voice recognition may be provided.

촬영된 사진에서 복수의 특징(특정 물건이나 장소, 사람, 또는 이벤트(행위나 사건))이 추출된 경우에는, 각각의 특징의 명칭을 복수의 사진 파일들의 명칭으로 저장하거나, 하나의 사진 파일에 복수의 특징들을 메타 데이터로 저장할 수 있다.When a plurality of features (a specific object, place, person, or event (act or event)) are extracted from a photographed photograph, the name of each feature is saved as the name of a plurality of photo files or stored in a single photo file. A plurality of features may be stored as meta data.

사용자의 음성이 불확실한 경우나 주변의 잡음으로 음성 인식율이 낮아질 수 있으므로, 상기의 특징의 명칭 추출 기술과 음성 인식 기술을 함께 처리하면, 사용자가 의도하는 파일명의 정확도를 더 높일 수 있다.Since the voice recognition rate may be lowered when the user's voice is uncertain or due to ambient noise, the accuracy of the file name intended by the user can be further increased by processing the above feature name extraction technology and voice recognition technology together.

상기의 방식으로 촬영된 사진 파일들에 부여된 명칭 또는 메타 데이터에 따라서, 자동으로 사진 파일들이 분류되어 서로 다른 폴더에 저장될 수도 있고, 촬영된 시간에 따라서 명칭 앞 또는 뒤에 일련번호를 붙일 수도 있다. 또는 동일한 파일명의 사진이 이미 있는 경우에는, 파일명의 뒤에 촬영 시간에 따른 일련번호 또는 촬영일시정보를 추가로 붙여서 구분할 수도 있다.Depending on the names or metadata given to the photo files taken in the above way, the photo files may be automatically classified and stored in different folders, or a serial number may be added before or after the name according to the time taken. . Alternatively, if a photo with the same file name already exists, a serial number according to the shooting time or shooting date information may be additionally attached to the file name to be distinguished.

한편, 사진 또는 동영상을 촬영하는 동안에 또는 촬영한 이후에, 스마트폰의 녹음버튼을 이용하여 스마트폰에 입력된 음향 신호를 녹음하고 이를 별도의 음향 파일로 저장하거나 사진 파일에 메타 데이터로 포함시킬 수 있다. 별도의 음향 파일로 저장하는 경우에는 해당 사진 또는 동영상 파일과 동일한 파일명 또는 메타 데이터로 음향 파일이 저장된다.On the other hand, during or after taking a photo or video, the sound signal input to the smartphone can be recorded using the record button of the smartphone and saved as a separate sound file or included as metadata in the photo file. there is. In the case of saving as a separate sound file, the sound file is saved with the same file name or meta data as the corresponding photo or video file.

또는 사진 또는 동영상 파일과 해당 음향 파일사이의 별도의 링크 정보를 생성할 수도 있다.Alternatively, separate link information between a photo or video file and a corresponding sound file may be created.

또는 촬영한 사진의 경우에는 동영상 형식으로 변경하여 녹음된 음향 신호를 포함시켜서 하나의 파일로 생성할 수도 있다.Alternatively, in the case of a photographed picture, it may be converted into a video format and a recorded sound signal may be included to generate a single file.

또한 사용자가 음성 인식을 통해 파일명을 문자열로 변환할 때 사용한 음성 신호를 음향 파일로 저장할 수도 있다.In addition, a voice signal used when a user converts a file name into a character string through voice recognition may be stored as a sound file.

또한 사진 또는 동영상 파일에 상기의 음향 파일이 포함되거나 해당 음향 파일이 존재하는 경우에는 사진 또는 동영상 파일의 메타 데이터에 그 여부 또는 음향 파일명을 포함시킬 수도 있다.In addition, if the sound file is included in the photo or video file or the corresponding sound file exists, whether or not or the name of the sound file may be included in the metadata of the photo or video file.

상기의 방식으로 촬영된 사진 및 동영상 파일에 대한 목록을 표시할 때, 상기 음향 신호의 존재 또는 포함 여부를 표시할 수도 있다.When displaying a list of pictures and video files taken in the above manner, the presence or absence of the sound signal may be displayed.

또한, 추가적으로 사진 또는 동영상 파일의 메타 데이터에 의해서 상기 목록에 촬영한 일시 또는 장소를 표시할 수도 있다.In addition, the date and time of the photo or video file may be additionally displayed on the list by means of metadata of the photo or video file.

또한 상기의 목록에는 섬네일을 표시할 수도 있으며, 특정 사진의 섬네일이나 파일명을 클릭하면 사진을 화면에 재생하고 동시에 녹음된 음향 파일을 재생하며, 특정 동영상의 경우에는 섬네일이나 파일명을 클릭하면 동영상의 첫 화면을 정지화면으로 재생하고 동시에 상기의 음향 파일을 재생한 후에, 나머지 동영상을 재생한다.In addition, thumbnails can be displayed in the above list. If you click the thumbnail or file name of a specific photo, the photo is played on the screen and the recorded sound file is played simultaneously. In the case of a specific video, clicking the thumbnail or file name will play the first video. After the screen is reproduced as a still image and the sound file is reproduced at the same time, the rest of the video is reproduced.

사전에 정해진 여러 명의 사용자 그룹 중에 한 명이 소유한 스마트폰에, 사전에 정해진 키워드가 사진 파일의 명칭 또는 메타 데이터에 포함된 사진 파일이 생성되면, 자동으로 해당 그룹 구성원들이 소유한 스마트폰에 공유되어 동기화가 될 수 있다.When a photo file containing a pre-determined keyword in the name or metadata of a photo file is created on a smartphone owned by one of several pre-determined user groups, it is automatically shared with the smartphones owned by members of the group. can be synchronized.

또는, 사전에 정해진 여러 명의 사용자 그룹 중에 한 명이 소유한 스마트폰에, 특정인이 사전에 정한 키워드가 사진 파일의 명칭 또는 메타 데이터에 포함된 사진 파일이 생성되면, 자동으로 해당 특정인이 소유한 스마트폰에 공유되어 동기화가 될 수 있다.Alternatively, if a photo file containing a pre-determined keyword in the name or metadata of a photo file is created on a smartphone owned by one of several pre-determined user groups, the smartphone owned by the specific person is automatically created. can be shared and synchronized.

도 2는 일 실시예에 따른 스마트폰에서 사진 자동 레이블링 및 녹음 장치에서의 스마트폰의 화면을 구성하는 도면들이다.2 is diagrams configuring a screen of a smart phone in an apparatus for automatically labeling and recording photos in a smart phone according to an embodiment.

도 2의 (a)는 일 실시예에 따른 스마트폰 앱의 초기 화면으로 4개의 버튼이 있으며, 사진 촬영 버튼을 누르면 카메라 앱으로 이동하여 사진을 촬영하게 된다. 사진을 촬영하면 자동으로 (b)의 화면으로 전환되며 촬영된 사진을 보고 만족하면 □저장□ 버튼을 누르고 다시 촬영하고 싶으면 □다시 촬영□ 버튼을 누른다. 다시 촬영하게 되면 이전의 카메라 앱으로 복귀한다. □저장□을 하면 (c)의 화면으로 전환되며 현재 파일명으로는 기본으로 일시 정보로 설정되어 있다. 이 파일명을 변경하기 위해서는 □음성 파일명□ 버튼을 누르고 원하는 파일명을 말하고 이 음성이 문자열로 인식되어 자동적으로 현재 파일명이 변경된다. 음성 인식된 파일명이 잘못되었던지 다른 명칭으로 변경하고 싶으면 □음성 파일명□을 다시 누르고 말을 한다. 추가적으로 음성이나 주변의 음향을 녹음하고 싶으면 □음성 녹음□ 버튼을 누른다. 버튼을 누르고 있는 동안 녹음이 되던지, 첫 번째 버튼을 누를때부터 녹음이 되고 두 번째 버튼을 누르면 중지된다. (d)는 촬영한 사진이나 동영상 파일들의 목록을 표시하는 화면으로 섬네일도 표시되어 있고, 파일명과 파일의 형식, 해당 음성 파일 존재 유무, 일시 및 장소가 표시되어 있다. 섬네일 사진이나 목록을 클릭하면 해당 사진이나 동영상이 녹음된 음성 파일과 함께 재생된다. (e)는 설정 화면으로 □자동 인식 모드□를 □on□으로 선택하면, 사진 파일명을 음성으로 인식하기 위하여 □음성파일명□ 버튼을 누르지 않아도 되며, 사진을 찍은 후 화면 (b)의 □저장□ 버튼을 누르면 자동으로 5초간 입력된 음성 신호를 문자열로 변환하고 화면 (c)의 □현재 파일명□에 표시한다. □off□인 경우는 앞에서 설명한 기본 모드로 동작한다. 또한 □사진/음향 합성□을 □on□으로 선택하면, 녹음된 음향 파일을 하나의 동영상 파일로 만든다. (예를 들면, mpeg) □off□인 경우에는 사진과 동일한 파일명을 가진 별도의 mp3 파일을 생성한다.(a) of FIG. 2 is an initial screen of a smartphone app according to an embodiment, and includes four buttons. When the photo capture button is pressed, the camera app is moved to take a photo. If you take a picture, it automatically switches to the screen of (b). If you are satisfied with the picture you have taken, press the □Save□ button, and if you want to take another picture, press the □Retake button. If you take a picture again, you will return to the previous camera app. When □Save□ is executed, the screen of (c) is converted and the current file name is set to date and time information by default. To change this file name, press the □Voice file name□ button and say the desired file name. This voice is recognized as a character string and the current file name is automatically changed. If the voice recognition file name is wrong or you want to change it to another name, press □Voice File Name□ again and speak. If you want to additionally record your voice or ambient sounds, press the □Voice Recording□ button. Recording is done while the button is pressed, or it starts recording when the first button is pressed and stops when the second button is pressed. (d) is a screen displaying a list of photographed pictures or moving picture files. Thumbnails are also displayed, and the file name, file format, presence or absence of the corresponding audio file, date and time, and location are displayed. Clicking on a thumbnail picture or listing will play the picture or video along with a recorded audio file. (e) is the setting screen. If □on□ is selected for □Auto Recognition Mode□, there is no need to press the □Voice File Name□ button to recognize the photo file name as voice. When the button is pressed, the voice signal input for 5 seconds is automatically converted into a character string and displayed on the □Current File Name□ on the screen (c). In the case of □off□, it operates in the basic mode described above. Also, if you select □on□ for □Photo/Sound Synthesis□, the recorded sound file is made into a single video file. (For example, mpeg) In the case of □off□, a separate mp3 file with the same file name as the photo is created.

이상과 같이 실시예들이 비록 한정된 도면에 의해 설명되었으나, 해당 기술분야에서 통상의 지식을 가진 자라면 상기를 기초로 다양한 기술적 수정 및 변형을 적용할 수 있다. 예를 들어, 설명된 기술들이 설명된 방법과 다른 순서로 수행되거나, 및/또는 설명된 시스템, 구조, 장치, 회로 등의 구성요소들이 설명된 방법과 다른 형태로 결합 또는 조합되거나, 다른 구성요소 또는 균등물에 의하여 대치되거나 치환되더라도 적절한 결과가 달성될 수 있다.As described above, although the embodiments have been described with limited drawings, those skilled in the art can apply various technical modifications and variations based on the above. For example, the described techniques may be performed in an order different from the method described, and/or components of the described system, structure, device, circuit, etc. may be combined or combined in a different form than the method described, or other components may be used. Or even if it is replaced or substituted by equivalents, appropriate results can be achieved.

그러므로, 다른 구현들, 다른 실시예들 및 특허청구범위와 균등한 것들도 후술하는 청구범위의 범위에 속한다.Therefore, other implementations, other embodiments, and equivalents of the claims are within the scope of the following claims.

Claims

In the method of recording a sound signal in a photo or video file,
Recording an input sound signal using a record button;
Storing it as a separate sound file or including it as meta data in a photo file;
In the case of storing as a separate sound file, saving the sound file with the same file name or meta data as the corresponding photo or video file; and
When displaying a list of pictures and video files taken in the above manner, displaying whether a sound signal is present or included
A method for recording sound signals in a photo or video file containing

According to claim 1,
If the sound file is included in the photo or video file or the sound file exists, including the presence or sound file name in metadata of the photo or video file
A method of recording a sound signal in a photo or video file that further includes.

According to claim 1,
Generating separate link information between the photo or video file and the corresponding sound file
A method of recording a sound signal in a photo or video file that further includes.

According to claim 1,
In the case of a photo, converting it into a video format and including the recorded sound signal to create a single file
A method of recording a sound signal in a photo or video file that further includes.

According to claim 1,
Thumbnails can be displayed in the list of taken photos or videos, and when a thumbnail or file name of a specific photo is clicked, the photo is played on the screen and the recorded sound file is played at the same time. In the case of a specific video, the thumbnail or When the file name is clicked, the first screen of the video is played as a still image, the sound file is played at the same time, and the rest of the video is played.
A method of recording a sound signal in a photo or video file that further includes.

According to claim 5,
Additionally, displaying the date and time or location of the photo or video file in the list by means of metadata of the photo or video file.
A method of recording a sound signal in a photo or video file that further includes.

According to claim 1,
When a photo file containing a pre-determined keyword in the name or metadata of a photo file is created on a smartphone owned by one of several pre-determined user groups, it is automatically shared with the smartphones owned by members of the group. Synchronized Steps
A method of recording a sound signal in a photo or video file that further includes.

According to claim 1,
When a photo file containing a pre-determined keyword in the name or metadata of a photo file is created on a smartphone owned by one of a group of pre-determined users, it is automatically shared on the smartphone owned by the specific person. Steps to be synchronized
A method of recording a sound signal in a photo or video file that further includes.

In the device for recording a sound signal in a photo or video file,
Record button for recording the sound signal input to the smartphone;
It can be saved as a separate sound file or included as meta data in the photo file. In case of saving as a separate sound file, the sound file is saved with the same file name or meta data as the photo or video file. a file storage unit capable of generating a single file by including a recorded sound signal converted into a video format; and
Display unit for displaying whether a sound signal is present or included when displaying a list of pictures and video files taken in the above manner
A device for recording sound signals in a photo or video file containing a

The file storage unit of claim 9,
A function of including the presence or sound file name in the metadata of the photo or video file if the above sound file is included in the photo or video file or the corresponding sound file exists
A device for recording a sound signal in a photo or video file further comprising a.

The file storage unit of claim 9,
In the case of photos, a function that can be converted into a video format and created as a single file by including recorded sound signals
A device for recording a sound signal in a photo or video file further comprising a.

The display unit of claim 9,
Thumbnails can be displayed in the list of taken photos or videos, and when a thumbnail or file name of a specific photo is clicked, the photo is played on the screen and the recorded sound file is played at the same time. In the case of a specific video, the thumbnail or A function to play the first screen of a video as a still image when the file name is clicked, play the above sound file at the same time, and then play the rest of the video
A device for recording a sound signal in a photo or video file further comprising a.