KR20220121667A

KR20220121667A - Method and apparatus for automatic picture labeling and recording in smartphone

Info

Publication number: KR20220121667A
Application number: KR1020210101086A
Authority: KR
Inventors: 오영식
Original assignee: 오영식
Priority date: 2021-02-25
Filing date: 2021-08-01
Publication date: 2022-09-01
Also published as: KR20230008687A

Abstract

Disclosed are a method and apparatus for automatically labeling and recording a picture in a smartphone. The apparatus for automatically labeling and recording a picture in a smartphone may include: a camera lens photographing a picture; a recording button commanding photographing; a displaying unit displaying a photographed picture; a voice inputting button commanding voice input and character conversion; a voice recognition unit converting input voice into a character or character string; and a file storage unit storing the converted character or character string as a name of a file in a picture file photographed by control of the control unit.

Description

Apparatus and method for automatic photo labeling and recording on a smartphone

아래 실시예들은 스마트폰에서 찍은 사진에 자동으로 명칭을 붙이거나 음향을 녹음하는 기술에 관한 것이다.The following embodiments relate to a technology for automatically naming a picture taken by a smartphone or recording a sound.

스마트폰이나 디지털 카메라로 일상생활에서 사진을 많이 찍는다. 사진의 대상은 사람일 수 도 있고 사물이나 사건의 경우도 있다. 통상적으로 스마트폰 또는 카메라로 사진을 찍으면 내장된 시계에 의해서 사진 촬영 일시가 사진 파일명으로 저장되던지, 또는 촬영 일련번호로 파일명으로 저장된다. 또한 일부 카메라나 스마트폰에는 촬영 시간 및 카메라의 속성 정보, 촬영 위치 정보 등이 사진에 포함된 메타 데이터의 형태로 저장하기도 한다. I take a lot of pictures in my daily life with my smartphone or digital camera. The subject of a photograph can be a person or an object or an event. In general, when a picture is taken with a smartphone or camera, the date and time of taking the picture is stored as a picture file name by the built-in clock, or is stored as a file name as a shooting serial number. In addition, some cameras or smartphones store the shooting time, camera property information, and shooting location information in the form of metadata included in the photo.

1. 카메라가 실장된 휴대폰에서 사진/음성 동시 편집방법, 등록번호 10-0341987, 등록일자 2002년06월12일1. Simultaneous photo/voice editing on a mobile phone equipped with a camera, registration number 10-0341987, registration date June 12, 2002 2. 디지털 이미지 캡쳐 세션 및 메타데이터 연관, 출원번호 10-2020-7006887, 출원일자 2018년09월 10일2. Digital image capture session and metadata association, application number 10-2020-7006887, filing date September 10, 2018 상기 특허문헌 1은 사진 파일과 함께 음성 파일을 함께 저장하여 사진을 재생할 때 음성과 사진이 동시에 출력되게 하는 기술에 관한 것이다.The Patent Document 1 relates to a technology for simultaneously outputting a voice and a picture when a picture is reproduced by storing an audio file together with a picture file. 또한 상기 특허문헌 2는 스마트폰 카메라에 입력되는 물건의 이미지에서 물건 명칭 등의 메타 데이터를 인식하고 통신 네트워크로 서버와 연결한 후 가격 등의 정보를 수신하여 표시하며, 이 과정에서 선택적으로 사용자의 음성을 문자로 변환하여 물건의 명칭 등의 메타정보를 추가적으로 인식하는 기술에 관한 것이다.In addition, the patent document 2 recognizes metadata such as the name of an object from an image of an object input to a smartphone camera, connects to a server through a communication network, receives and displays information such as price, and, in this process, selectively It relates to a technology for additionally recognizing meta information such as a name of an object by converting a voice into a text.

그러나 스마트폰 카메라 사용자에게 가장 필요한 기능은, 자신이 여러 장의 다른 사진들을 구분할 수 있는 문구 또는 해당 사진에 대한 정보를 사진의 명칭이나 메타 데이터의 형태로 사진 파일에 저장하는 것이다. 그래서 추후에 사진 파일의 명칭이나 메타정보를 보고 사진을 구분하거나 사진에 관한 정보를 파악하거나, 또는 여러 장의 사진 파일 중에서 자신이 원하는 키워드로 한 장 이상의 사진들을 검색하는 것이 매우 중요한 기능이다.However, the most necessary function for a smartphone camera user is to store a phrase or information about the corresponding photo in a photo file in the form of a photo name or metadata. Therefore, it is a very important function to classify photos by looking at the name or meta information of a photo file later, to understand information about photos, or to search for one or more photos with a desired keyword among multiple photo files.

이를 위해서 현재의 카메라나 스마트폰에서는 일단 사진을 촬영한 후에 기기에 사진 파일이 저장되면, 사용자가 파일 관리자 또는 사진 재생 기능을 이용하여 기존의 파일명을 문자편집 기능을 이용하여 수정해야 한다. 카메라나 스마트폰의 경우에는 단말기 크기가 작아 문자편집 기능이 불편한 것이 일반적이다. 또한 사진을 찍을 때마다 문자편집 기능을 사용하여 사진 파일명을 수정하는 것은 매우 불편하다. 또한 사진을 여러 장 촬영한 이후에 나중에 한꺼번에 파일명을 수정하려고 하면 각각의 사진에 대한 정확한 내용을 기억하고 구분하는 것이 쉽지 않은 어려움이 있다.To this end, in a current camera or smartphone, once a photo is taken and a photo file is stored in the device, the user must use a file manager or a photo playback function to edit an existing file name using a text editing function. In the case of a camera or smartphone, it is common that the text editing function is inconvenient due to the small size of the terminal. Also, it is very inconvenient to use the text editing function to edit the photo file name every time you take a photo. Also, if you try to edit the file name at once after taking several photos, it is difficult to remember and distinguish the exact contents of each photo.

이런 불편을 해소하기 위해 상기 특허문헌 1과 같이, 사진 파일과 함께 음성 파일을 함께 저장하는 경우도 있지만, 사진에 대한 배경 정보를 확인하기 위해서는 일일이 음성 파일을 재생해야 하므로 매우 불편하고, 사진 파일과 음성 파일의 연결이 변경되거나 해당 음성 파일이 삭제된는 경우에는 해당 기능을 사용할 수 없게 되는 문제가 있다. 또한, 특허문헌 2의 경우에는 사진 파일을 구분하기 위한 것이 아니라, 사진에 포함된 물건을 인식하고 검색하며 구매하기 위한 목적으로 그 용도와 방법이 완전히 다르다.In order to solve this inconvenience, as in Patent Document 1, there are cases where a voice file is stored together with a photo file, but it is very inconvenient because the voice file has to be played one by one in order to check the background information about the photo, and it is very inconvenient, If the connection of the voice file is changed or the corresponding voice file is deleted, there is a problem that the corresponding function cannot be used. In addition, in the case of Patent Document 2, the use and method are completely different for the purpose of recognizing, searching, and purchasing an object included in the photo, not for classifying the photo file.

최근에는 음성 인식 기술이 발전하여 정확도가 향상되었으므로, 카메라나 스마트폰에서 사진을 촬영한 후에 사용자의 음성으로 사진의 명칭을 말하고, 이를 카메라 자체적으로 또는 원격지의 음성 인식 서버를 이용하여 음성을 인식하여 문자로 변환하고 사진 파일의 명칭으로 또는 사진의 메타 데이터로 저장한다면, 상기의 문제들이 모두 해결된다. 또한 추가적으로 사진이나 동영상에 해당되는 음성 또는 음향 파일을 생성하여 사진 또는 동영상과 일체화시킴으로써, 사진이나 동영상 재생시에 촬영 현장의 상황을 좀 더 생생하게 표현할 수 있도록 한다.Recently, as voice recognition technology has been developed and the accuracy has been improved, after taking a picture with a camera or smartphone, the user's voice says the name of the picture, and the camera itself or using a remote voice recognition server recognizes the voice. All of the above problems are solved if you convert to text and save it as the name of the picture file or as the picture's metadata. In addition, by generating an additional audio or sound file corresponding to a photo or video and integrating it with the photo or video, the situation of the shooting site can be expressed more vividly when the photo or video is reproduced.

일 실시예에 따른 스마트폰에서 사진 자동 레이블링 및 녹음 장치는, 사진을 촬영하는 카메라 렌즈; 사진 촬영을 지시하는 촬영버튼; 촬영한 사진을 표시하는 표시부; 음성입력과 문자변환을 지시하는 음성입력버튼; 입력된 음성을 문자 또는 문자열로 변환하는 음성인식부; 및 제어부의 제어하에 촬영한 사진 파일에 상기 변환된 문자 또는 문자열을 파일의 명칭으로 저장하는 파일저장부를 포함할 수 있다.An apparatus for automatically labeling and recording photos in a smartphone according to an embodiment includes: a camera lens for taking a photo; a shooting button instructing to take a picture; a display unit for displaying the photographed picture; a voice input button for instructing voice input and text conversion; A voice recognition unit for converting the input voice into text or character string; and a file storage unit configured to store the converted text or character string as a file name in a photographed photo file under the control of the controller.

사용자의 의도에 의해서 촬영된 사진과 관련 있는 음향신호를 녹음하기 위한 녹음 버튼을 더 포함할 수 있다.It may further include a record button for recording a sound signal related to the picture taken by the user's intention.

사용자는 사진 촬영과 동시에 또는 이전 또는 이후에 음성입력버튼을 누르고 사진의 명칭을 말하며, 마이크로 입력된 음성 신호는 음성인식부에서 처리되어 문자 또는 문자열로 변환되고, 변환된 문자 또는 문자열은 촬영된 사진 파일의 명칭 또는 메타 데이터로 자동으로 저장되는 기능The user presses the voice input button at the same time or before or after taking the picture and says the name of the picture, the voice signal input into the microphone is processed by the voice recognition unit and converted into text or string, and the converted text or string is the photographed picture Ability to automatically save as file name or metadata

을 더 포함할 수 있다.may further include.

음성입력버튼을 클릭하면 음성 신호의 문자열 인식을 시작하고 다시 해당 버튼을 클릭하면 음성 인식을 종료하거나, 음성입력버튼을 클릭하여 음성 인식을 시작하고 사전에 정해진 시간이 지나거나, 더 이상 음성 신호의 입력이 없으면, 자동으로 음성 인식을 종료하는 기능If you click the voice input button, the text recognition of the voice signal starts and click the button again to end the voice recognition, or click the voice input button to start the voice recognition and a predetermined time passes, or the voice signal is no longer recognized. Function to automatically end voice recognition if there is no input

을 더 포함할 수 있다.may further include.

사진 또는 동영상을 촬영한 이후에, 사전에 정해진 시간동안 음성 인식으로 문자열 인식을 수행하여 파일명을 생성하는 기능을 더 포함할 수 있다.The method may further include a function of generating a file name by performing character string recognition through voice recognition for a predetermined time after taking a photo or video.

음성 인식은 스마트폰의 음성 인식부에서 자체적으로 수행되던지, 음성 인식부의 주도하에 음성 또는 음향 신호 데이터를 스마트폰의 통신 네트워크를 통해서 원격지의 변환 서버로 전송하고 변환 서버에서 문자 또는 문자열로 변환된 이후에 다시 스마트폰으로 수신되는 기능을 더 포함할 수 있다.Speech recognition is performed on its own by the voice recognition unit of the smartphone, or under the leadership of the voice recognition unit, voice or sound signal data is transmitted to a remote conversion server through the communication network of the smartphone and converted into text or character string by the conversion server. Afterwards, it may further include a function to be received back to the smartphone.

촬영된 사진에서 복수의 특징(특정 물건이나 장소, 사람, 또는 이벤트(행위나 사건))이 추출된 경우에는, 각각의 특징의 명칭을 복수의 사진 파일들의 명칭으로 저장하거나, 하나의 사진 파일에 복수의 메타 데이터로 저장하는 기능을 더 포함할 수 있다.When a plurality of features (a specific object, place, person, or event (act or event)) are extracted from a photographed photo, the name of each feature is stored as a name of a plurality of photo files or stored in a single photo file. A function of storing a plurality of metadata may be further included.

상기의 방식으로 촬영된 사진 파일들에 부여된 명칭 또는 메타 데이터에 따라서, 자동으로 사진 파일들이 분류되어 서로 다른 폴더에 저장될 수도 있고, 촬영된 시간에 따라서 명칭 앞 또는 뒤에 일련번호를 붙일 수도 있다. 또는 동일한 파일명의 사진이 이미 있는 경우에는, 파일명의 뒤에 촬영 시간에 따른 일련번호 또는 촬영일시정보를 추가로 붙여서 구분하는 기능을 더 포함할 수 있다.According to the name or metadata given to the photo files photographed in the above manner, the photo files may be automatically classified and stored in different folders, and a serial number may be attached before or after the name according to the time taken. . Alternatively, when there is already a photo with the same file name, a function for classifying the file name by additionally appending a serial number according to the recording time or photographing date and time information to the file name may be further included.

상기의 녹음버튼을 이용하여 스마트폰에 입력된 음향 신호를 녹음하고 이를 별도의 음향 파일로 저장하거나 사진 파일에 메타 데이터로 포함시킬 수 있으며, 별도의 음향 파일로 저장하는 경우에는 해당 사진 또는 동영상 파일과 동일한 파일명 또는 메타 데이터로 음향 파일을 저장하며, 사진의 경우에는 동영상 형식으로 변경하여 녹음된 음향 신호를 포함시켜서 하나의 파일로 생성할 수도 있는 기능을 더 포함할 수 있다.The sound signal input to the smartphone can be recorded using the above record button, and it can be saved as a separate sound file or included as metadata in the photo file. A sound file is stored with the same file name or metadata as , and in the case of a photo, it may further include a function of changing it to a video format and including a recorded sound signal to generate a single file.

상기의 사진 또는 동영상 파일에 상기의 음향 파일이 포함되거나 해당 음향 파일이 존재하는 경우에는 사진 또는 동영상 파일의 메타 데이터에 그 존재 여부 또는 음향 파일명을 포함시키는 기능을 더 포함할 수 있다.When the sound file is included in the photo or video file or the sound file exists, the method may further include a function of including the presence or absence of the sound file or the name of the sound file in the metadata of the photo or video file.

촬영한 사진들 또는 동영상들의 목록에는 섬네일을 표시할 수도 있으며, 특정 사진의 섬네일이나 파일명을 클릭하면 사진을 화면에 재생하고 동시에 녹음된 음향 파일을 재생하기도 하고, 특정 동영상의 경우에는 섬네일이나 파일명을 클릭하면 동영상의 첫 화면을 정지화면으로 재생하고 동시에 상기의 음향 파일을 재생한 후에, 나머지 동영상을 재생하는 기능을 더 포함할 수 있다.Thumbnails can also be displayed in the list of photos or videos taken, and if you click the thumbnail or file name of a specific photo, the photo is played on the screen and a recorded sound file is played at the same time. When clicked, the method may further include a function of playing the first screen of the video as a still image and playing the remaining video after playing the sound file at the same time.

사전에 정해진 여러 명의 사용자 그룹 중에 한 명이 소유한 스마트폰에, 사전에 정해진 키워드가 사진 파일의 명칭 또는 메타 데이터에 포함된 사진 파일이 생성되면, 자동으로 해당 그룹 구성원들이 소유한 스마트폰에 공유되어 동기화되는 기능을 더 포함할 수 있다.When a photo file with a preset keyword in the name or metadata of the photo file is created on a smartphone owned by one of several predefined user groups, it is automatically shared to the smartphone owned by the group members. A synchronization function may be further included.

사전에 정해진 여러 명의 사용자 그룹중에 한 명이 소유한 스마트폰에, 특정인이 사전에 정한 키워드가 사진 파일의 명칭 또는 메타 데이터에 포함된 사진 파일이 생성되면, 자동으로 해당 특정인이 소유한 스마트폰에 공유되어 동기화되는 기능을 더 포함할 수 있다.When a photo file with a keyword defined in advance by a specific person in the name or metadata of the photo file is created on a smartphone owned by one of several predefined user groups, it is automatically shared to the smartphone owned by that specific person It may further include a function to be synchronized.

스마트폰에서 사진 자동 레이블링 및 녹음 방법에 있어서, 스마트폰의 카메라로 사진 또는 동영상을 촬영하는 단계; 스마트폰의 음성입력버튼을 누르든지 또는 상기의 촬영 후 일정시간 동안 스마트폰에 음성을 입력하고, 이를 스마트폰 또는 외부 서버에서 인식하여 문자열로 변환시킨 후 상기의 사진 또는 동영상의 파일명칭 또는 메타 데이터로 저장하는 단계; 생성된 문자열의 파일명이 정확하지 않거나 변경을 원하면 다시 음성을 입력하여 파일명을 수정하는 단계; 선택적으로 녹음버튼을 이용하여 음성 또는 음향 신호를 녹음하고 상기의 사진 파일명과 동일한 파일명의 별도의 음향 파일로 저장하는 단계; 촬영된 사진 또는 동영상 목록에서 섬네일 또는 상기의 파일명을 선택하면 사진과 함께 녹음된 상기의 음향 파일을 함께 재생하는 단계를 포함한다.A method for automatically labeling and recording photos in a smart phone, the method comprising: taking a photo or a video with a camera of the smart phone; Either press the voice input button of the smartphone or input the voice into the smartphone for a certain period of time after the above shooting, and the smartphone or external server recognizes it and converts it into a character string, and then the file name or metadata of the picture or video saving as; correcting the file name by inputting a voice again if the file name of the generated string is not correct or if a change is desired; optionally recording a voice or sound signal by using the record button and storing it as a separate sound file with the same file name as the photo file name; and when a thumbnail or the file name is selected from the photographed picture or video list, the sound file recorded together with the picture is played back together.

일 실시예에 따르면 사진 또는 동영상 촬영시에 음성 정보를 문자정보로 인식하여 해당 파일 명칭이나 메타 데이터로 저장할 수 있다.According to an embodiment, when taking a photo or video, voice information may be recognized as text information and stored as a corresponding file name or metadata.

일 실시예에 따르면 상기의 저장된 파일의 명칭이나 메타 데이터를 표시하여 사용자가 파일을 구분하거나 파일의 정보를 파악할 수 있다.According to an embodiment, by displaying the name or metadata of the stored file, the user can classify the file or understand the information of the file.

일 실시예에 따르면 상기의 저장된 파일의 명칭이나 메타 데이터를 이용하여 특정 키워드에 맞는 파일들을 검색하거나 분류할 수 있다.According to an embodiment, files matching a specific keyword may be searched for or classified using the stored file name or metadata.

일 실시예에 따르면 사진 또는 동영상 촬영시에 추가적으로 음향 신호를 녹음하고 저장하여 사진 또는 동영상 재생시에 들을 수 있도록 할 수 있다.According to an embodiment, a sound signal may be additionally recorded and stored when taking a photo or moving picture so that it can be heard when the photo or moving image is reproduced.

도 1은 일 실시예에 따른 스마트폰에서 사진 자동 레이블링 및 녹음 장치의 구성을 도시하는 도면이다.
도 2는 일 실시예에 따른 스마트폰에서 사진 자동 레이블링 및 녹음 장치에서의 스마트폰의 화면을 구성하는 도면들이다.1 is a diagram illustrating a configuration of an apparatus for automatic photo labeling and recording in a smartphone according to an embodiment.
2 is a diagram of a screen of a smartphone in an apparatus for automatic photo labeling and recording in a smartphone according to an embodiment.

이하에서, 첨부된 도면을 참조하여 실시예들을 상세하게 설명한다. 그러나, 실시예들에는 다양한 변경이 가해질 수 있어서 특허출원의 권리 범위가 이러한 실시예들에 의해 제한되거나 한정되는 것은 아니다. 실시예들에 대한 모든 변경, 균등물 내지 대체물이 권리 범위에 포함되는 것으로 이해되어야 한다.Hereinafter, embodiments will be described in detail with reference to the accompanying drawings. However, since various changes may be made to the embodiments, the scope of the patent application is not limited or limited by these embodiments. It should be understood that all modifications, equivalents and substitutes for the embodiments are included in the scope of the rights.

실시예에서 사용한 용어는 단지 설명을 목적으로 사용된 것으로, 한정하려는 의도로 해석되어서는 안된다. 단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함한다. 본 명세서에서, "포함하다" 또는 "가지다" 등의 용어는 명세서 상에 기재된 특징 점, 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것이 존재함을 지정하려는 것이지, 하나 또는 그 이상의 다른 특징 점들이나 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해되어야 한다.Terms used in the examples are used for the purpose of description only, and should not be construed as limiting. The singular expression includes the plural expression unless the context clearly dictates otherwise. In the present specification, terms such as “comprise” or “have” are intended to designate that a feature point, number, step, operation, component, part, or a combination thereof described on the specification exists, but one or more other It should be understood that the existence or addition of feature points or numbers, steps, operations, components, parts or combinations thereof is not precluded in advance.

다르게 정의되지 않는 한, 기술적이거나 과학적인 용어를 포함해서 여기서 사용되는 모든 용어들은 실시예가 속하는 기술 분야에서 통상의 지식을 가진 자에 의해 일반적으로 이해되는 것과 동일한 의미를 가지고 있다. 일반적으로 사용되는 사전에 정의되어 있는 것과 같은 용어들은 관련 기술의 문맥 상 가지는 의미와 일치하는 의미를 가지는 것으로 해석되어야 하며, 본 출원에서 명백하게 정의하지 않는 한, 이상적이거나 과도하게 형식적인 의미로 해석되지 않는다.Unless defined otherwise, all terms used herein, including technical or scientific terms, have the same meaning as commonly understood by one of ordinary skill in the art to which the embodiment belongs. Terms such as those defined in a commonly used dictionary should be interpreted as having a meaning consistent with the meaning in the context of the related art, and should not be interpreted in an ideal or excessively formal meaning unless explicitly defined in the present application. does not

또한, 첨부 도면을 참조하여 설명함에 있어, 도면 부호에 관계없이 동일한 구성 요소는 동일한 참조부호를 부여하고 이에 대한 중복되는 설명은 생략하기로 한다. 실시예를 설명함에 있어서 관련된 공지 기술에 대한 구체적인 설명이 실시예의 요지를 불필요하게 흐릴 수 있다고 판단되는 경우 그 상세한 설명을 생략한다.In addition, in the description with reference to the accompanying drawings, the same components are assigned the same reference numerals regardless of the reference numerals, and the overlapping description thereof will be omitted. In the description of the embodiment, if it is determined that a detailed description of a related known technology may unnecessarily obscure the gist of the embodiment, the detailed description thereof will be omitted.

또한, 실시 예의 구성 요소를 설명하는 데 있어서, 제 1, 제 2, A, B, (a), (b) 등의 용어를 사용할 수 있다. 이러한 용어는 그 구성 요소를 다른 구성 요소와 구별하기 위한 것일 뿐, 그 용어에 의해 해당 구성 요소의 본질이나 차례 또는 순서 등이 한정되지 않는다. 어떤 구성 요소가 다른 구성요소에 "연결", "결합" 또는 "접속"된다고 기재된 경우, 그 구성 요소는 그 다른 구성요소에 직접적으로 연결되거나 접속될 수 있지만, 각 구성 요소 사이에 또 다른 구성 요소가 "연결", "결합" 또는 "접속"될 수도 있다고 이해되어야 할 것이다. In addition, in describing the components of the embodiment, terms such as first, second, A, B, (a), (b), etc. may be used. These terms are only for distinguishing the elements from other elements, and the essence, order, or order of the elements are not limited by the terms. When it is described that a component is “connected”, “coupled” or “connected” to another component, the component may be directly connected or connected to the other component, but between each component another component It will be understood that may also be "connected", "coupled" or "connected".

어느 하나의 실시 예에 포함된 구성요소와, 공통적인 기능을 포함하는 구성요소는, 다른 실시 예에서 동일한 명칭을 사용하여 설명하기로 한다. 반대되는 기재가 없는 이상, 어느 하나의 실시 예에 기재한 설명은 다른 실시 예에도 적용될 수 있으며, 중복되는 범위에서 구체적인 설명은 생략하기로 한다.Components included in one embodiment and components having a common function will be described using the same names in other embodiments. Unless otherwise stated, descriptions described in one embodiment may be applied to other embodiments as well, and detailed descriptions within the overlapping range will be omitted.

도 1은 일 실시예에 따른 스마트폰에서 사진 자동 레이블링 및 녹음 장치의 구성을 도시하는 도면이다. 스마트폰 사용자가 촬영버튼을 누르거나 리모컨을 사용하거나 음성 또는 음향으로 신호를 주거나 카메라 입력 제스쳐로 신호를 주거나 스마트폰의 움직임으로 신호를 주거나 또는 원격지에서 통신 방식을 통해 전송한 제어 명령으로, 스마트폰에 내장된 카메라를 이용하여 대상체를 향해 사진을 촬영한다. 사진은 정지 영상 또는 동영상 모두 해당된다. 사용자는 사진 촬영과 동시에 또는 이전 또는 이후에 음성입력버튼을 누르고 사진의 명칭을 말하고, 마이크로 입력된 음성 신호는 음성인식부에서 처리되어 문자 또는 문자열로 변환된다. 변환된 문자 또는 문자열은 촬영된 사진 파일의 명칭 또는 메타 데이터로 자동으로 저장된다. 메타 데이터는 사진 파일에 포함된 이미지 데이터 이외의 정보를 말한다. 이 때, 파일 명칭 또는 메타 데이터는 사용자의 확인 단계를 거치면서 삭제되거나 수정될 수도 있다. 동영상 파일의 경우에도 동일하게 처리된다. 1 is a diagram illustrating a configuration of an apparatus for automatic photo labeling and recording in a smartphone according to an embodiment. The smartphone user presses the shooting button, uses the remote control, gives a signal with voice or sound, gives a signal with a camera input gesture, gives a signal with the movement of the smartphone, or sends a control command from a remote location through a communication method. Takes a picture towards the object using the built-in camera. Photos are either still images or moving images. The user presses the voice input button at the same time or before or after taking the picture and speaks the name of the picture, and the voice signal input into the microphone is processed by the voice recognition unit and converted into text or character string. The converted text or character string is automatically saved as the name or metadata of the photographed photo file. Meta data refers to information other than image data included in a photo file. In this case, the file name or metadata may be deleted or modified while going through the user's confirmation step. The same is done for video files.

음성 인식으로 생성된 사진 파일의 명칭을 자동적으로 메타 데이터에 포함시키면, 추후에 사진 또는 동영상의 파일 명칭이 다르게 변경되더라도 원래의 음성 인식으로 입력된 문자열 또는 파일 명칭을 확인하거나 복구할 수 있다.If the name of the photo file generated by voice recognition is automatically included in the metadata, even if the file name of the photo or video is changed differently later, the character string or file name originally input through voice recognition can be checked or restored.

또는 음성입력버튼을 눌렀을 때 주변에서 입력된 음향 신호를 문자 또는 문자열로 변환하여 사진 파일의 명칭이나 메타 데이터로 저장될 수도 있다.Alternatively, when the voice input button is pressed, an acoustic signal input from the vicinity may be converted into a character or string and stored as a name or metadata of a photo file.

또는 음성입력버튼을 클릭하면 음성 신호의 문자열 인식을 시작하고 다시 해당 버튼을 클릭하면 음성 인식을 종료하거나, 음성입력버튼을 클릭하여 음성 인식을 시작하고 사전에 정해진 시간이 지나거나, 더 이상 음성 신호의 입력이 없으면, 자동으로 음성 인식을 종료할 수 도 있다.Alternatively, if you click the voice input button, the text recognition of the voice signal starts and click the button again to end the voice recognition, or click the voice input button to start the voice recognition and a predetermined time elapses, or the voice signal is no longer If there is no input of , voice recognition may be automatically terminated.

또는 사진 또는 동영상을 촬영한 이후에, 스마트폰에서 특정 음향 신호를 자동으로 발생시키면서 음성 인식을 시작하고, 사전에 정해진 시간 이후에 또는 사용자가 음성입력버튼을 클릭해서 음성 인식을 중단시킬 수도 있다.Alternatively, after taking a picture or video, the smartphone automatically generates a specific sound signal to start voice recognition, and the voice recognition may be stopped after a predetermined time or by the user clicking the voice input button.

음성입력버튼을 클릭하는 대신에, 리모컨을 사용하거나 특정 단어, 음성 또는 음향으로 스마트폰으로 신호를 주거나 스마트폰 카메라 입력으로 특정 몸동작으로 신호를 주거나 스마트폰의 특정 움직임 패턴으로 신호를 주거나, 사전에 정해진 스마트폰의 설정 버튼을 눌러서 또는 원격지에서 통신 방식을 통해 전송한 제어 명령으로, 마이크로 입력된 음성 신호는 음성인식부에서 처리되어 문자로 변환될 수도 있다.Instead of clicking the voice input button, you can use the remote control or give a signal to the smartphone with a specific word, voice, or sound, signal with a specific gesture with the smartphone camera input, signal with a specific movement pattern of the smartphone, or As a control command transmitted by pressing the setting button of a predetermined smart phone or through a communication method from a remote location, the voice signal input into the microphone may be processed by the voice recognition unit and converted into text.

음성 인식이 진행되는 동안에는 음성입력버튼의 색깔이 변경되어 사용자가 음성이 인식되고 있는 것을 인지할 수 있도록 할 수도 있다.While the voice recognition is in progress, the color of the voice input button may be changed so that the user can recognize that the voice is being recognized.

또한 사용자가 의도적으로 스마트폰에 음성 인식의 명령을 지정하지 않아도, 기본적인 기능으로 스마트폰으로 사진을 촬영하는 일정 시간 동안에 또는 일정 수준 이상의 음성이 인식되는 동안에, 스마트폰에 입력된 음성 신호를 인식하고 문자 또는 문자열로 변환하여 사진 파일의 명칭 또는 메타 데이터로 저장할 수도 있다.In addition, even if the user does not intentionally designate a voice recognition command to the smartphone, the basic function recognizes the voice signal input to the smartphone and It can also be converted to text or character string and saved as the name of a photo file or as metadata.

사진 또는 동영상 촬영 후에 음성 인식으로 파일명이나 메타 데이터를 설정한 후에, 사용자는 음성입력버튼을 다시 눌러서 상기의 동일한 방법에 의해서 음성 인식으로 문자열을 다시 생성하고 표시부에 표시된 문자열 파일명을 확인하고 이전의 파일명이나 메타 데이터를 수정할 수도 있다.After taking a picture or video, after setting the file name or metadata by voice recognition, the user presses the voice input button again to generate a character string by voice recognition by the same method as above, check the character string file name displayed on the display, and check the previous file name You can also edit metadata.

상기의 사진 촬영과 문자 변환, 사진 파일의 저장의 주요 기능들은 스마트폰이나 카메라의 본체에서 수행될 수도 있고, 본체에 저장된 별도의 소프트웨어 앱 프로그램에 의해서 수행될 수도 있으며, 그 주요 기능들이 본체와 앱 프로그램에 분산되어 수행될 수도 있다. The main functions of taking pictures, converting text, and storing photo files may be performed in the main body of the smartphone or camera, or may be performed by a separate software application program stored in the main body, and the main functions are the main functions of the main body and the app. It may be distributed and executed in a program.

또는 음성 또는 음향 신호 데이터는 스마트폰의 통신 네트워크를 통해서 원격지의 변환 서버로 전송되고 변환 서버에서 문자 또는 문자열로 변환된 이후에 다시 스마트폰으로 수신될 수도 있다. Alternatively, the voice or sound signal data may be transmitted to a remote conversion server through the communication network of the smart phone and received back to the smart phone after being converted into text or character strings in the conversion server.

또한 사진 촬영 기능은 원격지의 스마트폰 카메라에서 수행하고, 음성 또는 음향 신호의 입력은 지역(local)에서 수행하며, 문자 변환과 사진 파일의 저장은 지역에 있는 변환 서버에서 수행할 수도 있다.In addition, the photo-taking function may be performed by a remote smartphone camera, input of a voice or sound signal may be performed locally, and text conversion and storage of photo files may be performed by a conversion server located in the local area.

최근에 여러 가지 종류의 많은 사진들이 클라우드 서버에 수집되고 있고 semi-supervised learning 기술을 적용하면, 특정 물건이나 장소, 사람 또는 이벤트(행위나 사건)의 특징이 있는 사진을 입력했을 때 기존에 저장되어 있던 유사한 사진과 비교하여 그 특징을 자동으로 추출할 수 있다.Recently, many photos of various types are being collected on a cloud server, and when semi-supervised learning technology is applied, when a photo featuring a specific object, place, person, or event (action or event) is entered, it is stored and stored in the past. It can automatically extract its features by comparing it with similar photos that were already there.

따라서 스마트폰에서 촬영된 사진의 특징(특정 물건이나 장소, 사람, 또는 이벤트(행위나 사건))을 스마트폰 또는 원격지의 변환 서버에서 인식하고 그 특징의 명칭을 추출하여, 사진 파일의 명칭이나 메타 데이터로 저장될 수 있다. 물론 이 단계에서도 사용자의 확인이나 음성 인식을 통한 수정 단계를 둘 수 있다.Therefore, by recognizing the characteristics (a specific object, place, person, or event (act or event)) of a picture taken on a smartphone in a smartphone or remote conversion server, and extracting the name of the characteristic, the name or meta of the picture file data can be stored. Of course, even at this stage, a correction stage through user confirmation or voice recognition may be provided.

촬영된 사진에서 복수의 특징(특정 물건이나 장소, 사람, 또는 이벤트(행위나 사건))이 추출된 경우에는, 각각의 특징의 명칭을 복수의 사진 파일들의 명칭으로 저장하거나, 하나의 사진 파일에 복수의 특징들을 메타 데이터로 저장할 수 있다.When a plurality of features (a specific object, place, person, or event (act or event)) are extracted from a photographed photo, the name of each feature is stored as a name of a plurality of photo files or stored in a single photo file. A plurality of features may be stored as metadata.

사용자의 음성이 불확실한 경우나 주변의 잡음으로 음성 인식율이 낮아질 수 있으므로, 상기의 특징의 명칭 추출 기술과 음성 인식 기술을 함께 처리하면, 사용자가 의도하는 파일명의 정확도를 더 높일 수 있다.Since the voice recognition rate may be lowered when the user's voice is uncertain or due to ambient noise, the accuracy of the file name intended by the user can be further improved by processing the name extraction technique of the above feature and the voice recognition technique together.

상기의 방식으로 촬영된 사진 파일들에 부여된 명칭 또는 메타 데이터에 따라서, 자동으로 사진 파일들이 분류되어 서로 다른 폴더에 저장될 수도 있고, 촬영된 시간에 따라서 명칭 앞 또는 뒤에 일련번호를 붙일 수도 있다. 또는 동일한 파일명의 사진이 이미 있는 경우에는, 파일명의 뒤에 촬영 시간에 따른 일련번호 또는 촬영일시정보를 추가로 붙여서 구분할 수도 있다.According to the name or metadata given to the photo files photographed in the above manner, the photo files may be automatically classified and stored in different folders, and a serial number may be attached before or after the name according to the time taken. . Alternatively, if there is already a photo with the same file name, a serial number according to the recording time or photographing date and time information may be additionally appended to the file name to distinguish it.

한편, 사진 또는 동영상을 촬영하는 동안에 또는 촬영한 이후에, 스마트폰의 녹음버튼을 이용하여 스마트폰에 입력된 음향 신호를 녹음하고 이를 별도의 음향 파일로 저장하거나 사진 파일에 메타 데이터로 포함시킬 수 있다. 별도의 음향 파일로 저장하는 경우에는 해당 사진 또는 동영상 파일과 동일한 파일명 또는 메타 데이터로 음향 파일이 저장된다. Meanwhile, during or after taking a picture or video, you can use the record button of the smartphone to record the sound signal input to the smartphone and save it as a separate sound file or include it as metadata in the picture file. have. When saving as a separate sound file, the sound file is saved with the same file name or metadata as the corresponding photo or video file.

또는 사진 또는 동영상 파일과 해당 음향 파일사이의 별도의 링크 정보를 생성할 수도 있다.Alternatively, separate link information between a photo or video file and a corresponding sound file may be generated.

또는 촬영한 사진의 경우에는 동영상 형식으로 변경하여 녹음된 음향 신호를 포함시켜서 하나의 파일로 생성할 수도 있다.Alternatively, in the case of a photographed picture, it may be converted into a moving picture format and included in the recorded sound signal to generate a single file.

또한 사용자가 음성 인식을 통해 파일명을 문자열로 변환할 때 사용한 음성 신호를 음향 파일로 저장할 수도 있다.In addition, a voice signal used when a user converts a file name into a character string through voice recognition can be stored as a sound file.

또한 사진 또는 동영상 파일에 상기의 음향 파일이 포함되거나 해당 음향 파일이 존재하는 경우에는 사진 또는 동영상 파일의 메타 데이터에 그 여부 또는 음향 파일명을 포함시킬 수도 있다.In addition, when the sound file is included in the photo or video file or the corresponding sound file exists, whether or not the sound file is included or the name of the sound file may be included in the metadata of the photo or video file.

상기의 방식으로 촬영된 사진 및 동영상 파일에 대한 목록을 표시할 때, 상기 음향 신호의 존재 또는 포함 여부를 표시할 수도 있다.When displaying a list of photo and video files taken in the above manner, the presence or inclusion of the sound signal may be displayed.

또한, 추가적으로 사진 또는 동영상 파일의 메타 데이터에 의해서 상기 목록에 촬영한 일시 또는 장소를 표시할 수도 있다.In addition, it is also possible to additionally display the date or time or place of the shooting on the list by the metadata of the photo or video file.

또한 상기의 목록에는 섬네일을 표시할 수도 있으며, 특정 사진의 섬네일이나 파일명을 클릭하면 사진을 화면에 재생하고 동시에 녹음된 음향 파일을 재생하며, 특정 동영상의 경우에는 섬네일이나 파일명을 클릭하면 동영상의 첫 화면을 정지화면으로 재생하고 동시에 상기의 음향 파일을 재생한 후에, 나머지 동영상을 재생한다.Thumbnails can also be displayed in the list above. Clicking the thumbnail or file name of a specific picture plays the picture on the screen and simultaneously plays the recorded sound file. In the case of a specific video, clicking the thumbnail or file name starts the video After playing the screen as a still image and simultaneously playing the sound file, the rest of the video is played.

사전에 정해진 여러 명의 사용자 그룹 중에 한 명이 소유한 스마트폰에, 사전에 정해진 키워드가 사진 파일의 명칭 또는 메타 데이터에 포함된 사진 파일이 생성되면, 자동으로 해당 그룹 구성원들이 소유한 스마트폰에 공유되어 동기화가 될 수 있다.When a photo file with a preset keyword in the name or metadata of the photo file is created on a smartphone owned by one of several predefined user groups, it is automatically shared to the smartphone owned by the group members. can be synchronized.

또는, 사전에 정해진 여러 명의 사용자 그룹 중에 한 명이 소유한 스마트폰에, 특정인이 사전에 정한 키워드가 사진 파일의 명칭 또는 메타 데이터에 포함된 사진 파일이 생성되면, 자동으로 해당 특정인이 소유한 스마트폰에 공유되어 동기화가 될 수 있다.Alternatively, when a photo file in which a specific person's predefined keyword is included in the name or metadata of the photo file is created in a smartphone owned by one of several predefined user groups, the smartphone owned by the specific person automatically can be shared and synchronized.

도 2는 일 실시예에 따른 스마트폰에서 사진 자동 레이블링 및 녹음 장치에서의 스마트폰의 화면을 구성하는 도면들이다. 2 is a diagram of a screen of a smartphone in an apparatus for automatic photo labeling and recording in a smartphone according to an embodiment.

도 2의 (a)는 일 실시예에 따른 스마트폰 앱의 초기 화면으로 4개의 버튼이 있으며, 사진 촬영 버튼을 누르면 카메라 앱으로 이동하여 사진을 촬영하게 된다. 사진을 촬영하면 자동으로 (b)의 화면으로 전환되며 촬영된 사진을 보고 만족하면 “저장” 버튼을 누르고 다시 촬영하고 싶으면 “다시 촬영” 버튼을 누른다. 다시 촬영하게 되면 이전의 카메라 앱으로 복귀한다. “저장”을 하면 (c)의 화면으로 전환되며 현재 파일명으로는 기본으로 일시 정보로 설정되어 있다. 이 파일명을 변경하기 위해서는 “음성 파일명“ 버튼을 누르고 원하는 파일명을 말하고 이 음성이 문자열로 인식되어 자동적으로 현재 파일명이 변경된다. 음성 인식된 파일명이 잘못되었던지 다른 명칭으로 변경하고 싶으면 ”음성 파일명“을 다시 누르고 말을 한다. 추가적으로 음성이나 주변의 음향을 녹음하고 싶으면 ”음성 녹음“ 버튼을 누른다. 버튼을 누르고 있는 동안 녹음이 되던지, 첫 번째 버튼을 누를때부터 녹음이 되고 두 번째 버튼을 누르면 중지된다. (d)는 촬영한 사진이나 동영상 파일들의 목록을 표시하는 화면으로 섬네일도 표시되어 있고, 파일명과 파일의 형식, 해당 음성 파일 존재 유무, 일시 및 장소가 표시되어 있다. 섬네일 사진이나 목록을 클릭하면 해당 사진이나 동영상이 녹음된 음성 파일과 함께 재생된다. (e)는 설정 화면으로 “자동 인식 모드”를 “on”으로 선택하면, 사진 파일명을 음성으로 인식하기 위하여 “음성파일명” 버튼을 누르지 않아도 되며, 사진을 찍은 후 화면 (b)의 “저장” 버튼을 누르면 자동으로 5초간 입력된 음성 신호를 문자열로 변환하고 화면 (c)의 ”현재 파일명“에 표시한다. ”off“인 경우는 앞에서 설명한 기본 모드로 동작한다. 또한 “사진/음향 합성”을 “on”으로 선택하면, 녹음된 음향 파일을 하나의 동영상 파일로 만든다. (예를 들면, mpeg) “off”인 경우에는 사진과 동일한 파일명을 가진 별도의 mp3 파일을 생성한다.Figure 2 (a) is an initial screen of the smartphone app according to an embodiment, there are four buttons, and pressing the photo taking button moves to the camera app to take a picture. When you take a picture, it automatically switches to the screen in (b). If you are satisfied with the picture you have taken, click the “Save” button. If you want to take another picture, click the “Retake” button. When you take another shot, it reverts to the previous camera app. If you click “save”, the screen in (c) is changed and the current file name is set as the date and time information by default. To change this file name, press the “Voice File Name” button, say the desired file name, and the voice will be recognized as a character string, and the current file name will be automatically changed. If you want to change the name of the recognized file name to another name, press “Voice File Name” again and speak. If you want to additionally record your voice or surrounding sounds, press the “Record Voice” button. Recording is performed while the button is pressed, or recording starts when the first button is pressed and stops when the second button is pressed. (d) is a screen that displays a list of photographed photo or video files, thumbnails are also displayed, and the file name and file format, the presence or absence of the corresponding audio file, date and time, and location are displayed. When you click on a thumbnail photo or list, the photo or video is played along with the recorded audio file. (e) If you select “on” for “Auto Recognition Mode” on the setting screen, you do not need to press the “Voice File Name” button to recognize the picture file name as a voice, and after taking the picture, click “Save” on the screen (b). When the button is pressed, the voice signal input for 5 seconds is automatically converted into a character string and displayed on the “current file name” of the screen (c). In case of “off”, it operates in the basic mode described above. Also, if you select “on” for “Photo/Sound Composite”, the recorded sound file is made into a single video file. (eg mpeg) In case of “off”, a separate mp3 file with the same file name as the picture is created.

이상과 같이 실시예들이 비록 한정된 도면에 의해 설명되었으나, 해당 기술분야에서 통상의 지식을 가진 자라면 상기를 기초로 다양한 기술적 수정 및 변형을 적용할 수 있다. 예를 들어, 설명된 기술들이 설명된 방법과 다른 순서로 수행되거나, 및/또는 설명된 시스템, 구조, 장치, 회로 등의 구성요소들이 설명된 방법과 다른 형태로 결합 또는 조합되거나, 다른 구성요소 또는 균등물에 의하여 대치되거나 치환되더라도 적절한 결과가 달성될 수 있다.As described above, although the embodiments have been described with reference to the limited drawings, those skilled in the art may apply various technical modifications and variations based on the above. For example, the described techniques are performed in a different order than the described method, and/or the described components of the system, structure, apparatus, circuit, etc. are combined or combined in a different form than the described method, or other components Or substituted or substituted by equivalents may achieve an appropriate result.

그러므로, 다른 구현들, 다른 실시예들 및 특허청구범위와 균등한 것들도 후술하는 청구범위의 범위에 속한다.Therefore, other implementations, other embodiments, and equivalents to the claims are also within the scope of the following claims.

Claims

In a photo automatic labeling and recording device in a smartphone,
camera lens for taking pictures;
a shooting button instructing to take a picture;
a display unit for displaying the photographed picture;
a voice input button for instructing voice input and text conversion;
A voice recognition unit for converting the input voice into text or character string; and
A file storage unit for storing the converted text or character string as a file name in a photo file taken under the control of the controller
A photo automatic labeling and recording device comprising a.

According to claim 1,
Record button for recording sound signals related to pictures taken by the user
A photo automatic labeling and recording device further comprising a.

According to claim 1,
The user presses the voice input button at the same time or before or after taking the picture and says the name of the picture, the voice signal input into the microphone is processed by the voice recognition unit and converted into text or string, and the converted text or string is the photographed picture Ability to automatically save as file name or metadata
A photo automatic labeling and recording device further comprising a.

According to claim 1,
If you click the voice input button, the text recognition of the voice signal starts and click the button again to end the voice recognition, or click the voice input button to start the voice recognition and a predetermined time passes, or the voice signal is no longer recognized. Function to automatically end voice recognition if there is no input
A photo automatic labeling and recording device further comprising a.

According to claim 1,
After taking a photo or video, a function to create a file name by performing character string recognition by voice recognition for a predetermined period of time
A photo automatic labeling and recording device further comprising a.

According to claim 1,
Speech recognition is performed on its own by the voice recognition unit of the smartphone, or under the leadership of the voice recognition unit, voice or sound signal data is transmitted to a remote conversion server through the communication network of the smartphone and converted into text or character string by the conversion server. The function to be received back to the smartphone afterward
A photo automatic labeling and recording device further comprising a.

According to claim 1,
When a plurality of features (a specific object, place, person, or event (act or event)) are extracted from a photographed photo, the name of each feature is stored as a name of a plurality of photo files or stored in a single photo file. Ability to save as multiple metadata
A photo automatic labeling and recording device further comprising a.

According to claim 1,
According to the name or metadata given to the photo files photographed in the above manner, the photo files may be automatically classified and stored in different folders, and a serial number may be attached before or after the name according to the time taken. . Or, if there is already a photo with the same file name, a function to distinguish it by adding a serial number according to the shooting time or shooting date and time information to the end of the file name
A photo automatic labeling and recording device further comprising a.

3. The method of claim 1 and 2,
The sound signal input to the smartphone can be recorded using the above record button, and it can be saved as a separate sound file or included as metadata in the photo file. A function that saves the sound file with the same file name or meta data as the one, and in the case of a photo, changes it to a video format and includes the recorded sound signal to create a single file
A photo automatic labeling and recording device further comprising a.

3. The method of claim 1 and 2,
When the sound file is included in the photo or video file or the sound file exists, a function of including the presence or sound file name in the metadata of the photo or video file
A photo automatic labeling and recording device further comprising a.

3. The method of claim 1 and 2,
Thumbnails can also be displayed in the list of photos or videos taken, and if you click the thumbnail or file name of a specific photo, the photo is played on the screen and a recorded sound file is played at the same time. When clicked, the first screen of the video is played as a still image, and the above sound file is played at the same time, and then the rest of the video is played.
A photo automatic labeling and recording device further comprising a.

According to claim 1,
When a photo file with a preset keyword in the name or metadata of the photo file is created on a smartphone owned by one of several predefined user groups, it is automatically shared to the smartphone owned by the group members. Synchronized function
A photo automatic labeling and recording device further comprising a.

According to claim 1,
When a photo file with a keyword defined in advance by a specific person in the name or metadata of the photo file is created on a smartphone owned by one of several predefined user groups, it is automatically shared to the smartphone owned by that specific person function to be synchronized
A photo automatic labeling and recording device further comprising a.

In a method for automatic photo labeling and recording in a smartphone,
Taking a picture or video with the camera of the smartphone;
Either press the voice input button of the smartphone or input the voice into the smartphone for a certain period of time after the above shooting, and the smartphone or external server recognizes it and converts it into a character string, and then the file name or metadata of the picture or video saving as;
correcting the file name by inputting a voice again if the file name of the generated string is not correct or if a change is desired;
Optionally, recording a voice or sound signal by using the record button and storing it as a separate sound file with the same file name as the photo file name;
Playing the sound file recorded together with the photo when a thumbnail or the file name is selected from the list of photos or videos taken
A method for automatic labeling and recording of photos on a smartphone, including