KR20240040807A

KR20240040807A - Low-power machine learning using regions of interest captured in real time

Info

Publication number: KR20240040807A
Application number: KR1020247007105A
Authority: KR
Inventors: 샤미크 강글리; 에밀리 바바라 쿠퍼; 게리 리 주니어 본드란
Original assignee: 구글 엘엘씨
Priority date: 2021-08-12
Filing date: 2022-08-12
Publication date: 2024-03-28
Also published as: WO2023019247A1; WO2023019249A1

Abstract

이미지 컨텐츠를 생성하기 위한 시스템 및 방법이 설명된다. 시스템 및 방법은 컴퓨팅 디바이스의 센서가 그 센서에 의해 캡처된 광학 데이터와 연관된 이미지 컨텐츠를 식별하게 하라는 요청의 수신에 응답하여, 제1 이미지 해상도를 갖는 제1 센서 데이터 스트림을 검출하는 단계, 및 제2 이미지 해상도를 갖는 제2 센서 데이터 스트림을 검출하는 단계를 포함할 수 있다. 시스템 및 방법은 또한 컴퓨팅 디바이스의 처리 회로에 의해 제1 센서 데이터 스트림의 적어도 하나의 관심 영역을 식별하는 것, 제1 센서 데이터 스트림의 적어도 하나의 관심 영역에서 복수의 제1 픽셀을 정의하는 발췌 좌표를 결정하는 것을 포함할 수 있다. 제1 센서 데이터 스트림 및 적어도 하나의 관심 영역을 나타내는 발췌 이미지를 생성하는 단계를 포함한다.Systems and methods for generating image content are described. Systems and methods include, in response to receiving a request to cause a sensor of a computing device to identify image content associated with optical data captured by the sensor, detecting a first sensor data stream having a first image resolution, and 2 and detecting a second sensor data stream having an image resolution. Systems and methods also include identifying, by processing circuitry of a computing device, at least one region of interest in a first sensor data stream, extracting coordinates defining a plurality of first pixels in the at least one region of interest in the first sensor data stream. It may include determining. and generating a first sensor data stream and an excerpt image representing at least one region of interest.

Description

Low-power machine learning using regions of interest captured in real time

본 출원은 미국 출원 번호 17/819,170의 계속 출원으로 이에 대한 우선권을 주장하고, 2022년 8월 11일에 출원된 미국 출원 번호 17/819,154의 계속 출원으로 이에 대한 우선권을 주장하며, 두 출원 모두는 2021년 8월 12일에 출원된 미국 가출원 번호 63/260,206의 이익과 미국 가출원 번호 63/260,207의 이익을 주장한다. 이들 출원 모두의 개시 내용은 그 전체가 참조로서 본 명세서에 포함된다.This application claims priority to the continuation-in-application of U.S. Application No. 17/819,170, and claims priority to the continuation-in-application of U.S. Application No. 17/819,154, filed on August 11, 2022, both of which claim priority thereto. The benefit of U.S. Provisional Application No. 63/260,206, filed August 12, 2021, and U.S. Provisional Application No. 63/260,207 are claimed. The disclosures of both of these applications are incorporated herein by reference in their entirety.

본 출원은 또한 2021년 8월 12일에 출원된 미국 가출원 번호 63/260,206 및 2021년 8월 12일에 출원된 미국 가출원 번호 63/260,207에 대해 우선권을 주장하며, 그 개시 내용은 그 전체가 참조로서 본 명세서에 포함된다.This application also claims priority to U.S. Provisional Application No. 63/260,206, filed Aug. 12, 2021, and U.S. Provisional Application No. 63/260,207, filed Aug. 12, 2021, the disclosures of which are incorporated by reference in their entirety. It is included in this specification as.

본 설명은 일반적으로 이미지 컨텐츠를 처리하는데 사용되는 방법, 디바이스 및 알고리즘에 관한 것이다.This description generally relates to methods, devices and algorithms used to process image content.

컴퓨터 비전 기술은 컴퓨터가 이미지에서 정보를 분석하고 추출할 수 있도록 한다. 이러한 컴퓨터 비전 기술은 처리 및 전력 소비 측면에서 많은 비용이 들 수 있다. 모바일 컴퓨팅 디바이스의 성능과 전력 소비의 균형을 맞추려는 요구가 계속 증가함에 따라, 디바이스 제조업체는 모바일 컴퓨팅 디바이스의 한계를 초과하는 것을 방지하기 위해 디바이스 성능과 이미지 저하의 균형을 맞추도록 디바이스를 구성하는 임무룰 맡고 있다.Computer vision technology allows computers to analyze and extract information from images. These computer vision technologies can be expensive in terms of processing and power consumption. As the need to balance performance and power consumption in mobile computing devices continues to grow, device manufacturers are tasked with configuring their devices to balance device performance and image degradation to avoid exceeding the limits of mobile computing devices. I'm in charge of the rules.

하나 이상의 컴퓨터로 구성된 시스템은 소프트웨어, 펌웨어, 하드웨어 또는 이들의 조합이 시스템에 설치되어 작동 시 시스템이 액션들을 수행하게 하거나 유발하는 특정 동작들 또는 액션들을 수행하도록 구성될 수 있다. 하나 이상의 컴퓨터 프로그램은 데이터 처리 장치에 의해 실행될 때 해당 장치로 하여금 액션들을 수행하게 하는 명령들을 포함함으로써 특정 동작들 또는 액션들을 수행하도록 구성될 수 있다.A system comprised of one or more computers may be configured to perform specific operations or actions that cause or cause the system to perform actions when software, firmware, hardware, or a combination thereof is installed and operated on the system. One or more computer programs may be configured to perform specific operations or actions by including instructions that, when executed by a data processing device, cause the device to perform the actions.

첫 번째 일반적인 양태에서는 이미지 처리 방법이 설명된다. 이 방법은 웨어러블 또는 비-웨어러블 컴퓨팅 디바이스의 센서가 그 센서에 의해 캡처된 광학 데이터와 연관된 이미지 컨텐츠를 식별하게 하라는 요청을 수신하는 것에 응답하여, 제1 이미지 해상도를 갖는 제1 센서 데이터 스트림을 검출하는 단계 - 제1 센서 데이터 스트림은 광학 데이터에 기초함 - 및 제2 이미지 해상도를 갖는 제2 센서 데이터 스트림을 검출하는 단계 - 제2 센서 데이터 스트림은 광학 데이터에 기초함 -를 포함할 수 있다. 방법은 웨어러블 또는 비-웨어러블 컴퓨팅 디바이스의 처리 회로에 의해, 제1 센서 데이터 스트림에서 적어도 하나의 관심 영역을 식별하는 단계와, 처리 회로에 의해, 제1 센서 데이터 스트림의 적어도 하나의 관심 영역에서 복수의 제1 픽셀을 정의하는 발췌(할)(cropping) 좌표를 결정하는 단계와, 그리고 처리 회로에 의해, 적어도 하나의 관심 영역을 나타내는 발췌(된)(cropped) 이미지를 생성하는 단계를 더 포함할 수 있다. 상기 생성하는 단계는 제1 센서 데이터 스트림의 적어도 하나의 관심 영역에서 복수의 제1 픽셀을 정의하는 발췌할 좌표를 사용하여 제2 센서 데이터 스트림의 복수의 제2 픽셀을 식별하는 단계 및 제2 센서 데이터 스트림을 복수의 제2 픽셀로 발췌하는 단계를 포함할 수 있다.In a first general aspect, an image processing method is described. The method is responsive to receiving a request to cause a sensor of a wearable or non-wearable computing device to identify image content associated with optical data captured by the sensor, detecting a first sensor data stream having a first image resolution. and detecting a second sensor data stream having a second image resolution, wherein the first sensor data stream is based on optical data. The method includes identifying, by processing circuitry of a wearable or non-wearable computing device, at least one region of interest in a first sensor data stream, and, by processing circuitry, identifying a plurality of regions of interest in the at least one region of interest in the first sensor data stream. determining cropping coordinates defining first pixels of the pixels, and generating, by the processing circuitry, a cropped image representing at least one region of interest. You can. The generating step includes identifying a second plurality of pixels in a second sensor data stream using coordinates to extract that define a first plurality of pixels in at least one region of interest in the first sensor data stream, and Extracting the data stream into a plurality of second pixels may be included.

구현들에는 다음 기능 중 하나가 단독으로 또는 조합되어 포함될 수 있다.Implementations may include any of the following features, alone or in combination:

방법은 또한 처리 회로에 의해, 발췌 이미지에 대해 광학 문자 분해능을 수행하여 발췌 이미지의 기계 판독 가능한 버전을 생성하는 단계와, 처리 회로에 의해, 발췌 이미지의 기계 판독 가능 버전을 사용하여 검색 질의를 수행하여 복수의 검색 결과를 생성하는 단계와, 그리고 컴퓨팅 디바이스의 디스플레이에 검색 결과를 디스플레이하는 단계를 포함할 수 있다.The method also includes, by processing circuitry, performing optical character resolution on the excerpt image to generate a machine-readable version of the excerpt image, and performing, by the processing circuitry, a search query using the machine-readable version of the excerpt image. This may include generating a plurality of search results and displaying the search results on a display of the computing device.

일부 구현에서, 제2 센서 데이터 스트림은 웨어러블 또는 비-웨어러블 컴퓨팅 디바이스의 메모리에 저장되고, 방법은 제1 센서 데이터 스트림에서 적어도 하나의 관심 영역을 식별하는 것에 응답하여, 메모리에 저장된 제2 센서 데이터의 대응하는 적어도 하나의 관심 영역을 검색하는 단계와, 그리고 제2 센서 데이터 스트림을 계속해서 감지하고 액세스하는 동안 제1 센서 데이터 스트림에 대한 액세스를 제한하는 단계를 더 포함할 수 있다.In some implementations, the second sensor data stream is stored in a memory of a wearable or non-wearable computing device, and the method, in response to identifying at least one region of interest in the first sensor data stream, comprises: generating the second sensor data stored in the memory; and restricting access to the first sensor data stream while continuing to sense and access the second sensor data stream.

일부 구현에서, 방법은 적어도 하나의 관심 영역을 나타내는 발췌 이미지를 생성하는 것에 응답하여 웨어러블 또는 비-웨어러블 컴퓨팅 디바이스와 통신하는 모바일 디바이스에 생성된 발췌 이미지를 전송하는 단계와, 모바일 디바이스로부터, 적어도 하나의 관심 영역에 관한 정보를 수신하는 단계와, 그리고 웨어러블 또는 비-웨어러블 컴퓨팅 디바이스의 디스플레이에 정보를 디스플레이하게 하는 단계를 더 포함할 수 있다.In some implementations, a method includes, in response to generating a snippet image representative of at least one region of interest, transmitting the generated snippet image to a mobile device in communication with a wearable or non-wearable computing device, from the mobile device, at least one The method may further include receiving information regarding the region of interest, and displaying the information on a display of the wearable or non-wearable computing device.

일부 구현에서, 컴퓨팅 디바이스는 배터리 구동형 컴퓨팅 디바이스이고, 그리고 제1 이미지 해상도는 낮은 이미지 해상도를 갖고, 제2 이미지 해상도는 높은 이미지 해상도를 갖는다. 일부 구현에서, 적어도 하나의 관심 영역을 식별하는 단계는 웨어러블 또는 비-웨어러블 컴퓨팅 디바이스에서 실행되는 기계 학습 알고리즘에 의해 제1 센서 데이터를 입력으로 사용하여, 제1 센서 데이터에 표시된 텍스트 또는 적어도 하나의 객체를 식별하는 단계를 더 포함한다.In some implementations, the computing device is a battery-powered computing device, and the first image resolution has a low image resolution and the second image resolution has a high image resolution. In some implementations, identifying at least one region of interest includes using the first sensor data as input by a machine learning algorithm running on a wearable or non-wearable computing device to detect at least one or more text displayed in the first sensor data. It further includes the step of identifying the object.

일부 구현에서, 처리 회로는 적어도 제1 센서 데이터 스트림에 대한 이미지 신호 처리를 수행하도록 구성된 제1 이미지 프로세서, 및 제2 센서 데이터 스트림에 대한 이미지 신호 처리를 수행하도록 구성된 제2 이미지 프로세서를 포함하며, 제1 센서 데이터 스트림의 제1 이미지 해상도는 제2 데이터 센서 스트림의 제2 이미지 해상도보다 낮다.In some implementations, the processing circuitry includes at least a first image processor configured to perform image signal processing for a first sensor data stream, and a second image processor configured to perform image signal processing for a second sensor data stream; The first image resolution of the first sensor data stream is lower than the second image resolution of the second data sensor stream.

일부 구현에서, 적어도 하나의 관심 영역을 나타내는 발췌 이미지를 생성하는 단계는 적어도 하나의 관심 영역이 임계 조건을 충족하는 것으로 검출하는 것에 응답하여 수행되며, 임계 조건은 제2 센서 데이터 스트림의 복수의 제2 픽셀이 낮은 블러(blur)를 갖는다는 것을 검출하는 것을 포함한다.In some implementations, generating an excerpt image representing the at least one region of interest is performed in response to detecting that the at least one region of interest meets a threshold condition, wherein the threshold condition is a plurality of first regions of the second sensor data stream. 2. Detecting that the pixel has low blur.

두 번째 일반적인 양태에서는 웨어러블 컴퓨팅 디바이스가 설명된다. 웨어러블 컴퓨팅 디바이스는 적어도 하나의 처리 디바이스, 광학 데이터를 캡처하도록 구성된 적어도 하나의 이미지 센서, 실행될 때 웨어러블 컴퓨팅 디바이스로 하여금 동작들을 수행하게 하는 명령들을 저장한 메모리를 포함하고, 상기 동작들은 적어도 하나의 이미지 센서가 광학 데이터와 연관된 이미지 컨텐츠를 식별하게 하라는 요청을 수신하는 것에 응답하여, 제1 이미지 해상도를 갖는 제1 센서 데이터 스트림을 검출하는 동작 - 제1 센서 데이터 스트림은 광학 데이터에 기초함 - 과, 그리고 제2 이미지 해상도를 갖는 제2 센서 데이터 스트림을 검출하는 동작 - 제2 센서 데이터 스트림은 광학 데이터에 기초함 - 을 포함한다.In a second general aspect, a wearable computing device is described. The wearable computing device includes at least one processing device, at least one image sensor configured to capture optical data, and memory storing instructions that, when executed, cause the wearable computing device to perform operations, the operations comprising: In response to receiving a request to cause a sensor to identify image content associated with the optical data, detecting a first sensor data stream having a first image resolution, the first sensor data stream being based on the optical data; and and detecting a second sensor data stream having a second image resolution, wherein the second sensor data stream is based on optical data.

상기 동작들은 제1 센서 데이터 스트림에서 적어도 하나의 관심 영역을 식별하는 동작과, 제1 센서 데이터 스트림의 적어도 하나의 관심 영역에서 복수의 제1 픽셀을 정의하는 발췌 좌표를 결정하는 동작과, 그리고 적어도 하나의 관심 영역을 나타내는 발췌 이미지를 생성하는 동작을 포함하고, 여기서 생성하는 동작은 제1 센서 데이터 스트림의 적어도 하나의 관심 영역에서 복수의 제1 픽셀을 정의하는 발췌 좌표를 사용하여 제2 센서 데이터 스트림의 복수의 제2 픽셀을 식별하는 동작, 및 제2 센서 데이터 스트림을 복수의 제2 픽셀로 발췌하는 동작을 포함한다.The operations include identifying at least one region of interest in the first sensor data stream, determining excerpt coordinates defining a plurality of first pixels in the at least one region of interest in the first sensor data stream, and at least Generating an excerpt image representing a region of interest, wherein generating the second sensor data using the excerpt coordinates defining a plurality of first pixels in the at least one region of interest in the first sensor data stream. Identifying a second plurality of pixels in the stream, and extracting the second sensor data stream into the plurality of second pixels.

구현들은 다음 기능 중 하나를 단독으로 또는 조합하여 포함할 수 있다. 일부 구현에서, 적어도 하나의 이미지 센서는 높은 이미지 해상도 모드에서의 작동으로 전환하도록 트리거될 때까지 낮은 이미지 해상도 모드에서 동작하도록 구성된 듀얼 스트림 이미지 센서이다. 일부 구현에서, 상기 동작들은 발췌 이미지에 대해 광학 문자 분해능을 수행하여 발췌 이미지의 기계 판독 가능한 버전을 생성하는 동작과, 발췌 이미지의 기계 판독 가능 버전을 사용하여 검색 질의를 수행하여 복수의 검색 결과를 생성하는 동작과, 그리고 웨어러블 컴퓨팅 디바이스의 디스플레이에 검색 결과가 디스플레이되게 하는 동작을 더 포함한다. 일부 구현에서, 웨어러블 컴퓨팅 디바이스는 웨어러블 컴퓨팅 디바이스의 스피커로부터 상기 수행된 광학 문자 분해능의 오디오 출력을 유발할 수 있다.Implementations may include any of the following features alone or in combination: In some implementations, at least one image sensor is a dual stream image sensor configured to operate in a low image resolution mode until triggered to switch to operation in a high image resolution mode. In some implementations, the operations include performing optical character resolution on the excerpt image to generate a machine-readable version of the excerpt image, and performing a search query using the machine-readable version of the excerpt image to produce a plurality of search results. It further includes an operation of generating, and an operation of displaying the search results on the display of the wearable computing device. In some implementations, a wearable computing device can trigger audio output of the achieved optical character resolution from speakers of the wearable computing device.

일부 구현에서, 제2 센서 데이터 스트림은 웨어러블 컴퓨팅 디바이스의 메모리에 저장되고, 그리고 상기 동작들은 제1 센서 데이터 스트림에서 적어도 하나의 관심 영역을 식별하는 것에 응답하여, 메모리에 저장된 제2 센서 데이터에서 대응하는 적어도 하나의 관심 영역을 검색하는 동작과, 그리고 제2 센서 데이터 스트림을 계속해서 감지하고 액세스하는 동안 제1 센서 데이터 스트림에 대한 액세스를 제한하는 동작을 더 포함한다.In some implementations, the second sensor data stream is stored in a memory of the wearable computing device, and the operations are responsive to identifying at least one region of interest in the first sensor data stream and corresponding in the second sensor data stored in the memory. searching for at least one region of interest, and restricting access to the first sensor data stream while continuing to sense and access the second sensor data stream.

일부 구현에서, 상기 동작들은 적어도 하나의 관심 영역을 나타내는 발췌 이미지를 생성하는 것에 응답하여 웨어러블 컴퓨팅 디바이스와 통신하는 모바일 디바이스에 생성된 발췌 이미지를 전송하는 동작과, 모바일 디바이스로부터 적어도 하나의 관심 영역에 대한 정보를 수신하는 동작과, 그리고 웨어러블 컴퓨팅 디바이스의 디스플레이에 정보를 디스플레이하게 하는 동작을 더 포함한다. 일부 구현에서, 상기 동작들은 웨어러블 컴퓨팅 디바이스에서 정보를 출력(예를 들어, 청각, 시각, 촉각 등)하게 하는 동작을 포함한다.In some implementations, the operations include transmitting the generated snippet image to a mobile device in communication with the wearable computing device in response to generating the snippet image representing at least one region of interest, and transmitting the generated snippet image from the mobile device to the at least one region of interest. It further includes an operation of receiving information about the device, and an operation of displaying the information on the display of the wearable computing device. In some implementations, the operations include causing information to be output (eg, auditory, visual, tactile, etc.) from the wearable computing device.

일부 구현에서, 제1 이미지 해상도는 낮은 이미지 해상도를 갖고, 제2 이미지 해상도는 높은 이미지 해상도를 가지며, 그리고 상기 동작들은 적어도 하나의 관심 영역을 식별하는 동작을 더 포함할 수 있고, 이 동작은 웨어러블 컴퓨팅 디바이스에서 실행되는 기계 학습 알고리즘에 의해 제1 센서 데이터를 입력으로서 사용하, 제1 센서 데이터에 표시된 텍스트 또는 적어도 하나의 객체를 식별하는 동작을 더 포함할 수 있다.In some implementations, the first image resolution has a low image resolution, the second image resolution has a high image resolution, and the operations may further include identifying at least one region of interest, the operations comprising identifying the wearable The method may further include identifying text or at least one object displayed in the first sensor data using the first sensor data as input by a machine learning algorithm running on the computing device.

일부 구현에서, 적어도 하나의 처리 디바이스는 제1 센서 데이터 스트림에 대한 이미지 신호 처리를 수행하도록 구성된 제1 이미지 프로세서와, 그리고 제2 센서 데이터 스트림에 대한 이미지 신호 처리를 수행하도록 구성된 제2 이미지 프로세서를 포함하고, 제1 센서 데이터 스트림의 제1 이미지 해상도는 제2 데이터 센서 스트림의 제2 이미지 해상도보다 낮다.In some implementations, the at least one processing device includes a first image processor configured to perform image signal processing for the first sensor data stream, and a second image processor configured to perform image signal processing for the second sensor data stream. wherein the first image resolution of the first sensor data stream is lower than the second image resolution of the second data sensor stream.

설명된 기술의 구현에는 하드웨어, 방법 또는 프로세스, 또는 컴퓨터 접속 가능 매체 상의 컴퓨터 소프트웨어가 포함될 수 있다. 하나 이상의 구현의 세부사항은 첨부 도면 및 아래 설명에 설명되어 있다. 다른 특징들은 설명, 도면, 청구범위를 통해 명백해질 것이다.Implementations of the described techniques may include hardware, methods or processes, or computer software on a computer-accessible medium. Details of one or more implementations are set forth in the accompanying drawings and description below. Other features will become apparent from the description, drawings, and claims.

도 1은 본 개시 전반에 걸쳐 설명된 구현에 따른 제한된 계산 및/또는 전력 자원을 사용하여 이미지 컨텐츠를 생성하고 처리하기 위한 웨어러블 컴퓨팅 디바이스의 예이다.
도 2는 본 개시 전반에 걸쳐 설명된 구현에 따른 웨어러블 컴퓨팅 디바이스에서 이미지 처리를 수행하기 위한 시스템을 도시한다.
도 3a 및 도 3b는 본 개시 전반에 걸쳐 설명된 구현에 따른 웨어러블 컴퓨팅 디바이스의 예를 도시한다.
도 4a 내지 도 4c는 본 개시 전반에 걸쳐 설명된 구현에 따른 웨어러블 컴퓨팅 디바이스에서 이미지 처리 작업을 수행하기 위한 예시적인 흐름도를 도시한다.
도 5a 및 도 5b는 본 개시 전반에 걸쳐 설명된 구현에 따른 웨어러블 컴퓨팅 디바이스에서 이미지 처리 작업을 수행하기 위한 예시적인 흐름도를 도시한다.
도 6a 내지 도 6c는 본 개시 전반에 걸쳐 설명된 구현에 따른 웨어러블 컴퓨팅 디바이스에서 이미지 처리 작업을 수행하기 위한 예시적인 흐름도를 도시한다.
도 7a 내지 도 7c는 본 개시 전반에 걸쳐 설명된 구현에 따른 웨어러블 컴퓨팅 디바이스에서 이미지 처리 작업을 수행하기 위한 예시적인 흐름도를 도시한다.
도 8a 및 도 8b는 본 개시 전반에 걸쳐 설명된 구현에 따른 이미지 처리 작업을 수행하기 위해 단일 이미지 센서로 듀얼 해상도 이미지 센서를 에뮬레이션하기 위한 예시적인 흐름도를 예시한다.
도 9는 본 개시 내용 전체에 걸쳐 설명된 구현에 따른 웨어러블 컴퓨팅 디바이스에서 이미지 처리 작업을 수행하는 프로세스의 일 예를 나타내는 흐름도이다.
도 10은 본 명세서에 설명된 기술과 함께 사용될 수 있는 컴퓨터 디바이스 및 모바일 컴퓨터 디바이스의 예를 도시한다.
다양한 도면의 유사한 참조 기호는 유사한 요소를 나타낸다.1 is an example of a wearable computing device for generating and processing image content using limited computational and/or power resources in accordance with implementations described throughout this disclosure.
2 illustrates a system for performing image processing in a wearable computing device according to implementations described throughout this disclosure.
3A and 3B illustrate examples of wearable computing devices according to implementations described throughout this disclosure.
4A-4C illustrate example flow diagrams for performing image processing tasks in a wearable computing device according to implementations described throughout this disclosure.
5A and 5B illustrate example flow diagrams for performing image processing tasks in a wearable computing device according to implementations described throughout this disclosure.
6A-6C illustrate example flow diagrams for performing image processing tasks in a wearable computing device according to implementations described throughout this disclosure.
7A-7C illustrate example flow diagrams for performing image processing tasks in a wearable computing device according to implementations described throughout this disclosure.
8A and 8B illustrate example flow diagrams for emulating a dual resolution image sensor with a single image sensor to perform image processing tasks according to implementations described throughout this disclosure.
9 is a flow diagram illustrating an example of a process for performing an image processing task in a wearable computing device according to implementations described throughout this disclosure.
10 shows examples of computer devices and mobile computer devices that can be used with the techniques described herein.
Similar reference symbols in the various drawings represent similar elements.

본 개시는 웨어러블 컴퓨팅 디바이스가 컴퓨터 비전 작업을 효율적으로 수행할 수 있게 할 수 있는 이미지 처리를 수행하기 위한 시스템 및 방법을 설명한다. 예를 들어, 본 문서에 설명된 시스템 및 방법은 프로세서, 센서, 신경망 및/또는 이미지 분석 알고리즘을 활용하여 이미지 내의 텍스트, 기호 및/또는 객체를 인식하고/하거나 이미지로부터 정보를 추출할 수 있다. 일부 구현에서, 본 명세서에 설명된 시스템 및 방법은 감소된 계산 모드 및/또는 감소된 전력 모드에서 동작하면서 이러한 작업들을 수행할 수 있다. 예를 들어, 여기에 설명된 시스템 및 방법은 컴퓨터 비전 및 이미지 분석 작업을 수행할 때 사용되는 이미지 데이터의 양을 최소화하여 특정 하드웨어 및/또는 디바이스 자원(예를 들어, 메모리, 프로세서, 네트워크 대역폭 등)의 사용을 줄일 수 있다.This disclosure describes systems and methods for performing image processing that can enable wearable computing devices to efficiently perform computer vision tasks. For example, the systems and methods described herein may utilize processors, sensors, neural networks, and/or image analysis algorithms to recognize text, symbols, and/or objects within an image and/or extract information from the image. In some implementations, the systems and methods described herein can perform these tasks while operating in a reduced compute mode and/or reduced power mode. For example, the systems and methods described herein can be used to minimize the amount of image data used when performing computer vision and image analysis tasks, thereby reducing the use of certain hardware and/or device resources (e.g., memory, processor, network bandwidth, etc.). ) can be reduced.

자원 사용을 줄이기 위해 웨어러블 컴퓨팅 디바이스를 구성하면 이러한 디바이스가 작업을 다른 디바이스(예를 들어, 서버, 모바일 디바이스, 기타 컴퓨터 등)로 전송하지 않고도 디바이스 내에서 상대적으로 저전력 계산 작업을 수행할 수 있다는 이점을 제공할 수 있다. 예를 들어, 전체 이미지를 분석하는 대신, 본 명세서에 설명된 시스템 및 방법은 이미지 데이터에서 관심 영역(ROI)들을 식별할 수 있으며 이러한 영역들 중 하나 이상은 웨어러블 컴퓨팅 디바이스에서 온보드 분석될 수 있다. 정확한 결과를 유지하면서 분석할 정보의 양을 줄이는 것은 복잡한 이미지 처리 작업이 전체 이미지를 분석하는 자원 부담 없이 웨어러블 컴퓨팅 디바이스에서 을 수행될 수 있게 해준다.The advantage of configuring wearable computing devices to reduce resource usage is that these devices can perform relatively low-power computational tasks within the device without transmitting the work to other devices (e.g., servers, mobile devices, other computers, etc.). can be provided. For example, instead of analyzing the entire image, the systems and methods described herein can identify regions of interest (ROIs) in the image data and one or more of these regions can be analyzed onboard in a wearable computing device. Reducing the amount of information to analyze while maintaining accurate results allows complex image processing tasks to be performed on wearable computing devices without the resource burden of analyzing the entire image.

일부 구현에서, 본 명세서에 설명된 시스템 및 방법은 디바이스가 계산 부하 및/또는 전력을 줄이면서 효율적인 이미지 처리 작업을 수행할 수 있도록 보장하기 위해 디바이스에서 실행되도록 구성될 수 있다. 예를 들어, 본 명세서에 설명된 시스템 및 방법은 웨어러블 디바이스가 전력, 메모리 및/또는 처리 소비를 줄이기 위해 특정 기술을 활용하면서 객체 감지 작업, 광학 문자 인식(OCR) 작업, 및/또는 기타 이미지 처리 작업을 수행할 수 있도록 할 수 있다.In some implementations, the systems and methods described herein can be configured to run on a device to ensure that the device can perform efficient image processing tasks while reducing computational load and/or power. For example, the systems and methods described herein allow wearable devices to perform object detection tasks, optical character recognition (OCR) tasks, and/or other image processing while utilizing certain technologies to reduce power, memory, and/or processing consumption. You can enable it to get the job done.

일부 구현에서, 본 명세서에 설명된 시스템 및 방법은 복잡한 이미지 처리 계산이 다른 자원 및/또는 디바이스의 지원 없이 웨어러블 디바이스에서 실행될 수 있음을 보장할 수 있다. 예를 들어, 기존 시스템은 계산량이 많은 이미지 처리 작업을 수행하기 위해 통신 가능하게 연결된 다른 모바일 디바이스, 서버 및/또는 오프보드 시스템에 지원을 요청할 수 있다. 본 명세서에 설명된 시스템 및 방법은 예를 들어 완전하고 정확한 이미지 처리 능력을 제공하면서 서버보다 계산 능력이 떨어지는 디바이스에 의해 동작될 수 있는 이미지 부분(예를 들어, ROI, 객체, 발췌 이미지 컨텐츠 등)을 생성하는 이점을 제공한다.In some implementations, the systems and methods described herein can ensure that complex image processing calculations can be performed on a wearable device without the support of other resources and/or devices. For example, an existing system may request assistance from other communicatively connected mobile devices, servers, and/or offboard systems to perform computationally intensive image processing tasks. The systems and methods described herein provide, for example, image portions (e.g., ROIs, objects, excerpted image content, etc.) that can be operated by a device with less computational power than a server while providing complete and accurate image processing capabilities. It provides the advantage of creating .

본 명세서에 설명된 시스템 및 방법은 웨어러블 컴퓨팅 디바이스가 객체 감지, 움직임 추적, 얼굴 인식, OCR 작업 등과 같은 컴퓨터 비전 작업을 수행하기 위해 낮은 전력 소비 및/또는 낮은 처리 소비를 활용하는 기계 학습 지능(예를 들어, 신경망, 알고리즘 등)을 사용하도록 할 수 있다.The systems and methods described herein allow wearable computing devices to utilize machine learning intelligence (e.g., low power consumption and/or low processing consumption) to perform computer vision tasks such as object detection, motion tracking, facial recognition, OCR tasks, etc. For example, neural networks, algorithms, etc.) can be used.

일부 구현에서, 본 명세서에 설명된 시스템 및 방법은 더 적은 처리 및/또는 전력 자원을 활용하기 위해 확장 가능한 레벨의 이미지 처리를 사용할 수 있다. 예를 들어, 저해상도 이미지 신호 프로세서는 저해상도 이미지 스트림에서 낮은 레벨의 이미지 처리 작업을 수행하는데 사용될 수 있는 반면, 고해상도 이미지 신호 프로세서는 고해상도 이미지 스트림에 대해 높은 레벨의 이미지 처리 작업을 수행하는데 사용될 수 있다. 일부 구현에서, 저해상도 이미지 신호 프로세서의 출력은 추가 센서, 검출기, 기계 학습 네트워크 및/또는 프로세서에 대한 입력으로서 제공될 수 있다. 이러한 출력은 대응하는 픽셀, 관심 영역, 및/또는 저해상도 이미지로부터 인식된 이미지 컨텐츠를 나타내는 이미지 컨텐츠를 포함하는 고해상도 이미지의 컨텐츠를 식별하는데 사용될 수 있는 이미지의 일부(예를 들어, 픽셀, 관심 영역, 인식된 이미지 컨텐츠 등)를 나타낼 수 있다.In some implementations, the systems and methods described herein may use scalable levels of image processing to utilize fewer processing and/or power resources. For example, a low-resolution image signal processor may be used to perform low-level image processing operations on a low-resolution image stream, while a high-resolution image signal processor may be used to perform high-level image processing operations on a high-resolution image stream. In some implementations, the output of the low-resolution image signal processor may be provided as input to additional sensors, detectors, machine learning networks, and/or processors. This output may be a portion of the image (e.g., a pixel, region of interest, recognized image content, etc.).

일부 구현에서, 본 명세서에 설명된 시스템 및 방법은 감지 및/또는 처리 작업을 분할하기 위해 웨어러블 컴퓨팅 디바이스에 탑재된(onboard) 하나 이상의 센서를 활용할 수 있다. 예를 들어, 웨어러블 컴퓨팅 디바이스는 복수의 이미지로부터 고해상도 이미지 스트림과 저해상도 이미지 스트림을 모두 검색할 수 있는 듀얼 스트림 센서(예를 들어, 카메라)를 포함할 수 있다. 예를 들어, 저해상도 이미지 스트림은 장면 카메라 역할을 하는 센서에 의해 캡처될 수 있다. 고해상도 이미지 스트림은 세부 카메라 역할을 하는 동일한 센서에 의해 캡처될 수 있다. 이미지 처리 작업을 수행하기 위해 하나 또는 두 개의 스트림이 서로 다른 프로세서에 제공될 수 있다. 일부 구현에서, 각각의 스트림은 컴퓨터 비전 작업, 기계 학습 작업, 및/또는 다른 이미지 처리 작업을 수행하는데 사용하기 위한 서로 다른 정보를 생성하는데 사용될 수 있다.In some implementations, the systems and methods described herein may utilize one or more sensors onboard a wearable computing device to partition the sensing and/or processing tasks. For example, a wearable computing device may include a dual stream sensor (e.g., a camera) capable of retrieving both high-resolution and low-resolution image streams from a plurality of images. For example, a low-resolution image stream may be captured by a sensor acting as a scene camera. A high-resolution image stream can be captured by the same sensor that acts as a detail camera. One or two streams may be provided to different processors to perform image processing tasks. In some implementations, each stream may be used to generate different information for use in performing computer vision tasks, machine learning tasks, and/or other image processing tasks.

도 1은 본 개시 전반에 걸쳐 설명된 구현에 따른 제한된 계산 및/또는 전력 자원을 사용하여 이미지 컨텐츠를 생성 및 처리하기 위한 웨어러블 컴퓨팅 디바이스(100)의 예이다. 이 예에서, 웨어러블 컴퓨팅 디바이스(100)는 AR 스마트 안경의 형태로 묘사된다. 그러나, 배터리 구동형 디바이스의 모든 폼 팩터는 본 명세서에 설명된 시스템 및 방법으로 대체되고 결합될 수 있다. 일부 구현에서, 웨어러블 컴퓨팅 디바이스(100)는 하나 이상의 센서, 저전력 아일랜드 프로세서, 고전력 아일랜드 프로세서, 코어 프로세서, 인코더 등 중 일부 또는 전부와 결합된 시스템 온 칩(SOC) 아키텍처(미도시)를 포함한다.1 is an example of a wearable computing device 100 for generating and processing image content using limited computational and/or power resources according to implementations described throughout this disclosure. In this example, wearable computing device 100 is depicted in the form of AR smart glasses. However, all form factors of battery-powered devices can be replaced and combined with the systems and methods described herein. In some implementations, wearable computing device 100 includes a system-on-chip (SOC) architecture (not shown) coupled with some or all of one or more sensors, low-power island processors, high-power island processors, core processors, encoders, etc.

동작 시, 웨어러블 컴퓨팅 디바이스(100)는 카메라, 이미지 센서 등을 통해 장면(102)을 캡처할 수 있다. 웨어러블 컴퓨팅 디바이스(100)는 사용자(104)에 의해 착용되고 작동될 수 있다. 장면(102)은 물리적 컨텐츠뿐만 아니라 증강 현실(AR) 컨텐츠를 포함할 수 있다. 일부 구현예에서, 웨어러블 컴퓨팅 디바이스(100)는 모바일 컴퓨팅 디바이스(106)와 같은 다른 디바이스와 통신 가능하게 결합될 수 있다.When operating, the wearable computing device 100 may capture the scene 102 through a camera, an image sensor, etc. Wearable computing device 100 may be worn and operated by user 104. Scene 102 may include augmented reality (AR) content as well as physical content. In some implementations, wearable computing device 100 may be communicatively coupled with another device, such as mobile computing device 106.

동작 시, 모바일 컴퓨팅 디바이스(106)는 듀얼 스트림 이미지 센서(108)(예를 들어, 듀얼 해상도 이미지 센서)를 포함할 수 있다. 듀얼 스트림 이미지 센서(108)는 저해상도(예를 들어, 경량) 이미지 신호 프로세서(112)에 저해상도 이미지 스트림(110)을 제공할 수 있다. 듀얼 스트림 이미지 센서(108)는 고해상도 이미지 스트림(114)을 버퍼(116)에 동시에 제공할 수 있다. 고해상도 이미지 신호 프로세서(118)는 추가 처리를 수행하기 위해 버퍼(116)로부터 고해상도 이미지 스트림(114)을 획득할 수 있다. 저해상도 이미지 스트림(110)은 예를 들어 ROI 검출기(120)를 통해 하나 이상의 ROI(예를 들어 텍스트, 객체, 손, 문단 등)를 식별하는 ML 모델에 제공될 수 있다. 프로세서(112)는 하나 이상의 ROI와 관련된 ROI 좌표(122)(예를 들어, 발췌 좌표)를 결정할 수 있다.In operation, mobile computing device 106 may include a dual stream image sensor 108 (e.g., a dual resolution image sensor). Dual stream image sensor 108 may provide a low-resolution image stream 110 to a low-resolution (e.g., lightweight) image signal processor 112. The dual stream image sensor 108 can simultaneously provide high-resolution image streams 114 to the buffer 116. High-resolution image signal processor 118 may obtain high-resolution image stream 114 from buffer 116 to perform further processing. Low-resolution image stream 110 may be provided to a ML model that identifies one or more ROIs (e.g., text, object, hand, paragraph, etc.), for example, via ROI detector 120. Processor 112 may determine ROI coordinates 122 (e.g., excerpt coordinates) associated with one or more ROIs.

고해상도 이미지 신호 프로세서(112)는 ROI 좌표(122)를 수신(또는 요청)할 수 있고 ROI 좌표(122)를 사용하여 고해상도 이미지 스트림(114)으로부터 하나 이상의 고해상도 이미지를 발췌(crop)할 수 있다. 예를 들어, 프로세서(112)는 고해상도 이미지 스트림(114) 내에서 어떤 프레임과 영역이 ROI가 식별된 연관된 저해상도 이미지 프레임과 동일한 캡처 시간과 일치하는지 결정하기 위해 저해상도 이미지 스트림(110)과 연관된 ROI 좌표(122)를 검색할 수 있다. 그런 다음 고해상도 이미지 신호 프로세서(118)는 이미지를 동일한 발췌 ROI 좌표(122)로 발췌할 수 있다. 발췌(된) 이미지(124)는 ML 엔진(예를 들어, ML 프로세서(126))에 제공되어 OCR, 객체 인식, 심층 문단화(deep paragraphing), 손 감지, 얼굴 감지, 기호 감지 등을 포함하는 컴퓨터 비전 작업을 수행할 수 있지만 이에 한정되지 않는다.High-resolution image signal processor 112 may receive (or request) ROI coordinates 122 and use ROI coordinates 122 to crop one or more high-resolution images from high-resolution image stream 114 . For example, processor 112 may coordinate ROI associated with low-resolution image stream 110 to determine which frames and regions within high-resolution image stream 114 coincide with the same capture time as the associated low-resolution image frame for which the ROI has been identified. You can search (122). The high-resolution image signal processor 118 can then extract the image to the same extracted ROI coordinates 122. The excerpted image 124 is provided to an ML engine (e.g., ML processor 126) to perform functions including OCR, object recognition, deep paragraphing, hand detection, face detection, symbol detection, etc. Can perform, but is not limited to, computer vision tasks.

비-제한적인 예에서, 웨어러블 컴퓨팅 디바이스(100)는 듀얼 스트림 이미지 센서(108)에 의해 캡처된 광학 데이터와 관련된 이미지 컨텐츠를 식별하라는 요청에 응답하여 실시간 이미지 처리를 시작하도록 트리거될 수 있다. 요청은 웨어러블 컴퓨팅 디바이스(100)를 착용한 사용자로부터일 수 있다. 실시간 이미지 처리는 제1 센서 데이터 스트림(예를 들어, 저해상도 이미지 스트림(110)) 및 제2 센서 데이터 스트림(예를 들어, 고해상도 이미지 스트림(114))을 검출하는 것으로 시작될 수 있다. 그런 다음, 웨어러블 컴퓨팅 디바이스(100)는 적어도 하나의 관심 영역(예를 들어, ROI(128) 및 ROI(130))을 식별할 수 있다. 이 예에서, ROI(128)는 아날로그 시계를 포함하고 ROI(130)는 프레젠테이션의 텍스트와 이미지를 포함한다. 디바이스(100)는 제 1 스트림(예를 들어, 저해상도 이미지 스트림(110))을 분석하여 ROI(128 및 130)에 대한 발췌 좌표를 결정할 수 있다.In a non-limiting example, wearable computing device 100 may be triggered to begin real-time image processing in response to a request to identify image content associated with optical data captured by dual stream image sensor 108. The request may be from a user wearing wearable computing device 100. Real-time image processing may begin with detecting a first sensor data stream (e.g., low-resolution image stream 110) and a second sensor data stream (e.g., high-resolution image stream 114). Wearable computing device 100 can then identify at least one region of interest (e.g., ROI 128 and ROI 130). In this example, ROI 128 contains an analog clock and ROI 130 contains the text and images of the presentation. Device 100 may analyze the first stream (e.g., low-resolution image stream 110) to determine excerpt coordinates for ROIs 128 and 130.

웨어러블 컴퓨팅 디바이스(100)는 저해상도 이미지 스트림(110)으로부터 ROI(128 및 130)에 대응하는 고해상도 이미지 스트림(114)의 하나 이상의 이미지 프레임을 발췌하기 위해 버퍼(116)로부터 고해상도 이미지 스트림(114)을 획득하여 고해상도 이미지 스트림(114) 내의 동일한 대응(예를 들어, 동일한 캡처 시간, 프레임 내의 동일한 발췌 좌표) ROI(128 및 130)를 식별할 수 있다. 발췌된 하나 이상의 이미지 프레임은 온보드(탑재) ML 프로세서(126)에 제공되어 발췌된 이미지에 대한 추가 처리를 수행할 수 있다. 이미지 분석을 위해 더 적은 양의 이미지 컨텐츠(예를 들어, 더 적은 시각 센서 데이터, 이미지 일부 등)를 제공하기 위해 이미지를 발췌하면, 전체 이미지를 분석하도록 요청되는 기존 ML 프로세서보다 적은 전력과 지연 시간(latency)을 사용하여 ML 처리가 수행될 수 있다. 따라서, 웨어러블 컴퓨팅 디바이스(100)는 OCR, 객체 인식, 심층 문단화, 손 검출, 얼굴 검출, 기호 검출 등과 같은 작업에 대한 ML 처리를 실시간으로 수행할 수 있다.Wearable computing device 100 extracts high-resolution image stream 114 from buffer 116 to extract one or more image frames of high-resolution image stream 114 corresponding to ROIs 128 and 130 from low-resolution image stream 110. Acquisition may identify ROIs 128 and 130 with identical correspondence (e.g., same capture time, same excerpt coordinates within frame) within the high-resolution image stream 114. One or more extracted image frames may be provided to an onboard ML processor 126 to perform further processing on the extracted images. Excerpting an image to provide a smaller amount of image content (e.g., less visual sensor data, portions of an image, etc.) for image analysis requires less power and latency than traditional ML processors that are asked to analyze the entire image. ML processing can be performed using (latency). Accordingly, the wearable computing device 100 can perform ML processing for tasks such as OCR, object recognition, deep paragraphing, hand detection, face detection, symbol detection, etc. in real time.

예를 들어, 웨어러블 컴퓨팅 디바이스(100)는 ROI(130)를 식별하고, 이미지 발췌를 수행하고, ML 처리(processing)를 사용하여 사용자(104)의 손이 ROI(130)의 이미지 및 텍스트를 가리키는지(point) 결정할 수 있다. 처리는 ROI(130)에 대한 손 검출 및 OCR을 수행할 수 있다. 유사하게, 프리젠테이션 중에 오디오 컨텐츠를 사용할 수 있는 경우, 웨어러블 컴퓨팅 디바이스(100)는 사용자(104)가 ROI(130)를 가리킨 때의 타임스탬프에 대해 ROI(128)를 평가할 수 있다. 타임스탬프(예를 들어, 9:05)는 ROI(130)와 관련된 오디오 데이터(예를 들어, 전사)를 검색하고 분석하는데 사용될 수 있다. 이러한 오디오 데이터는 웨어러블 컴퓨팅 디바이스(100)의 디스플레이에 프리젠테이션하기 위해 시각적 텍스트로 변환될 수 있다. 따라서, ML 프로세서(126)는 ROI들을 선택할 수 있지만, 출력(132)에 도시된 바와 같이 ROI들을 상관시킬 수도 있다. 출력(132)은 웨어러블 컴퓨팅 디바이스(100)의 디스플레이 상에 제시될 수 있다.For example, wearable computing device 100 may identify an ROI 130, perform image extraction, and use ML processing to indicate that the user's 104 hand is pointing to the image and text in ROI 130. The point can be decided. Processing may perform hand detection and OCR for ROI 130. Similarly, if audio content is available during the presentation, wearable computing device 100 may evaluate ROI 128 relative to the timestamp when user 104 pointed to ROI 130. A timestamp (e.g., 9:05) may be used to retrieve and analyze audio data (e.g., transcription) associated with ROI 130. This audio data may be converted to visual text for presentation on the display of wearable computing device 100. Accordingly, ML processor 126 may select the ROIs, but may also correlate the ROIs as shown in output 132. Output 132 may be presented on a display of wearable computing device 100.

다수의 프로세서 블록(예를 들어, 프로세서(112), 프로세서(118), 프로세서(126 등))이 도시되어 있지만, 단일 프로세서가 웨어러블 컴퓨팅 디바이스(100)상의 모든 처리 작업을 수행하는데 활용될 수 있다. 즉, 각각의 프로세서 블록(112, 118, 126)은 웨어러블 컴퓨팅 디바이스(100)에 탑재된 단일 프로세서에서 실행될 수 있는 서로 다른 알고리즘 또는 코드 스니펫(조각)을 나타낼 수 있다.Although multiple processor blocks are shown (e.g., processor 112, processor 118, processor 126, etc.), a single processor may be utilized to perform all processing tasks on wearable computing device 100. . That is, each processor block 112, 118, and 126 may represent a different algorithm or code snippet (piece) that can be executed on a single processor mounted on the wearable computing device 100.

일부 구현에서, 객체 및/또는 ROI는 프로세서(112 또는 118)에 의해 발췌되기보다는 듀얼 스트림 이미지 센서(108)에서 발췌될 수 있다. 일부 구현에서, 위에 상세히 설명된 바와 같이, 객체 및/또는 ROI는 이미지 신호 프로세서(112) 또는 이미지 신호 프로세서(118)에서 발췌될 수 있다. 일부 구현예에서, ROI는 대신 준비되어 웨어러블 컴퓨팅 디바이스(100)와 통신하는 모바일 디바이스로 전송될 수 있다. 모바일 디바이스는 ROI를 발췌하고 발췌된 이미지를 다시 웨어러블 컴퓨팅 디바이스(100)에 제공할 수 있다.In some implementations, objects and/or ROIs may be extracted from dual stream image sensor 108 rather than by processor 112 or 118. In some implementations, objects and/or ROIs may be extracted from image signal processor 112 or image signal processor 118, as detailed above. In some implementations, the ROI may instead be prepared and transmitted to a mobile device that communicates with wearable computing device 100. The mobile device may extract the ROI and provide the extracted image back to the wearable computing device 100.

도 2는 본 개시 전체에 걸쳐 설명된 구현에 따른 웨어러블 컴퓨팅 디바이스(100)에서 이미지 처리를 수행하기 위한 시스템(200)을 예시한다. 일부 구현에서, 이미지 처리는 웨어러블 컴퓨팅 디바이스(100)에서 수행된다. 일부 구현에서, 이미지 처리는 하나 이상의 디바이스 간에 공유된다. 예를 들어, 이미지 처리는 웨어러블 컴퓨팅 디바이스(100)에서 부분적으로 완료될 수 있고, 모바일 컴퓨팅 디바이스(202)(예를 들어, 모바일 컴퓨팅 디바이스(106)) 및/또는 서버 컴퓨팅 디바이스(204)에서 부분적으로 완료될 수 있다. 일부 구현에서, 이미지 처리는 웨어러블 컴퓨팅 디바이스(100)에서 수행되고 이러한 처리로부터의 출력은 모바일 컴퓨팅 디바이스(202) 및/또는 서버 컴퓨팅 디바이스(204)에 제공된다.2 illustrates a system 200 for performing image processing in a wearable computing device 100 according to implementations described throughout this disclosure. In some implementations, image processing is performed in wearable computing device 100. In some implementations, image processing is shared between one or more devices. For example, image processing may be completed partially at wearable computing device 100 and partially at mobile computing device 202 (e.g., mobile computing device 106) and/or server computing device 204. It can be completed with In some implementations, image processing is performed at wearable computing device 100 and output from such processing is provided to mobile computing device 202 and/or server computing device 204.

일부 구현에서, 웨어러블 컴퓨팅 디바이스(100)는 하나 이상의 컴퓨팅 디바이스를 포함하며, 여기서 디바이스 중 적어도 하나는 사람의 피부 위에 또는 그 근처에 착용될 수 있는 디스플레이 디바이스이다. 일부 예에서, 웨어러블 컴퓨팅 디바이스(100)는 하나 이상의 웨어러블 컴퓨팅 디바이스 구성요소(components)이거나 이를 포함한다. 일부 구현에서, 웨어러블 컴퓨팅 디바이스(100)는 광학 헤드 장착형 디스플레이(OHMD) 디바이스, 투명 헤드업 디스플레이(HUD)) 디바이스, 가상 현실(VR) 디바이스, 증강현실(AR) 디바이스와 같은 헤드 장착형 디스플레이(HMD) 디바이스, 또는 센서, 디스플레이 및 컴퓨팅 기능을 갖춘 고글이나 헤드셋과 같은 기타 디바이스를 포함할 수 있다. 일부 구현에서, 웨어러블 컴퓨팅 디바이스(100)는 AR 안경(예를 들어, 스마트 안경)을 포함한다. AR 안경은 한쌍의 안경 모양으로 설계된 광학 헤드 장착형 디스플레이 디바이스를 나타낸다. 일부 구현예에서, 웨어러블 컴퓨팅 디바이스(100)는 스마트 시계이거나 이를 포함한다. 일부 구현에서, 웨어러블 컴퓨팅 디바이스(100)는 보석이거나 보석류를 포함한다. 일부 구현예에서, 웨어러블 컴퓨팅 디바이스(100)는 링 컨트롤러 디바이스 또는 기타 웨어러블 컨트롤러이거나 이를 포함한다. 일부 구현예에서, 웨어러블 컴퓨팅 디바이스(100)는 이어버드/헤드폰 또는 스마트 이어버드/헤드폰이거나 이를 포함한다.In some implementations, wearable computing device 100 includes one or more computing devices, where at least one of the devices is a display device that can be worn on or near a person's skin. In some examples, wearable computing device 100 is or includes one or more wearable computing device components. In some implementations, wearable computing device 100 is a head-mounted display (HMD), such as an optical head-mounted display (OHMD) device, a transparent head-up display (HUD) device, a virtual reality (VR) device, or an augmented reality (AR) device. ) device, or other devices such as goggles or headsets with sensors, displays, and computing capabilities. In some implementations, wearable computing device 100 includes AR glasses (eg, smart glasses). AR glasses represent an optical head-mounted display device designed to resemble a pair of glasses. In some implementations, wearable computing device 100 is or includes a smart watch. In some implementations, wearable computing device 100 is jewelry or includes jewelry. In some implementations, wearable computing device 100 is or includes a ring controller device or other wearable controller. In some implementations, wearable computing device 100 is or includes earbuds/headphones or smart earbuds/headphones.

도 2에 도시된 바와 같이, 시스템(200)은 모바일 컴퓨팅 디바이스(202) 및 선택적으로 서버 컴퓨팅 디바이스(204)에 통신 가능하게 결합된 웨어러블 컴퓨팅 디바이스(100)를 포함한다. 일부 구현에서, 통신 가능한 결합은 네트워크(206)를 통해 발생할 수 있다. 일부 구현에서, 통신 가능한 결합은 웨어러블 컴퓨팅 디바이스(100), 모바일 컴퓨팅 디바이스(202) 및/또는 서버 컴퓨팅 디바이스(204) 사이에서 직접 발생할 수 있다.As shown in FIG. 2 , system 200 includes a wearable computing device 100 communicatively coupled to a mobile computing device 202 and, optionally, a server computing device 204 . In some implementations, communicable coupling may occur over network 206. In some implementations, communicable coupling may occur directly between wearable computing device 100, mobile computing device 202, and/or server computing device 204.

웨어러블 컴퓨팅 디바이스(100)는 하나 이상의 기계 실행 가능 명령이나 소프트웨어, 펌웨어 또는 이들의 조합을 실행하도록 구성된 기판에 형성될 수 있는 하나 이상의 프로세서(208)를 포함한다. 프로세서(208)는 반도체 기반일 수 있고 디지털 로직을 수행할 수 있는 반도체 재료를 포함할 수 있다. 프로세서(208)는 몇 가지 예를 들면 CPU, GPU, 및/또는 DSP를 포함할 수 있다.Wearable computing device 100 includes one or more processors 208 that may be formed on a substrate configured to execute one or more machine-executable instructions or software, firmware, or a combination thereof. Processor 208 may be semiconductor-based and may include semiconductor materials capable of performing digital logic. Processor 208 may include a CPU, GPU, and/or DSP, to name a few.

웨어러블 컴퓨팅 디바이스(100)는 또한 하나 이상의 메모리 디바이스(210)를 포함할 수 있다. 메모리 디바이스(210)는 프로세서(들)(208)에 의해 판독 및/또는 실행될 수 있는 포멧으로 정보를 저장하는 임의 유형의 저장 디바이스를 포함할 수 있다. 메모리 디바이스(210)는 프로세서(들)(208)에 의해 실행될 때 특정 동작들을 수행하는 애플리케이션 및 모듈을 저장할 수 있다. 일부 예에서, 애플리케이션 및 모듈은 외부 저장 디바이스에 저장되고 메모리 디바이스(210)에 로드될 수 있다. 메모리(210)는 예를 들어 웨어러블 컴퓨팅 디바이스(100)에 대한 이미지 컨텐츠 및/또는 오디오 컨텐츠를 저장하고 검색하기 위해 버퍼(212)를 포함하거나 버퍼(212)에 액세스할 수 있다.Wearable computing device 100 may also include one or more memory devices 210. Memory device 210 may include any type of storage device that stores information in a format that can be read and/or executed by processor(s) 208. Memory device 210 may store applications and modules that perform certain operations when executed by processor(s) 208 . In some examples, applications and modules may be stored on an external storage device and loaded into memory device 210. Memory 210 may include or access buffer 212, for example, to store and retrieve image content and/or audio content for wearable computing device 100.

웨어러블 컴퓨팅 디바이스(100)는 센서 시스템(214)을 포함한다. 센서 시스템(214)은 이미지 데이터를 검출 및/또는 획득하도록 구성된 하나 이상의 이미지 센서(216)를 포함한다. 일부 구현에서, 센서 시스템(214)은 다수의 이미지 센서(216)를 포함한다. 도시된 바와 같이, 센서 시스템(214)은 하나 이상의 이미지 센서(216)를 포함한다. 이미지 센서(216)는 이미지(예를 들어, 픽셀, 프레임 및/또는 이미지의 일부) 및 비디오를 캡처하고 레코딩할 수 있다.Wearable computing device 100 includes sensor system 214. Sensor system 214 includes one or more image sensors 216 configured to detect and/or acquire image data. In some implementations, sensor system 214 includes multiple image sensors 216. As shown, sensor system 214 includes one or more image sensors 216. Image sensor 216 may capture and record images (e.g., pixels, frames, and/or portions of images) and video.

일부 구현에서, 이미지 센서(216)는 RGB(Red Green Blue) 카메라이다. 일부 예에서, 이미지 센서(216)는 펄스 레이저 센서(예를 들어, LiDAR 센서) 및/또는 깊이 카메라를 포함한다. 예를 들어, 이미지 센서(216)는 이미지 프레임(226)으로 표시되는 이미지를 만드는 데 사용되는 정보를 검출하고 전달하도록 구성된 카메라일 수 있다. 이미지 센서(216)는 이미지와 비디오를 모두 캡처하고 레코딩할 수 있다.In some implementations, image sensor 216 is a Red Green Blue (RGB) camera. In some examples, image sensor 216 includes a pulsed laser sensor (eg, LiDAR sensor) and/or a depth camera. For example, image sensor 216 may be a camera configured to detect and convey information used to create an image represented by image frame 226. Image sensor 216 is capable of capturing and recording both images and video.

동작 시, 이미지 센서(216)는 웨어러블 컴퓨팅 디바이스(100)가 활성화되는 동안 연속적으로 또는 주기적으로 이미지 데이터(예를 들어, 광학 센서 데이터)를 획득(예를 들어, 캡처)하도록 구성된다. 일부 구현에서, 이미지 센서(216)는 상시-온(항상 켜져 있는) 센서로 작동하도록 구성된다. 일부 구현에서, 이미징 센서(216)는 관심 객체 또는 관심 영역의 검출에 응답하여 활성화될 수 있다.In operation, image sensor 216 is configured to acquire (e.g., capture) image data (e.g., optical sensor data) continuously or periodically while wearable computing device 100 is activated. In some implementations, image sensor 216 is configured to operate as an always-on (always on) sensor. In some implementations, imaging sensor 216 may be activated in response to detection of an object of interest or region of interest.

일부 구현에서, 이미지 센서(216)는 듀얼 해상도 스트리밍 모드에서 기능하도록 구성된 단일 센서이다. 예를 들어, 이미지 센서(216)는 낮은 이미지 해상도를 갖는 복수의 이미지를 캡처(예를 들어, 수신, 획득 등)하는 저해상도 스트리밍 모드에서 동작할 수 있다. 센서(216)는 높은 이미지 해상도로 동일한 복수의 이미지를 캡처(예를 들어, 수신, 획득 등)하는 고해상도 스트리밍 모드에서 동시에 작동할 수 있다. 듀얼 모드는 복수의 이미지를 낮은 이미지 해상도의 제1 이미지 스트림과 높은 이미지 해상도의 제2 이미지 스트림으로 캡처하거나 검색할 수 있다.In some implementations, image sensor 216 is a single sensor configured to function in dual resolution streaming mode. For example, image sensor 216 may operate in a low-resolution streaming mode to capture (e.g., receive, acquire, etc.) multiple images with low image resolution. Sensor 216 may operate simultaneously in a high-resolution streaming mode, capturing (e.g., receiving, acquiring, etc.) multiple images of the same at a high image resolution. Dual mode can capture or retrieve multiple images as a first image stream with a low image resolution and a second image stream with a high image resolution.

일부 구현예에서, 웨어러블 컴퓨팅 디바이스(100)는 더 적은 처리 및/또는 전력 자원을 활용하기 위해 이미지 처리의 확장 가능한 레벨을 사용하도록 이미지 센서(216)를 채용할 수 있다. 예를 들어, 본 명세서에 설명된 시스템 및 방법은 두 개 이상의 서로 다른 유형의 이미지 신호 프로세서를 사용함으로써 낮은 레벨(수준)의 이미지 처리, 중간 레벨의 이미지 처리, 높은 레벨의 이미지 처리 및/또는 레벨 사이의 조합 및/또는 레벨의 임의의 조합을 사용할 수 있다. 일부 구현에서, 본 명세서에 설명된 시스템 및 방법은 이미지 처리 기술을 수행할 때 이미지의 해상도를 변경할 수도 있다.In some implementations, wearable computing device 100 may employ image sensor 216 to use scalable levels of image processing to utilize fewer processing and/or power resources. For example, the systems and methods described herein may provide low-level image processing, mid-level image processing, high-level image processing, and/or level image processing by using two or more different types of image signal processors. Any combination of levels and/or levels may be used. In some implementations, the systems and methods described herein may change the resolution of an image when performing image processing techniques.

본 명세서에서 사용된 바와 같이, 높은 레벨의 처리에는 고해상도 이미지 컨텐츠에 대한 이미지 처리가 포함된다. 예를 들어, 컨텐츠(예를 들어, 객체) 인식을 수행하기 위한 컴퓨터 비전 기술은 높은 레벨의 처리로 간주될 수 있다. 중간 레벨 이미지 처리에는 예를 들어 장면 부분의 위치 및/또는 모양을 나타낼 수 있는 이미지 메타데이터로부터 장면 설명을 추출하는 처리 작업이 포함된다. 낮은 레벨의 이미지 처리에는 장면 객체와 관찰자의 세부 정보는 무시하면서 이미지로부터 설명을 추출하는 작업이 포함된다.As used herein, high level processing includes image processing for high resolution image content. For example, computer vision techniques to perform content (e.g., object) recognition may be considered high-level processing. Mid-level image processing includes processing operations that extract scene descriptions from image metadata, which may indicate, for example, the location and/or shape of parts of the scene. Low-level image processing involves extracting descriptions from images while ignoring details of scene objects and observers.

센서 시스템(214)은 또한 관성 운동 장치(IMU) 센서(218)를 포함할 수 있다. IMU 센서(218)는 웨어러블 컴퓨팅 디바이스(100)의 움직임(motion), 이동(movement), 및/또는 가속도를 검출할 수 있다. IMU 센서(218)는 예를 들어 가속도계, 자이로스코프, 자력계 및 기타 센서와 같은 다양한 다른 유형의 센서를 포함할 수 있다.Sensor system 214 may also include an inertial motion unit (IMU) sensor 218. IMU sensor 218 may detect motion, movement, and/or acceleration of the wearable computing device 100. IMU sensors 218 may include various other types of sensors, such as accelerometers, gyroscopes, magnetometers, and other sensors.

일부 구현에서, 센서 시스템(214)은 웨어러블 컴퓨팅 디바이스(100)에 의해 수신된 오디오를 검출하도록 구성된 오디오 센서(220)를 또한 포함할 수 있다. 센서 시스템(214)은 광 센서, 거리 및/또는 근접 센서, 용량성 센서와 같은 접촉 센서, 타이머, 및/또는 기타 센서 및/또는 상이한 센서 조합(들)과 같은 다른 유형의 센서를 포함할 수 있다. 센서 시스템(214)은 웨어러블 컴퓨팅 디바이스(100)의 위치 및/또는 방향과 연관된 정보를 획득하는 데 사용될 수 있다.In some implementations, sensor system 214 may also include an audio sensor 220 configured to detect audio received by wearable computing device 100. Sensor system 214 may include other types of sensors, such as optical sensors, distance and/or proximity sensors, contact sensors such as capacitive sensors, timers, and/or other sensors, and/or combination(s) of different sensors. there is. Sensor system 214 may be used to obtain information related to the location and/or orientation of wearable computing device 100.

웨어러블 컴퓨팅 디바이스(100)는 또한 저해상도 이미지 스트림(예를 들어, 저해상도 이미지 스트림(110))으로부터 이미지를 처리하기 위해 저해상도 이미지 신호 프로세서(222)를 포함할 수 있다. 저해상도 이미지 신호 프로세서(222)는 저전력 이미지 신호 프로세서라고도 지칭될 수 있다. 웨어러블 컴퓨팅 디바이스(100)는 또한 고해상도 이미지 스트림(예를 들어, 고해상도 이미지 스트림(114))으로부터 이미지를 처리하기 위해 고해상도 이미지 신호 프로세서(224)를 포함할 수 있다. 고해상도 이미지 신호 프로세서(224)는 (저해상도 이미지 신호 프로세서(222)와 관련하여) 고전력 이미지 신호 프로세서라고도 지칭될 수 있다. 이미지 스트림은 각각 특정 이미지 해상도(228)를 갖는 이미지 프레임(226)을 포함할 수 있다. 작동 시, 이미지 센서(216)는 저전력, 저해상도(LPLR) 모드 및/또는 고전력, 고해상도(HPLR) 모드 중 하나 또는 둘 모두에서 동작하는 듀얼(이중) 기능일 수 있다.Wearable computing device 100 may also include a low-resolution image signal processor 222 to process images from a low-resolution image stream (e.g., low-resolution image stream 110). The low-resolution image signal processor 222 may also be referred to as a low-power image signal processor. Wearable computing device 100 may also include a high-resolution image signal processor 224 to process images from a high-resolution image stream (e.g., high-resolution image stream 114). High-resolution image signal processor 224 may also be referred to as a high-power image signal processor (with respect to low-resolution image signal processor 222). The image stream may include image frames 226 each having a specific image resolution 228. In operation, the image sensor 216 may be dual-functional, operating in one or both low power, low resolution (LPLR) mode and/or high power, high resolution (HPLR) mode.

웨어러블 컴퓨팅 디바이스(100)는 ROI들(232) 및/또는 객체들(234)을 검출하도록 구성된 관심 영역(ROI) 검출기(230)를 포함할 수 있다. 또한, 웨어러블 컴퓨팅 디바이스(100)는 발췌 좌표(238)에 따라 이미지 컨텐츠(예를 들어, 이미지 프레임(226)) 및 ROI(232)를 발췌하도록 구성된 발췌기(cropper)(236)를 포함할 수 있다. 일부 구현에서, 웨어러블 컴퓨팅 디바이스(100)는 특정 픽셀 블록, 이미지 컨텐츠 등을 압축하도록 구성된 인코더(240)를 포함한다. 인코더(240)는 예를 들어 버퍼(212)로부터 압축용 이미지 컨텐츠를 수신할 수 있다.Wearable computing device 100 may include a region of interest (ROI) detector 230 configured to detect ROIs 232 and/or objects 234 . Additionally, wearable computing device 100 may include a cropper 236 configured to extract image content (e.g., image frame 226) and ROI 232 according to cropping coordinates 238. there is. In some implementations, wearable computing device 100 includes an encoder 240 configured to compress specific pixel blocks, image content, etc. The encoder 240 may receive image content for compression from the buffer 212, for example.

저해상도 이미지 신호 프로세서(222)는 저전력 계산을 수행하여 센서(216)로부터 생성된 이미지 스트림을 분석하여 이미지(들) 내의 객체 및/또는 관심 영역을 검출할 수 있다. 검출은 손 검출, 객체 검출, 문단 검출 등을 포함할 수 있다. 관심 객체 및/또는 ROI가 검출되고 그러한 영역 또는 객체의 검출이 임계 조건(256)을 충족하는 경우, 객체 및/또는 영역과 관련된 경계 상자 및/또는 다른 좌표가 식별될 수 있다. 임계 조건은 이미지 스트림의 이미지 프레임에 있는 객체 또는 영역에 대한 특정한 높은 품질 레벨을 나타낼 수 있다. 예를 들어, 임계 조건은 객체 또는 영역이 초점 내에 있는지, 여전히 낮은 블러(예를 들어, 낮은 모션 블러)를 갖는지 및/또는 올바른 노출을 갖는지 및/또는 컴퓨팅 디바이스의 뷰에서 특정 객체가 검출는지 결정하는 것 중 일부 또는 전부와 관련될 수 있다.Low-resolution image signal processor 222 may perform low-power calculations to analyze the image stream generated from sensor 216 to detect objects and/or regions of interest within the image(s). Detection may include hand detection, object detection, paragraph detection, etc. If an object of interest and/or ROI is detected and detection of such region or object meets threshold condition 256, bounding boxes and/or other coordinates associated with the object and/or region may be identified. A threshold condition may indicate a specific high quality level for an object or region in an image frame of an image stream. For example, a threshold condition determines whether an object or area is in focus, still has low blur (e.g., low motion blur), and/or has correct exposure, and/or whether a particular object is detected in the computing device's view. It may relate to any or all of the following:

웨어러블 컴퓨팅 디바이스(100)는 또한 무선 신호를 통해 다른 컴퓨팅 디바이스와 통신하도록 구성된 하나 이상의 안테나(242)를 포함할 수 있다. 예를 들어, 웨어러블 컴퓨팅 디바이스(100)는 하나 이상의 무선 신호를 수신하고, 무선 신호를 사용하여 모바일 컴퓨팅 디바이스(202) 및/또는 서버 컴퓨팅 디바이스(204)와 같은 다른 디바이스, 또는 안테나(242) 범위 내의 다른 디바이스와 통신할 수 있다. 무선 신호는 단거리 연결(예를 들어, 블루투스 연결, 근거리 통신(NFC) 연결)이나 인터넷 연결(예를 들어, Wi-Fi, 모바일 네트워크)과 같은 무선 연결을 통해 트리거될 수 있다.Wearable computing device 100 may also include one or more antennas 242 configured to communicate with other computing devices via wireless signals. For example, wearable computing device 100 may receive one or more wireless signals and use the wireless signals to communicate with other devices, such as mobile computing device 202 and/or server computing device 204, or within range of antenna 242. You can communicate with other devices within it. The wireless signal may be triggered via a wireless connection, such as a short-range connection (e.g., a Bluetooth connection, a near-field communication (NFC) connection) or an Internet connection (e.g., Wi-Fi, a mobile network).

웨어러블 컴퓨팅 디바이스(100)는 디스플레이(244)를 포함한다. 디스플레이(244)는 LCD, LED 디스플레이, OLED, EPD(Electro-Poretic Display), 또는 LED 광원을 채용하는 마이크로 프로젝션 디스플레이 등을 포함할 수 있다. 일부 예에서, 디스플레이(244)는 사용자의 시야에 투사된다. 일부 예에서, AR 안경의 경우, 디스플레이(244)는 AR 안경을 착용한 사용자가 디스플레이(244)에 의해 제공되는 이미지 볼 수 있을 뿐만 아니라 투사된 이미지 뒤의 AR 안경 시야에 위치한 정보도 볼 수 있도록 투명 또는 반투명 디스플레이를 제공할 수 있다.Wearable computing device 100 includes a display 244 . The display 244 may include an LCD, an LED display, an OLED, an electro-poretic display (EPD), or a micro-projection display employing an LED light source. In some examples, display 244 is projected into the user's field of view. In some examples, in the case of AR glasses, display 244 allows a user wearing AR glasses to view images presented by display 244 as well as information located in the AR glasses field of view behind the projected image. Transparent or translucent displays can be provided.

웨어러블 컴퓨팅 디바이스(100)는 또한 웨어러블 컴퓨팅 디바이스(100)의 동작을 용이하게 하기 위한 다양한 제어 시스템 디바이스를 포함한 제어 시스템(246)을 포함한다. 제어 시스템(246)은 웨어러블 컴퓨팅 디바이스(100)의 구성요소에 작동 가능하게 연결된 프로세서(208), 센서 시스템(214) 및/또는 프로세서(248)(예를 들어, CPU, GPUS, DSP 등)를 사용할 수 있다.Wearable computing device 100 also includes a control system 246 that includes various control system devices to facilitate operation of wearable computing device 100. Control system 246 may include processor 208, sensor system 214, and/or processor 248 (e.g., CPU, GPUS, DSP, etc.) operably connected to components of wearable computing device 100. You can use it.

웨어러블 컴퓨팅 디바이스(100)는 또한 UI 렌더러(250)를 포함한다. UI 렌더러(250)는 디스플레이(244)와 함께 기능하여 웨어러블 컴퓨팅 디바이스(100)의 사용자에게 사용자 인터페이스 객체 또는 다른 컨텐츠를 묘사할 수 있다. 예를 들어, UI 렌더러(250)는 웨어러블 컴퓨팅 디바이스(100)에 의해 캡처된 이미지를 수신하여 디스플레이(244) 상에 추가적인 사용자 인터페이스 컨텐츠를 생성하고 렌더링할 수 있다.Wearable computing device 100 also includes UI renderer 250. UI renderer 250 may function in conjunction with display 244 to depict user interface objects or other content to a user of wearable computing device 100. For example, UI renderer 250 may receive images captured by wearable computing device 100 to generate and render additional user interface content on display 244 .

웨어러블 컴퓨팅 디바이스(100)는 또한 통신 모듈(252)을 포함한다. 통신 모듈(252)은 웨어러블 컴퓨팅 디바이스(100)가 웨어러블 컴퓨팅 디바이스(100)의 범위 내에서 다른 컴퓨팅 디바이스와 정보를 교환하기 위해 통신할 수 있도록 할 수 있다. 예를 들어, 웨어러블 컴퓨팅 디바이스(100)는 예를 들어 유선 연결, 예를 들어 Wi-Fi 또는 블루투스를 통한 무선 연결, 또는 다른 유형의 연결을 통한 통신을 용이하게 하기 위해 다른 컴퓨팅 디바이스에 동작 가능하게 결합될 수 있다.Wearable computing device 100 also includes communication module 252. Communication module 252 may enable wearable computing device 100 to communicate to exchange information with other computing devices within range of wearable computing device 100. For example, wearable computing device 100 may be operable with other computing devices to facilitate communication, for example, through a wired connection, wirelessly, for example, via Wi-Fi or Bluetooth, or another type of connection. can be combined

일부 구현에서, 웨어러블 컴퓨팅 디바이스(100)는 네트워크(206)를 통해 서버 컴퓨팅 디바이스(204) 및/또는 모바일 컴퓨팅 디바이스(202)와 통신하도록 구성된다. 서버 컴퓨팅 디바이스(204)는 다수의 서로 다른 디바이스, 예를 들어 표준 서버, 이러한 서버의 그룹, 또는 랙 서버 시스템의 형태를 취하는 하나 이상의 컴퓨팅 디바이스를 나타낼 수 있다. 일부 구현에서, 서버 컴퓨팅 디바이스(204)는 프로세서 및 메모리와 같은 구성요소를 공유하는 단일 시스템이다. 네트워크(206)는 인터넷 및/또는 LAN(Local Area Network), WAN(Wide Area Network), 셀룰러 네트워크, 위성 네트워크 또는 기타 유형의 데이터 네트워크와 같은 다른 유형의 데이터 네트워크를 포함할 수 있다. 네트워크(206)는 또한 네트워크(206) 내에서 데이터를 수신 및/또는 전송하도록 구성된 임의의 수의 컴퓨팅 디바이스(예를 들어, 컴퓨터, 서버, 라우터, 네트워크 스위치 등)를 포함할 수 있다.In some implementations, wearable computing device 100 is configured to communicate with server computing device 204 and/or mobile computing device 202 over network 206. Server computing device 204 may represent a number of different devices, such as one or more computing devices that take the form of a standard server, a group of such servers, or a rack server system. In some implementations, server computing device 204 is a single system that shares components such as processors and memory. Network 206 may include the Internet and/or other types of data networks, such as a local area network (LAN), wide area network (WAN), cellular network, satellite network, or other type of data network. Network 206 may also include any number of computing devices (e.g., computers, servers, routers, network switches, etc.) configured to receive and/or transmit data within network 206.

일부 구현예에서, 웨어러블 컴퓨팅 디바이스(100)는 또한 웨어러블 컴퓨팅 디바이스(100)에 탑재된 기계 학습 작업을 수행하기 위한 신경망 NN(254)들을 포함한다. 신경망은 기계 학습 모델 및/또는 동작과 함께 사용되어 특정 이미지 컨텐츠를 생성, 수정, 예측 및/또는 검출할 수 있다. 일부 구현에서, 센서(들)(216)에 의해 획득된 센서 데이터에 대해 수행되는 이미지 처리는 기계 학습(ML) 추론 동작으로 지칭된다. 추론 동작은 하나 이상의 예측을 하는(또는 예측으로 이어지는) ML 모델과 관련된 이미지 처리 동작, 단계 또는 하위 단계를 지칭할 수 있다. 웨어러블 컴퓨팅 디바이스(100)에 의해 수행되는 특정 유형의 이미지 처리는 예측을 위해 ML 모델을 사용할 수 있다. 예를 들어, 기계 학습은 새로운 데이터에 관한 결정을 내리기 위해 기존 데이터에서 데이터를 학습하는 통계 알고리즘을 사용할 수 있는데, 이를 추론이라고 한다. 즉, 추론은 이미 트레이닝된 모델을 가져와서 그 훈련된 모델을 사용하여 예측을 하는 프로세스를 지칭한다. 추론의 몇 가지 예는 이미지 인식(예를 들어, OCR 텍스트 인식, 얼굴 인식, 얼굴, 신체 또는 객체 추적 등) 및/또는 지각(예를 들어, 상시-온 오디오 감지, 음성 입력 요청 오디오 감지 등)을 포함한다.In some implementations, wearable computing device 100 also includes neural network NNs 254 for performing machine learning tasks mounted on wearable computing device 100. Neural networks can be used in conjunction with machine learning models and/or operations to create, modify, predict, and/or detect specific image content. In some implementations, image processing performed on sensor data acquired by sensor(s) 216 is referred to as a machine learning (ML) inference operation. An inference operation may refer to an image processing operation, step, or substep associated with an ML model that makes (or leads to) one or more predictions. Certain types of image processing performed by wearable computing device 100 may use ML models for prediction. For example, machine learning can use statistical algorithms that learn from existing data to make decisions about new data, called inference. In other words, inference refers to the process of taking an already trained model and making predictions using that trained model. Some examples of inference include image recognition (e.g., OCR text recognition, face recognition, face, body or object tracking, etc.) and/or perception (e.g., always-on audio detection, voice input request audio detection, etc.) Includes.

일부 구현에서, ML 모델은 하나 이상의 NN(254)을 포함한다. NN(254)은 입력 계층에 의해 수신된 입력을 일련의 은닉 계층을 통해 변환하고 출력 계층을 통해 출력을 생성한다. 각 계층은 노드 세트의 하위 세트로 구성된다. 은닉 계층의 노드들은 이전 계층의 모든 노드와 완전히 연결되어 있으며 다음 계층의 모든 노드에 개별(들) 출력을 제공한다. 단일 계층의 노드들은 서로 독립적으로 기능한다(즉, 연결을 공유하지 않음). 출력 계층의 노드들은 변환된 입력을 요청하는 프로세스에 제공한다. 일부 구현에서, 웨어러블 컴퓨팅 디바이스(100)에 의해 활용되는 NN(254)은 완전히 연결되지 않은 NN인 컨벌루션 신경망이다.In some implementations, the ML model includes one or more NNs 254. NN 254 transforms the input received by the input layer through a series of hidden layers and produces output through the output layer. Each layer consists of a subset of the node set. Nodes in the hidden layer are fully connected to all nodes in the previous layer and provide individual(s) output to all nodes in the next layer. Nodes in a single layer function independently of each other (i.e., do not share connections). Nodes in the output layer provide converted input to the requesting process. In some implementations, NN 254 utilized by wearable computing device 100 is a convolutional neural network that is a fully disconnected NN.

일반적으로, 웨어러블 컴퓨팅 디바이스(100)는 열차 데이터를 생성하고 ML 작업을 수행하기 위해 NN 및 분류기(미도시)를 사용할 수 있다. 예를 들어, 프로세서(208)는 예컨대 ROI(232)가 이미지 센서(216)에 의해 캡처된 이미지 프레임(226) 내에 포함되어 있는지 여부를 검출하는 분류기를 실행하도록 구성될 수 있다. ROI(232)는 관심 객체라고도 지칭될 수 있다. 분류기는 ML 모델을 포함하거나 ML 모델에 의해 정의될 수 있다. ML 모델은 예측(예를 들어, ROI(232)가 이미지 프레임(226) 내에 포함되는지 여부)을 수행하기 위해 ML 모델에 의해 사용되는 다수의 파라미터를 정의할 수 있다. ML 모델은 상대적으로 작을 수 있으며 전체 이미지가 아닌 ROI에 대해 기능하도록 구성될 수 있다. 따라서, 분류기는 상대적으로 작은 ML 모델을 통해 전력 및 대기 시간을 절약하도록 기능하도록 구성될 수 있다.In general, the wearable computing device 100 may use a NN and a classifier (not shown) to generate train data and perform ML operations. For example, processor 208 may be configured to execute a classifier that detects, for example, whether ROI 232 is included within image frame 226 captured by image sensor 216. ROI 232 may also be referred to as an object of interest. A classifier may contain an ML model or be defined by an ML model. The ML model may define a number of parameters that are used by the ML model to make predictions (e.g., whether ROI 232 is contained within image frame 226). ML models can be relatively small and configured to function on ROIs rather than the entire image. Accordingly, the classifier can be configured to function to save power and latency through relatively small ML models.

일부 구현에서, 웨어러블 컴퓨팅 디바이스(100)는 ML 모델을 실행하여 ROI 데이터세트(예를 들어, ROI(232))를 계산하는 ROI 분류기를 사용할 수 있다. 일부 예에서, ROI(232)는 객체 위치 데이터 및/또는 경계 상자 데이터세트의 예를 나타낸다. ROI(232)는 ROI가 이미지 프레임(들)(226) 내에 위치하는 위치를 정의하는 데이터일 수 있다.In some implementations, wearable computing device 100 may use an ROI classifier that executes an ML model to compute an ROI dataset (e.g., ROI 232). In some examples, ROI 232 represents an example of object location data and/or bounding box dataset. ROI 232 may be data defining where the ROI is located within image frame(s) 226.

동작 시, 웨어러블 컴퓨팅 디바이스(100)는 듀얼 해상도 이미지 센서(예를 들어, 이미지 센서(216))에서 이미지 컨텐츠(예를 들어, 복수의 이미지/프레임)를 수신하거나 캡처할 수 있다. 이미지 컨텐츠는 동시에 복수의 이미지/프레임 중 고해상도 이미지가 고해상도 이미지 신호 프로세서(224)로 전송되는 동안 저해상도 이미지 신호 프로세서(222)로 전송되는 저해상도 이미지일 수 있다. 그런 다음 저해상도 이미지는 신경망(254)과 같은 온보드 기계 학습 모델로 전송될 수 있다. NN(254)은 ROI 검출기(230)를 활용하여 하나 이상의 ROI(예를 들어, 텍스트, 객체, 이미지, 프레임, 픽셀 등)를 식별할 수 있다. 식별된 ROI들은 각각 결정된 발췌 좌표(238)와 연관될 수 있다. 웨어러블 컴퓨팅 디바이스(100)는 이러한 발췌 좌표(238)를 고해상도 이미지 신호 처리기(224)로 전송하여 상기 검출된 ROI에 대응하는 저해상도 이미지의 발췌 좌표에 기초하여 고해상도 이미지의 발췌를 유발할 수 있다. 발췌된 이미지는 원본 고해상도 이미지보다 작으며, 따라서 웨어러블 컴퓨팅 디바이스(100)는 NN(254)을 사용하여 발췌된 이미지에 대해 기계 학습 활동(예를 들어, OCR, 객체 감지 등)을 수행할 수 있다.In operation, wearable computing device 100 may receive or capture image content (e.g., multiple images/frames) from a dual resolution image sensor (e.g., image sensor 216). Image content may be a low-resolution image transmitted to the low-resolution image signal processor 222 while a high-resolution image among a plurality of images/frames is transmitted to the high-resolution image signal processor 224 at the same time. The low-resolution images can then be sent to an on-board machine learning model, such as a neural network 254. NN 254 may utilize ROI detector 230 to identify one or more ROIs (e.g., text, object, image, frame, pixel, etc.). The identified ROIs may each be associated with the determined excerpt coordinates 238. The wearable computing device 100 may transmit these excerpt coordinates 238 to the high-resolution image signal processor 224 to cause extraction of a high-resolution image based on the excerpt coordinates of the low-resolution image corresponding to the detected ROI. The extracted image is smaller than the original high-resolution image, and therefore wearable computing device 100 can perform machine learning activities (e.g., OCR, object detection, etc.) on the extracted image using NN 254. .

일부 구현예에서, 웨어러블 컴퓨팅 디바이스(100)는 발췌된 영역에 대해 다른 비전 기반 알고리즘을 수행할 수 있다. 예를 들어, 인식된 객체(들)와 관련된 정보를 식별하고 활용하기 위해 발췌된 영역에 대해 객체 인식이 수행될 수 있다. 예를 들어, 특정 제품, 랜드마크, 상점 정면, 식물, 동물, 및/또는 기타 다양한 일반 객체 및/또는 객체 분류기가 인식되어 사용자를 위한 추가 정보를 결정하기 위한 입력으로서 사용될 수 있다. 일부 구현예에서, 디스플레이되거나, 청각적으로 제공되거나, 기타 출력될 수 있는 웨어러블 컴퓨팅 디바이스(100)에 대한 추가 정보를 식별하고 제공하기 위해 바코드 인식 및 판독이 상기 발췌된 영역에 대해 수행될 수 있다. 일부 구현에서, 인식된 얼굴에 AR 및/또는 VR 컨텐츠를 배치하고, 인식된 얼굴의 특정 특징을 캡처하고, 얼굴의 특정 특징을 계산할 수 있도록 상기 발췌된 영역에 대해 얼굴 인식이 수행될 수 있다. 일부 구현에서, 객체 움직임 및/또는 사용자 움직임을 추적하기 위해 상기 발췌된 영역에 대해 특징 추적(예를 들어, 특징점 검출)이 수행될 수 있다.In some implementations, wearable computing device 100 may perform other vision-based algorithms on the extracted area. For example, object recognition may be performed on an excerpted area to identify and utilize information related to the recognized object(s). For example, specific products, landmarks, storefronts, plants, animals, and/or various other common objects and/or object classifiers may be recognized and used as input to determine additional information for the user. In some implementations, barcode recognition and reading may be performed on the excerpted areas to identify and provide additional information about wearable computing device 100 that may be displayed, presented audibly, or otherwise output. . In some implementations, facial recognition may be performed on the excerpted areas to place AR and/or VR content on the recognized face, capture specific features of the recognized face, and calculate specific features of the face. In some implementations, feature tracking (e.g., feature point detection) may be performed on the extracted region to track object movement and/or user movement.

도 3a 및 3b는 본 개시 전반에 걸쳐 설명된 구현에 따른 AR 웨어러블 컴퓨팅 디바이스의 예의 다양한 도면을 도시한다. 도 3a는 본 개시 내용 전체에 걸쳐 설명된 구현에 따른 웨어러블 컴퓨팅 디바이스의 예의 정면도이다. 이 예에서, 웨어러블 컴퓨팅 디바이스는 한 쌍의 AR 안경(300A)(예를 들어, 도 1의 웨어러블 컴퓨팅 디바이스(100))이다. 일반적으로, AR 안경(300A)은 시스템(200)의 임의의 또는 모든 구성요소를 포함할 수 있다. AR 안경(300A)은 안경 형태로 설계된 광학 헤드 장착형 디스플레이 디바이스를 대표하는 스마트 안경으로도 지칭될 수 있다. 예를 들어, 스마트 안경은 착용자가 안경을 통해 보는 것과 함께 정보(예를 들어, 디스플레이 투사)를 추가하는 안경이다.3A and 3B show various diagrams of examples of AR wearable computing devices according to implementations described throughout this disclosure. 3A is a front view of an example of a wearable computing device according to implementations described throughout this disclosure. In this example, the wearable computing device is a pair of AR glasses 300A (e.g., wearable computing device 100 of FIG. 1). In general, AR glasses 300A may include any or all components of system 200. AR glasses 300A may also be referred to as smart glasses, which represent optical head-mounted display devices designed in the form of glasses. For example, smart glasses are glasses that add information (e.g., projected displays) along with what the wearer sees through the glasses.

AR 안경(300A)이 본 명세서에 설명된 웨어러블 컴퓨팅 디바이스로 도시되어 있지만, 다른 유형의 웨어러블 컴퓨팅 디바이스도 가능하다. 예를 들어, 웨어러블 컴퓨팅 디바이스는 광학 헤드 장착형 디스플레이(OHMD) 디바이스, 투명 헤드업 디스플레이(HUD), 증강 현실(AR) 디바이스와 같은 헤드 장착형 디스플레이(HMD) 디바이스, 또는 센서, 디스플레이 및 컴퓨팅 기능을 갖춘 고글이나 헤드셋과 같은 기타 디바이스를 포함하는 임의의 배터리 구동형 디바이스를 포함할 수 있지만 이에 한정되지 않는다. 일부 예에서, 웨어러블 컴퓨팅 디바이스(300A)는 시계, 모바일 디바이스, 보석류, 링 컨트롤러, 또는 기타 웨어러블 컨트롤러일 수 있다.Although AR glasses 300A are shown as the wearable computing device described herein, other types of wearable computing devices are also possible. For example, wearable computing devices can be head-mounted display (HMD) devices such as optical head-mounted display (OHMD) devices, transparent heads-up displays (HUD), augmented reality (AR) devices, or devices with sensors, displays, and computing capabilities. Can include, but is not limited to, any battery-powered device, including other devices such as goggles or headsets. In some examples, wearable computing device 300A may be a watch, mobile device, jewelry, ring controller, or other wearable controller.

도 3a에 도시된 바와 같이, AR 안경(300A)은 프레임(302)과, 프레임(302)(또는 프레임(302)의 유리 부분)에 결합된 디스플레이 디바이스(304)를 갖는다. AR 안경(300A)은 또한 오디오 출력 디바이스(306), 조명 디바이스(308), 감지 시스템(310), 제어 시스템(312), 적어도 하나의 프로세서(314) 및 카메라(316)를 포함한다.As shown in FIG. 3A, AR glasses 300A have a frame 302 and a display device 304 coupled to the frame 302 (or a glass portion of the frame 302). AR glasses 300A also include an audio output device 306, a lighting device 308, a sensing system 310, a control system 312, at least one processor 314, and a camera 316.

디스플레이 디바이스(304)는 버드배스(birdbath) 또는 도파관 광학계를 사용하는 것과 같은 투시형(see-through) 근안 디스플레이를 포함할 수 있다. 예를 들어, 이러한 광학 설계는 디스플레이 소스로부터의 광을 45도 각도로 배치된 빔 스플리터 역할을 하는 텔레프롬프터(teleprompter) 유리 부분에 투사할 수 있다. 빔 스플리터는 디스플레이 소스로부터의 광이 부분적으로 반사되고 나머지 광은 투과되도록 하는 반사 및 투과 값을 허용할 수 있다. 이러한 광학 설계를 통해 사용자는 디스플레이에 의해 생성된 디지털 이미지(예를 들어, UI 요소, 가상 컨텐츠 등) 옆에 있는 세상의 물리적(실제) 항목(item)을 모두 볼 수 있다. 일부 구현에서, 도파관 광학계는 AR 안경(300A)의 디스플레이 디바이스(304)에 컨텐츠를 묘사하는데 사용될 수 있다.Display device 304 may include a see-through near-eye display, such as using birdbath or waveguide optics. For example, this optical design could project light from a display source onto a piece of teleprompter glass that acts as a beam splitter positioned at a 45-degree angle. The beam splitter can allow reflection and transmission values such that light from the display source is partially reflected and the remaining light is transmitted. This optical design allows the user to see all physical (real-world) items in the world next to the digital images (e.g., UI elements, virtual content, etc.) generated by the display. In some implementations, waveguide optics may be used to depict content on the display device 304 of AR glasses 300A.

오디오 출력 디바이스(306)(예를 들어, 하나 이상의 스피커)는 프레임(302)에 연결될 수 있다. 감지 시스템(310)은 AR 안경(300A)의 작동을 용이하게 하기 위해 다양한 제어 시스템 디바이스를 포함하는 제어 시스템(312)과 다양한 감지 디바이스를 포함할 수 있다. 제어 시스템(312)은 제어 시스템(312)의 구성요소에 작동 가능하게 결합된 프로세서(314)를 포함할 수 있다.Audio output devices 306 (e.g., one or more speakers) may be coupled to frame 302. Sensing system 310 may include various sensing devices and a control system 312 including various control system devices to facilitate operation of AR glasses 300A. Control system 312 may include a processor 314 operably coupled to components of control system 312 .

카메라(316)는 정지 이미지 및/또는 동영상을 캡처할 수 있다. 일부 구현에서, 카메라(316)는 카메라(316)로부터 외부 객체의 거리와 관련된 데이터를 수집할 수 있는 깊이 카메라일 수 있다. 일부 구현에서, 카메라(316)는 예를 들어 입력 디바이스 상의 광학 마커 또는 화면 상의 손가락과 같은 외부 디바이스의 하나 이상의 광학 마커를 예를 들어 감지하고 추적할 수 있는 포인트(지점) 추적 카메라일 수 있다. 일부 구현에서, AR 안경(300A)은 카메라(316)의 시야에 있는 객체(예를 들어, 가상 및 물리적)를 검출하기 위해 예를 들어 카메라(316)와 선택적으로 작동할 수 있는 조명 디바이스(308)를 포함할 수 있다. 조명 디바이스(308)는 예를 들어 카메라(316)의 시야에 있는 객체를 검출하기 위해 카메라(316)와 선택적으로 작동할 수 있다.Camera 316 may capture still images and/or video. In some implementations, camera 316 may be a depth camera that can collect data related to the distance of external objects from camera 316. In some implementations, camera 316 may be a point tracking camera capable of detecting and tracking, for example, an optical marker on an input device or one or more optical markers on an external device, such as a finger on a screen. In some implementations, AR glasses 300A may be configured with an illumination device 308 that can selectively operate, for example, with camera 316 to detect objects (e.g., virtual and physical) in the field of view of camera 316. ) may include. Illumination device 308 may optionally operate with camera 316, for example, to detect objects in the field of view of camera 316.

AR 안경(300A)은 프로세서(314) 및 제어 시스템(312)과 통신하는 통신 모듈(예를 들어, 통신 모듈(252))을 포함할 수 있다. 통신 모듈은 AR 안경(300A) 내에 하우징된 디바이스들 간의 통신은 물론 예를 들어 컨트롤러, 모바일 디바이스, 서버 및/또는 다른 컴퓨팅 디바이스와 같은 외부 디바이스와의 통신도 제공할 수 있다. 통신 모듈은 AR 안경(300A)이 통신하여 다른 컴퓨팅 디바이스와 정보를 교환하고 AR 안경(300A) 또는 환경의 다른 식별 가능한 요소 범위 내의 다른 디바이스를 인증할 수 있게 할 수 있다. 예를 들어, AR 안경(300A)은 예를 들어 유선 연결, 무선 연결(예를 들어, Wi-Fi 또는 블루투스, 또는 다른 유형의 연결을 통한 통신을 용이하게 하기 위해 다른 컴퓨팅 디바이스에 작동 가능하게 연결될 수 있다.AR glasses 300A may include a communication module (e.g., communication module 252) that communicates with processor 314 and control system 312. The communication module may provide communication between devices housed within the AR glasses 300A as well as communication with external devices, such as controllers, mobile devices, servers, and/or other computing devices. The communication module may enable AR glasses 300A to communicate and exchange information with other computing devices and authenticate other devices within range of AR glasses 300A or other identifiable elements in the environment. For example, AR glasses 300A may be operably connected to another computing device to facilitate communication, for example, via a wired connection, a wireless connection (e.g., Wi-Fi or Bluetooth, or another type of connection). You can.

도 3b는 본 개시 전반에 걸쳐 설명된 구현에 따른 AR 안경(300A)의 후면도(300B)이다. AR 안경(300B)은 도 1의 웨어러블 컴퓨팅 디바이스(100)의 예일 수 있다. AR 안경(300B)은 착용자가 안경을 통해 보는 것과 함께 정보를 추가(예를 들어, 디스플레이(320)를 투사)하는 안경이다. 일부 구현에서, 정보를 투사하는 대신, 디스플레이(320)는 인-렌즈(in-lens) 마이크로 디스플레이이다. 일부 구현에서, AR 안경(300B)(예를 들어, 안경(eyeglasses) 또는 안경(spectacles))은 코 위의 브리지(324)를 활용하여 사람의 눈 앞에 렌즈를 고정하는 프레임(테)(302)에 장착된 렌즈(322)(예를 들어 유리 또는 단단한 플라스틱 렌즈)와, 귀 위에 놓이는 다리(bows)(326)(예를 들어, 다리(temple) 또는 다리 부분)을 포함하는 시력 보조 장치(vision aids)이다.3B is a back view 300B of AR glasses 300A according to an implementation described throughout this disclosure. AR glasses 300B may be an example of the wearable computing device 100 of FIG. 1 . AR glasses 300B are glasses that add information (e.g., project a display 320) along with what the wearer sees through the glasses. In some implementations, instead of projecting information, display 320 is an in-lens micro display. In some implementations, AR glasses 300B (e.g., eyeglasses or spectacles) include a frame 302 that utilizes a bridge 324 over the nose to secure the lenses in front of a person's eyes. A vision aid comprising a lens 322 (e.g., a glass or hard plastic lens) mounted on the eye and bows 326 (e.g., a temple or bridge portion) that rest over the ear. aids).

도 4a-4c는 본 개시 전반에 걸쳐 설명된 구현에 따른 웨어러블 컴퓨팅 디바이스에서 이미지 처리 작업을 수행하기 위한 예시적인 흐름도(400A, 400B, 400C)를 도시한다. 예를 들어, 흐름도(400A-400C)는 하나 이상의 이미지 센서(216)를 사용하여 웨어러블 컴퓨팅 디바이스(100)에 의해 수행될 수 있다. 흐름도(400A-400C)의 예에서, 센서(402)는 센서(402)가 들어오는(또는 캡처된) 저해상도 이미지 스트림과 들어오는(또는 캡처된) 고해상도 이미지 스트림 모두를 연속적으로 그리고 동시에 처리하는 듀얼 스트림 모드로 기능할 수 있다.Figures 4A-4C depict example flow diagrams 400A, 400B, and 400C for performing image processing tasks in a wearable computing device according to implementations described throughout this disclosure. For example, flow diagrams 400A-400C may be performed by wearable computing device 100 using one or more image sensors 216. In the example of flow diagrams 400A-400C, sensor 402 operates in a dual stream mode in which sensor 402 processes both an incoming (or captured) low-resolution image stream and an incoming (or captured) high-resolution image stream continuously and simultaneously. It can function as

도 4a는 웨어러블 컴퓨팅 디바이스에 설치되어 이미지 데이터의 스트림을 캡처하거나 수신하도록 구성된 이미지 센서(402)의 사용을 예시하는 흐름도(400A)이다. 센서(402)는 이미지 컨텐츠의 두 스트림을 캡처하고 출력할 수 있다. 예를 들어, 센서(402)는 전체(full) 시야로 캡처된 복수의 고해상도 이미지 프레임(예를 들어, 카메라/센서(402)의 전체 해상도)을 획득할 수 있다. 복수의 고해상도 이미지 프레임은 버퍼(404)에 저장될 수 있다. 일반적으로, 고출력 이미지 신호 프로세서(406)는 선택적으로 고해상도 이미지 프레임의 분석을 수행하는데 이용될 수 있다.FIG. 4A is a flow diagram 400A illustrating the use of an image sensor 402 installed in a wearable computing device and configured to capture or receive a stream of image data. Sensor 402 can capture and output two streams of image content. For example, sensor 402 may acquire multiple high-resolution image frames captured with a full field of view (e.g., the full resolution of camera/sensor 402). Multiple high-resolution image frames may be stored in buffer 404. In general, high-output image signal processor 406 may optionally be used to perform analysis of high-resolution image frames.

또한, 센서(402)는 전체 시야로 캡처된 복수의 저해상도 이미지 프레임을 (동시에) 획득할 수 있다. 저해상도 이미지 프레임은 저전력 이미지 신호 프로세서(408)에 제공될 수 있다. 프로세서(408)는 저해상도 이미지 프레임을 처리하여 디베이어링되고(debayered), 색상 보정되고, 음영 보정된 이미지(예를 들어, YUV 포맷 이미지 스트림)인 이미지 스트림(410)(예를 들어, 하나 이상의 이미지)을 생성한다.Additionally, sensor 402 may acquire (simultaneously) multiple low-resolution image frames captured with the entire field of view. Low-resolution image frames may be provided to a low-power image signal processor 408. Processor 408 processes the low-resolution image frames to produce image streams 410 (e.g., one or more images) that are debayered, color-corrected, and shade-corrected images (e.g., YUV format image streams). ) is created.

저전력 이미지 신호 프로세서(408)는 저전력 계산(compute)(412)을 수행하여 상기 생성된 이미지 스트림(410)을 분석하여 이미지(들)(410) 내의 객체 및/또는 관심 영역(414)을 검출할 수 있다. 검출에는 손 검출, 객체 검출, 문단 검출 등이 포함될 수 있다. 관심 객체 및/또는 ROI가 검출되고 이러한 영역 또는 객체의 검출이 임계 조건을 충족하면 그 객체 및/또는 영역의 경계 상자가 식별된다. 임계 조건은 이미지 스트림의 이미지 프레임에 있는 객체 또는 영역에 대한 특정한 높은 품질 레벨을 나타낼 수 있다. 예를 들어, 임계 조건은 객체 또는 영역이 초점 내에 있는지, 여전히 블러(흐려짐)가 낮은지 및/또는 올바른 노출을 가지고 있는지를 결정하는 것 중 일부 또는 전부와 관련될 수 있다. 경계 상자는 상자의 네 모서리 각각의 발췌 좌표를 나타낼 수 있다. 일부 구현에서, 경계 상자는 상자형이 아닐 수 있지만 대신 삼각형, 원형, 타원형 또는 기타 모양일 수 있다.The low-power image signal processor 408 performs low-power compute 412 to analyze the generated image stream 410 to detect objects and/or regions of interest 414 within the image(s) 410. You can. Detection may include hand detection, object detection, paragraph detection, etc. If an object of interest and/or ROI is detected and detection of such region or object meets a threshold condition, the bounding box of that object and/or region is identified. A threshold condition may indicate a specific high quality level for an object or region in an image frame of an image stream. For example, a threshold condition may relate to any or all of the following: determining whether an object or area is in focus, still has low blur, and/or has correct exposure. The bounding box can represent the extracted coordinates of each of the four corners of the box. In some implementations, the bounding box may not be box-shaped, but instead may be triangular, circular, oval, or other shape.

도 4b는 웨어러블 컴퓨팅 디바이스에 설치되어 이미지 데이터의 스트림을 캡처하거나 수신하도록 구성된 이미지 센서(402)의 사용을 예시하는 흐름도(400B)이다. 도 4a와 유사하게, 센서(402)는 이미지 컨텐츠의 2개 스트림, 즉 복수의 고해상도 이미지 프레임과 복수의 저해상도 이미지 프레임을 캡처하고 출력할 수 있다. 복수의 고해상도 이미지 프레임이 버퍼(404)에 제공될 수 있다.FIG. 4B is a flow diagram 400B illustrating the use of an image sensor 402 installed in a wearable computing device and configured to capture or receive a stream of image data. Similar to Figure 4A, sensor 402 can capture and output two streams of image content: a plurality of high-resolution image frames and a plurality of low-resolution image frames. Multiple high-resolution image frames may be provided to buffer 404.

저해상도 이미지 프레임은 저전력 이미지 신호 프로세서(408)에 제공될 수 있다. 프로세서(408)는 저해상도 이미지 프레임을 처리하여 디베이어링되고, 색상 보정되고, 음영 보정된 이미지(예를 들어, YUV 포맷 이미지 스트림)인 이미지 스트림(410)(예를 들어, 하나 이상의 이미지)을 생성한다. Low-resolution image frames may be provided to a low-power image signal processor 408. Processor 408 processes the low-resolution image frames to generate image streams 410 (e.g., one or more images) that are debayered, color-corrected, and shade-corrected images (e.g., YUV format image streams). do.

저전력 이미지 신호 프로세서(408)는 저전력 계산(412)을 수행하여 상기 생성된 이미지 스트림(410)을 분석하여 이미지(들)(410) 내의 객체 및/또는 관심 영역(414)을 검출할 수 있다. 관심 객체 및/또는 ROI가 검출되고 이러한 영역 또는 객체의 검출이 임계 조건을 충족하면 그 객체 및/또는 영역의 경계 상자가 식별된다. 임계 조건은 이미지 스트림의 이미지 프레임에 있는 객체 또는 영역에 대한 특정한 높은 품질 레벨을 나타낼 수 있다. 예를 들어, 임계 조건은 객체 또는 영역이 초점 내에 있는지, 여전히 흐려짐이 낮은지 및/또는 올바른 노출을 갖는지를 결정하는 것 중 일부 또는 전부와 관련될 수 있다. 경계 상자는 관심 영역 또는 객체(414)를 나타내는 발췌 좌표를 나타낼 수 있다. 일단 객체 또는 관심 영역(414)이 결정되고 임계 조건(들)이 충족되면, 고전력 이미지 신호 프로세서(406)는 버퍼(404)로부터, 고해상도 이미지 프레임(예를 들어, 관심 객체 또는 영역이 식별된 저해상도 이미지 프레임과 동일한 캡처 시간과 일치하는 원시 프레임)을 검색(또는 수신)한다.The low-power image signal processor 408 may perform low-power calculations 412 to analyze the generated image stream 410 to detect objects and/or regions of interest 414 within the image(s) 410 . If an object of interest and/or ROI is detected and detection of such region or object meets a threshold condition, the bounding box of that object and/or region is identified. A threshold condition may indicate a specific high quality level for an object or region in an image frame of an image stream. For example, a threshold condition may relate to any or all of the following: determining whether an object or area is in focus, still has low blur, and/or has correct exposure. The bounding box may represent a region of interest or excerpt coordinates representing an object 414. Once the object or region of interest 414 is determined and the threshold condition(s) are met, the high power image signal processor 406 outputs a high-resolution image frame from the buffer 404 (e.g., a low-resolution image frame with the object or region of interest identified). Retrieves (or receives) a raw frame (that matches the same capture time as the image frame).

예를 들어, 발췌 좌표 및 검색된 고해상도 이미지 프레임은 고전력 이미지 신호 프로세서(406)에 대한 입력으로서 제공될 수 있다(예를 들어, 화살표(416)). 고전력 이미지 신호 프로세서(406)는 저해상도 이미지 프레임으로부터 결정된 관심 객체 및/또는 관심 영역에 대해 발췌된 처리된(예를 들어, 변환된, 색상 보정된, 음영 보정된) 이미지(418)를 출력할 수 있다. 이 발췌된 전체 해상도 이미지(418)는 추가 처리를 위해 추가 온보드 또는 오프보드 디바이스에 제공될 수 있다.For example, the excerpt coordinates and retrieved high-resolution image frames may be provided as input to the high-power image signal processor 406 (e.g., arrow 416). The high-power image signal processor 406 may output a processed (e.g., converted, color-corrected, shade-corrected) image 418 extracted for objects of interest and/or regions of interest determined from low-resolution image frames. there is. This extracted full resolution image 418 may be provided to additional onboard or offboard devices for further processing.

일부 구현에서는, 복수의 이미지 프레임 대신에, 계산 및/또는 디바이스 전력을 절약하기 위해 고전력 이미지 신호 프로세서를 통해 단일 고해상도 이미지가 처리될 수 있다. 추가 계산 및/또는 디바이스 전력은 추가적인 다운스트림 처리에 의해서도 발생할 수 있는데, 그 이유는 이러한 다운스트림 처리가 전체 고해상도 이미지가 아닌 발췌된 이미지(418)에 대해 작동할 수 있기 때문이다.In some implementations, instead of multiple image frames, a single high-resolution image may be processed through a high-power image signal processor to save computational and/or device power. Additional computation and/or device power may also result from additional downstream processing because such downstream processing may operate on excerpted images 418 rather than the full high-resolution image.

일반적으로, 저전력 이미지 신호 프로세서(408)는 객체 및/또는 관심 영역이 식별된 후에 저해상도 이미지 프레임의 스트림 처리를 계속하지 않을 수 있다. 일부 구현에서, 저전력 이미지 신호 프로세서(408)는 이전에 제공/식별된 이미지 프레임에 대한 추가 품질 메트릭을 기다리기 위해 대기 모드에 있을 수 있다. 일부 구현에서, 센서(402)는 저해상도 이미지 스트림에 대한 지각(인식) 기능(즉, ML 처리)의 수행을 계속 트리거할 수 있으며, 이는 다운스트림 기능에 대한 출력에 추가로 영향을 미칠 수 있다.In general, low-power image signal processor 408 may not continue processing the stream of low-resolution image frames after objects and/or regions of interest have been identified. In some implementations, low-power image signal processor 408 may be in standby mode to await additional quality metrics for previously provided/identified image frames. In some implementations, sensor 402 may continue to trigger performance of perceptual (recognition) functions (i.e., ML processing) on low-resolution image streams, which may further affect output to downstream functions.

도 4c는 추가적인 다운스트림 처리의 예를 도시하는 흐름도(400C)이다. 도시된 바와 같이, 발췌된 이미지(418)는 관심 영역을 나타낸다. 발췌된 이미지(418)는 이미지(418)(또는 이미지들)를 메모리에 버퍼링하지 않고 인코더(예를 들어, 압축) 블록으로 전송될 수 있는 고해상도 이미지이다. 인코더의 압축된 출력은 버퍼(422)에 버퍼링되고 암호화 블록(424)에 의해 암호화된다. 암호화된 이미지(418)는 버퍼(426)에 버퍼링되어 예를 들어 Wi-Fi(예를 들어, 네트워크(206))를 통해 동반(companion) 모바일 컴퓨팅 디바이스(202)로 전송될 수 있다. 모바일 컴퓨팅 디바이스(202)는 예를 들어 웨어러블 컴퓨팅 디바이스(100)에서 지원되는 것보다 복잡한 처리를 지원하기 위해 계산 능력, 메모리 및 배터리 용량을 포함하도록 구성될 수 있다. 추가 다운스트림 처리에서는 전체 이미지가 아닌 이미지 컨텐츠의 발췌된 버전을 활용하기 때문에, 다운스트림 처리는 계산, 메모리 및 배터리 전역을 절약하는 이점도 있다. 예를 들어, 발췌된 이미지(418)가 아니라, 더 큰 이미지를 위해 Wi-Fi를 통해 추가 패킷을 전송하는 데 더 많은 양의 전력이 사용된다.FIG. 4C is a flow diagram 400C illustrating an example of additional downstream processing. As shown, excerpted image 418 represents a region of interest. The excerpted image 418 is a high-resolution image that can be sent to the encoder (e.g., compression) block without buffering the image 418 (or images) in memory. The compressed output of the encoder is buffered in buffer 422 and encrypted by encryption block 424. Encrypted image 418 may be buffered in buffer 426 and transmitted to companion mobile computing device 202, for example, via Wi-Fi (e.g., network 206). Mobile computing device 202 may be configured to include computing power, memory, and battery capacity to support more complex processing than that supported in wearable computing device 100, for example. Because further downstream processing utilizes excerpted versions of the image content rather than the entire image, downstream processing also benefits from global savings in computation, memory, and battery. For example, a larger amount of power is used to transmit additional packets over Wi-Fi for larger images rather than just the excerpted image 418.

도 5a 및 도 5b는 본 개시 전반에 걸쳐 설명된 구현에 따른 웨어러블 컴퓨팅 디바이스에서 이미지 처리 작업을 수행하기 위한 예시적인 흐름도(500A 및 500B)를 예시한다. 예를 들어, 흐름도(500A 및 500B)는 하나 이상의 이미지 센서(216)를 사용하여 웨어러블 컴퓨팅 디바이스(100)에 의해 실행될 수 있다. 흐름도(500A 및 500B)의 예에서, 센서(402)는 센서(402)가 들어오는(또는 캡처된) 저해상도 이미지 스트림과 들어오는(또는 캡처된) 고해상도 이미지 스트림 모두를 연속적으로 그리고 동시에 처리하는 듀얼 스트림 모드에서 기능할 수 있다.5A and 5B illustrate example flow diagrams 500A and 500B for performing image processing tasks in a wearable computing device according to implementations described throughout this disclosure. For example, flowcharts 500A and 500B may be executed by wearable computing device 100 using one or more image sensors 216. In the example of flow diagrams 500A and 500B, sensor 402 is in a dual stream mode in which sensor 402 processes both an incoming (or captured) low-resolution image stream and an incoming (or captured) high-resolution image stream continuously and simultaneously. can function in

도 5a는 예를 들어 웨어러블 컴퓨팅 디바이스(100)에 설치되어 이미지 데이터의 스트림을 캡처하거나 수신하도록 구성된 이미지 센서(402)의 사용을 도시하는 흐름도(500A)이다. 센서(402)는 이미지 컨텐츠의 두 스트림을 캡처하고 출력할 수 있다. 예를 들어, 센서(402)는 전체 시야로 캡처된 복수의 고해상도 이미지 프레임(예를 들어, 카메라/센서(402)의 전체 해상도)을 획득할 수 있다. 이 예에서, 복수의 고해상도 이미지 프레임은 고전력 이미지 신호 프로세서(406)에 의해 처리될 수 있다. 처리된 고해상도 이미지 프레임은 인코더(502)에 의해 인코딩(예를 들어, 압축)되어 버퍼(404)에 저장될 수 있다. 이것은 압축 전에 고해상도 이미지 프레임을 저장하는 버퍼 메모리 및 메모리 대역폭을 절약할 수 있는 이점을 제공한다. 대신 이 예에서는 압축되지 않은 원본 이미지 프레임 대신 고해상도 이미지 프레임의 압축된 버전을 압축하고 저장한다.FIG. 5A is a flow diagram 500A illustrating the use of an image sensor 402 installed in, for example, a wearable computing device 100 and configured to capture or receive a stream of image data. Sensor 402 can capture and output two streams of image content. For example, sensor 402 may acquire multiple high-resolution image frames captured with a full field of view (e.g., the full resolution of camera/sensor 402). In this example, multiple high-resolution image frames may be processed by high-power image signal processor 406. Processed high-resolution image frames may be encoded (e.g., compressed) by encoder 502 and stored in buffer 404. This offers the advantage of saving buffer memory and memory bandwidth for storing high-resolution image frames before compression. Instead, this example compresses and stores a compressed version of the high-resolution image frame instead of the original, uncompressed image frame.

또한, 센서(402)는 전체 시야로 캡처된 복수의 저해상도 이미지 프레임을 (동시에) 획득할 수 있다. 저해상도 이미지 프레임은 저전력 이미지 신호 프로세서(408)에 제공될 수 있다. 프로세서(408)는 저해상도 이미지 프레임을 처리하여 디베이어링되고, 색상 보정되고, 음영 보정된 이미지(예를 들어, YUV 포맷 이미지 스트림)인 이미지 스트림(410)(예를 들어, 하나 이상의 이미지)을 생성한다.Additionally, sensor 402 may acquire (simultaneously) multiple low-resolution image frames captured with the entire field of view. Low-resolution image frames may be provided to a low-power image signal processor 408. Processor 408 processes the low-resolution image frames to generate image streams 410 (e.g., one or more images) that are debayered, color-corrected, and shade-corrected images (e.g., YUV format image streams). do.

저전력 이미지 신호 프로세서(408)는 저전력 계산(412)을 수행하여 상기 생성된 이미지 스트림(410)을 분석하여 이미지(들)(410) 내의 객체 및/또는 관심 영역(414)을 검출할 수 있다. 검출에는 손 검출, 객체 검출, 문단 검출 등이 포함될 수 있다. 관심 객체 및/또는 ROI가 검출되고 이러한 영역 또는 객체의 검출이 임계 조건을 충족하면 그 객체 및/또는 영역의 경계 상자가 식별된다. 임계 조건은 이미지 스트림의 이미지 프레임에 있는 객체 또는 영역에 대한 특정한 높은 품질 레벨을 나타낼 수 있다. 예를 들어, 임계 조건은 객체 또는 영역이 초점네에 있는지, 여전히 블러가 낮은지 및/또는 올바른 노출을 갖는지를 결정하는 것 중 일부 또는 전부와 관련될 수 있다. 경계 상자는 예를 들어 상자의 네 모서리 각각의 발췌 좌표를 나타낼 수 있다.The low-power image signal processor 408 may perform low-power calculations 412 to analyze the generated image stream 410 to detect objects and/or regions of interest 414 within the image(s) 410 . Detection may include hand detection, object detection, paragraph detection, etc. If an object of interest and/or ROI is detected and detection of such region or object meets a threshold condition, the bounding box of that object and/or region is identified. A threshold condition may indicate a specific high quality level for an object or region in an image frame of an image stream. For example, a threshold condition may relate to any or all of the following: determining whether an object or area is in focus, still has low blur, and/or has correct exposure. The bounding box may represent, for example, the extracted coordinates of each of the four corners of the box.

도 5b는 예를 들어 하나 이상의 이미지 프레임(410)으로부터 부분 이미지 컨텐츠를 생성하는 프로세스를 도시하는 흐름도(500B)이다. 도 5a와 유사하게, 센서(402)는 적어도 하나의 고해상도 이미지 프레임과 적어도 하나의 저해상도 이미지 프레임을 포함하는 이미지 컨텐츠의 2개의 스트림을 캡처하고 출력할 수 있다. 이 예에서, 적어도 하나의 고해상도 이미지 프레임은 고전력 이미지 신호 프로세서(406)에 의해 처리될 수 있다. 처리된 고해상도 이미지 프레임은 인코더(502)에 의해 인코딩(예를 들어, 압축)되어 버퍼(404)에 저장될 수 있다.FIG. 5B is a flow diagram 500B illustrating a process for generating partial image content from, for example, one or more image frames 410. Similar to Figure 5A, sensor 402 may capture and output two streams of image content including at least one high-resolution image frame and at least one low-resolution image frame. In this example, at least one high-resolution image frame may be processed by high-power image signal processor 406. Processed high-resolution image frames may be encoded (e.g., compressed) by encoder 502 and stored in buffer 404.

또한, 센서(402)는 저전력 이미지 신호 프로세서(408)에 제공될 수 있는 적어도 하나의 저해상도 이미지 프레임을 (동시에) 획득할 수 있다. 프로세서(408)는 적어도 하나의 저해상도 이미지 프레임을 처리하여 디베이어링되고, 색상 교정되고, 음영 교정된 이미지(예를 들어, YUV 포맷 이미지)인 적어도 하나의 이미지(410)를 생성한다.Additionally, sensor 402 may (simultaneously) acquire at least one low-resolution image frame that may be provided to low-power image signal processor 408. Processor 408 processes at least one low-resolution image frame to generate at least one image 410 that is a debayered, color-corrected, shadow-corrected image (e.g., a YUV format image).

저전력 이미지 신호 프로세서(408)는 저전력 계산(412)을 수행하여 적어도 하나의 이미지(410)를 분석하여 예를 들어 적어도 하나의 이미지(410) 내의 객체 및/또는 관심 영역(414)을 검출할 수 있다. 검출에는 손 검출, 객체 검출, 문단 검출 등이 포함될 수 있다. 관심 객체 및/또는 ROI가 검출되고 이러한 영역 또는 객체의 검출이 임계 조건을 충족하면 그 객체 및/또는 영역의 경계 상자가 식별된다. 임계 조건은 이미지 스트림의 이미지 프레임에 있는 객체 또는 영역에 대한 특정한 높은 품질 레벨을 나타낼 수 있다. 예를 들어, 임계 조건은 객체 또는 영역이 초점내에 있는지, 여전히 블러가 낮은지 및/또는 올바른 노출을 갖는지를 결정하는 것 중 일부 또는 전부와 관련될 수 있다. 경계 상자는 예를 들어 상자의 네 모서리 각각의 발췌 좌표를 나타낼 수 있다.Low-power image signal processor 408 may perform low-power calculations 412 to analyze at least one image 410 to, for example, detect objects and/or regions of interest 414 within at least one image 410. there is. Detection may include hand detection, object detection, paragraph detection, etc. If an object of interest and/or ROI is detected and detection of such region or object meets a threshold condition, the bounding box of that object and/or region is identified. A threshold condition may indicate a specific high quality level for an object or region in an image frame of an image stream. For example, a threshold condition may relate to any or all of the following: determining whether an object or area is in focus, still has low blur, and/or has correct exposure. The bounding box may represent, for example, the extracted coordinates of each of the four corners of the box.

ROI 또는 관심 객체가 결정되고 임계 조건이 충족되면, 발췌하는 좌표(예를 들어, 발췌 좌표(238))는 발췌기(예를 들어, 발췌기(236))로 전송되거나 그에 의해 획득될 수 있다. 그런 다음 흐름도(500B)는 버퍼(404)로부터 적어도 하나의 고해상도 이미지를 검색할 수 있다. 이어서 검색된 고해상도 이미지는 저해상도 이미지에 대해 결정된 발췌 좌표에 따라 관심 영역 또는 객체에 대해 발췌된다(504). 고해상도 이미지는 일반적으로 관심 영역이나 객체가 식별된 저해상도 이미지와 동일한 캡처 시간을 갖는다. 발췌물의 출력은 고해상도 이미지의 발췌된 부분(506)을 포함할 수 있다. 추가적인 다운스트림 처리는 웨어러블 컴퓨팅 디바이스(100)의 온보드 또는 오프보드에서 수행될 수 있다.Once the ROI or object of interest is determined and the threshold conditions are met, the coordinates to extract (e.g., extract coordinates 238) may be transmitted to or obtained by an extractor (e.g., extractor 236). . Flow diagram 500B may then retrieve at least one high-resolution image from buffer 404. The retrieved high-resolution image is then extracted for the region or object of interest according to the extracted coordinates determined for the low-resolution image (504). High-resolution images typically have the same capture time as lower-resolution images with the area or object of interest identified. The output of the excerpt may include an excerpt 506 of a high-resolution image. Additional downstream processing may be performed on-board or off-board the wearable computing device 100.

더 높은 계산과 전력 자원을 갖춘 디바이스에서 발생하는 기존의 이미지 분석과 달리, 본 명세서에 설명된 시스템은 고해상도 이미지의 연속 처리를 피할 수 있다. 또한, 웨어러블 컴퓨팅 디바이스(100)에 의해 스케일링이 수행되지 않으므로, 본 명세서에 설명된 시스템은 스케일링을 위한 계산 자원을 피할 수 있다. 또한, 저해상도 이미지 스트림과 고해상도 이미지 스트림을 모두 활용하는 일반적인 이미지 분석 시스템은 각 스트림마다 별도의 버퍼링된 스토리지를 사용한다. 본 명세서에 설명된 시스템은 고해상도 이미지에 대해 버퍼를 활용하고 더 적은 메모리 트래픽 및/또는 메모리라는 이점을 제공하는 저해상도 이미지에는 버퍼의 사용을 피한다.Unlike traditional image analysis, which occurs on devices with higher computational and power resources, the system described herein avoids continuous processing of high-resolution images. Additionally, because scaling is not performed by the wearable computing device 100, the system described herein can avoid computational resources for scaling. Additionally, typical image analysis systems that utilize both low-resolution and high-resolution image streams use separate buffered storage for each stream. The systems described herein utilize buffers for high-resolution images and avoid the use of buffers for low-resolution images, which provides the advantage of less memory traffic and/or memory.

도 6a-6c는 본 개시 전반에 걸쳐 설명된 구현에 따른 웨어러블 컴퓨팅 디바이스(100)에서 이미지 처리 작업을 수행하기 위한 예시적인 흐름도(600A, 600B, 600C)를 도시한다. 흐름도(600A-600C)의 예에서, 센서(402)는 센서(402)가 들어오는(또는 캡처된) 저해상도 이미지 스트림 또는 들어오는(또는 캡처된) 고해상도 이미지 스트림을 분석하는 단일 모드를 처리하는 단일 스트림 모드로 기능할 수 있다. 그런 다음 센서(402)는 들어오는(또는 캡처된) 저해상도 이미지 스트림과 들어오는(또는 캡처된) 고해상도 이미지 스트림 모두를 연속적으로 그리고 동시에 처리하기 위해 듀얼 스트림 모드로 기능하도록 트리거될 수 있다.6A-6C depict example flow diagrams 600A, 600B, and 600C for performing image processing tasks in wearable computing device 100 according to implementations described throughout this disclosure. In the examples of flow diagrams 600A-600C, sensor 402 processes a single mode in which sensor 402 analyzes an incoming (or captured) low-resolution image stream or an incoming (or captured) high-resolution image stream. It can function as Sensor 402 may then be triggered to function in dual stream mode to process both the incoming (or captured) low resolution image stream and the incoming (or captured) high resolution image stream continuously and simultaneously.

도 6a는 예를 들어 웨어러블 컴퓨팅 디바이스(100)에 설치된 이미지 센서(402)의 사용을 도시하는 흐름도(600A)이다. 센서(402)는 듀얼 모드 처리와 단일 모드 처리 사이를 스위칭하도록 구성될 수 있다. 예를 들어, 센서(402)는 먼저 저해상도 이미지 스트림(예를 들어, 복수의 연속 이미지 프레임)을 스트리밍/캡처하는 모드에서 기능할 수 있다. 이러한 모드는 고해상도 이미지 분석이 첫 번째 캡처 및/또는 이미지 검색 단계에서 회피되기 때문에 웨어러블 컴퓨팅 디바이스(100)에 대한 계산 전력 및 배터리 전력을 절약할 수 있다. 버퍼(404) 및 고전력 이미지 신호 프로세서(406)는 사용이 가능할 수 있지만, 이 예에서는 비활성화된다.FIG. 6A is a flow diagram 600A illustrating the use of an image sensor 402 installed, for example, in wearable computing device 100. Sensor 402 may be configured to switch between dual mode processing and single mode processing. For example, sensor 402 may function in a mode that first streams/captures a low-resolution image stream (e.g., multiple consecutive image frames). This mode may save computational power and battery power for the wearable computing device 100 because high-resolution image analysis is avoided in the first capture and/or image retrieval step. Buffer 404 and high power image signal processor 406 may be available, but are disabled in this example.

동작 시, 저해상도 이미지 프레임은 저전력 이미지 신호 프로세서(408)에 제공될 수 있다. 프로세서(408)는 저해상도 이미지 프레임을 처리하여 디베이어링되고, 색상 보정되고, 음영 보정된 이미지(예를 들어, YUV 포맷 이미지 스트림)인 이미지 스트림(410)(예를 들어, 하나 이상의 이미지)을 생성한다. 저전력 이미지 신호 프로세서(408)는 저전력 계산(412)을 수행하여 상기 생성된 이미지 스트림(410)을 분석하여 이미지(들)(410) 내의 객체 및/또는 관심 영역(414)을 검출할 수 있다. 검출에는 손 검출, 객체 검출, 문단 검출 등이 포함될 수 있다.In operation, low-resolution image frames may be provided to low-power image signal processor 408. Processor 408 processes the low-resolution image frames to generate image streams 410 (e.g., one or more images) that are debayered, color-corrected, and shade-corrected images (e.g., YUV format image streams). do. The low-power image signal processor 408 may perform low-power calculations 412 to analyze the generated image stream 410 to detect objects and/or regions of interest 414 within the image(s) 410 . Detection may include hand detection, object detection, paragraph detection, etc.

관심 객체 및/또는 ROI가 검출되고 이러한 영역 또는 객체의 검출이 임계 조건을 충족하는 경우, 그 객체 및/또는 영역의 경계 상자가 식별된다. 임계 조건은 이미지 스트림의 이미지 프레임에 있는 객체 또는 영역에 대한 특정한 높은 품질 레벨을 나타낼 수 있다. 예를 들어, 임계 조건은 객체 또는 영역이 초점내에 있는지, 여전히 블러가 낮은지 및/또는 올바른 노출을 갖는지를 결정하는 것 중 일부 또는 전부와 관련될 수 있다. 경계 상자는 예를 들어 상자의 네 모서리 각각의 발췌 좌표를 나타낼 수 있다. 관심 영역 또는 객체의 식별 및 발췌 좌표는 센서(402)가 듀얼 스트림 모드에서 작동을 시작할 수 있게 하는 트리거일 수 있다.If an object of interest and/or ROI is detected and detection of such region or object meets a threshold condition, the bounding box of that object and/or region is identified. A threshold condition may indicate a specific high quality level for an object or region in an image frame of an image stream. For example, a threshold condition may relate to any or all of the following: determining whether an object or area is in focus, still has low blur, and/or has correct exposure. The bounding box may represent, for example, the extracted coordinates of each of the four corners of the box. Identification and extraction coordinates of a region of interest or object may be a trigger that allows sensor 402 to begin operating in dual stream mode.

일부 구현에서, 듀얼 스트림 모드를 시작하는 트리거는 여러 임계 조건이 충족될 때 발생할 수 있지만 전부는 아니다. 예를 들어, 웨어러블 컴퓨팅 디바이스(100)는 카메라 초점에 대한 임계 조건 및 카메라에 대한 노출 임계 조건을 충족하는 관심 객체를 인식할 수 있다. 그러나, 객체가 움직일 수 있기 때문에(예를 들어 모션 블러가 있을 수 있음), 웨어러블 컴퓨팅 디바이스(100)는 최종 조건이 충족될 때 프로세서(118)를 사용하여 고해상도 이미지 캡처를 준비하기 위해 듀얼 스트림 모드의 사용을 트리거할 수 있다. 이는 듀얼 스트림 모드로 전환하는데 여러 프레임의 지연이 있을 때 유용할 수 있다. 즉, 듀얼 스트림 모드가 실행되는 기간은 거의 모든 임계 조건이 충족되고, 검출 및 결정하기 위한 최종 임계 조건이 다가오는 영상 프레임에서 발생할 것으로 예측될 때 진입하는 전환 기간일 수 있다.In some implementations, the trigger to initiate dual stream mode may occur when several, but not all, threshold conditions are met. For example, the wearable computing device 100 may recognize an object of interest that satisfies a threshold condition for camera focus and a threshold condition for exposure to the camera. However, because the object may be moving (e.g., there may be motion blur), wearable computing device 100 uses processor 118 to prepare the high-resolution image capture when the final conditions are met. can trigger the use of . This can be useful when there is a delay of several frames in switching to dual stream mode. That is, the period in which the dual stream mode is executed may be a transition period that enters when almost all threshold conditions are met and the final threshold condition for detection and decision is expected to occur in the upcoming video frame.

도 6b는 센서(402)가 듀얼 스트림 모드로 전환되는 웨어러블 컴퓨팅 디바이스(100)에 설치된 이미지 센서(402)의 사용을 도시하는 흐름도(600B)이다. 듀얼 스트림 모드로 전환하기 위한 트리거는 하나 이상의 객체 또는 관심 영역의 검출을 포함할 수 있다. 듀얼 스트림 모드는 이미 스트리밍 중인 저해상도 이미지 프레임에 추가하여 센서(402)가 고해상도 이미지 프레임의 캡처(또는 획득)를 시작하도록 트리거한다. 일부 구현에서, 고해상도 이미지 스트림의 여러 이미지 프레임은 자동 이득 측정, 노출 설정 및 초점 설정을 구성하는데 사용될 수 있다. 일단 고해상도 이미지 스트림에 대한 구성이 완료되면, 스트림은 동기화될 수 있고 저해상도 이미지 프레임은 화살표(602)로 도시된 바와 같이 도 6a에서와 동일한 흐름을 통해 계속될 수 있다. 그러나, 센서(402)는 프로세서를 사용하여 고해상도 이미지 프레임을 버퍼링(버퍼(404) 사용)하기 시작할 수 있다. 버퍼(404)는 센서가 영상 스트리밍 초기부터 연속 듀얼 스트림 모드로 동작할 때 사용되던 버퍼보다 작을 수 있다.FIG. 6B is a flow diagram 600B illustrating the use of an image sensor 402 installed in a wearable computing device 100 where the sensor 402 is switched to dual stream mode. A trigger for switching to dual stream mode may include detection of one or more objects or regions of interest. Dual stream mode triggers sensor 402 to begin capturing (or acquiring) high-resolution image frames in addition to the low-resolution image frames that are already streaming. In some implementations, multiple image frames of a high-resolution image stream may be used to configure automatic gain measurements, exposure settings, and focus settings. Once configuration for the high-resolution image stream is complete, the streams can be synchronized and the low-resolution image frames can continue through the same flow as in Figure 6A, as shown by arrow 602. However, sensor 402 may begin buffering (using buffer 404) high-resolution image frames using the processor. The buffer 404 may be smaller than the buffer used when the sensor operates in continuous dual stream mode from the beginning of video streaming.

도 6c는 웨어러블 컴퓨팅 디바이스(100)에 설치되어 단일 모드 스트리밍에서 듀얼 모드 스트리밍으로 전환된 후 이미지 데이터의 스트림을 캡처하거나 수신하도록 구성된 이미지 센서(402)의 사용을 도시하는 흐름도(600C)이다. 동작 시, 센서(402)는 전체 시야로 캡처된 복수의 고해상도 이미지 프레임(예를 들어, 카메라/센서(402)의 전체 해상도)을 얻을 수 있다. 복수의 고해상도 이미지 프레임은 버퍼(404)에 저장될 수 있다.FIG. 6C is a flow diagram 600C illustrating the use of an image sensor 402 installed in a wearable computing device 100 and configured to capture or receive a stream of image data after switching from single-mode streaming to dual-mode streaming. In operation, sensor 402 may obtain multiple high-resolution image frames captured with a full field of view (e.g., the full resolution of camera/sensor 402). Multiple high-resolution image frames may be stored in buffer 404.

저해상도 이미지 프레임은 센서(402)에 의해 계속해서 캡처 및/또는 획득되며 저전력 이미지 신호 프로세서(408)에 제공될 수 있다. 프로세서(408)는 저해상도 이미지 프레임을 처리하여 디베이어링되고, 색상 보정되고, 음영 보정된 이미지(예를 들어, YUV 포맷 이미지 스트림)인 이미지 스트림(410)(예를 들어, 하나 이상의 이미지)을 생성한다. 저전력 이미지 신호 프로세서(408)는 저전력 계산(412)을 수행하여 상기 생성된 이미지 스트림(410)을 분석하여 이미지(들)(410) 내의 객체 및/또는 관심 영역(414)을 검출할 수 있다. 관심 객체 및/또는 ROI가 검출되고 이러한 영역 또는 객체의 검출이 임계 조건을 충족하면 객체 및/또는 영역의 경계 상자가 식별된다.Low-resolution image frames may continue to be captured and/or acquired by sensor 402 and provided to low-power image signal processor 408. Processor 408 processes the low-resolution image frames to generate image streams 410 (e.g., one or more images) that are debayered, color-corrected, and shade-corrected images (e.g., YUV format image streams). do. The low-power image signal processor 408 may perform low-power calculations 412 to analyze the generated image stream 410 to detect objects and/or regions of interest 414 within the image(s) 410 . If an object of interest and/or ROI is detected and detection of such region or object meets a threshold condition, the bounding box of the object and/or region is identified.

임계 조건은 이미지 스트림의 이미지 프레임에 있는 객체 또는 영역에 대한 특정한 높은 품질 레벨을 나타낼 수 있다. 예를 들어, 임계 조건은 객체 또는 영역이 초점에 있는지, 여전히 블러가 낮은지 및/또는 올바른 노출을 갖는지를 결정하는 것 중 일부 또는 전부와 관련될 수 있다. 경계 상자는 관심 영역 또는 객체(414)를 나타내는 발췌 좌표를 나타낼 수 있다. 일단 객체 또는 관심 영역(414)이 결정되고 임계 조건(들)이 충족되면, 고전력 이미지 신호 프로세서(406)는 버퍼(404)로부터 고해상도 이미지 프레임(예를 들어, 관심 객체 또는 영역이 식별된 저해상도 이미지 프레임과 동일한 캡처 시간과 일치하는 원시 프레임)을 검색(또는 수신)한다.A threshold condition may indicate a specific high quality level for an object or region in an image frame of an image stream. For example, a threshold condition may relate to any or all of the following: determining whether an object or area is in focus, still has low blur, and/or has correct exposure. The bounding box may represent a region of interest or excerpt coordinates representing an object 414. Once the object or region of interest 414 is determined and the threshold condition(s) are met, the high-power image signal processor 406 extracts a high-resolution image frame from buffer 404 (e.g., a low-resolution image with the object or region of interest identified). Retrieves (or receives) a raw frame (that matches the same capture time as the frame).

발췌 좌표 및 검색된 고해상도 이미지 프레임은 고전력 이미지 신호 프로세서(406)의 입력으로 제공될 수 있다(예를 들어, 화살표(604)). 고전력 이미지 신호 프로세서(406)는 저해상도 이미지 프레임으로부터 결정된 객체 및/또는 관심 영역에 대해 발췌된 처리된(예를 들어, 변환된, 색상 보정된, 음영 보정된) 이미지(606)를 출력할 수 있다. 이 발췌된 전체 해상도 이미지(606)는 추가 처리를 위해 추가 온보드 또는 오프보드 디바이스에 제공될 수 있다.The excerpt coordinates and retrieved high-resolution image frames may be provided as input to the high-power image signal processor 406 (e.g., arrow 604). The high-power image signal processor 406 may output a processed (e.g., converted, color-corrected, shade-corrected) image 606 extracted for the determined object and/or region of interest from the low-resolution image frame. . This excerpted full resolution image 606 may be provided to additional onboard or offboard devices for further processing.

도 7a-7c는 본 개시 전반에 걸쳐 설명된 구현에 따른 웨어러블 컴퓨팅 디바이스(100)에서 이미지 처리 작업을 수행하기 위한 예시적인 흐름도(700A, 700B, 700C)를 도시한다. 흐름도(700A-700C)의 예에서, 센서(402)는 제1 모드에서 기능할 수 있고 이어서 제2 모드에서 기능하도록 트리거될 수 있다. 예를 들어, 센서(402)는 저해상도 이미지가 캡처되고 활용되는 저해상도 스트리밍 모드에서 컨텐츠 스트리밍 및/또는 캡처를 시작하도록 트리거될 수 있다. 하나 이상의 객체 및/또는 관심 영역을 검출하면, 센서(402)는 고해상도 이미지가 활용되는 고해상도 스트리밍 모드로 전환할 수 있다.7A-7C depict example flow diagrams 700A, 700B, and 700C for performing image processing tasks in wearable computing device 100 according to implementations described throughout this disclosure. In the example of flowcharts 700A-700C, sensor 402 may function in a first mode and then be triggered to function in a second mode. For example, sensor 402 may be triggered to begin streaming and/or capturing content in a low-resolution streaming mode in which low-resolution images are captured and utilized. Upon detecting one or more objects and/or areas of interest, sensor 402 may switch to a high-resolution streaming mode in which high-resolution images are utilized.

도 7a는 예를 들어 웨어러블 컴퓨팅 디바이스(100)에 설치된 이미지 센서(402)의 사용을 도시하는 흐름도(700A)이다. 이 예에서, 센서(402)는 먼저 저해상도 이미지 스트림(예를 들어, 복수의 연속 이미지 프레임)을 스트리밍 및/또는 캡처(또는 획득)하는 저해상도 스트리밍 모드에서 기능할 수 있다.FIG. 7A is a flow diagram 700A illustrating the use of an image sensor 402 installed, for example, in wearable computing device 100. In this example, sensor 402 may function in a low-resolution streaming mode, first streaming and/or capturing (or acquiring) a low-resolution image stream (e.g., a plurality of consecutive image frames).

관심 객체 및/또는 ROI가 검출되고 이러한 영역 또는 객체의 검출이 임계 조건을 충족하면 객체 및/또는 영역의 경계 상자가 식별된다. 임계 조건은 이미지 스트림의 이미지 프레임에 있는 객체 또는 영역에 대한 특정한 높은 품질 레벨을 나타낼 수 있다. 예를 들어, 임계 조건은 객체 또는 영역이 초점 내에 있는지, 여전히 블러가 낮은지 및/또는 올바른 노출을 갖는지 결정하는 것 중 일부 또는 전부와 관련될 수 있다. 경계 상자는 예를 들어 상자의 네 모서리 각각의 발췌 좌표를 나타낼 수 있다. 관심 영역 또는 객체의 식별 및 발췌 좌표는 센서(402)가 저해상도 스트리밍 모드에서 고해상도 스트리밍 모드로 전환할 수 있게 하는 트리거일 수 있다.If an object of interest and/or ROI is detected and detection of such region or object meets a threshold condition, the bounding box of the object and/or region is identified. A threshold condition may indicate a specific high quality level for an object or region in an image frame of an image stream. For example, a threshold condition may relate to any or all of the following: determining whether an object or area is in focus, still has low blur, and/or has correct exposure. The bounding box may represent, for example, the extracted coordinates of each of the four corners of the box. Identification and extraction coordinates of a region of interest or object may be a trigger that allows sensor 402 to switch from a low-resolution streaming mode to a high-resolution streaming mode.

도 7b는 센서(402)가 저해상도 스트리밍 모드에서 고해상도 스트리밍 모드로 전환되도록 트리거된 웨어러블 컴퓨팅 디바이스(100)에 설치된 이미지 센서(402)의 사용을 도시하는 흐름도(700B)이다. 이러한 모드들을 전환하는 트리거에는 하나 이상의 객체 또는 관심 영역의 검출이 포함될 수 있다. 저해상도 스트리밍 모드에서 고해상도 스트리밍 모드로의 전환은 센서(402)를 트리거하여 저해상도 이미지 프레임의 캡처를 중단하면서 고해상도 이미지 프레임의 캡처(또는 획득)를 시작하도록 한다.FIG. 7B is a flow diagram 700B illustrating the use of an image sensor 402 installed on a wearable computing device 100 to trigger the sensor 402 to transition from a low-resolution streaming mode to a high-resolution streaming mode. Triggers to switch these modes may include detection of one or more objects or regions of interest. The transition from low-resolution streaming mode to high-resolution streaming mode triggers sensor 402 to stop capturing low-resolution image frames and begin capturing (or acquiring) high-resolution image frames.

이 예에서, 적어도 하나의 고해상도 이미지 프레임은 고전력 이미지 신호 프로세서(406)에 의해 처리될 수 있다. 처리된 고해상도 이미지 프레임 중 하나 이상은 저해상도 이미지(704)(또는 이미지 스트림)를 생성하기 위해 스케일링(예를 들어, 해상도 축소)되도록 스케일러(702)에 제공되어 될 수 있다. 고전력 이미지 신호 프로세서(406)는 이미지(704)(또는 이미지 스트림)에 대해 저전력 계산(412)을 수행할 수 있다. 저전력 계산은 적어도 하나의 이미지(704)를 분석하여 예를 들어 적어도 하나의 이미지(704) 내의 객체 및/또는 관심 영역(706)을 검출하는 것을 포함할 수 있다. 검출에는 손 검출, 객체 검출, 문단 검출 등이 포함될 수 있다. 관심 객체 및/또는 ROI가 검출되고 이러한 영역 또는 객체의 검출이 임계 조건을 충족하면 객체 및/또는 영역의 경계 상자가 식별된다. 임계 조건은 이미지 스트림의 이미지 프레임에 있는 객체 또는 영역에 대한 특정한 높은 품질 레벨을 나타낼 수 있다. 예를 들어, 임계 조건은 객체 또는 영역이 초점 내에 있는지, 여전히 블러가 낮은지 및/또는 올바른 노출을 갖는지를 결정하는 것 중 일부 또는 전부와 관련될 수 있다. 경계 상자는 예를 들어 상자의 네 모서리 각각의 발췌 좌표를 나타낼 수 있다. 고전력 이미지 신호 프로세서(406)는 또한 버퍼(710)에 전체 고해상도 이미지(708)를 버퍼링할 수 있다. 이미지(708)는 고해상도 이미지로부터 이익을 얻을 수 있는 동작들에 사용될 수 있다.In this example, at least one high-resolution image frame may be processed by high-power image signal processor 406. One or more of the processed high-resolution image frames may be provided to a scaler 702 to be scaled (e.g., reduced resolution) to produce a lower-resolution image 704 (or image stream). High-power image signal processor 406 may perform low-power calculations 412 on image 704 (or image stream). Low-power computation may include analyzing the at least one image 704 to, for example, detect objects and/or regions of interest 706 within the at least one image 704. Detection may include hand detection, object detection, paragraph detection, etc. If an object of interest and/or ROI is detected and detection of such region or object meets a threshold condition, the bounding box of the object and/or region is identified. A threshold condition may indicate a specific high quality level for an object or region in an image frame of an image stream. For example, a threshold condition may relate to any or all of the following: determining whether an object or area is in focus, still has low blur, and/or has correct exposure. The bounding box may represent, for example, the extracted coordinates of each of the four corners of the box. High-power image signal processor 406 may also buffer the entire high-resolution image 708 in buffer 710. Image 708 may be used for operations that would benefit from a high-resolution image.

도 7c는 웨어러블 컴퓨팅 디바이스(100)에 설치된 이미지 센서(402)의 사용을 도시하는 흐름도(700C)이다. 이 예에서 센서(402)는 ROI(706)의 발췌 좌표를 결정하는 것에 응답하여 고해상도 스트리밍 모드로 전환된다. 예를 들어, ROI(706)(또는 관심 객체)가 결정되고 임계 조건이 충족되면, 발췌 좌표(예를 들어, 발췌 좌표(238))는 발췌기(예를 들어, 발췌기(236))로 전송되거나 발췌기에 의해 획득될 수 있다. 고해상도 이미지(708)가 버퍼(710)로부터 검색될 수 있다. 그런 다음, 검색된 고해상도 이미지(708)는 저해상도 이미지(704)에 대해 결정된 발췌 좌표(예를 들어 ROI(706))에 따라 관심 영역 또는 객체로 발췌된다(712). 고해상도 이미지(708)는 일반적으로 관심 영역 또는 객체(706)가 식별된 저해상도 이미지(704)와 동일한 캡처 시간을 갖는다. 발췌물의 출력은 고해상도 이미지의 발췌된 부분(714)을 포함할 수 있다. 추가적인 다운스트림 처리는 웨어러블 컴퓨팅 디바이스(100)의 온보드 또는 오프보드에서 수행될 수 있다.FIG. 7C is a flow diagram 700C illustrating the use of the image sensor 402 installed on the wearable computing device 100. In this example, sensor 402 transitions to high-resolution streaming mode in response to determining the excerpt coordinates of ROI 706. For example, once an ROI 706 (or object of interest) is determined and a threshold condition is met, the excerpt coordinates (e.g., excerpt coordinates 238) are sent to an extractor (e.g., extractor 236). It can be transmitted or obtained by an extractor. High-resolution images 708 may be retrieved from buffer 710. The retrieved high-resolution image 708 is then extracted 712 as a region or object of interest according to the extracted coordinates (e.g., ROI 706) determined for the low-resolution image 704. The high-resolution image 708 generally has the same capture time as the lower-resolution image 704 in which the region or object of interest 706 was identified. The output of the excerpt may include an excerpt 714 of the high-resolution image. Additional downstream processing may be performed on-board or off-board the wearable computing device 100.

일부 구현에서, 센서(402)는 저해상도 스트리밍 모드와 고해상도 스트리밍 모드의 사용 사이에서 빠른 동적 전환을 수행할 시기를 결정할 수 있다. 예를 들어, 센서(402)는 고전력 이미지 신호 프로세서(406) 또는 저전력 이미지 신호 프로세서(408)에 센서 데이터(예를 들어, 이미지 데이터, 픽셀, 광학 데이터 등)를 제공할 시기를 결정할 수 있다. 이 결정은 예를 들어 전력 사용 및 시스템 지연을 감소 및/또는 최소화하기 위한 목적으로 이미지 스트림과 관련하여 발생하는 검출된 이벤트에 기초할 수 있다. 검출된 이벤트의 예로는 객체, ROI, 이동 또는 이동 중지, 조명 변경, 새로운 사용자, 기타 사용자, 기타 객체 등 중 하나 이상을 검출하는 것이 포함될 수 있다.In some implementations, sensor 402 may determine when to perform a fast dynamic transition between using a low-resolution streaming mode and a high-resolution streaming mode. For example, sensor 402 may determine when to provide sensor data (e.g., image data, pixels, optical data, etc.) to high power image signal processor 406 or low power image signal processor 408. This decision may be based on detected events occurring in connection with the image stream, for example, for the purpose of reducing and/or minimizing power usage and system delay. Examples of detected events may include detecting one or more of the following: an object, an ROI, moving or stopping moving, a lighting change, a new user, another user, another object, etc.

이 예에서, 웨어러블 컴퓨팅 디바이스(100)는 저해상도를 갖는 컨텐츠로 동작하도록 트리거될 수 있고, 높은 해상도를 갖는 컨텐츠로 전환할 수 있다. 이미지 스트림의 저해상도와 고해상도를 모두 사용할 수 있기 때문에, 웨어러블 컴퓨팅 디바이스(100)는 버퍼링된 데이터로부터 컨텐츠를 검색하지 않고 2개의 스트림으로부터 선택할 수 있다. 이를 통해 계산 전환 비용, 메모리 비용, 및 메모리로부터의 컨텐츠 검색 비용을 절약할 수 있다.In this example, wearable computing device 100 may be triggered to operate with content having a low resolution and switch to content having a high resolution. Because both low and high resolution image streams can be used, the wearable computing device 100 can select from the two streams without retrieving content from buffered data. This saves computational conversion costs, memory costs, and content retrieval costs from memory.

도 8a 및 도 8b는 본 개시 전반에 걸쳐 설명된 구현에 따른 이미지 처리 작업을 수행하기 위해 단일 이미지 센서로 듀얼 해상도 이미지 센서를 에뮬레이트하기 위한 예시적인 흐름도(800A 및 800B)를 도시한다. 이 예에서, 듀얼 스트림 출력 센서(예를 들어, 센서(402))는 장면 카메라(802)를 표현하기 위해 저해상도의 넓은 시야를 갖는 제1 이미지 스트림을 출력하도록 구성될 수 있다. 마찬가지로, 센서는 세부(detail) 카메라(804)를 나타내기 위해 고해상도의 좁은 시야를 갖는 제2 스트림을 출력하도록 구성될 수 있다.8A and 8B illustrate example flow diagrams 800A and 800B for emulating a dual resolution image sensor with a single image sensor to perform image processing tasks according to implementations described throughout this disclosure. In this example, a dual stream output sensor (e.g., sensor 402) may be configured to output a first image stream with a low resolution, wide field of view to represent the scene camera 802. Likewise, the sensor may be configured to output a second stream with a high resolution, narrow field of view to represent the detail camera 804.

도 8a는 웨어러블 컴퓨팅 디바이스에 설치되어 이미지 데이터의 스트림을 캡처하거나 수신하도록 구성된 이미지 센서(예를 들어, 장면 카메라(802) 및 세부 카메라(804))의 사용을 도시하는 흐름도(800A)이다. 세부 카메라(804)는 고해상도 이미지 스트림(예를 들어, 복수의 이미지 프레임)을 획득(예를 들어, 캡처)하고 출력할 수 있다. 복수의 고해상도 이미지 프레임은 버퍼(806)에 저장될 수 있다. 일반적으로, 고출력 이미지 신호 프로세서(808)는 고해상도 이미지 프레임의 분석을 수행하기 위해 선택적으로 이용 가능하다.FIG. 8A is a flow diagram 800A illustrating the use of image sensors (e.g., scene camera 802 and detail camera 804) installed in a wearable computing device and configured to capture or receive a stream of image data. Detail camera 804 may acquire (e.g., capture) and output a high-resolution image stream (e.g., multiple image frames). Multiple high-resolution image frames may be stored in buffer 806. In general, a high-output image signal processor 808 is optionally available to perform analysis of high-resolution image frames.

장면 카메라(802)는 복수의 저해상도 이미지 프레임을 (동시에) 획득(예를 들어, 캡처)하고 출력할 수 있다. 저해상도 이미지 프레임은 저전력 이미지 신호 프로세서(810)에 제공될 수 있다. 프로세서(810)는 저해상도 이미지 프레임을 처리하여 디베이어링되고, 색상 보정되고, 음영 보정된 이미지(예를 들어, YUV 포맷 이미지 스트림)인 이미지 스트림(812)(예를 들어, 하나 이상의 이미지)을 생성한다.Scene camera 802 may acquire (e.g., capture) and output (simultaneously) multiple low-resolution image frames. Low-resolution image frames may be provided to the low-power image signal processor 810. Processor 810 processes the low-resolution image frames to generate image streams 812 (e.g., one or more images) that are debayered, color-corrected, and shade-corrected images (e.g., YUV format image streams). do.

저전력 이미지 신호 프로세서(810)는 저전력 계산(814)을 수행하여 상기 생성된 이미지 스트림(812)을 분석하여 이미지(들)(812) 내의 객체 및/또는 관심 영역(816)을 검출할 수 있다. 검출에는 손 검출, 객체 검출, 문단 검출 등이 포함될 수 있다. 관심 객체 및/또는 ROI가 검출되고 이러한 영역 또는 객체의 검출이 임계 조건을 충족하면 객체 및/또는 영역의 경계 상자가 식별된다. 경계 상자는 상자의 네 모서리 각각의 발췌 좌표를 나타낼 수 있다. 임계 조건은 예를 들어 검출된 트리거 이벤트에 기초하여 단일 캡처(또는 간단한 캡처 버스트)에 대해 세부 카메라로 전환되도록 하는 장면 카메라 뷰의 트리거 이벤트를 나타낼 수 있다. 임계 조건(예를 들어, 트리거 이벤트)은 예를 들어 이미지의 뷰에서 조건 및/또는 항목이 인식될 때 발생할 수 있다.The low-power image signal processor 810 may perform low-power calculations 814 to analyze the generated image stream 812 to detect objects and/or regions of interest 816 within the image(s) 812. Detection may include hand detection, object detection, paragraph detection, etc. If an object of interest and/or ROI is detected and detection of such region or object meets a threshold condition, the bounding box of the object and/or region is identified. The bounding box can represent the extracted coordinates of each of the four corners of the box. A threshold condition may indicate, for example, a trigger event in the scene camera view that causes a switch to the detail camera for a single capture (or simple capture burst) based on the detected trigger event. A threshold condition (e.g., a trigger event) may occur, for example, when a condition and/or item is recognized in a view of an image.

도 8b는 웨어러블 컴퓨팅 디바이스에 설치되어 이미지 데이터의 스트림을 캡처하거나 수신하도록 구성된 이미지 센서(예를 들어, 장면 카메라(802) 및 세부 카메라(804))의 사용을 도시하는 흐름도(800B)이다. 도 8a와 유사하게, 장면 카메라(802)는 하나 이상의 저해상도 이미지(예를 들어, 이미지 프레임)를 캡처 및 출력하고, 저전력 이미지 신호 프로세서(810)로 이미지를 처리하고 및/또는 저전력 컴퓨팅(814)으로 계산을 수행하여 저해상도 이미지(812)를 생성하고, 객체 및/또는 관심 영역(816)을 결정할 수 있다.FIG. 8B is a flow diagram 800B illustrating the use of image sensors (e.g., scene camera 802 and detail camera 804) installed in a wearable computing device and configured to capture or receive a stream of image data. Similar to Figure 8A, scene camera 802 captures and outputs one or more low-resolution images (e.g., image frames) and processes the images with low-power image signal processor 810 and/or low-power computing 814. By performing calculations, a low-resolution image 812 can be generated and the object and/or region of interest 816 can be determined.

일단 객체 및/또는 관심 영역(816)이 식별되고 임계 조건이 충족되는 것으로 결정되면, 고해상도 이미지가 버퍼(806)로부터 검색될 수 있다. 예를 들어, 고전력 이미지 신호 프로세서(808)는 저해상도 이미지(812)와 발췌된 ROI(816)의 발췌 좌표를 사용하여, 버퍼(806)에 저장되고 이미지(812)와 동시에 캡처된 고해상도의 더 좁은 시야 이미지 내에서 동일한 ROI(816)를 식별할 수 있다.Once the object and/or area of interest 816 is identified and the threshold conditions are determined to be met, a high-resolution image may be retrieved from the buffer 806. For example, the high-power image signal processor 808 may use the low-resolution image 812 and the excerpted coordinates of the extracted ROI 816 to capture the high-resolution, narrower image signal processor 808 stored in the buffer 806 and captured simultaneously with the image 812. The same ROI 816 can be identified within the field of view image.

예를 들어, 발췌 좌표(예를 들어, 화살표(818)) 및 검색된 고해상도 이미지 프레임은 고전력 이미지 신호 프로세서(808)의 입력으로 제공될 수 있다. 고전력 이미지 신호 프로세서(808)는 저해상도 이미지 프레임으로부터 결정된 관심 영역 및/또는 객체에 대해 발췌된 처리된(예를 들어, 변환된, 색상 보정된, 음영 보정된) 이미지(820)를 출력할 수 있다. 이렇게 발췌된 전체 해상도 이미지(820)는 추가 처리를 위해 추가 온보드 또는 오프보드 디바이스에 제공될 수 있다.For example, excerpt coordinates (e.g., arrows 818) and retrieved high-resolution image frames may be provided as input to a high-power image signal processor 808. High-power image signal processor 808 may output a processed (e.g., converted, color-corrected, shade-corrected) image 820 extracted for regions of interest and/or objects determined from low-resolution image frames. . This extracted full resolution image 820 may be provided to additional onboard or offboard devices for further processing.

일부 구현에서, 복수의 이미지 프레임이 아니라, 단일 고해상도 이미지가 계산 및/또는 디바이스 전력을 절약하기 위해 고전력 이미지 신호 프로세서를 통해 처리될 수 있다. 추가적인 계산 및/또는 디바이스 전력은 또한 추가적인 다운스트림 처리에 의해 발생할 수 있는데, 그 이유는 이러한 다운스트림 처리가 전체 고해상도 이미지가 아니라 발췌된 이미지(820)에 대해 작동할 수 있기 때문이다.In some implementations, rather than multiple image frames, a single high-resolution image may be processed through a high-power image signal processor to save computational and/or device power. Additional computation and/or device power may also result from additional downstream processing because such downstream processing may operate on excerpted images 820 rather than the full high-resolution image.

도 4a 내지 8b는 이중 센서(예를 들어, 장면 카메라 및 세부 카메라)의 기능을 제공하기 위해 단일 센서를 활용하는 흐름도를 설명한다. 일부 구현에서, 단일 센서의 출력은 해당 센서가 상이한 해상도 및/또는 상이한 시야에서 적어도 2개의 스트림을 출력할 수 있기 때문에 별도의 센서 없이 듀얼 모드로 작동하도록 구성될 수 있다. 예를 들어, 제1 스트림은 장면 카메라를 표현하기 위해 이미지의 저해상도 스트림에 대한 넓은 시야를 출력하도록 구성될 수 있다. 유사하게, 제2 스트림은 세부 카메라를 표현하기 위해 이미지의 고해상도 스트림에 대한 좁은 시야를 출력하도록 구성될 수 있다.Figures 4A-8B illustrate flow diagrams utilizing a single sensor to provide the functionality of dual sensors (e.g., a scene camera and a detail camera). In some implementations, the output of a single sensor may be configured to operate in dual mode without a separate sensor because that sensor may output at least two streams at different resolutions and/or different fields of view. For example, the first stream may be configured to output a wide field of view over a low-resolution stream of images to represent a scene camera. Similarly, the second stream may be configured to output a narrow field of view over a high-resolution stream of images to represent camera details.

도 9는 본 개시 전반에 걸쳐 설명된 구현에 따라 컴퓨팅 디바이스에서 이미지 처리 작업을 수행하는 프로세스(900)의 일 예를 나타내는 흐름도이다. 일부 구현에서, 컴퓨팅 디바이스는 배터리 구동형인 웨어러블 컴퓨팅 디바이스이다. 일부 구현에서, 컴퓨팅 디바이스는 배터리 구동형인 비-웨어러블 컴퓨팅 디바이스이다.FIG. 9 is a flow diagram illustrating an example of a process 900 for performing an image processing task at a computing device in accordance with implementations described throughout this disclosure. In some implementations, the computing device is a battery-powered, wearable computing device. In some implementations, the computing device is a battery-powered, non-wearable computing device.

프로세스(900)는 적어도 하나의 처리 디바이스, 스피커, 선택적 디스플레이 기능, 및 실행될 때 처리 디바이스가 청구항에 설명된 복수의 동작 및 컴퓨터 구현 단계를 수행하게 하는 명령들을 저장하는 메모리를 갖춘 컴퓨팅 디바이스의 이미지 처리 시스템을 활용할 수 있다. 일반적으로, 웨어러블 컴퓨팅 디바이스(100), 시스템(200 및/또는 1000)은 프로세스(900)의 설명 및 실행에 사용될 수 있다. 웨어러블 컴퓨팅 디바이스(100)와 시스템(200 및/또는 1000)의 조합은 일부 구현에서 단일 시스템을 나타낼 수 있다. 일반적으로, 프로세스(900)는 본 명세서에 설명된 시스템 및 알고리즘을 활용하여 이미지 데이터를 검출하여, 웨어러블 컴퓨팅 디바이스(100)에서 이미지 처리를 수행하는데 사용되는 데이터의 양을 줄이기 위해 발췌될(잘릴) 수 있는 하나 이상의 관심 영역을 실시간으로(예를 들어 캡처 시) 식별한다.Process 900 includes image processing of a computing device having at least one processing device, a speaker, an optional display function, and a memory that stores instructions that, when executed, cause the processing device to perform a plurality of operations and computer implemented steps described in the claims. You can use the system. In general, wearable computing device 100, system 200 and/or 1000 may be used to describe and perform process 900. A combination of wearable computing device 100 and systems 200 and/or 1000 may represent a single system in some implementations. Generally, process 900 utilizes the systems and algorithms described herein to detect image data to be extracted (cropped) to reduce the amount of data used to perform image processing in wearable computing device 100. Identify one or more regions of interest in real time (e.g. at capture time).

블록(902)에서, 프로세스(900)는 이미지 컨텐츠의 수신(또는 이미지 컨텐츠 식별 요청)을 기다리는 것을 포함할 수 있다. 예를 들어, 웨어러블 컴퓨팅 디바이스(100) 상의 이미지 센서(216)는 이미지 컨텐츠를 수신, 캡처 또는 검출할 수 있다. 컨텐츠 또는 센서(216)가 센서에 의해 캡처된 광학 데이터와 연관된 이미지 컨텐츠를 식별하게 하는 요청을 수신하는 것에 응답하여, 센서(216) 및/또는 프로세서(208, 222 및/또는 224)는 블록(904)에서 제1 이미지 해상도를 갖는 제1 센서 데이터 스트림을 검출할 수 있고, 제1 센서 데이터 스트림은 광학 데이터에 기초할 수 있다. 예를 들어, 이미지 센서(216)는 이미지 센서(216)에서 수신된 광학 데이터에 기초하여 저해상도 데이터 스트림을 검출하고, 그 저해상도 이미지 데이터 스트림을 저해상도 이미지 신호 프로세서(222)에 제공할 수 있다. 또한, 블록(906)에서, 센서(216) 및/또는 프로세서(208, 222 및/또는 224)는 제2 센서 데이터 스트림이 또한 광학 데이터에 기초하는 제2 이미지 해상도를 갖는 제2 센서 데이터 스트림을 검출할 수 있다. 예를 들어, 이미지 센서(216)는 이미지 센서(216)에서 수신된 광학 데이터에 기초하여 고해상도 데이터 스트림을 검출하고, 그 고해상도 데이터 스트림을 고해상도 이미지 신호 프로세서(224)에 제공할 수 있다.At block 902, process 900 may include waiting for receipt of image content (or a request for image content identification). For example, image sensor 216 on wearable computing device 100 may receive, capture, or detect image content. In response to receiving a request to cause content or sensor 216 to identify image content associated with optical data captured by the sensor, sensor 216 and/or processor 208, 222 and/or 224 may execute a block ( At 904), a first sensor data stream having a first image resolution may be detected, and the first sensor data stream may be based on optical data. For example, image sensor 216 may detect a low-resolution data stream based on optical data received at image sensor 216 and provide the low-resolution image data stream to low-resolution image signal processor 222. Additionally, at block 906, sensor 216 and/or processor 208, 222, and/or 224 generate a second sensor data stream wherein the second sensor data stream also has a second image resolution based on optical data. It can be detected. For example, image sensor 216 may detect a high-resolution data stream based on optical data received at image sensor 216 and provide the high-resolution data stream to high-resolution image signal processor 224.

동작 시 제1 센서 데이터 스트림과 제2 데이터 스트림은 제2 센서 데이터 스트림으로부터의 고해상도 이미지의 타임스탬프와 일치(match)하는 제1 센서 데이터 스트림으로부터의 저해상도 이미지를 획득하도록, 동시에 및/또는 상호 참조될 수 있는 캡처 시간과 상관 관계를 통해 캡처 및/또는 수신될 수 있다. In operation, the first sensor data stream and the second data stream are simultaneously and/or cross-referenced to obtain a low-resolution image from the first sensor data stream that matches the timestamp of the high-resolution image from the second sensor data stream. Capture and/or reception can be achieved with a possible capture time and correlation.

블록(908)에서, 프로세스(900)는 웨어러블 컴퓨팅 디바이스(100)의 처리 회로에 의해, 제1 센서 데이터 스트림에서 적어도 하나의 관심 영역을 식별하는 단계를 포함한다. 예를 들어, 하나 이상의 ROI(232)는 이미지 센서(216)에 의해 획득된 이미지 프레임(226) 내에서 식별될 수 있다. 이미지 프레임(226)은 저해상도 이미지 신호 프로세서(222) 및/또는 ROI 검출기(230)에 의해 분석되었을 수 있다. ROI는 픽셀, 가장자리(edge) 검출 좌표, 객체 모양 또는 검출, 기타 영역 또는 객체 식별 메트릭(측정 기준)에 의해 정의될 수 있다. 일부 구현에서, ROI는 이미지 스트림의 객체 또는 다른 이미지 컨텐츠일 수 있다.At block 908 , process 900 includes identifying, by processing circuitry of wearable computing device 100 , at least one region of interest in the first sensor data stream. For example, one or more ROIs 232 may be identified within image frame 226 acquired by image sensor 216. Image frame 226 may have been analyzed by low-resolution image signal processor 222 and/or ROI detector 230. ROI may be defined by pixels, edge detection coordinates, object shape or detection, or other area or object identification metrics (metrics). In some implementations, the ROI may be an object or other image content in the image stream.

일부 구현에서, 처리 회로는 제1 센서 데이터 스트림(예를 들어, 저해상도 스트림)에 대해 이미지 신호 처리를 수행하도록 구성된 제1 이미지 프로세서(예를 들어, 저해상도 프로세서(222)) 및 제2 센서 데이터 스트림(예를 들어, 고해상도 스트림)에 대해 이미지 신호 처리를 수행하도록 구성된 제2 이미지 프로세서(예를 들어, 고해상도 이미지 신호 프로세서(224))를 포함할 수 있다. 일반적으로, 제1 센서 데이터 스트림의 제1 이미지 해상도는 제2 데이터 센서 스트림의 제2 이미지 해상도보다 낮다.In some implementations, processing circuitry includes a first image processor (e.g., low-resolution processor 222) configured to perform image signal processing on a first sensor data stream (e.g., low-resolution stream) and a second sensor data stream. It may include a second image processor (e.g., high-resolution image signal processor 224) configured to perform image signal processing on (e.g., high-resolution stream). Typically, the first image resolution of the first sensor data stream is lower than the second image resolution of the second data sensor stream.

블록(910)에서, 프로세스(900)는 처리 회로에 의해, 제1 센서 데이터 스트림의 적어도 하나의 관심 영역에서 복수의 제1 픽셀을 정의하는 발췌 좌표(238)를 결정하는 단계를 포함한다. 예를 들어, 저해상도 이미지 신호 프로세서(222)는 객체(234) 또는 관심 영역(ROI)(232)을 정의하는 발췌 좌표를 식별하기 위해 이미지 프레임(226), 분석된 해상도(228) 및 ROI 검출기(230)를 활용할 수 있다.At block 910, the process 900 includes determining, by the processing circuitry, excerpt coordinates 238 that define a first plurality of pixels in at least one region of interest in the first sensor data stream. For example, the low-resolution image signal processor 222 may use an image frame 226, an analyzed resolution 228, and an ROI detector to identify excerpt coordinates that define an object 234 or a region of interest (ROI) 232. 230) can be used.

블록(912)에서, 프로세스(900)는 처리 회로에 의해, 적어도 하나의 관심 영역을 나타내는 발췌된 이미지를 생성하는 단계를 포함한다. 발췌된 이미지를 생성하는 단계는 발췌기(236)가 블록(914)에서 제1 센서 데이터 스트림의 적어도 하나의 관심 영역의 복수의 제1 픽셀을 정의하는 발췌 좌표(238)를 사용하여 제2 센서 데이터 스트림의 복수의 제2 픽셀의 식별(identification)을 트리거하는 단계와, 블록(916)에서 복수의 제2 픽셀로 제2 센서 데이터 스트림을 발췌하는 단계를 포함할 수 있다. 생성된 발췌된 이미지는 원래 식별된 관심 영역 또는 객체의 고해상도 버전일 수 있다.At block 912, process 900 includes generating, by processing circuitry, an excerpted image representing at least one region of interest. Generating the excerpted image may include the extractor 236, at block 914, using the excerpt coordinates 238 to define a first plurality of pixels of at least one region of interest in the first sensor data stream. Triggering identification of a second plurality of pixels in the data stream and extracting the second sensor data stream with the second plurality of pixels at block 916. The resulting excerpted image may be a high-resolution version of the originally identified region of interest or object.

일부 구현에서, 적어도 하나의 관심 영역(또는 객체)을 나타내는 발췌된 이미지를 생성하는 단계는 적어도 하나의 관심 영역이 임계 조건(256)을 충족한다고 검출하는 것에 응답하여 수행된다. 일반적으로, 임계 조건(256)은 이미지 스트림의 이미지 프레임에 있는 객체 또는 영역에 대한 특정한 높은 품질 레벨을 나타낼 수 있다. 일부 구현에서, 임계 조건(256)은 제2 센서 데이터 스트림의 복수의 제2 픽셀이 낮은 블러를 갖는지 검출하는 단계를 포함할 수 있다. 일부 구현에서, 임계 조건(256)은 복수의 제2 픽셀이 특정 이미지 노출 측정치를 갖는지 검출하는 단계를 포함할 수 있다. 일부 구현에서, 임계 조건(256)은 복수의 제2 픽셀이 초점 내에 있는지 검출하는 것을 포함할 수 있다.In some implementations, generating an excerpted image representing at least one region of interest (or object) is performed in response to detecting that at least one region of interest meets threshold condition 256. In general, threshold condition 256 may indicate a particular high quality level for an object or region in an image frame of an image stream. In some implementations, threshold condition 256 may include detecting whether a plurality of second pixels of the second sensor data stream have low blur. In some implementations, threshold condition 256 may include detecting whether a plurality of second pixels have a particular image exposure measurement. In some implementations, threshold condition 256 may include detecting whether the plurality of second pixels are in focus.

예를 들어, 임계 조건은 객체 또는 영역이 초점 내에 있고, 여전히 블러(흐려짐)가 적고, 및/또는 올바른 노출을 갖는지 결정하는 것 중 일부 또는 전부와 관련될 수 있다. 경계 상자는 관심 영역 또는 객체(414)를 나타내는 발췌 좌표를 나타낼 수 있다. 일단 객체 또는 관심 영역(예를 들어, 복수의 제2 픽셀)이 결정되고 임계 조건(들)이 충족되면, 고전력 이미지 신호 프로세서는 고해상도 이미지 프레임(예를 들어, 관심 객체 또는 영역이 식별된 저해상도 이미지 프레임과 동일한 캡처 시간과 일치하는 원시 프레임)을 검색(또는 수신)할 수 있다.For example, a threshold condition may relate to any or all of the following: determining whether an object or area is in focus, still has minimal blur, and/or has correct exposure. The bounding box may represent a region of interest or excerpt coordinates representing an object 414. Once an object or region of interest (e.g., a second plurality of pixels) is determined and the threshold condition(s) are met, the high-power image signal processor may output a high-resolution image frame (e.g., a low-resolution image with the object or region of interest identified). You can retrieve (or receive) a raw frame (that matches the same capture time as the frame).

발췌된 고해상도 이미지는 추가 이미지 분석에 사용될 수 있다. 예를 들어, 처리 회로(224)는 발췌된 이미지의 기계 판독 가능한 버전(예를 들어, 출력(132))을 생성하기 위해 그 발췌된 이미지에 대해 광학 문자 분해능(optical character resolution: OCR)을 수행하는 것을 포함할 수 있다. 그런 다음 처리 회로(224)는 발췌된 이미지(예를 들어, 이미지(418))의 기계 판독 가능 버전을 사용하여 검색 질의를 수행하여 웨어러블 컴퓨팅 디바이스(100)의 디스플레이(244)에 디스플레이하기 위해 트리거될 수 있는 복수의 검색 결과(예를 들어, 출력(132)에 표시됨)를 생성할 수 있다. 일부 구현에서, 검색 질의 수행 대신, OCR이 출력(132)을 생성하기 위해 발췌된 이미지에 대해 수행될 수 있으며, 출력(132)은 웨어러블 컴퓨팅 디바이스(100)를 통해 사용자에게 디스플레이될 수 있고, 사용자에게 소리내어 읽혀질 수 있으며, 및/또는 사용자가 소비할 출력으로서 제공될 수 있다. 예를 들어, 수행된 OCR의 오디오 출력은 웨어러블 컴퓨팅 디바이스(100)의 스피커로부터 제공될 수 있다.Extracted high-resolution images can be used for further image analysis. For example, processing circuitry 224 may perform optical character resolution (OCR) on the excerpted image to generate a machine-readable version of the excerpted image (e.g., output 132). It may include: Processing circuitry 224 then performs a search query using a machine-readable version of the excerpted image (e.g., image 418) and triggers it for display on display 244 of wearable computing device 100. A plurality of search results (e.g., displayed in output 132) may be generated. In some implementations, instead of performing a search query, OCR may be performed on the extracted images to generate output 132, which may be displayed to a user via wearable computing device 100, and the user may be read aloud to a user, and/or may be provided as output for consumption by a user. For example, audio output of the performed OCR may be provided from a speaker of the wearable computing device 100.

일부 구현에서, 제2 센서 데이터 스트림은 웨어러블 컴퓨팅 디바이스(100)의 메모리에 저장된다. 예를 들어, 고해상도 이미지 스트림은 나중에 고해상도 이미지 스트림과 연관된 이미지 프레임 및 메타데이터에 액세스하기 위해 버퍼(404)에 저장될 수 있다. 제1 센서 데이터 스트림에서 적어도 하나의 관심 영역(또는 객체)을 식별하는 것에 응답하여, 처리 회로(224)는 메모리에 저장된 제2 센서 데이터에서 대응하는 적어도 하나의 관심 영역(또는 객체)을 검색할 수 있다. 예를 들어, 버퍼링된 데이터가 저해상도 이미지 스트림의 관심 영역(또는 객체)에서 식별된 이미지 컨텐츠의 고해상도 버전을 얻기 위해 액세스될 수 있다. 또한, 웨어러블 컴퓨팅 디바이스(100)는 제2 센서 데이터 스트림(예를 들어 고해상도 이미지 스트림)을 계속해서 검출하고 액세스하는 동안 제1 센서 데이터 스트림(예를 들어 저해상도 이미지 스트림)에 대한 액세스 제한을 트리거할 수 있다. 즉, 관심 객체나 관심 영역이 결정되면, 웨어러블 컴퓨팅 디바이스(100)는 더 이상 저해상도 이미지 스트림과 연관된 전력 또는 자원을 사용하기를 원하지 않을 수 있으며, 저해상도 이미지 스트림의 스트리밍, 저장 및/또는 액세스와 관련하여 자원 및/또는 전력의 남용을 피하기 위해 스트림에 대한 액세스를 제한할 수 있다.In some implementations, the second sensor data stream is stored in memory of wearable computing device 100. For example, a high-resolution image stream may be stored in buffer 404 for later access to image frames and metadata associated with the high-resolution image stream. In response to identifying at least one area of interest (or object) in the first sensor data stream, processing circuitry 224 may retrieve the corresponding at least one area of interest (or object) in the second sensor data stored in memory. You can. For example, buffered data may be accessed to obtain a high-resolution version of image content identified in a region (or object) of interest in a low-resolution image stream. Additionally, wearable computing device 100 may trigger access restrictions to a first sensor data stream (e.g., a low-resolution image stream) while continuing to detect and access a second sensor data stream (e.g., a high-resolution image stream). You can. That is, once an object of interest or region of interest is determined, wearable computing device 100 may no longer wish to use the power or resources associated with the low-resolution image stream, and may no longer wish to use the power or resources associated with streaming, storing, and/or accessing the low-resolution image stream. thus limiting access to the stream to avoid abuse of resources and/or power.

일부 구현에서, 프로세스(900)는 또한 적어도 하나의 관심 영역을 나타내는 발췌된 이미지를 생성하는 것에 응답하여 컴퓨팅 디바이스와 통신하는 모바일 디바이스에 그 생성된 발췌된 이미지를 전송하는 단계를 포함할 수 있다. 예를 들어, 웨어러블 컴퓨팅 디바이스(100)에서 수행된 고해상도 이미지의 식별 및 발췌 시, 웨어러블 컴퓨팅 디바이스(100)는 추가 처리를 위해 그 발췌된 이미지를 모바일 디바이스(202)에 전송할 수 있다. 예를 들어, 모바일 디바이스(202)는 관심 영역 또는 객체에 관한 데이터 또는 정보를 생성할 수 있다. 모바일 디바이스(202)는 웨어러블 컴퓨팅 디바이스(100)보다 더 높은 전력 및/또는 계산 자원을 가질 수 있으므로 생성된 데이터 및/또는 정보는 모바일 디바이스(202)에서 검색 및/또는 계산되어 웨어러블 컴퓨팅 디바이스(100)로 다시 전송될 수 있다. 웨어러블 컴퓨팅 디바이스(100)는 모바일 디바이스로부터 적어도 하나의 관심 영역에 관한 정보를 수신할 수 있고, 웨어러블 컴퓨팅 디바이스(100)의 디스플레이(244)에 해당 정보가 디스플레이되게 할 수 있다.In some implementations, process 900 may also include transmitting the generated excerpted image to a mobile device in communication with the computing device in response to generating the excerpted image representing at least one region of interest. For example, upon identification and extraction of high-resolution images performed on wearable computing device 100, wearable computing device 100 may transmit the excerpted images to mobile device 202 for further processing. For example, mobile device 202 may generate data or information regarding an area or object of interest. Mobile device 202 may have higher power and/or computational resources than wearable computing device 100 so that the generated data and/or information may be retrieved and/or computed on mobile device 202 and stored in wearable computing device 100. ) can be sent again. The wearable computing device 100 may receive information about at least one region of interest from the mobile device and display the corresponding information on the display 244 of the wearable computing device 100.

일부 구현에서, 제1 이미지 해상도는 낮은 이미지 해상도를 갖고, 제2 이미지 해상도는 높은 이미지 해상도를 갖는다. 적어도 하나의 관심 영역 또는 관심 객체를 식별하는 것은 웨어러블 컴퓨팅 디바이스(100)에서 실행되는 기계 학습 알고리즘(예를 들어, NN(254)을 통해)의 사용을 추가로 포함할 수 있다. 기계 학습 알고리즘은 제1 센서 데이터에 표시된 텍스트 또는 적어도 하나의 객체를 식별하기 위한 입력으로서 제1 센서 데이터를 사용할 수 있다. 예를 들어, 제1 데이터 센서 스트림이 낮은 해상도를 갖더라도, 기계 학습 알고리즘은 이미지 스트림에 텍스트나 객체가 표시되는지 결정할 수 있다.In some implementations, the first image resolution has a low image resolution and the second image resolution has a high image resolution. Identifying at least one area of interest or object of interest may further include the use of a machine learning algorithm (e.g., via NN 254) executing on wearable computing device 100. The machine learning algorithm may use the first sensor data as an input to identify text or at least one object displayed in the first sensor data. For example, even if the first data sensor stream has a low resolution, a machine learning algorithm can determine whether text or an object appears in the image stream.

예를 들어, 웨어러블 컴퓨팅 디바이스(100)는 기계 학습 알고리즘을 사용하여 특정 이미지 컨텐츠(예를 들어, 객체, ROI)가 저해상도 이미지에 존재(예를 들어, 묘사 또는 표시)되는지 결정할 수 있다. 발췌기(236)는 결정된 알려지지 않은 이미지 컨텐츠를 발췌 좌표(238)로 발췌할(자를) 수 있고, 그 발췌된 이미지를 트레이닝된 기계 학습 알고리즘에 제공하여 특정 이미지 컨텐츠가 관심 객체(들) 및/또는 관심 영역(들)을 나타내는지 여부를 결정할 수 있다. 예를 들어, 발췌될 이미지 컨텐츠를 결정하는 기계 학습 알고리즘은 캡처된 이미지의 하나 이상의 세부 정보(details)를 식별하는 것을 포함할 수 있다. 식별된 세부 정보는 해당 세부 정보에 인식 가능한 객체, 기호, 객체 부분 등이 포함하는지 여부를 되어 있는지 분석될 수 있다.For example, wearable computing device 100 may use a machine learning algorithm to determine whether specific image content (e.g., object, ROI) is present (e.g., depicted or displayed) in a low-resolution image. The extractor 236 may extract (crop) the determined unknown image content to the extract coordinates 238 and provide the extracted image to a trained machine learning algorithm to determine whether the specific image content is related to the object(s) of interest and/or Alternatively, it may be determined whether or not it represents a region(s) of interest. For example, a machine learning algorithm that determines image content to be extracted may include identifying one or more details of the captured image. The identified detailed information may be analyzed to determine whether the detailed information includes recognizable objects, symbols, object parts, etc.

예를 들어, 세부 정보가 인식 가능한 객체로서 식별되면, 웨어러블 컴퓨팅 디바이스(100)는 객체의 특정 특징을 인식하려고 시도할 수 있다. 예를 들어, 객체가 탄산음료 캔(soda can)인 경우, 웨어러블 컴퓨팅 디바이스(100)는 어떤 브랜드의 탄산음료인지 평가할 수 있다. 기계 학습 알고리즘은 디바이스 상에서, 저해상도 이미지에 대해 경량 검출기 아키텍처를 실행하여 특정 패턴의 ROI(또는 관심 객체) 좌표를 식별할 수 있다. 예를 들어, 기계 학습 알고리즘은 출력으로서, 탄산음료 캔의 브랜드가 패턴에 의해 식별될 수 있는 클러스터링된 텍스트 영역이고 해당 패턴이 발췌되어 추가 분석에 사용할 수 있음을 결정할 수 있다. 예를 들어, 패턴은 발췌 좌표를 나타내는 기계 학습 알고리즘 출력에 기초하여 발췌될 수 있다. 그런 다음 이 좌표를 사용하여 동일한 장면의 고해상도 이미지 부분(예를 들어, 클러스터링된 텍스트 영역)을 선택할 수 있다. 해당 부분은 브랜드를 결정하기 위해 추가로 분석될 수 있다. 고해상도의 일부는 저해상도 이미지로부터 검색된 입력에 기초하여 사용되므로, 기계 학습 알고리즘은 전체 고해상도 이미지의 분석이 회피되었기 때문에 웨어러블 컴퓨팅 디바이스(100)에서 에너지 및 대기 시간을 최적화하는 이점을 제공할 수 있다.For example, if detailed information is identified as a recognizable object, wearable computing device 100 may attempt to recognize certain characteristics of the object. For example, if the object is a soda can, the wearable computing device 100 can evaluate what brand of soda it is. Machine learning algorithms can run lightweight detector architectures on low-resolution images on the device to identify specific patterns of ROI (or object of interest) coordinates. For example, a machine learning algorithm may determine that, as output, the brand of a soda can is a clustered text region that can be identified by a pattern, and that pattern can be extracted and used for further analysis. For example, a pattern may be extracted based on a machine learning algorithm output indicating the extraction coordinates. These coordinates can then be used to select high-resolution image portions of the same scene (e.g., clustered text regions). This section can be further analyzed to determine the brand. Because a portion of the high resolution is used based on input retrieved from low resolution images, machine learning algorithms may provide the benefit of optimizing energy and latency in the wearable computing device 100 since analysis of the entire high resolution image is avoided.

본 개시 전반에 걸쳐 설명된 예는 컴퓨터 및/또는 컴퓨팅 시스템을 지칭할 수 있다. 본 명세서에 사용된 바와 같이, 컴퓨터(및/또는 컴퓨팅) 시스템은 본 명세서에 설명된 컴퓨터화된 기술 중 하나 이상을 수행하기 위해 하드웨어, 펌웨어 및 소프트웨어로 구성된 하나 이상의 디바이스의 임의의 적절한 조합을 포함하지만 이에 한정되지 않는다. 본 문서에 사용된 컴퓨터(및/또는 컴퓨팅) 시스템은 단일 컴퓨팅 디바이스 또는 집합적으로 작동하는 다중 컴퓨팅 디바이스일 수 있으며 데이터 저장 및 기능 실행이 다양한 컴퓨팅 디바이스 사이에 분산되어 있다.Examples described throughout this disclosure may refer to computers and/or computing systems. As used herein, a computer (and/or computing) system includes any suitable combination of one or more devices comprised of hardware, firmware, and software to perform one or more of the computerized techniques described herein. However, it is not limited to this. As used herein, a computer (and/or computing) system may be a single computing device or multiple computing devices operating collectively, with data storage and function execution distributed among the various computing devices.

본 개시 전반에 걸쳐 설명된 예들은 증강 현실(AR)을 지칭할 수 있다. 본 명세서에 사용된 바와 같이, AR은 컴퓨터 시스템이 적어도 하나의 가상 양상과 적어도 하나의 현실 양상을 포함하는 감각적 인식을 촉진하는 사용자 경험을 지칭한다. AR 경험은 배터리 구동형 웨어러블 컴퓨팅 디바이스 또는 배터리 구동형 비웨어러블 컴퓨팅 디바이스를 포함하되 이에 한정되지 않는 다양한 유형의 컴퓨터 시스템 중 하나를 통해 제공될 수 있다. 일부 구현예에서, 웨어러블 컴퓨팅 디바이스는 AR 안경, 다른 웨어러블 AR 디바이스, 태블릿, 시계 또는 랩톱 컴퓨터를 포함할 수 있지만 이에 한정되지 않는 AR 헤드셋을 포함할 수 있다.Examples described throughout this disclosure may refer to augmented reality (AR). As used herein, AR refers to a user experience in which a computer system promotes sensory perception that includes at least one virtual modality and at least one real modality. AR experiences may be provided through one of various types of computer systems, including, but not limited to, battery-powered wearable computing devices or battery-powered non-wearable computing devices. In some implementations, wearable computing devices may include AR headsets, which may include, but are not limited to, AR glasses, other wearable AR devices, tablets, watches, or laptop computers.

일부 유형의 AR 경험에서, 사용자는 컴퓨터 시스템에 의한 중개 없이 자신의 감각으로 직접 현실의 양상을 인지할 수 있다. 예를 들어, 웨어러블 컴퓨팅 디바이스(100)와 같은 일부 AR 안경은 사용자의 망막에 이미지(예를 들어, 인지될 가상 양상)를 전송하는 동시에 AR 안경에 의해 생성되지 않은 다른 광(빛)을 눈이 등록할 수 있도록 설계되었다. 또 다른 예로서, 인-렌즈 마이크로 디스플레이가 투시 렌즈에 내장될 수 있거나 투사된 디스플레이가 투시 렌즈 위에 중첩될 수 있다. 다른 유형의 AR 경험에서, 컴퓨터 시스템은 하나 이상의 방식으로 사용자의 현실 인상(impression)(예를 들어, 인지할 실제 양상)을 개선, 보완, 변경 및/또는 활성화할 수 있다. 일부 구현에서, AR 경험은 컴퓨터 시스템의 디스플레이 디바이스 화면에서 인식된다. 예를 들어, 일부 AR 헤드셋 및/또는 AR 안경은 사용자 눈 앞에 위치한 디스플레이 디바이스 상에 사용자 주변 환경의 카메라 이미지를 제시하기 위해 카메라 피드스루(feedthrough)로 설계되었다.In some types of AR experiences, users can perceive aspects of reality directly with their senses without intermediation by a computer system. For example, some AR glasses, such as wearable computing device 100, transmit an image (e.g., a virtual aspect to be perceived) to the user's retina while simultaneously directing the eye to other light not generated by the AR glasses. Designed for registration. As another example, an in-lens micro display may be embedded in a perspective lens or a projected display may be superimposed on a perspective lens. In other types of AR experiences, the computer system may enhance, supplement, change, and/or activate the user's impression of reality (e.g., the aspect of reality to be perceived) in one or more ways. In some implementations, the AR experience is perceived on a display device screen of a computer system. For example, some AR headsets and/or AR glasses are designed with a camera feedthrough to present camera images of the user's surroundings on a display device positioned in front of the user's eyes.

도 10은 (예를 들어, 웨어러블 컴퓨팅 디바이스(100), 예컨대 클라이언트 컴퓨팅 디바이스, 서버 컴퓨팅 디바이스(204) 및/또는 모바일 디바이스(202)를 구현하기 위해) 본 명세서에 설명된 기술과 함께 사용될 수 있는 컴퓨터 디바이스(1000) 및 모바일 컴퓨터 디바이스(1050)의 예를 도시한다. 컴퓨팅 디바이스(1000)는 프로세서(1002), 메모리(1004), 저장 디바이스(1006), 메모리(1004)와 고속 확장 포트(1010)에 연결되는 고속 인터페이스(908), 및 저속 버스(1014)와 저장 디바이스(1006)에 연결되는 저속 인터페이스(1012)를 포함한다. 각각의 구성 요소(1002, 1004, 1006, 1008, 1010, 1012)는 다양한 버스를 사용하여 상호 연결되고, 공통 마더보드에 장착되거나 적절한 다른 방식으로 장착될 수 있다. 프로세서(1002)는 고속 인터페이스(1008)에 결합된 디스플레이(1016)와 같은 외부 입/출력 디바이스에 GUI에 대한 그래픽 정보를 디스플레이하기 위해 메모리(1004) 또는 저장 디바이스(1006)에 저장된 명령들을 비롯하여 컴퓨팅 디바이스(1000) 내에서 실행하기 위한 명령들을 처리할 수 있다. 다른 구현에서, 다중 메모리 및 메모리 유형과 함께 다중 프로세서 및/또는 다중 버스가 적절하게 사용될 수 있다. 또한, 다수의 컴퓨팅 디바이스(1000)는 필요한 동작들의 일부를 제공하는 각 디바이스(예를 들어, 서버 뱅크, 블레이드 서버 그룹, 또는 다중 프로세서 시스템)와 연결될 수 있다. 10 illustrates a method that can be used with the techniques described herein (e.g., to implement wearable computing device 100, such as client computing device, server computing device 204, and/or mobile device 202). Examples of computer device 1000 and mobile computer device 1050 are shown. Computing device 1000 includes a processor 1002, memory 1004, a storage device 1006, a high-speed interface 908 coupled to memory 1004 and a high-speed expansion port 1010, and a low-speed bus 1014 and storage. and a low-speed interface 1012 coupled to device 1006. Each component 1002, 1004, 1006, 1008, 1010, 1012 is interconnected using various buses and may be mounted on a common motherboard or in any other suitable manner. Processor 1002 may perform computing, including instructions stored in memory 1004 or storage device 1006 to display graphical information for the GUI on an external input/output device, such as display 1016 coupled to high-speed interface 1008. Commands for execution within the device 1000 can be processed. In other implementations, multiple processors and/or multiple buses along with multiple memories and memory types may be used as appropriate. Additionally, multiple computing devices 1000 may be connected, with each device providing some of the required operations (e.g., a server bank, a blade server group, or a multiprocessor system).

메모리(1004)는 컴퓨팅 디바이스(1000) 내에 정보를 저장한다. 일 구현에서, 메모리(1004)는 휘발성 메모리 유닛 또는 유닛들이다. 다른 구현에서, 메모리(1004)는 비휘발성 메모리 유닛 또는 유닛들이다. 메모리(1004)는 또한 자기 또는 광 디스크와 같은 컴퓨터 판독 가능 매체의 다른 형태일 수도 있다.Memory 1004 stores information within computing device 1000. In one implementation, memory 1004 is a volatile memory unit or units. In another implementation, memory 1004 is a non-volatile memory unit or units. Memory 1004 may also be another form of computer-readable medium, such as a magnetic or optical disk.

저장 디바이스(1006)는 컴퓨팅 디바이스(1000)에 대용량 저장 디바이스를 제공할 수 있다. 일 구현에서, 저장 디바이스(1006)는 플로피 디스크 디바이스, 하드 디스크 디바이스, 광학 디스크 디바이스, 또는 테이프 디바이스, 플래시 메모리 또는 기타 유사한 고체 상태 메모리 디바이스와 같은 컴퓨터 판독 가능 매체, 또는 저장 영역 네트워크 또는 기타 구성의 디바이스를 포함한 디바이스 어레이이거나 이를 포함할 수 있다. 컴퓨터 프로그램 제품은 정보 매체에 유형적으로 구현될 수 있다. 컴퓨터 프로그램 제품은 또한 실행될 때 위에서 설명된 것과 같은 하나 이상의 방법을 수행하는 명령들을 포함할 수 있다. 정보 매체는 메모리(1004), 저장 디바이스(12006), 또는 프로세서(1002)의 메모리와 같은 컴퓨터 또는 기계 판독 가능 매체이다.Storage device 1006 can provide computing device 1000 with a mass storage device. In one implementation, storage device 1006 is a computer-readable medium such as a floppy disk device, hard disk device, optical disk device, or tape device, flash memory, or other similar solid-state memory device, or a storage area network or other configuration. It may be or include a device array including devices. A computer program product can be tangibly embodied in an information medium. A computer program product may also include instructions that, when executed, perform one or more methods such as those described above. The information medium is a computer or machine-readable medium, such as memory 1004, storage device 12006, or memory of processor 1002.

고속 컨트롤러(1008)는 컴퓨팅 디바이스(1000)에 대한 대역폭 집약적 동작을 관리하는 반면 저속 컨트롤러(1012)는 낮은 대역폭 집약적 동작을 관리한다. 이러한 기능 할당은 단지 예일 뿐이다. 일 구현에서, 고속 컨트롤러(1008)는 메모리(1004), 디스플레이(1016)(예를 들어, 그래픽 프로세서 또는 가속기를 통해) 및 다양한 확장 카드(미도시)를 수용할 수 있는 고속 확장 포트(1010)에 결합된다. 구현에서, 저속 컨트롤러(1012)는 저장 디바이스(1006) 및 저속 확장 포트(1014)에 결합된다. 다양한 통신 포트(예를 들어, USB, 블루투스, 이더넷, 무선 이더넷)를 포함할 수 있는 저속 확장 포트는 예를 들어 네트워크 어댑터를 통해 키보드, 포인팅 디바이스, 스캐너와 같은 하나 이상의 입/출력 디바이스 또는 스위치나 라우터와 같은 네트워킹 디바이스에 연결될 수 있다.High-speed controller 1008 manages bandwidth-intensive operations for computing device 1000, while low-speed controller 1012 manages low-bandwidth-intensive operations. These functional assignments are just examples. In one implementation, the high-speed controller 1008 includes memory 1004, a display 1016 (e.g., via a graphics processor or accelerator), and a high-speed expansion port 1010 that can accommodate various expansion cards (not shown). is combined with In an implementation, low speed controller 1012 is coupled to storage device 1006 and low speed expansion port 1014. Low-speed expansion ports, which may include a variety of communication ports (e.g., USB, Bluetooth, Ethernet, wireless Ethernet), for example, one or more input/output devices such as a keyboard, pointing device, scanner, or switch or through a network adapter. It can be connected to networking devices such as routers.

컴퓨팅 디바이스(1000)는 도면에 도시된 바와 같이 다수의 다른 형태로 구현될 수 있다. 예를 들어, 이는 표준 서버(1020)로 구현되거나 이러한 서버 그룹에서 여러 번 구현될 수 있다. 이는 랙 서버 시스템(1024)의 일부로 구현될 수도 있다. 또한, 이는 노트북 컴퓨터(1022)와 같은 개인용 컴퓨터에서도 구현될 수 있다. 대안적으로, 컴퓨팅 디바이스(1000)의 구성요소는 디바이스(1050)와 같은 모바일 디바이스(미도시)의 다른 구성요소와 결합될 수 있다. 이러한 디바이스 각각은 컴퓨팅 디바이스(1000, 1050) 중 하나 이상을 포함할 수 있고, 전체 시스템은 서로 통신하는 다수의 컴퓨팅 디바이스(1000, 1050)로 구성될 수 있다.Computing device 1000 may be implemented in a number of different forms as shown in the drawing. For example, this could be implemented as a standard server 1020 or multiple times across groups of such servers. This may be implemented as part of a rack server system 1024. Additionally, this may also be implemented in a personal computer such as a laptop computer 1022. Alternatively, components of computing device 1000 may be combined with other components of a mobile device (not shown), such as device 1050. Each of these devices may include one or more of computing devices 1000, 1050, and the overall system may be comprised of multiple computing devices 1000, 1050 in communication with each other.

컴퓨팅 디바이스(1050)는 특히 프로세서(1052), 메모리(1064), 디스플레이(1054)와 같은 입/출력 디바이스, 통신 인터페이스(1066) 및 트랜시버(1068)를 포함한다. 디바이스(1050)에는 추가 저장 디바이스를 제공하기 위해 마이크로드라이브 또는 기타 디바이스와 같은 저장 디바이스가 제공될 수도 있다. 구성요소(1050, 1052, 1064, 1054, 1066, 1068) 각각은 다양한 버스를 사용하여 상호 연결되며, 일부 구성요소는 공통 마더보드에 장착되거나 적절한 다른 방식으로 장착될 수 있다.Computing device 1050 includes, among other things, a processor 1052, memory 1064, input/output devices such as display 1054, communication interface 1066, and transceiver 1068. Device 1050 may also be provided with a storage device, such as a microdrive or other device, to provide additional storage. Each of the components 1050, 1052, 1064, 1054, 1066, and 1068 is interconnected using various buses, and some components may be mounted on a common motherboard or in some other manner as appropriate.

프로세서(1052)는 메모리(1064)에 저장된 명령들을 포함하여 컴퓨팅 디바이스(1050) 내에서 명령들을 실행할 수 있다. 프로세서는 별도의 다중 아날로그 및 디지털 프로세서를 포함하는 칩의 칩셋으로 구현될 수 있다. 프로세서는 예를 들어, 사용자 인터페이스의 제어, 디바이스(1050)에 의해 실행되는 애플리케이션 및 디바이스(1050)에 의한 무선 통신과 같은 디바이스(1050)의 다른 구성요소의 조정을 제공할 수 있다.Processor 1052 may execute instructions within computing device 1050, including instructions stored in memory 1064. The processor may be implemented as a chipset of chips containing multiple separate analog and digital processors. The processor may provide coordination of other components of device 1050, such as, for example, control of the user interface, applications executed by device 1050, and wireless communications by device 1050.

프로세서(1052)는 제어 인터페이스(1058) 및 디스플레이(1054)에 연결된 디스플레이 인터페이스(1056)를 통해 사용자와 통신할 수 있다. 디스플레이(1054)는 예를 들어 TFT LCD(박막 트랜지스터 액정 디스플레이), LED(발광 다이오드) 또는 OLED(유기 발광 다이오드) 디스플레이, 또는 기타 적절한 디스플레이 기술일 수 있다. 디스플레이 인터페이스(1056)는 그래픽 및 기타 정보를 사용자에게 제시하기 위해 디스플레이(1054)를 구동하기 위한 적절한 회로를 포함할 수 있다. 제어 인터페이스(1058)는 사용자로부터 커맨드를 수신하고 이를 프로세서(1052)에 제출하기 위해 변환할 수 있다. 또한, 외부 인터페이스(1062)는 프로세서(1052)와 통신하여 제공되어 디바이스(1050)와 다른 디바이스의 근거리 통신을 가능하게 할 수 있다. 외부 인터페이스(1062)는 예를 들어 일부 구현에서는 유선 통신을 제공하고 다른 구현에서는 무선 통신을 제공할 수 있으며, 다중 인터페이스도 사용될 수 있다.Processor 1052 may communicate with the user through control interface 1058 and display interface 1056 coupled to display 1054. Display 1054 may be, for example, a TFT LCD (thin film transistor liquid crystal display), an LED (light emitting diode) or OLED (organic light emitting diode) display, or other suitable display technology. Display interface 1056 may include suitable circuitry to drive display 1054 to present graphics and other information to a user. Control interface 1058 may receive commands from a user and transform them for submission to processor 1052. Additionally, the external interface 1062 may be provided to communicate with the processor 1052 to enable short-range communication between the device 1050 and other devices. External interface 1062 may provide wired communications in some implementations and wireless communications in others, for example, and multiple interfaces may also be used.

메모리(1064)는 컴퓨팅 디바이스(1050) 내에 정보를 저장한다. 메모리(1064)는 컴퓨터 판독 가능 매체(들), 휘발성 메모리 유닛(들) 또는 비휘발성 메모리 유닛(들) 중 하나 이상으로 구현될 수 있다. 확장 메모리(1074)는 또한 예를 들어 SIMM(Single In-Line Memory Module) 카드 인터페이스를 포함할 수 있는 확장 인터페이스(1072)를 통해 디바이스(1050)에 제공되고 연결될 수 있다. 이러한 확장 메모리(1074)는 디바이스(1050)에 대한 추가 저장 공간을 제공할 수 있거나 디바이스(1050)에 대한 애플리케이션 또는 기타 정보를 저장할 수도 있다. 구체적으로, 확장 메모리(1074)는 전술한 프로세스를 수행하거나 보완하기 위한 명령들을 포함할 수 있고, 보안 정보도 포함할 수 있다. 따라서, 예를 들어 확장 메모리(1074)는 디바이스(1050)에 대한 보안 모듈로서 제공될 수 있으며 디바이스(1050)의 보안 사용을 허용하는 명령들로 프로그래밍될 수 있다. 또한, SIMM 카드에 해킹 불가능한 방식으로 식별 정보를 배치하는 등의 추가 정보와 함께 보안 애플리케이션이 SIMM 카드를 통해 제공될 수 있다.Memory 1064 stores information within computing device 1050. Memory 1064 may be implemented as one or more of computer-readable medium(s), volatile memory unit(s), or non-volatile memory unit(s). Expansion memory 1074 may also be provided and connected to device 1050 via expansion interface 1072, which may include, for example, a Single In-Line Memory Module (SIMM) card interface. This expansion memory 1074 may provide additional storage space for device 1050 or may store applications or other information for device 1050. Specifically, the expansion memory 1074 may include instructions for performing or supplementing the above-described process, and may also include security information. Thus, for example, expansion memory 1074 may serve as a security module for device 1050 and may be programmed with instructions to allow secure use of device 1050. Additionally, security applications may be provided via the SIMM card with additional information, such as placing identifying information on the SIMM card in an unhackable manner.

메모리는 예를 들어 후술하는 바와 같이 플래시 메모리 및/또는 NVRAM 메모리를 포함할 수 있다. 일 구현에서, 컴퓨터 프로그램 제품은 정보 매체에 명백하게 구현된다. 컴퓨터 프로그램 제품에는 실행될 때 전술한 것과 같은 하나 이상의 방법을 수행하는 명령들이 포함되어 있다. 정보 매체는 예를 들어 트랜시버(1068) 또는 외부 인터페이스(1062)를 통해 수신될 수 있는 메모리(1064), 확장 메모리(1074) 또는 프로세서(1052)의 메모리와 같은 컴퓨터 판독 가능 매체 또는 기계 판독 가능 매체이다.The memory may include, for example, flash memory and/or NVRAM memory, as described below. In one implementation, the computer program product is explicitly embodied in an information carrier. A computer program product contains instructions that, when executed, perform one or more methods such as those described above. The information carrier may be a computer-readable or machine-readable medium, such as, for example, memory of memory 1064, expansion memory 1074, or processor 1052, which may be received via transceiver 1068 or external interface 1062. am.

디바이스(1050)는 필요한 경우 디지털 신호 처리 회로를 포함할 수 있는 통신 인터페이스(1066)를 통해 무선으로 통신할 수 있다. 통신 인터페이스(1066)는 무엇보다도 GSM 음성 통화, SMS, EMS 또는 MMS 메시징, CDMA, TDMA, PDC, WCDMA, CDMA2000 또는 GPRS와 같은 다양한 모드 또는 프로토콜 하에서 통신을 제공할 수 있다. 이러한 통신은 예를 들어 무선 주파수 트랜시버(1068)를 통해 발생할 수 있다. 또한, 블루투스, Wi-Fi 또는 기타 트랜시버(미도시)를 사용하는 등의 단거리 통신이 발생할 수도 있다. 또한, GPS(Global Positioning System) 수신기 모듈(1070)은 디바이스(1050)에서 실행되는 애플리케이션에 의해 적절하게 사용될 수 있는 추가 내비게이션 관련 무선 데이터 및 위치 관련 무선 데이터를 디바이스(1050)에 제공할 수 있다.Device 1050 may communicate wirelessly via communication interface 1066, which may include digital signal processing circuitry, if desired. Communication interface 1066 may provide communication under various modes or protocols, such as GSM voice calls, SMS, EMS or MMS messaging, CDMA, TDMA, PDC, WCDMA, CDMA2000 or GPRS, among others. Such communication may occur, for example, via radio frequency transceiver 1068. Additionally, short-range communication may occur, such as using Bluetooth, Wi-Fi, or other transceivers (not shown). Additionally, the Global Positioning System (GPS) receiver module 1070 may provide the device 1050 with additional navigation-related wireless data and location-related wireless data that can be appropriately used by applications running on the device 1050.

디바이스(1050)는 또한 사용자로부터 음성 정보를 수신하고 이를 사용 가능한 디지털 정보로 변환할 수 있는 오디오 코덱(1060)을 사용하여 청각적으로 통신할 수 있다. 오디오 코덱(1060)은 마찬가지로 예를 들어 디바이스(1050)의 핸드셋에서 스피커를 통해 사용자를 위한 가청 사운드를 생성할 수 있다. 이러한 소리에는 음성 전화 통화의 소리가 포함될 수 있고, 녹음된 소리(예를 들어, 음성 메시지, 음악 파일 등)가 포함될 수 있으며, 디바이스(1050)에서 작동하는 애플리케이션에 의해 생성된 소리도 포함될 수 있다.Device 1050 may also communicate aurally using audio codec 1060, which can receive audio information from a user and convert it into usable digital information. Audio codec 1060 may likewise generate audible sound for a user, such as through a speaker in a handset of device 1050. These sounds may include sounds from voice phone calls, may include recorded sounds (e.g., voice messages, music files, etc.), and may also include sounds generated by applications running on device 1050. .

컴퓨팅 디바이스(1050)는 도면에 도시된 바와 같이 다수의 다른 형태로 구현될 수 있다. 예를 들어, 이는 휴대 전화(1080)로 구현될 수 있다. 이는 스마트폰(1082), PDA(Personal Digital Assistant) 또는 기타 유사한 모바일 디바이스의 일부로서 구현될 수도 있다.Computing device 1050 may be implemented in a number of different forms as shown in the figure. For example, this could be implemented as a mobile phone 1080. This may be implemented as part of a smartphone 1082, a personal digital assistant (PDA), or other similar mobile device.

본 명세서에 설명된 시스템 및 기술의 다양한 구현은 디지털 전자 회로, 집적 회로, 주문형 집적회로(ASIC), 컴퓨터 하드웨어, 펌웨어, 소프트웨어 및/또는 이들의 조합으로 실현될 수 있다. 이들 다양한 구현은 저장 시스템, 적어도 하나의 입력 디바이스 및 적어도 하나의 출력 디바이스로부터 데이터 및 명령을 수신하고 이들로 데이터 및 명령을 전송하도록 결합된 특수 또는 범용일 수 있는 적어도 하나의 프로그래밍 가능한 프로세서를 포함하는 프로그래밍 가능한 시스템에서 실행 가능 및/또는 해석 가능한 하나 이상의 컴퓨터 프로그램에서의 구현을 포함할 수 있다. Various implementations of the systems and techniques described herein may be realized in digital electronic circuits, integrated circuits, application specific integrated circuits (ASICs), computer hardware, firmware, software, and/or combinations thereof. These various implementations include at least one programmable processor, which may be special or general purpose, coupled to receive data and instructions from and transmit data and instructions to a storage system, at least one input device, and at least one output device. It may include implementation in one or more computer programs executable and/or interpretable on a programmable system.

이러한 컴퓨터 프로그램(프로그램, 소프트웨어, 소프트웨어 애플리케이션 또는 코드라고도 함)에는 프로그래밍 가능 프로세서에 대한 기계 명령들이 포함되어 있으며 고급 절차적 및/또는 객체 지향적 프로그래밍 언어 및/또는 어셈블리/기계로 구현될 수 있다. 본 명세서에 사용되는 바와 같이, "기계 판독 가능 매체", "컴퓨터 판독 가능 매체"라는 용어는 기계 판독 가능 신호로서 기계 명령들을 수신하는 기계 판독 가능 매체를 비롯하여, 프로그래밍 가능 프로세서에 기계 명령 및/또는 데이터를 제공하는데 사용되는 임의의 컴퓨터 프로그램 제품, 장치 및/또는 디바이스(예를 들어, 자기 디스크, 광 디스크, 메모리, 프로그래밍 가능 논리 장치(PLD))를 지칭한다. "기계 판독 가능 신호"라는 용어는 기계 명령 및/또는 데이터를 프로그래밍 가능한 프로세서에 제공하는 데 사용되는 모든 신호를 지칭한다.Such computer programs (also called programs, software, software applications, or code) contain machine instructions for a programmable processor and may be implemented in a high-level procedural and/or object-oriented programming language and/or assembly/machine. As used herein, the terms “machine-readable medium” and “computer-readable medium” include a machine-readable medium that receives machine instructions as machine-readable signals and/or transmits machine instructions and/or instructions to a programmable processor. Refers to any computer program product, apparatus, and/or device (e.g., magnetic disk, optical disk, memory, programmable logic device (PLD)) used to provide data. The term “machine-readable signal” refers to any signal used to provide machine instructions and/or data to a programmable processor.

사용자와의 상호 작용을 제공하기 위해, 본 명세서에 설명된 시스템과 기술은 사용자에게 정보를 표시하기 위한 디스플레이 디바이스(LED, OLED 또는 LCD 모니터/액정 디스플레이)와 사용자가 컴퓨터에 입력을 제공할 수 있는 키보드 및 포인팅 디바이스(예를 들어, 마우스 또는 트랙볼)를 갖춘 컴퓨터에서 구현될 수 있다. 사용자와의 상호작용을 제공하기 위해 다른 종류의 디바이스도 사용될 수 있는데, 예를 들어, 사용자에게 제공되는 피드백은 모든 형태의 감각 피드백(예를 들어, 시각적 피드백, 청각 피드백 또는 촉각 피드백)일 수 있고, 사용자로부터의 입력은 음향, 음성 또는 촉각 입력을 포함한 모든 형태로 수신될 수 있다.To provide interaction with a user, the systems and techniques described herein may include a display device (LED, OLED, or LCD monitor/liquid crystal display) for displaying information to the user and for allowing the user to provide input to the computer. It may be implemented on a computer equipped with a keyboard and a pointing device (e.g., a mouse or trackball). Other types of devices may also be used to provide interaction with the user, for example, the feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); , input from the user may be received in any form, including acoustic, voice, or tactile input.

본 명세서에 설명된 시스템 및 기술은 백엔드 구성요소(예를 들어, 데이터 서버)를 포함하거나, 미들웨어 구성 요소(예를 들어, 애플리케이션 서버)를 포함하거나, 프런트엔드 구성 요소(예를 들어, 사용자가 본 명세서에 설명된 주제의 구현과 상호 작용할 수 있는 그래픽 사용자 인터페이스, 웹 브라우저 또는 앱을 갖춘 클라이언트 컴퓨터)를 포함하거나, 또는 백엔드, 미들웨어 또는 프런트엔드 구성 요소 중 하나 이상의 조합을 포함하는 컴퓨팅 시스템에서 구현될 수 있다. 시스템의 구성 요소는 통신 네트워크와 같은 디지털 데이터 통신의 모든 형태나 매체를 통해 상호 연결될 수 있다. 통신 네트워크의 예로는 근거리 통신망("LAN"), 광역 통신망("WAN") 및 인터넷이 포함된다.The systems and technologies described herein may include a back-end component (e.g., a data server), a middleware component (e.g., an application server), or a front-end component (e.g., a user implemented on a computing system that includes a client computer equipped with a graphical user interface, web browser, or app capable of interacting with an implementation of the subject matter described herein, or that includes a combination of one or more of backend, middleware, or frontend components. It can be. The components of a system may be interconnected through any form or medium of digital data communication, such as a telecommunications network. Examples of communications networks include local area networks (“LANs”), wide area networks (“WANs”), and the Internet.

컴퓨팅 시스템은 클라이언트와 서버를 포함할 수 있다. 클라이언트와 서버는 일반적으로 서로 멀리 떨어져 있으며 일반적으로 통신 네트워크를 통해 상호 작용한다. 클라이언트와 서버의 관계는 각 컴퓨터에서 실행되고 서로 클라이언트-서버 관계를 갖는 컴퓨터 프로그램으로 인해 발생한다. A computing system may include clients and servers. Clients and servers are usually remote from each other and typically interact through a communications network. The relationship between client and server arises due to computer programs running on each computer and having a client-server relationship with each other.

일부 구현예에서, 도면에 도시된 컴퓨팅 디바이스는 AR 헤드셋/HMD 디바이스(1090)와 인터페이스하여 물리적 공간 내에 삽입된 컨텐츠를 보기 위한 증강 환경을 생성하는 센서들을 포함할 수 있다. 예를 들어, 컴퓨팅 디바이스(1050) 또는 도면에 도시된 다른 컴퓨팅 디바이스에 포함된 하나 이상의 센서는 AR 헤드셋(1090)에 입력을 제공하거나 일반적으로 AR 공간에 입력을 제공할 수 있다. 센서에는 터치스크린, 가속도계, 자이로스코프, 압력 센서, 생체 인식 센서, 온도 센서, 습도 센서 및 주변광 센서가 포함될 수 있지만 이에 한정되지 않는다. 컴퓨팅 디바이스(1050)는 센서들을 사용하여 AR 공간에 대한 입력으로서 사용될 수 있는 AR 공간에서의 컴퓨팅 디바이스의 절대 위치 및/또는 검출된 회전을 결정할 수 있다. 예를 들어, 컴퓨팅 디바이스(1050)는 컨트롤러, 레이저 포인터, 키보드, 무기 등과 같은 가상 객체로서 AR 공간에 통합될 수 있다. AR 공간에 통합될 때 사용자에 의한 컴퓨팅 디바이스/가상 객체의 위치 지정은 사용자가 AR 공간에서 특정 방식으로 가상 객체를 볼 수 있도록 컴퓨팅 디바이스를 위치시키는 것을 허용할 수 있다. In some implementations, the computing device shown in the figures may include sensors that interface with AR headset/HMD device 1090 to create an augmented environment for viewing content embedded within the physical space. For example, one or more sensors included in computing device 1050 or another computing device shown in the figures may provide input to AR headset 1090 or provide input to the AR space generally. Sensors may include, but are not limited to, touchscreens, accelerometers, gyroscopes, pressure sensors, biometric sensors, temperature sensors, humidity sensors, and ambient light sensors. Computing device 1050 may use sensors to determine an absolute position and/or detected rotation of the computing device in AR space, which can be used as an input to the AR space. For example, computing device 1050 may be integrated into the AR space as a virtual object such as a controller, laser pointer, keyboard, weapon, etc. Positioning of the computing device/virtual object by the user when integrated into the AR space may allow the user to position the computing device to view the virtual object in a particular way in the AR space.

일부 구현에서, 컴퓨팅 디바이스(1050)에 포함되거나 이에 연결되는 하나 이상의 입력 디바이스는 AR 공간에 대한 입력으로서 사용될 수 있다. 입력 디바이스에는 터치스크린, 키보드, 하나 이상의 버튼, 트랙패드, 터치패드, 포인팅 디바이스, 마우스, 트랙볼, 조이스틱, 카메라, 마이크로폰, 입력 기능이 있는 이어폰이나 버드, 게임 컨트롤러, 또는 기타 연결 가능한 입력 디바이스가 포함될 수 있지만 이에 한정되지 않는다. 컴퓨팅 디바이스가 AR 공간에 통합될 때 컴퓨팅 디바이스(1050)에 포함된 입력 디바이스와 상호작용하는 사용자는 AR 공간에서 특정 동작이 발생하도록 할 수 있다.In some implementations, one or more input devices included in or connected to computing device 1050 may be used as inputs to the AR space. Input devices may include a touchscreen, keyboard, one or more buttons, trackpad, touchpad, pointing device, mouse, trackball, joystick, camera, microphone, earphones or buds with input capabilities, game controller, or other connectable input device. It may be possible, but it is not limited to this. When a computing device is integrated into an AR space, a user interacting with an input device included in computing device 1050 can cause a specific action to occur in the AR space.

일부 구현에서, 컴퓨팅 디바이스(1050)의 터치스크린은 AR 공간에서 터치패드로서 렌더링될 수 있다. 사용자는 컴퓨팅 디바이스(1050)의 터치스크린과 상호작용할 수 있다. 예를 들어 AR 헤드셋(1090)에서 상호작용은 AR 공간의 렌더링된 터치패드에서의 움직임으로 렌더링된다. 렌더링된 움직임은 AR 공간의 가상 객체를 제어할 수 있다.In some implementations, the touchscreen of computing device 1050 may be rendered as a touchpad in the AR space. A user may interact with the touchscreen of computing device 1050. For example, in AR headset 1090, interactions are rendered as movements on a rendered touchpad in the AR space. Rendered movements can control virtual objects in the AR space.

일부 구현에서, 컴퓨팅 디바이스(1050)에 포함된 하나 이상의 출력 디바이스는 AR 공간에 있는 AR 헤드셋(1090)의 사용자에게 출력 및/또는 피드백을 제공할 수 있다. 출력 및 피드백은 시각적, 촉각적 또는 오디오일 수 있다. 출력 및/또는 피드백에는 진동, 하나 이상의 조명이나 스트로보의 켜기 및 끄기 또는 깜박임 및/또는 섬광(flashing), 알람 소리, 차임벨 재생, 노래 재생, 및 오디오 파일 재생 등이 있지만 이에 한정되지 않는다. 출력 디바이스에는 진동 모터, 진동 코일, 압전 디바이스, 정전기 디바이스, 발광 다이오드(LED), 스트로브 및 스피커가 포함될 수 있지만 이에 한정되지 않는다.In some implementations, one or more output devices included in computing device 1050 may provide output and/or feedback to a user of AR headset 1090 in the AR space. Output and feedback can be visual, tactile, or audio. Outputs and/or feedback may include, but are not limited to, vibration, turning one or more lights or strobes on and off or blinking and/or flashing, sounding an alarm, playing a chime, playing a song, and playing an audio file. Output devices may include, but are not limited to, vibrating motors, vibrating coils, piezoelectric devices, electrostatic devices, light emitting diodes (LEDs), strobes, and speakers.

일부 구현에서, 컴퓨팅 디바이스(1050)는 컴퓨터가 생성한 3D 환경에서 다른 객체로서 나타날 수 있다. 사용자와 컴퓨팅 디바이스(1050)의 상호작용(예를 들어, 회전, 흔들기, 터치스크린 터치, 터치스크린에서 손가락 스와이프)은 AR 공간의 객체와의 상호작용으로 해석될 수 있다. AR 공간의 레이저 포인터의 예에서, 컴퓨팅 디바이스(1050)는 컴퓨터로 생성된 3D 환경에서 가상 레이저 포인터로 나타난다. 사용자가 컴퓨팅 디바이스(1050)를 조작함에 따라 AR 공간에 있는 사용자는 레이저 포인터의 움직임을 보게 된다. 사용자는 컴퓨팅 디바이스(1050) 또는 AR 헤드셋(1090) 상의 AR 환경에서 컴퓨팅 디바이스(1050)와의 상호작용으로부터 피드백을 수신한다. 컴퓨팅 디바이스와 사용자의 상호 작용은 제어 가능한 디바이스에 대한 AR 환경에서 생성된 사용자 인터페이스와의 상호 작용으로 변환될 수 있다.In some implementations, computing device 1050 may appear as other objects in a computer-generated 3D environment. The user's interaction with the computing device 1050 (e.g., rotating, shaking, touching the touch screen, swiping a finger on the touch screen) may be interpreted as interaction with an object in the AR space. In the example of a laser pointer in AR space, computing device 1050 appears as a virtual laser pointer in a computer-generated 3D environment. As the user manipulates the computing device 1050, the user in the AR space sees the movement of the laser pointer. The user receives feedback from interactions with computing device 1050 in an AR environment on computing device 1050 or AR headset 1090. A user's interaction with a computing device can be translated into interaction with a user interface created in an AR environment for a controllable device.

일부 구현에서, 컴퓨팅 디바이스(1050)는 터치스크린을 포함할 수 있다. 예를 들어, 사용자는 제어 가능한 디바이스의 사용자 인터페이스와 상호작용하기 위해 터치스크린과 상호작용할 수 있다. 예를 들어, 터치스크린은 제어 가능한 디바이스의 속성을 제어할 수 있는 슬라이더와 같은 사용자 인터페이스 요소를 포함할 수 있다.In some implementations, computing device 1050 may include a touchscreen. For example, a user may interact with a touchscreen to interact with the user interface of a controllable device. For example, a touchscreen may include user interface elements such as sliders that can control properties of a controllable device.

컴퓨팅 디바이스(1000)는 랩탑, 데스크탑, 워크스테이션, PDA, 서버, 블레이드 서버, 메인프레임 및 기타 적절한 컴퓨터를 포함하지만 이에 제한되지 않는 다양한 형태의 디지털 컴퓨터 및 디바이스를 나타내도록 의도. 컴퓨팅 디바이스(1050)는 PDA, 휴대폰, 스마트폰 및 기타 유사한 컴퓨팅 디바이스와 같은 다양한 형태의 모바일 디바이스를 나타내도록 의도. 본 명세서에 도시된 구성 요소, 해당 연결 및 관계, 해당 기능은 단지 예일 뿐이며 이 문서에 설명 및/또는 청구된 발명의 구현을 제한하려는 의미는 아니다.Computing device 1000 is intended to represent various types of digital computers and devices, including but not limited to laptops, desktops, workstations, PDAs, servers, blade servers, mainframes, and other suitable computers. Computing device 1050 is intended to represent various types of mobile devices, such as PDAs, cell phones, smartphones, and other similar computing devices. The components, their connections and relationships, and their functions shown herein are examples only and are not meant to limit the implementation of the invention described and/or claimed in this document.

다수의 실시예가 설명. 그럼에도 불구하고, 본 명세서의 정신과 범위를 벗어나지 않고 다양한 수정이 이루어질 수 있음이 이해될 것이다.A number of embodiments are described. Nonetheless, it will be understood that various modifications may be made without departing from the spirit and scope of the present disclosure.

또한, 도면에 도시된 논리 흐름은 원하는 결과를 달성하기 위해 도시된 특정 순서 또는 순차적 순서를 필요로 하지 않는다. 또한, 설명된 흐름으로부터 다른 단계가 제공되거나 단계가 제거될 수 있으며, 설명된 시스템에 다른 구성요소가 추가되거나 제거될 수 있다. 따라서, 다른 실시예는 다음 청구범위의 범위 내에 있다.Additionally, the logic flow depicted in the figures does not require the specific order or sequential order shown to achieve the desired results. Additionally, other steps may be provided or steps may be removed from the described flow, and other components may be added or removed from the described system. Accordingly, other embodiments are within the scope of the following claims.

위의 설명에 추가로, 사용자는 본 명세서에 설명된 시스템, 프로그램 또는 기능이 사용자 정보(예를 들어, 사용자의 소셜 네트워크, 소셜 활동, 활동, 직업, 사용자 선호도 또는 사용자의 현재 위치에 대한 정보)의 수집을 가능하게 하는지 여부와 사용자가 서버에서 컨텐츠 또는 통신을 전송할 수 있는지 여부에 대해 사용자가 선택할 수 있도록 하는 제어 기능을 제공받을 수 있다. 또한, 특정 데이터는 개인 식별 정보가 제거될 수 있도록 저장되거나 사용되기 전에 하나 이상의 방식으로 처리될 수 있다. 예를 들어, 사용자의 신원은 사용자에 대해 어떠한 개인 식별 정보도 결정될 수 없도록 처리될 수 있거나, 사용자의 지리적 위치는 특정 사용자의 위치가 확인될 수 없도록 위치 정보가 획득된 곳에서 일반화(예를 들어, 도시, 우편번호 또는 주 수준)될 수 있다. 따라서, 사용자는 자신에 대해 어떤 정보가 수집되고, 해당 정보가 어떻게 사용되며 및 어떤 정보가 사용자에게 제공되는지에 대한 제어 기능을 가질 수 있다.In addition to the description above, User acknowledges that any system, program, or feature described herein may collect User information (e.g., information about User's social networks, social activities, activities, occupation, User preferences, or User's current location). You may be provided with controls that allow you to choose whether to enable the collection of content and whether you can transmit content or communications to the server. Additionally, certain data may be processed in one or more ways before being stored or used so that personally identifiable information can be removed. For example, the user's identity may be processed so that no personally identifiable information can be determined about the user, or the user's geographic location may be generalized from where the location information was obtained (e.g. , city, zip code, or state level). Accordingly, users can have control over what information is collected about them, how that information is used, and what information is provided to them.

설명된 구현의 특정 특징이 본 명세서에 설명된 바와 같이 예시되었지만, 이제 당업자는 많은 수정, 대체, 변경 및 등가물을 생각할 수 있을 것이다. 따라서, 첨부된 청구범위는 구현의 범위 내에 속하는 모든 수정 및 변경을 포괄하도록 의도된 것임을 이해해야 한다. 이는 예시로서 제시된 것이며, 이에 국한되지 않으며, 형태나 세부 사항이 다양하게 변경될 수 있다는 점을 이해해야 한다. 본 명세서에 설명된 디바이스 및/또는 방법의 임의의 부분은 상호 배타적인 조합을 제외하고 임의의 조합으로 조합될 수 있다. 본 명세서에 설명된 구현은 설명된 다양한 구현의 기능, 구성 요소 및/또는 특징의 다양한 조합 및/또는 하위 조합을 포함할 수 있다.Although certain features of the described implementations have been illustrated as described herein, many modifications, substitutions, changes and equivalents will now occur to those skilled in the art. Accordingly, it is to be understood that the appended claims are intended to cover all modifications and changes that fall within the scope of implementation. It should be understood that this is provided as an example and is not limited thereto, and that the form or details may vary. Any portion of the devices and/or methods described herein may be combined in any combination except mutually exclusive combinations. Implementations described herein may include various combinations and/or sub-combinations of the functions, components, and/or features of the various implementations described.

Claims

As an image processing method,
In response to receiving a request to cause a sensor of a computing device to identify image content associated with optical data captured by the sensor:
detecting a first sensor data stream having a first image resolution, the first sensor data stream being based on optical data; and
detecting a second sensor data stream having a second image resolution, the second sensor data stream being based on optical data; and
identifying, by processing circuitry of the computing device, at least one region of interest in the first sensor data stream;
determining, by the processing circuitry, excerpt coordinates defining a plurality of first pixels in at least one region of interest in the first sensor data stream;
generating, by processing circuitry, a cropped image representing at least one region of interest, wherein generating:
identifying a second plurality of pixels in a second sensor data stream using excerpt coordinates that define the first plurality of pixels in at least one region of interest in the first sensor data stream, and
An image processing method comprising extracting a second sensor data stream into a plurality of second pixels.

According to paragraph 1,
performing, by processing circuitry, optical character resolution on the excerpt image to generate a machine-readable version of the excerpt image;
performing, by processing circuitry, a search query using the machine-readable version of the excerpt image to generate a plurality of search results; and
An image processing method further comprising displaying search results on a display of a computing device.

According to claim 1 or 2,
the second sensor data stream is stored in a memory of the computing device,
The image processing method is,
In response to identifying at least one region of interest in the first sensor data stream, retrieving a corresponding at least one region of interest in second sensor data stored in memory; and
An image processing method further comprising restricting access to the first sensor data stream while continuing to sense and access the second sensor data stream.

In any of the preceding clauses,
In response to generating a snippet of an image representing at least one region of interest, transmitting the generated snippet of the image to a mobile device in communication with the computing device;
Receiving, from a mobile device, information regarding at least one region of interest; and
An image processing method further comprising displaying information on a display of a computing device.

In any of the preceding clauses,
the computing device is a battery-powered computing device, and
the first image resolution has a low image resolution and the second image resolution has a high image resolution; and
Identifying the at least one region of interest includes:
An image processing method further comprising identifying text or at least one object displayed in the first sensor data using the first sensor data as input by a machine learning algorithm executing on a computing device.

In any of the preceding clauses,
The processing circuit is at least:
a first image processor configured to perform image signal processing on the first sensor data stream; and
a second image processor configured to perform image signal processing on a second sensor data stream, wherein a first image resolution of the first sensor data stream is lower than a second image resolution of the second data sensor stream. Image processing method.

In any of the preceding clauses,
The step of generating an excerpt image representing the at least one region of interest includes:
performed in response to detecting that at least one region of interest meets a threshold condition, wherein the threshold condition includes detecting whether the plurality of second pixels in the second sensor data stream have low blur. Image processing method.

As a wearable computing device,
at least one processing device;
at least one image sensor configured to capture optical data;
a memory storing instructions that, when executed, cause the wearable computing device to perform operations, the operations comprising:
In response to receiving a request to cause at least one image sensor to identify image content associated with optical data:
detecting a first sensor data stream having a first image resolution, the first sensor data stream being based on optical data; and
detecting a second sensor data stream having a second image resolution, the second sensor data stream being based on optical data; and
identifying at least one region of interest in the first sensor data stream;
determining excerpt coordinates defining a first plurality of pixels in at least one region of interest in the first sensor data stream; and
Generating an excerpt image representing at least one region of interest, wherein the generating operation comprises generating an excerpt image representing at least one region of interest in the first sensor data stream using the excerpt coordinates defining a plurality of first pixels in the region of interest of the second sensor. A wearable computing device comprising: identifying a second plurality of pixels of a data stream; and extracting a second sensor data stream into the plurality of second pixels.

According to clause 8,
The at least one image sensor is:
A wearable computing device comprising a dual stream image sensor configured to operate in a low image resolution mode until triggered to switch to operation in the high image resolution mode.

According to clause 8 or 9,
The above operations are:
performing optical character resolution on the excerpt image to generate a machine-readable version of the excerpt image;
performing a search query using the machine-readable version of the excerpt image to generate a plurality of search results; and
A wearable computing device further comprising causing audio output at the optical character resolution from a speaker of the wearable computing device.

In any of the preceding clauses,
The second sensor data stream is stored in a memory of the wearable computing device, and the operations include:
In response to identifying at least one area of interest in the first sensor data stream, searching for at least one corresponding area of interest in second sensor data stored in a memory; and
The wearable computing device further comprising limiting access to the first sensor data stream while continuing to sense and access the second sensor data stream.

In any of the preceding clauses,
The above operations are:
In response to generating an image excerpt representing at least one region of interest, transmitting the generated image excerpt to a mobile device in communication with the wearable computing device;
Receiving information about at least one region of interest from a mobile device; and
A wearable computing device further comprising an operation of outputting information from the wearable computing device.

In any of the preceding clauses,
The first image resolution has a low image resolution, the second image resolution has a high image resolution; and
The operation of identifying the at least one region of interest includes:
A wearable computing device further comprising identifying text or at least one object displayed in the first sensor data using the first sensor data as input by a machine learning algorithm executing on the wearable computing device.

In any of the preceding clauses,
The at least one processing device includes:
a first image processor configured to perform image signal processing on the first sensor data stream; and
a second image processor configured to perform image signal processing on a second sensor data stream, wherein a first image resolution of the first sensor data stream is lower than a second image resolution of the second data sensor stream. Wearable computing device.

A non-transitory computer-readable medium having stored instructions that, when executed by processing circuitry, cause a wearable computing device to:
detect a first sensor data stream having a first image resolution, the first sensor data stream being based on optical data acquired by a sensor of the wearable computing device; and
detect a second sensor data stream having a second image resolution, the second sensor data stream being based on the optical data;
identify at least one region of interest in the first sensor data stream;
determine coordinates to extract that define a plurality of first pixels in at least one region of interest in the first sensor data stream;
generate an excerpt image representing at least one region of interest;
The above generates,
identify a second plurality of pixels in the second sensor data stream using excerpt coordinates that define the first plurality of pixels in at least one region of interest in the first sensor data stream, and
A non-transitory computer-readable medium comprising extracting a second sensor data stream into a second plurality of pixels.

According to clause 15,
The processing circuit is,
Wearable computing devices allow:
performing optical character resolution on the excerpt image to generate a machine-readable version of the excerpt image;
perform a search query using the machine-readable version of the excerpted image to generate a plurality of search results; and
A non-transitory computer readable medium further configured to display search results on a display of a wearable computing device.

According to claim 15 or 16,
The second sensor data stream is stored in a memory of the wearable computing device, and
The processing circuit is,
Wearable computing devices allow:
In response to identifying at least one area of interest in the first sensor data stream, retrieve the corresponding at least one area of interest in the second sensor data stored in the memory; and
A non-transitory computer readable medium further configured to limit access to the first sensor data stream while continuing to sense and access the second sensor data stream.

In any of the preceding clauses,
The processing circuit is,
Wearable computing devices allow:
In response to generating a snippet of an image representing at least one region of interest, transmit the generated snippet of the image to a mobile device in communication with the wearable computing device;
Receive, from a mobile device, information regarding at least one region of interest; and
A non-transitory computer readable medium configured to display information on a display of a wearable computing device.

In any of the preceding clauses,
the first image resolution has a low image resolution and the second image resolution has a high image resolution; and
Identifying the at least one area of interest includes:
Non-transitory computer reading, further comprising identifying text or at least one object represented in the first sensor data by a machine learning algorithm running on the wearable computing device and using the first sensor data as input. Available medium.

In any of the preceding clauses,
The processing circuit is at least:
a first image processor configured to perform image signal processing on the first sensor data stream; and
a second image processor configured to perform image signal processing on the second sensor data stream, wherein the first image resolution of the first sensor data stream is lower than the second image resolution of the second data sensor stream. -Transient computer-readable media.