TW202420241A - Image processing method and electronic device - Google Patents
Image processing method and electronic device Download PDFInfo
- Publication number
- TW202420241A TW202420241A TW111142674A TW111142674A TW202420241A TW 202420241 A TW202420241 A TW 202420241A TW 111142674 A TW111142674 A TW 111142674A TW 111142674 A TW111142674 A TW 111142674A TW 202420241 A TW202420241 A TW 202420241A
- Authority
- TW
- Taiwan
- Prior art keywords
- width
- virtual camera
- image
- virtual
- processing method
- Prior art date
Links
- 238000003672 processing method Methods 0.000 title claims abstract description 20
- 230000001815 facial effect Effects 0.000 claims abstract description 45
- 210000003128 head Anatomy 0.000 claims description 22
- 238000006073 displacement reaction Methods 0.000 claims description 19
- 210000001747 pupil Anatomy 0.000 claims description 3
- 210000004373 mandible Anatomy 0.000 claims description 2
- 238000012935 Averaging Methods 0.000 claims 3
- 238000005070 sampling Methods 0.000 claims 3
- 230000007423 decrease Effects 0.000 claims 1
- 238000000034 method Methods 0.000 claims 1
- 238000010586 diagram Methods 0.000 description 24
- 238000013527 convolutional neural network Methods 0.000 description 13
- 238000004364 calculation method Methods 0.000 description 5
- 230000006870 function Effects 0.000 description 4
- 238000003062 neural network model Methods 0.000 description 4
- 238000013459 approach Methods 0.000 description 3
- 238000013528 artificial neural network Methods 0.000 description 3
- 230000003190 augmentative effect Effects 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 238000009499 grossing Methods 0.000 description 2
- 230000004886 head movement Effects 0.000 description 2
- 230000002452 interceptive effect Effects 0.000 description 2
- 230000000295 complement effect Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 229910044991 metal oxide Inorganic materials 0.000 description 1
- 150000004706 metal oxides Chemical class 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000011176 pooling Methods 0.000 description 1
- 238000009877 rendering Methods 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
Images
Abstract
Description
本案係關於一種影像處理方法,特別係關於一種虛擬場景的影像處理方法及電子裝置。This case relates to an image processing method, and more particularly to a virtual scene image processing method and an electronic device.
在現今的環境中,人們被要求遠距工作的比例大幅提升。這使許多從業者意識到遠端協作的好處。協作的設定備受歡迎的原因有很多,其中一個原因是協作者不再需要花費數小時或數天的旅途見面,他們可以在虛擬世界中可靠地進行通信和協作。In today's environment, people are being asked to work remotely at a much higher rate. This has made many practitioners realize the benefits of remote collaboration. There are many reasons why collaborative settings are becoming popular, one of which is that collaborators no longer need to travel for hours or days to meet in person, they can communicate and collaborate reliably in the virtual world.
然而,在一些情形中,協作者必須擁有特定設備才能與虛擬世界中的其他人有效協作,例如,配備虛擬實境頭部顯示器來追蹤頭部運動。這些頭部顯示器通常與一組配備追蹤傳感器的操縱桿配對。However, in some cases, collaborators must have specific equipment to effectively collaborate with others in the virtual world, such as a virtual reality head-mounted display that tracks head movements. These head-mounted displays are often paired with a set of joysticks equipped with tracking sensors.
再者,虛擬應用程序通常需要相當強大的計算設備才能確保舒適度。由於上述的議題,從現實世界和虛擬世界加入的協作者之間仍然存在很大差距。Furthermore, virtual applications usually require quite powerful computing equipment to ensure comfort. Due to the above issues, there is still a big gap between collaborators joining from the real world and the virtual world.
因此,如何降低利用高階設備分析頭部位姿的需求,並且確保協作者可以在一個輕鬆且身臨其境的協作系統中合作,為本領域中重要的議題。Therefore, how to reduce the need for high-end equipment to analyze head posture and ensure that collaborators can work together in an easy and immersive collaborative system is an important issue in this field.
本揭示文件提供一種影像處理方法,影像處理方法包含下列步驟。分析臉部圖框的複數個臉部特徵點。依據該些臉部特徵點計算特徵寬度,並且依據該些臉部特徵點分析頭部位姿。依據頭部位姿,更新特徵寬度以產生更新後的寬度。計算更新後的寬度相對於初始寬度的縮放比例。依據縮放比例,控制虛擬相機在虛擬場景中的攝像距離。依據虛擬相機的攝像距離,自虛擬場景採樣二維影像。The present disclosure document provides an image processing method, which includes the following steps. Analyze multiple facial feature points of a facial frame. Calculate feature width based on the facial feature points, and analyze head pose based on the facial feature points. Update the feature width based on the head pose to generate an updated width. Calculate the scaling ratio of the updated width relative to the initial width. Control the shooting distance of a virtual camera in a virtual scene based on the scaling ratio. Sample a two-dimensional image from the virtual scene based on the shooting distance of the virtual camera.
本揭示文件提供一種電子裝置。電子裝置包含影像感測器、處理器以及顯示器。影像感測器用以拍攝影像。處理器電性耦接該影像感測器以及顯示器,處理器用以進行下列步驟。分析影像中的臉部圖框的複數個臉部特徵點。依據該些臉部特徵點計算特徵寬度,並且依據該些臉部特徵點分析頭部位姿。依據頭部位姿,更新特徵寬度以產生更新後的寬度。計算更新後的寬度相對於初始寬度的縮放比例。依據縮放比例,控制虛擬相機在虛擬場景中的攝像距離。依據虛擬相機的攝像距離,自虛擬場景採樣二維影像。顯示器用以顯示二維影像。The present disclosure document provides an electronic device. The electronic device includes an image sensor, a processor and a display. The image sensor is used to capture images. The processor is electrically coupled to the image sensor and the display, and the processor is used to perform the following steps. Analyze multiple facial feature points of a facial frame in an image. Calculate feature width based on the facial feature points, and analyze head posture based on the facial feature points. Update feature width based on the head posture to generate an updated width. Calculate the scaling ratio of the updated width relative to the initial width. Control the shooting distance of a virtual camera in a virtual scene based on the scaling ratio. Sample a two-dimensional image from the virtual scene based on the shooting distance of the virtual camera. The display is used to display two-dimensional images.
綜上所述,本案的影像處理方法以及電子裝置透過分析影像中的臉部位置,控制虛擬相機的攝像距離,並且依據虛擬相機的視野,將三維場景轉換至二維圖像,從而提供使用者在視訊會議、自拍影像、景色呈現或其他互動式影像處理技術中沉浸式的體驗。In summary, the image processing method and electronic device of the present invention controls the shooting distance of the virtual camera by analyzing the facial position in the image, and converts the three-dimensional scene into a two-dimensional image according to the field of view of the virtual camera, thereby providing users with an immersive experience in video conferencing, self-portrait images, landscape presentations or other interactive image processing technologies.
下列係舉實施例配合所附圖示做詳細說明,但所提供之實施例並非用以限制本揭露所涵蓋的範圍,而結構運作之描述非用以限制其執行順序,任何由元件重新組合之結構,所產生具有均等功效的裝置,皆為本揭露所涵蓋的範圍。另外,圖示僅以說明為目的,並未依照原尺寸作圖。為使便於理解,下述說明中相同元件或相似元件將以相同之符號標示來說明。The following is a detailed description of the embodiments with the attached diagrams, but the embodiments provided are not intended to limit the scope of the disclosure, and the description of the structure operation is not intended to limit its execution order. Any device with equal functions produced by the re-combination of components is within the scope of the disclosure. In addition, the diagrams are for illustration purposes only and are not drawn according to the original size. For ease of understanding, the same or similar components in the following description will be marked with the same symbols.
在全篇說明書與申請專利範圍所使用之用詞(terms),除有特別註明除外,通常具有每個用詞使用在此領域中、在此揭露之內容中與特殊內容中的平常意義。The terms used throughout the specification and claims generally have the ordinary meanings of each term used in the art, in the context of this disclosure and in the specific context, unless otherwise specified.
此外,在本文中所使用的用詞『包含』、『包括』、『具有』、『含有』等等,均為開放性的用語,即意指『包含但不限於』。此外,本文中所使用之『及/或』,包含相關列舉項目中一或多個項目的任意一個以及其所有組合。In addition, the terms "include", "including", "have", "contain", etc. used in this article are open terms, which means "including but not limited to". In addition, "and/or" used in this article includes any one or more items in the relevant enumerated items and all combinations thereof.
於本文中,當一元件被稱為『耦接』或『耦接』時,可指『電性耦接』或『電性耦接』。『耦接』或『耦接』亦可用以表示二或多個元件間相互搭配操作或互動。此外,雖然本文中使用『第一』、『第二』、…等用語描述不同元件,該用語僅是用以區別以相同技術用語描述的元件或操作。In this article, when an element is referred to as "coupled" or "coupled", it may refer to "electrically coupled" or "electrically coupled". "Coupled" or "coupled" may also be used to indicate the coordinated operation or interaction between two or more elements. In addition, although the terms "first", "second", etc. are used in this article to describe different elements, the terms are only used to distinguish between elements or operations described with the same technical terms.
請參閱第1圖以及第2圖,第1圖為依據本揭示文件的一些實施例所繪示的電子裝置100拍攝影像IMG的示意圖。第2圖為依據本揭示文件的一些實施例所繪示的電子裝置100的示意圖。在一些實施例中,電子裝置100可以由個人電腦、平板、智慧型手機或其他具有影像感測以及運算功能的電子裝置實施。在一些實施例中,電子裝置100包含影像感測器110、處理器120、記憶體裝置130以及顯示器140。Please refer to FIG. 1 and FIG. 2. FIG. 1 is a schematic diagram of an
在一些實施例中,本案的利用廣泛部署的網絡攝像頭從遠程協作者獲取視頻數據,並且通過電子裝置100將本地用戶的頭部動作關聯至虛擬世界來增強沉浸式體驗。In some embodiments, the present invention utilizes widely deployed webcams to obtain video data from remote collaborators and associates the local user's head movements with the virtual world through the
為了實現這一點,我們利用人臉特徵檢測神經網路模型從給本地用戶的網絡攝像頭拍攝的影像中評估頭部位姿。經評估的頭部姿勢被關聯至虛擬相機的作動,以減少在現實世界和虛擬世界之中即時發生的事件的差距,從而無需透過額外配置的擷取裝置或影像感測裝置即可獲得更加身臨其境的用戶體驗。To achieve this, we use a facial feature detection neural network model to estimate head pose from webcam images of the local user. The estimated head pose is linked to the motion of the virtual camera to reduce the gap between real-world and virtual events in real time, resulting in a more immersive user experience without the need for additional capture devices or image sensors.
換言之,用於渲染三維場景中的虛擬世界以進行協作的筆記型電腦、手機、平板、電腦或其他具有運算、顯示及影像感測功能的電子裝置可以用作虛擬世界的門戶或窗口。In other words, a laptop, mobile phone, tablet, computer or other electronic device with computing, display and image sensing functions used to render a virtual world in a three-dimensional scene for collaboration can be used as a portal or window to the virtual world.
在一些實施例中,利用本案的電子裝置100作為介面與虛擬世界互動,可降低配戴感測頭部位姿的頭部設備的需,從而在通用電子裝置(例如,傳統筆記本電腦、手機、電腦)的顯示器上實現三維場景的觀看體驗,而不須配置具有虛擬實境(Virtual Reality)或擴充實境(Augmented Reality)功能的頭部配戴裝置(例如,頭戴顯示器)。In some embodiments, the
在一些實施例中,本案的電子裝置100亦可搭配虛擬實境(Virtual Reality)或擴充實境(Augmented Reality)技術使用。因此,本案不以此為限。In some embodiments, the
處理器120可以由中央處理器、微處理器、圖形處理器、可程式閘陣列積體電路(Field-Programmable Gate Array;FPGA)、特定應用積體電路(Application Specific Integrated Circuit;ASIC)或其他適用於提取或執行儲存於記憶體裝置130的指令的硬體裝置。記憶體裝置130可以由電性、磁性、光學記憶裝置或其他儲存指令或資料的儲存裝置實施。The
記憶體裝置130可以由揮發性記憶體或非揮發性記憶體實施。在一些實施例中,記憶體裝置130可以由隨機存取記憶體(Random Access Memory;RAM)、動態隨機存取記憶體(Dynamic Random Access Memory;DRAM)、磁阻式隨機存取記憶體(Magnetoresistive Random Access Memory;MRAM)、相變化記憶體(Phase-Change Random Access Memory;PCRAM)或其他儲存裝置實施。記憶體裝置130用以儲存資料或指令供處理器120存取運作。The
影像感測器110可以由CMOS(Complementary Metal Oxide Semiconductor;COMS)影像感測器(CMOS image sensor)、CCD(Charge-Coupled Device;CCD)影像感測器(CCD image sensor)、其他光感測組件或光感測裝置實施。影像感測器110電性耦接處理器120。The
電子裝置100的影像感測器110用以拍攝影像IMG。The
此時,若使用者進入影像感測器110的視野FOV,電子裝置100利用神經網路自影像感測器110擷取的影象IMG中檢測臉部圖框Ff,而使電子裝置100分析影像IMG中使用者的臉部沿軸向Ay的深度變化以及沿與軸向Ay垂直的鉛直平面的位移,並且依據使用者的臉部的深度變化以及於二維平面的位移,調整顯示器140的顯示影像。處理器120如何分析使用者的臉部的深度變化以及位移並且調整顯示器140的顯示影像將於後續實施例中詳細說明。At this time, if the user enters the field of view FOV of the
請一併參閱第1圖至第10B圖。第3圖為依據本揭示文件的一些實施例所繪示的影像處理方法200的流程圖。第4圖為依據本揭示文件的一些實施例所繪示的卷積神經網路CNN的示意圖。第5圖為依據本揭示文件的一些實施例所繪示的第4圖中的卷積神經網路CNN的規格的示意圖。第6圖為依據本揭示文件的一些實施例所繪示的使用者的頭部位姿HP的示意圖。第7圖為依據本揭示文件的一些實施例所繪示的第4圖中的臉部特徵點FPs的示意圖。第8圖為依據本揭示文件的一些實施例所繪示的使用者的臉部的中心位置f
c以及特徵寬度f
w的示意圖。第9A圖至第9B圖為依據本揭示文件的一些實施例所繪示的使用者的臉部的特徵寬度f
w以及更新後的寬度f
w’的示意圖。第10A圖以及第10B圖為依據本揭示文件的一些實施例所繪示的虛擬相機於虛擬場景的示意圖。
Please refer to Figures 1 to 10B together. Figure 3 is a flow chart of an
如第3圖所示,影像處理方法200包含步驟S210~S270。其中,步驟S210由影像感測器110執行。步驟S270由顯示器140執行。步驟S220~S260由處理器120執行。As shown in FIG. 3 , the
於步驟S210中,由影像感測器110拍攝二維影像。由電子裝置100的影像感測器110拍攝二維影像IMG,如第1圖所示。此時,若使用者進入影像感測器110的視野,電子裝置100利用神經網路會自影像感測器110擷取的影像IMG中檢測臉部圖框Ff。In step S210, the
於步驟S220中,分析影像IMG中的臉部圖框Ff的臉部特徵點FPs。在一些實施例中,處理器120利用臉部特徵神經網路模型CNN分析影像IMG中的臉部圖框Ff的臉部特徵點FPs。在一些實施例中,臉部特徵神經網路模型CNN係由卷積神經網路架構實施,所述卷積神經網路架構包含8個卷積層cov_1~cov_8、2個全連階層dense_1~dense_2以及4個池化層pool_1~pool_4。在一些實施例中,所述卷積神經網路架構的輸入係影像IMG。並且,對於一個臉部圖框Ff,所述卷積神經網路架構輸出68個臉部特徵點FPs的(x,z)座標,如第4圖、第5圖以及第7圖所示。In step S220, the facial feature points FPs of the facial frame Ff in the image IMG are analyzed. In some embodiments, the
於步驟S230中,依據該些臉部特徵點FPs分析臉部圖框Ff中的特徵寬度f w、中心位置f c以及頭部位姿HP。在一些實施例中,中心位置f c由雙眼間的中心點實施,並且特徵寬度f w由瞳距實施。在另一些實施例中,特徵寬度f w可以由臉部特徵點FPs的其他部分計算臉部圖框Ff中的雙眼內寬、雙眼外寬、眼裂寬度、臉部寬度、下頷寬度、嘴部寬度或鼻部寬度實施。因此,本案不以此為限。 In step S230, the feature width fw , the center position fc, and the head posture HP in the face frame Ff are analyzed based on the facial feature points FPs. In some embodiments, the center position fc is implemented by the center point between the eyes, and the feature width fw is implemented by the pupil distance. In other embodiments, the feature width fw can be implemented by calculating the inner width of the eyes, the outer width of the eyes, the width of the palpebral fissure, the width of the face, the width of the mandible, the width of the mouth, or the width of the nose in the face frame Ff from other parts of the facial feature points FPs. Therefore, the present invention is not limited thereto.
舉例而言,若臉部的中心位置fc由雙眼之間的中心實施,並且臉部的特徵寬度f
w由瞳距實施,處理器120自臉部特徵點FPs提取左眼內外眼角位置p
43及p
46以及右眼內外眼角位置p
37及p
40。
For example, if the facial center position fc is implemented by the center between the eyes and the facial feature width fw is implemented by the pupil distance, the
處理器120平均左眼內外眼角位置p
43及p
46作為左眼位置p
l,並且處理器120平均右眼內外眼角位置p
37及p
40作為右眼位置p
r,如下列公式所示。
The
處理器120平均左眼位置p
l以及右眼位置p
r作為臉部的中心位置fc,並且處理器120計算左眼位置p
l以及右眼位置p
r之間的差值作為特徵寬度f
w,如下列公式所示。
The
並且,處理器120計算中心位置f
c於二維平面的位移d
f以及特徵寬度f
w的縮放比例r
f,如下列公式所示。
Furthermore, the
在上述公式中,f c,0代表中心位置的初始數值,f w,0代表臉部寬度的初始數值。 In the above formula, f c,0 represents the initial value of the center position, and f w,0 represents the initial value of the face width.
如此,若使用者接近電子裝置100,在影像感測器110所拍攝的影像IMG中,使用者的臉部會放大。另一方面,若使用者遠離電子裝置100,在影像感測器110拍攝的影像IMG中,使用者的臉部會縮小。因此,透過臉部圖框Ff中的特徵寬度f
w的縮放比例r
f,便可分析並判斷使用者與電子裝置100之間的深度/距離的變化。
Thus, if the user approaches the
在一些情形中,若使用者未向電子裝置 100 靠近, 僅有頭部位姿 HP 以垂直軸(例如,Az軸)為軸心的的偏擺 (yaw)。此時,由於影像感測器 110 捕捉到的使用者人臉為側臉,導致根據兩眼距離所計算出來的特徵距離 fw 會縮小。In some cases, if the user does not approach the
因此,接續步驟S240,依據頭部位姿HP,更新特徵寬度f
w以產生更新後的寬度f
w’,從而利用更新後的寬度f
w’判斷使用者與電子裝置100之間更佳精確的深度/距離的變化。
Therefore, in step S240, the feature width fw is updated according to the head posture HP to generate an updated width fw ', so as to use the updated width fw ' to determine the change of the depth/distance between the user and the
具體而言,由處理器120利用神經網路由該些臉部特徵點FPs之間的向量變化分析使用者的頭部位姿HP的臉部方向FD。並且,使用者的頭部位姿HP在垂直軸上的偏擺(yaw),經轉換為臉部方向FD在水平面上相對於軸向Ay的角度θ,其中角度θ於90度至-90度的範圍內。Specifically, the
由處理器120依據使用者的頭部位姿HP在垂直軸 (例如,Az軸)上的偏擺角度θ,校正/更新臉部圖框Ff中的特徵寬度f
w以產生更新後的寬度f
w’,更新後的寬度f
w’可由下列公式表示。
The
如此,更新後的寬度f
w’將用於後續計算,從而更佳精確的判別電子裝置100沿軸向Ay至使用者之間的距離。處理器120計算更新後的寬度f
w’相對於初始寬度f
w,0的縮放比例,所述縮放比例以r
f’表示,縮放比例r
f’如下列公式所示。
Thus, the updated width fw ' will be used in subsequent calculations to more accurately determine the distance from the
於步驟S243中,依據縮放比例r
f’,控制虛擬相機VCA在虛擬場景中的攝像距離。若縮放比例r
f’變大(或大於1),表示使用者臉部相對靠近電子裝置100,處理器120控制虛擬相機VCA靠近固定點VP,使虛擬相機VCA的視野縮小,從而關注虛擬場景VIR中的局部區域,如第10A圖所示。另一方面,若縮放比例r
f’變小(或小於1),表示使用者臉部相對遠離電子裝置100,處理器120控制虛擬相機VCA遠離固定點VP,使虛擬相機VCA的視野擴大,如第10B圖所示。
In step S243, the shooting distance of the virtual camera VCA in the virtual scene is controlled according to the zoom ratio r f '. If the zoom ratio r f ' becomes larger (or greater than 1), it means that the user's face is relatively close to the
值得注意的是,虛擬場景VIR可以由三維虛擬場景實施。在另一些實施例中,虛擬場景VIR由二維虛擬場景實施。因此,本案不以此為限。It is worth noting that the virtual scene VIR can be implemented by a three-dimensional virtual scene. In other embodiments, the virtual scene VIR is implemented by a two-dimensional virtual scene. Therefore, the present invention is not limited thereto.
在一些實施例中,處理器120可將縮放比例r
f’進行具有平滑功能的差分運算(Difference calculation with Smoothing)處理,在依據運算後的結果控制虛擬相機VCA在虛擬場景VIR中的攝像距離。
In some embodiments, the
虛擬相機VCA的側向移動位置以及旋轉位姿係依據臉部圖框Ff中的中心位置f c決定,會於步驟S252及S253中詳細說明。 The lateral movement position and rotational posture of the virtual camera VCA are determined according to the center position f c in the face frame Ff, which will be described in detail in steps S252 and S253.
為了更佳的理解,請一併請參閱第1圖~第3圖、第8圖以及第11A圖至第11C圖。第11A圖至第11C圖為依據本揭示文件的一些實施例所繪示的使用者的臉部的中心位置f
c相對於電子裝置100的位移d
f映射至虛擬場景VIR中的虛擬相機VCA的位姿以及位移的示意圖。
For a better understanding, please refer to FIG. 1 to FIG. 3, FIG. 8, and FIG. 11A to FIG. 11C are schematic diagrams showing the center position fc of the user's face relative to the displacement df of the
於步驟S252中,計算中心位置f
c相對於初始位置f
c,0的位移d
f。在一些實施例中,初始位置f
c,0的設置可在設定階段完成。舉例而言,電子裝置100進入設定階段後,提示使用者預備進行初始位置f
c,0的設置,在使用者調整其臉部與電子裝置100之間的調整至最合適的距離後,由電子裝置100擷取當下畫面,計算其中的臉部的中心位置作為初始位置f
c,0,如第11A圖所示。
In step S252, the displacement df of the center position fc relative to the initial position fc,0 is calculated. In some embodiments, the setting of the initial position fc,0 can be completed in the setting stage. For example, after the
於步驟S253中,依據所述位移d
f,控制虛擬相機VCA在虛擬場景VIR中的取景角度。處理器120依據所述位移,在基於固定點VP所建立的曲面移動虛擬相機VCA,以調整該虛擬相機VCA在虛擬場景VIR中的取景角度。
In step S253, the viewing angle of the virtual camera VCA in the virtual scene VIR is controlled according to the displacement df . The
舉例而言,若臉部圖框Ff中的中心位置f c沿方向Dxp移動,虛擬相機VCA沿方向Dxp移動,並且相對於固定點VP以逆時針旋轉視角,從而調整該虛擬相機VCA在虛擬場景VIR中的取景角度,如第11B圖所示。 For example, if the center position fc in the face frame Ff moves along the direction Dxp, the virtual camera VCA moves along the direction Dxp and rotates the viewing angle counterclockwise relative to the fixed point VP, thereby adjusting the viewing angle of the virtual camera VCA in the virtual scene VIR, as shown in FIG. 11B .
另一方面,若臉部圖框Ff中的中心位置f c沿方向Dxn移動,虛擬相機VCA沿方向Dxn移動,並且相對於固定點VP以順時針旋轉視角,從而調整該虛擬相機VCA在虛擬場景VIR中的取景角度,如第11C圖所示。 On the other hand, if the center position fc in the face frame Ff moves along the direction Dxn, the virtual camera VCA moves along the direction Dxn and rotates the viewing angle clockwise relative to the fixed point VP, thereby adjusting the viewing angle of the virtual camera VCA in the virtual scene VIR, as shown in FIG. 11C .
值得注意的是,處理器120可將中心位置f
c的位移d
f進行具有平滑功能的差分運算(Difference calculation with Smoothing)處理,依據運算後的結果控制虛擬相機VCA在虛擬場景VIR中的取景角度。
It is worth noting that the
於第11B圖以及第11C圖的實施例中,臉部的中心位置f c沿軸向Ax移動。在一些實施例中,臉部的中心位置f c可同時在軸向Ax以及Az所形成的鉛直平面移動。臉部的中心位置f c在軸向Az的位移與虛擬相機VCA的攝影角度之間的關聯,類似於臉部的中心位置f c在軸向Ax的位移d f與虛擬相機VCA的攝影角度之間的關聯,故不再贅述。 In the embodiments of FIG. 11B and FIG. 11C , the center position f c of the face moves along the axis Ax. In some embodiments, the center position f c of the face may move simultaneously in the vertical plane formed by the axes Ax and Az. The relationship between the displacement of the center position f c of the face in the axis Az and the shooting angle of the virtual camera VCA is similar to the relationship between the displacement d f of the center position f c of the face in the axis Ax and the shooting angle of the virtual camera VCA, so it is not repeated.
於步驟S260中,利用虛擬相機VCA自虛擬場景VIR採樣二維影像。處理器120依據前述虛擬相機VCA的攝像距離以及取景角度,利用將三維場景轉換至二維圖像的渲染引擎(3D-2D rendering engine),自虛擬場景VIR採樣二維影像。In step S260, the virtual camera VCA samples a 2D image from the virtual scene VIR. The
於步驟S270中,由顯示器140顯示自虛擬場景VIR採樣的二維影像。在一些實施例中,當虛擬相機VCA的視野縮小時,虛擬相機VCA所擷取的影像可經放大至與顯示器140的畫面相應的尺寸。如此,使用者可更清楚的關注局部區域。在一些實施例中,當虛擬相機VCA的視野放大時,虛擬相機VCA所擷取的影像可經縮小至與顯示器140的畫面相應的尺寸,從而關注全域區域。In step S270, the two-dimensional image sampled from the virtual scene VIR is displayed by the
請參閱第12A圖至第12C圖。第12A圖至第12C圖為依據本揭示文件的一些實施例所繪示的影像感測器110拍攝的影像IMG中的使用者的移動以及顯示器140的影像IMG_DIS的示意圖。Please refer to FIG. 12A to FIG. 12C . FIG. 12A to FIG. 12C are schematic diagrams showing the movement of the user in the image IMG captured by the
如第12A圖所示,當使用者於實際環境中向右移動(由於前置鏡頭拍攝後呈鏡像,影像IMG中使用者向左移動),虛擬相機VCA向右移動並逆時針旋轉以擷取左側空間中的畫面。As shown in FIG. 12A , when the user moves to the right in the real environment (since the front camera shoots and the image is mirrored, the user moves to the left in the image IMG), the virtual camera VCA moves to the right and rotates counterclockwise to capture the image in the left space.
如第12B圖所示,當使用者於實際環境中相對於初始位置f c,0不具有位移時,虛擬相機VCA擷取中央空間的畫面。 As shown in FIG. 12B , when the user has no displacement relative to the initial position f c,0 in the real environment, the virtual camera VCA captures the image of the central space.
如第12C圖所示,當使用者於實際環境中向左移動(由於前置鏡頭拍攝後呈鏡像,影像IMG中使用者向右移動),虛擬相機VCA向左移動並順時針旋轉以擷取右側空間中的畫面。As shown in FIG. 12C , when the user moves to the left in the real environment (since the front camera shoots the image and the user moves to the right in the image IMG), the virtual camera VCA moves to the left and rotates clockwise to capture the image in the right space.
綜上所述,本案的電子裝置100以及影像處理方法200透過分析二維影像中的臉部位置,控制虛擬相機VCA的攝像距離以及取景角度,並且依據虛擬相機VCA的視野,將三維場景轉換至二維圖像,從而提供使用者在視訊會議、自拍影像、景色呈現或其他互動式影像處理技術中沉浸式的體驗。In summary, the
為使本揭露之上述和其他目的、特徵、優點與實施例能更明顯易懂,所附符號之說明如下: 100:電子裝置 110:影像感測器 120:處理器 130:記憶體裝置 140:顯示器 200:影像處理方法 IMG,IMG_DIS:影像 Ff:臉部圖框 Ax,Ay,Az:軸向 FOV:視野 S210,S220,S230,S241,S242,S243,S252,S253,S260,S270:步驟 CNN:臉部特徵神經網路模型 FPs:臉部特徵點 HP:頭部位姿 p 37:右眼外眼角位置 p 40:右眼內眼角位置 p 43:左眼內眼角位置 p 46:左眼外眼角位置 p r:右眼位置 p l:左眼位置 f c:中心位置 f w:特徵寬度 d r:位移 f w’:更新後的寬度 FD:臉部方向 VP:固定點 VIR:虛擬場景 VCA:虛擬相機 Dxp,Dxn:方向 In order to make the above and other objects, features, advantages and embodiments of the present disclosure more clearly understandable, the attached symbols are described as follows: 100: electronic device 110: image sensor 120: processor 130: memory device 140: display 200: image processing method IMG, IMG_DIS: image Ff: face frame Ax, Ay, Az: axial FOV: field of view S210, S220, S230, S241, S242, S243, S252, S253, S260, S270: step CNN: facial feature neural network model FPs: facial feature point HP: head posture p37 : right eye outer corner position p40 : right eye inner corner position p43 : left eye inner corner position p46 : left eye outer corner position pr : right eye position p l : left eye position f c : center position f w : feature width d r : displacement f w ': updated width FD: face direction VP: fixed point VIR: virtual scene VCA: virtual camera Dxp, Dxn: direction
為使本揭露之上述和其他目的、特徵、優點與實施例能更明顯易懂,所附圖式之說明如下: 第1圖為依據本揭示文件的一些實施例所繪示的電子裝置拍攝影像的示意圖。 第2圖為依據本揭示文件的一些實施例所繪示的電子裝置的示意圖。 第3圖為依據本揭示文件的一些實施例所繪示的影像處理方法的流程圖。 第4圖為依據本揭示文件的一些實施例所繪示的卷積神經網路的示意圖。 第5圖為依據本揭示文件的一些實施例所繪示的第4圖中的卷積神經網路的規格的示意圖。 第6圖為依據本揭示文件的一些實施例所繪示的使用者的頭部位姿的示意圖。 第7圖為依據本揭示文件的一些實施例所繪示的第4圖中的臉部特徵點的示意圖。 第8圖為依據本揭示文件的一些實施例所繪示的使用者的臉部的中心位置以及特徵寬度的示意圖。 第9A圖至第9B圖為依據本揭示文件的一些實施例所繪示的使用者的臉部的特徵寬度以及更新後的寬度的示意圖。第10A圖以及第10B圖為依據本揭示文件的一些實施例所繪示的虛擬相機於虛擬場景的示意圖。 第11A圖至第11C圖為依據本揭示文件的一些實施例所繪示的使用者的臉部的中心位置相對於電子裝置的位移應設至虛擬場景中的虛擬相機的位姿以及位移的示意圖。 第12A圖至第12C圖為依據本揭示文件的一些實施例所繪示的影像感測器拍攝的影像中的使用者的移動以及顯示器的影像的示意圖。 In order to make the above and other purposes, features, advantages and embodiments of the present disclosure more clearly understandable, the attached figures are described as follows: Figure 1 is a schematic diagram of an electronic device shooting an image according to some embodiments of the present disclosure. Figure 2 is a schematic diagram of an electronic device according to some embodiments of the present disclosure. Figure 3 is a flow chart of an image processing method according to some embodiments of the present disclosure. Figure 4 is a schematic diagram of a convolutional neural network according to some embodiments of the present disclosure. Figure 5 is a schematic diagram of the specifications of the convolutional neural network in Figure 4 according to some embodiments of the present disclosure. Figure 6 is a schematic diagram of a user's head posture according to some embodiments of the present disclosure. Figure 7 is a schematic diagram of facial feature points in Figure 4 according to some embodiments of the present disclosure. FIG. 8 is a schematic diagram of the center position and feature width of the user's face according to some embodiments of the present disclosure. FIG. 9A to FIG. 9B are schematic diagrams of the feature width and updated width of the user's face according to some embodiments of the present disclosure. FIG. 10A and FIG. 10B are schematic diagrams of a virtual camera in a virtual scene according to some embodiments of the present disclosure. FIG. 11A to FIG. 11C are schematic diagrams of the position and displacement of the virtual camera in the virtual scene relative to the displacement of the electronic device according to some embodiments of the present disclosure. FIG. 12A to FIG. 12C are schematic diagrams of the movement of the user in the image captured by the image sensor and the image of the display according to some embodiments of the present disclosure.
國內寄存資訊(請依寄存機構、日期、號碼順序註記) 無 國外寄存資訊(請依寄存國家、機構、日期、號碼順序註記) 無 Domestic storage information (please note in the order of storage institution, date, and number) None Foreign storage information (please note in the order of storage country, institution, date, and number) None
100:電子裝置 100: Electronic devices
110:影像感測器 110: Image sensor
120:處理器 120: Processor
130:記憶體裝置 130: Memory device
140:顯示器 140: Display
IMG:影像 IMG: Image
Ff:臉部圖框 Ff: Face frame
Claims (10)
Publications (1)
Publication Number | Publication Date |
---|---|
TW202420241A true TW202420241A (en) | 2024-05-16 |
Family
ID=
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10250800B2 (en) | Computing device having an interactive method for sharing events | |
JP3926837B2 (en) | Display control method and apparatus, program, and portable device | |
EP3057066B1 (en) | Generation of three-dimensional imagery from a two-dimensional image using a depth map | |
US9704299B2 (en) | Interactive three dimensional displays on handheld devices | |
US8937592B2 (en) | Rendition of 3D content on a handheld device | |
US11776142B2 (en) | Structuring visual data | |
US8768043B2 (en) | Image display apparatus, image display method, and program | |
WO2016086492A1 (en) | Immersive video presentation method for intelligent mobile terminal | |
US20120054690A1 (en) | Apparatus and method for displaying three-dimensional (3d) object | |
US9813693B1 (en) | Accounting for perspective effects in images | |
JP2013521544A (en) | Augmented reality pointing device | |
US11044398B2 (en) | Panoramic light field capture, processing, and display | |
JP2022523478A (en) | Damage detection from multi-view visual data | |
JP6294054B2 (en) | Video display device, video presentation method, and program | |
US11490032B2 (en) | Method and apparatus for creating and displaying visual media on a device | |
US20210227195A1 (en) | Creating cinematic video from multi-view capture data | |
KR20190011492A (en) | Device for providing content and method of operating the same | |
US20230117311A1 (en) | Mobile multi-camera multi-view capture | |
KR20120008191A (en) | A method and device for display of mobile device, and mobile device using the same | |
JP6621565B2 (en) | Display control apparatus, display control method, and program | |
TW202420241A (en) | Image processing method and electronic device | |
US20240144718A1 (en) | Image processing method and electronic device | |
KR20210112390A (en) | Filming method, apparatus, electronic device and storage medium | |
EP2421272A2 (en) | Apparatus and method for displaying three-dimensional (3D) object | |
TW201913292A (en) | Mobile device and method for blending display content with environment scene |