WO2022001407A1

WO2022001407A1 - Camera control method and display device

Info

Publication number: WO2022001407A1
Application number: PCT/CN2021/093589
Authority: WO
Inventors: 杨鲁明; 王依林; 朱铄; 孙永江
Original assignee: 海信视像科技股份有限公司
Priority date: 2020-07-01
Filing date: 2021-05-13
Publication date: 2022-01-06
Also published as: CN111669508A

Abstract

Disclosed in the present application are a camera control method and a display device. The method comprises: a controller performing recognition processing on a designated image acquired by a camera to obtain a portrait area position, and calculating the azimuth distance between the area center of the portrait area position and the image center of the designated image; if the azimuth distance exceeds an azimuth set threshold, calculating, according to the azimuth distance and shooting parameters of the camera, a target adjustment angle of the camera; and on the basis of the target adjustment angle of the camera, adjusting the shooting angle of the camera, so that the portrait of a person is located in the center area of the designated image acquired by the camera.

Description

A camera control method and display device

This application claims the priority of the application titled "A Camera Control Method and Display Device" filed with the China Patent Office on July 1, 2020, with the application number of 202010628749.6; the entire contents of which are incorporated into this application by reference.

technical field

The present application relates to the technical field of television software, and in particular, to a control method of a camera and a display device.

Background technique

With the rapid development of display devices, the functions of the display devices will become more and more abundant, and the performance of the display devices will become more and more powerful. For example, the display device can implement functions such as network search, IP TV, BBTV, video on demand (VOD), digital music, network news, and network video telephony. When using a display device to implement the network video call function, a camera needs to be installed on the display device to collect user images.

SUMMARY OF THE INVENTION

An embodiment of the present application provides a display device, including:

a camera, the camera is configured to capture a portrait and realize rotation within a preset angle range;

A controller connected to the camera, the controller being configured to:

Acquiring the shooting parameters of the camera and the designated images of the collected characters located in the shooting area of the camera;

Performing identification processing on the designated image to obtain a portrait area position corresponding to the person, where the portrait area position refers to an area including the head image of the person;

Calculate the azimuth distance between the area center of the portrait area position and the image center of the designated image, and the azimuth distance is used to identify the horizontal direction distance and the vertical direction distance;

If the azimuth distance exceeds the azimuth setting threshold, calculate the target adjustment angle of the camera according to the azimuth distance and the shooting parameters of the camera;

Based on the target adjustment angle of the camera, the shooting angle of the camera is adjusted so that the portrait of the person is located in the center area of the designated image captured by the camera.

Description of drawings

In order to illustrate the technical solutions of the present application more clearly, the accompanying drawings that need to be used in the embodiments will be briefly introduced below. Other drawings can also be obtained from these drawings.

FIG. 1 exemplarily shows a schematic diagram of an operation scene between a display device and a control apparatus according to some embodiments;

FIG. 2 exemplarily shows a hardware configuration block diagram of a display device 200 according to some embodiments;

FIG. 3 exemplarily shows a hardware configuration block diagram of the control device 100 according to some embodiments;

FIG. 4 exemplarily shows a schematic diagram of software configuration in the display device 200 according to some embodiments;

FIG. 5 exemplarily shows a schematic diagram of displaying an icon control interface of an application in the display device 200 according to some embodiments;

FIG. 6 exemplarily shows a structural block diagram of a display device according to some embodiments;

FIG. 7 exemplarily shows a schematic diagram of implementing a preset angle range for camera rotation according to some embodiments;

FIG. 8 exemplarily shows a scene graph of camera rotation within a preset angle range according to some embodiments;

FIG. 9 exemplarily shows a schematic diagram of a sound source angle range according to some embodiments;

FIG. 10 exemplarily shows a flowchart of a method for adjusting the shooting angle of a camera according to some embodiments;

FIG. 11 exemplarily shows a flowchart of a wake-up text comparison method according to some embodiments;

FIG. 12 exemplarily shows a flowchart of a method for performing sound source identification on character sound source information according to some embodiments;

FIG. 13 exemplarily shows a flowchart of a method for determining a target rotation direction and a target rotation angle of a camera according to some embodiments;

FIG. 14 exemplarily shows a scene diagram of adjusting the shooting angle of the camera according to some embodiments;

FIG. 15 exemplarily shows another scene diagram of adjusting the shooting angle of the camera according to some embodiments;

FIG. 16 exemplarily shows a scene diagram of the position of a character when speaking according to some embodiments;

FIG. 17 exemplarily shows another scene diagram in which the camera rotates within a preset angle range according to some embodiments;

FIG. 18 exemplarily shows a flowchart of a method for controlling a camera according to some embodiments;

FIG. 19 exemplarily shows an overall data flow diagram of a camera control method according to some embodiments;

FIG. 20 exemplarily shows a flowchart of a method for calculating an azimuth distance according to some embodiments;

FIG. 21 exemplarily shows a schematic diagram of calculating azimuth distance according to some embodiments;

FIG. 22 exemplarily shows a schematic diagram of the horizontal viewing angle of a camera according to some embodiments;

Figure 23 exemplarily shows a schematic diagram of calculating a target horizontal adjustment angle according to some embodiments;

FIG. 24 exemplarily shows a schematic diagram of a vertical viewing angle of a camera according to some embodiments;

FIG. 25 exemplarily shows a schematic diagram of calculating a vertical adjustment angle of a target according to some embodiments;

FIG. 26 exemplarily shows a flowchart of a method for focusing and zooming in on a portrait display according to some embodiments;

FIG. 27 exemplarily shows a schematic diagram of zoomed-in portrait display according to some embodiments.

detailed description

In order to make the objectives, implementations and advantages of the present application clearer, the exemplary embodiments of the present application will be described clearly and completely below with reference to the accompanying drawings in the exemplary embodiments of the present application. Obviously, the exemplary embodiments described It is only a part of the embodiments of the present application, but not all of the embodiments.

Based on the exemplary embodiments described in this application, all other embodiments obtained by those of ordinary skill in the art without creative efforts shall fall within the protection scope of the appended claims of this application. Furthermore, although the disclosures in this application have been presented in terms of illustrative example or instances, it should be understood that various aspects of this disclosure may also constitute a complete embodiment in isolation.

It should be noted that the brief description of the terms in the present application is only for the convenience of understanding the embodiments described below, rather than intended to limit the embodiments of the present application. Unless otherwise specified, these terms are to be understood according to their ordinary and ordinary meanings.

The terms "first", "second", "third", etc. in the description and claims of this application and the above drawings are used to distinguish similar or similar objects or entities, and are not necessarily meant to limit specific Order or precedence unless otherwise indicated. It should be understood that the terms so used are interchangeable under appropriate circumstances, eg, can be implemented in an order other than those given in the illustration or description of the embodiments of the present application.

Furthermore, the terms "comprising" and "having" and any variations thereof, are intended to cover but not exclusively include, for example, a product or device incorporating a series of components is not necessarily limited to those explicitly listed, but may include No other components are expressly listed or inherent to these products or devices.

The term "module" as used in this application refers to any known or later developed hardware, software, firmware, artificial intelligence, fuzzy logic or combination of hardware or/and software code capable of performing the function associated with that element.

As used in this application, the term "remote control" refers to a component of an electronic device, such as the display device disclosed in this application, that can wirelessly control the electronic device, usually over a short distance. Generally, infrared and/or radio frequency (RF) signals and/or Bluetooth are used to connect with electronic devices, and functional modules such as WiFi, wireless USB, Bluetooth, and motion sensors may also be included. For example, a hand-held touch remote control replaces most of the physical built-in hard keys in a general remote control device with a user interface in a touch screen.

The term "gesture" used in this application refers to a user's behavior that is used by a user to express an expected thought, action, purpose/or result through an action such as a change of hand shape or hand movement.

FIG. 1 is a schematic diagram of an operation scenario between a display device and a control apparatus according to an embodiment. As shown in FIG. 1 , a user can operate the display device 200 through the smart device 300 or the control device 100 .

The control apparatus 100 may be a remote controller, and the communication between the remote controller and the display device includes infrared protocol communication or Bluetooth protocol communication, and other short-distance communication methods, and the display device 200 is controlled wirelessly or wiredly. The user can control the display device 200 by inputting user instructions through keys on the remote control, voice input, control panel input, and the like.

In some embodiments, a smart device 300 (eg, a mobile terminal, a tablet computer, a computer, a notebook computer, etc.) can also be used to control the display device 200 . For example, the display device 200 is controlled using an application running on the smart device.

In some embodiments, the display device 200 can also be controlled in a manner other than the control apparatus 100 and the smart device 300. For example, the module for acquiring voice commands configured inside the display device 200 can directly receive the user's voice command for control. , the user's voice command control can also be received through a voice control device provided outside the display device 200 device.

In some embodiments, the display device 200 is also in data communication with the server 400 . The display device 200 may be allowed to communicate via local area network (LAN), wireless local area network (WLAN), and other networks. The server 400 may provide various contents and interactions to the display device 200 .

FIG. 2 exemplarily shows a configuration block diagram of the control apparatus 100 according to an exemplary embodiment. As shown in FIG. 3 , the control device 100 includes a controller 110 , a communication interface 130 , a user input/output interface 140 , a memory 190 , and a power supply 180 . The control device 100 can receive the user's input operation instruction, and convert the operation instruction into an instruction that the display device 200 can recognize and respond to, and play an intermediary role between the user and the display device 200 .

FIG. 2 is a block diagram showing a hardware configuration of a display device 200 according to an exemplary embodiment.

The display device 200 includes at least one of a tuner and demodulator 210 , a communicator 220 , a detector 230 , an external device interface 240 , a controller 250 , a display 275 , an audio output interface 285 , a memory 260 , a power supply 290 , and a user interface 265 .

The display 275 includes a display screen component for presenting pictures, and a driving component for driving image display, for receiving image signals output from the controller, components for displaying video content, image content, and menu manipulation interfaces, and user manipulation UI interfaces .

The display 275 may be a liquid crystal display, an OLED display, and a projection display, and may also be a projection device and a projection screen.

The communicator 220 is a component for communicating with external devices or servers according to various communication protocol types. For example, the communicator may include at least one of a Wifi module, a Bluetooth module, a wired Ethernet module and other network communication protocol chips or near field communication protocol chips, and an infrared receiver. The display device 200 may establish transmission and reception of control signals and data signals with the external control device 100 or the server 400 through the communicator 220 .

The user interface can be used to receive control signals from the control device 100 (eg, an infrared remote control, etc.).

The detector 230 is used to collect external environment or external interaction signals. For example, the detector 230 includes a light receiver, a sensor for collecting ambient light intensity; alternatively, the detector 230 includes an image collector, such as a camera, which can be used to collect external environmental scenes, user attributes or user interaction gestures, or , the detector 230 includes a sound collector, such as a microphone, for receiving external sound.

The external device interface 240 may include, but is not limited to, the following: any one of high-definition multimedia interface (HDMI), analog or data high-definition component input interface (component), composite video input interface (CVBS), USB input interface (USB), RGB port, etc. or multiple interfaces. It may also be a composite input/output interface formed by a plurality of the above-mentioned interfaces.

The controller 250 and the tuner 210 may be located in different separate devices, that is, the tuner 210 may also be located in an external device of the main device where the controller 250 is located, such as an external set-top box.

The controller 250 controls the operation of the display device and responds to user operations through various software control programs stored in the memory 260 . The controller 250 controls the overall operation of the display apparatus 200 . For example, in response to receiving a user command for selecting a UI object to be displayed on the display 275, the controller 250 may perform an operation related to the object selected by the user command.

Objects can be any of the optional objects, such as hyperlinks, icons, or other actionable controls. The operations related to the selected object include: displaying operations connected to hyperlinked pages, documents, images, etc., or executing operations of programs corresponding to the icons.

In some embodiments, the user may input user commands on a graphical user interface (GUI) displayed on the display 275, and the user input interface receives the user input commands through the graphical user interface (GUI). Alternatively, the user may input a user command by inputting a specific sound or gesture, and the user input interface recognizes the sound or gesture through a sensor to receive the user input command.

"User interface" can refer to the medium interface for interaction and information exchange between application programs or operating systems and users, which realizes the conversion between the internal form of information and the form acceptable to users. The commonly used form of user interface is Graphical User Interface (GUI), which refers to a user interface related to computer operations displayed in a graphical manner. It can be an icon, window, control and other interface elements displayed on the display screen of the electronic device, wherein the control can include icons, buttons, menus, tabs, text boxes, dialog boxes, status bars, navigation bars, Widgets, etc. visual interface elements.

Referring to FIG. 4 , in some embodiments, the system is divided into four layers, from top to bottom, they are an application layer (referred to as “application layer”), an application framework layer (referred to as “framework layer”) ”), the Android runtime and the system library layer (referred to as the “system runtime layer”), and the kernel layer.

In some embodiments, at least one application program runs in the application program layer, and these application programs may be a Window program, a system setting program, a clock program, a camera application, etc. built into the operating system; they may also be developed by a third party The application programs developed by the author, such as the Hijian program, the K song program, the magic mirror program, etc. During specific implementation, the application package in the application layer is not limited to the above examples, and may actually include other application packages, which is not limited in this embodiment of the present application.

The framework layer provides an application programming interface (API) and a programming framework for applications in the application layer. The application framework layer includes some predefined functions. The application framework layer is equivalent to a processing center, which decides to let the applications in the application layer take action. The application program can access the resources in the system and obtain the services of the system during execution through the API interface.

As shown in FIG. 4 , the application framework layer in the embodiment of the present application includes managers (Managers), content providers (Content Provider), etc., wherein the manager includes at least one of the following modules: an activity manager (Activity Manager) uses Interacts with all activities running in the system; Location Manager is used to provide system services or applications with access to system location services; Package Manager is used to retrieve files currently installed on the device Various information related to the application package; Notification Manager (Notification Manager) is used to control the display and clearing of notification messages; Window Manager (Window Manager) is used to manage icons, windows, toolbars, wallpapers on the user interface and desktop widgets.

In some embodiments, the activity manager is used to: manage the life cycle of each application and the usual navigation and fallback functions, such as controlling the exit of the application (including switching the user interface currently displayed in the display window to the system desktop), opening the , back (including switching the currently displayed user interface in the display window to the upper-level user interface of the currently displayed user interface), and the like.

In some embodiments, the window manager is used to manage all window programs, such as obtaining the size of the display screen, judging whether there is a status bar, locking the screen, taking screenshots, and controlling the change of the display window (for example, reducing the display window to display, shaking display, twisting deformation display, etc.), etc.

In some embodiments, the system runtime layer provides support for the upper layer, that is, the framework layer. When the framework layer is used, the Android operating system will run the C/C++ library included in the system runtime layer to implement the functions to be implemented by the framework layer.

In some embodiments, the kernel layer is the layer between hardware and software. As shown in Figure 4, the kernel layer at least includes at least one of the following drivers: audio driver, display driver, Bluetooth driver, camera driver, WIFI driver, USB driver, HDMI driver, sensor driver (such as fingerprint sensor, temperature sensor, touch sensors, pressure sensors, etc.), etc.

In some embodiments, the kernel layer further includes a power driver module for power management.

In some embodiments, software programs and/or modules corresponding to the software architecture in FIG. 4 are stored in the first memory or the second memory shown in FIG. 2 or FIG. 3 .

In some embodiments, taking the magic mirror application (photography application) as an example, when the remote control receiving device receives the input operation of the remote control, the corresponding hardware interrupt is sent to the kernel layer. The kernel layer processes the input operation into the original input event (including the value of the input operation, the timestamp of the input operation and other information). Raw input events are stored at the kernel layer. The application framework layer obtains the original input event from the kernel layer, identifies the control corresponding to the input event according to the current position of the focus, and regards the input operation as a confirmation operation, and the control corresponding to the confirmation operation is the control of the magic mirror application icon. The mirror application calls the interface of the application framework layer, starts the mirror application, and then starts the camera driver by calling the kernel layer to capture still images or videos through the camera.

In some embodiments, for a display device with a touch function, taking a split-screen operation as an example, the display device receives an input operation (such as a split-screen operation) performed by the user on the display screen, and the kernel layer can generate corresponding input operations according to the input operation. Enter an event and report the event to the application framework layer. The window mode (such as multi-window mode) and window position and size corresponding to the input operation are set by the activity manager of the application framework layer. The window management of the application framework layer draws the window according to the settings of the activity manager, and then sends the drawn window data to the display driver of the kernel layer, and the display driver displays the corresponding application interface in different display areas of the display screen.

In some embodiments, as shown in FIG. 5, the application layer contains at least one application that can display corresponding icon controls in the display, such as: live TV application icon control, video on demand application icon control, media center application Program icon controls, application center icon controls, game application icon controls, etc.

In some embodiments, the live TV application may provide live TV from different sources. For example, a live TV application may provide a TV signal using input from cable, over-the-air, satellite services, or other types of live TV services. And, the live TV application may display the video of the live TV signal on the display device 200 .

In some embodiments, a video-on-demand application may provide video from various storage sources. Unlike live TV applications, video-on-demand provides a display of video from certain storage sources. For example, video-on-demand can come from the server side of cloud storage, from local hard disk storage containing existing video programs.

In some embodiments, the media center application may provide various multimedia content playback applications. For example, a media center may provide services other than live TV or video-on-demand, where users can access various images or audio through a media center application.

In some embodiments, the application center may provide storage of various applications. An application can be a game, an application, or some other application that is related to a computer system or other device but can be run on a Smart TV. The application center can obtain these applications from various sources, store them in local storage, and then run them on the display device 200 .

In some embodiments, the application programs that need to use the camera in the display device include "Hey see", "Look in the mirror", "Youxuemao", "Fitness", etc., which can realize "video chat", "watch while chatting" ” and “Fitness”. "See you" is a video chat application that can realize one-click chat between mobile phone and TV, and between TV and TV. "Looking in the Mirror" is an application that provides users with mirror services. By turning on the camera through the mirroring application, users can use the smart TV as a mirror. "Youxuemao" is an application that provides learning functions. When the "watch while chatting" function is implemented, the user simultaneously watches the video program under the scenario of launching the "Hi See" application to make a video call. The "fitness" function can simultaneously display the fitness instruction video and the image of the user following the fitness instruction video to perform corresponding actions on the display of the display device, so that users can check whether their actions are standard in real time.

Since the user may not be fixed in one position when using the display device for "video chat", "watch while chatting" or "exercise", the user can also perform the above functions while walking. However, in the existing display device, the camera is fixedly installed on the display device, the center line of the viewing angle of the camera is perpendicular to the display, and the viewing angle of the camera is limited, usually between 60° and 75°, that is, the shooting area of the camera is The center line of the angle of view of the camera is synchronously spread to the left and right to form an area corresponding to an angle of 60° to 75°.

If the user walks out of the shooting area of the camera, the camera cannot capture an image containing the user's portrait, so that the portrait cannot be displayed on the monitor. If in the video chat call scenario, the opposite end user who is in the video chat call with the local user will not be able to see the local user; if in the fitness scenario, the display will not be able to display the image of the user presenting the fitness action, and the user will not be able to see it. Your own fitness movements will not be able to judge whether they are standard or not, which will affect the user experience.

FIG. 6 exemplarily shows a structural block diagram of a display device according to some embodiments. In order to allow the camera to still capture the user's image when the user walks out of the camera's shooting area, referring to FIG. The camera is used to capture portraits. The camera is no longer fixedly installed, but is rotatably installed on the display device. Specifically, the camera 232 is installed on the top of the display in a rotating form, and the camera 232 can rotate along the top of the display.

FIG. 7 exemplarily shows a schematic diagram of implementing a preset angle range of camera rotation according to some embodiments; FIG. 8 exemplarily shows a scene diagram of camera rotation within the preset angle range according to some embodiments. Referring to FIG. 7 and FIG. 8 , the preset camera 232 can be rotated within a preset angle range and rotated in a horizontal direction. In some embodiments, the preset angle range is 0°˜120°, that is, at the position facing the display, the left side of the user is 0° and the right side of the user is 120°. Taking the state when the center line of the viewing angle of the camera 232 is perpendicular to the display as the initial state, the camera can be rotated 60° to the left from the initial state, and 60° to the right from the initial state; the center line of the viewing angle of the camera is perpendicular to the display. The position is the 60° position of the camera.

The display device provided by the embodiment of the present application realizes the use of sound source information to trigger the rotation of the camera, can automatically identify the real-time location of the user and adjust the shooting angle of the camera, so that the camera can always capture images including portraits. To this end, in some embodiments, the display device implements the collection of the sound source information of the person by setting the sound collector 231 .

In order to ensure the accuracy of sound source collection, multiple sets of sound collectors can be set in the display device. In some embodiments, four sets of sound collectors 231 are set in the display device, and the four sets of sound collectors 231 can be arranged in a linear positional relationship. set up. In some embodiments, the sound collector may be a microphone, and four groups of microphones are linearly arranged to form a microphone array. During sound collection, the four groups of sound collectors 231 receive sound information generated when the same user interacts with the display device through voice.

A schematic diagram of a sound source angle range according to some embodiments is exemplarily shown in FIG. 9 . When the user speaks, the generated sound will be received by 360°. Therefore, when the user is on the front of the display device, the angle of the sound source generated by the user ranges from 0° to 180°. Similarly, when the user is on the back of the display device, the user The generated sound source angle also ranges from 0° to 180°. Referring to FIG. 9 , taking the position of the user facing the display device as an example, the user is located on the left side of the sound collector, which is 0° horizontally, and the user is located on the right side of the sound collector, which is 180° horizontally.

Referring again to Figures 7 and 9, the 30° angular position of the sound source is equal to the 0° angular position of the camera, the 90° angular position of the sound source is equal to the 60° angular position of the camera, and the 150° angular position of the sound source is equal to the 120° angular position of the camera corner position.

The controller 250 is connected with the camera 232 and the sound collector 231 respectively, and the controller is used to receive the sound source information of the character collected by the sound collector, identify the sound source information of the character, determine the azimuth angle of the position of the character, and then determine the camera. The angle that needs to be turned. The controller adjusts the shooting angle of the camera according to the determined angle that the camera needs to rotate, so that the shooting area of the camera is facing the position of the voice of the character, and adjusts the shooting angle of the camera according to the position of the character to capture the image containing the character.

FIG. 10 exemplarily shows a flowchart of a method for adjusting the shooting angle of a camera according to some embodiments. In a display device provided by an embodiment of the present application, when adjusting the shooting angle of the camera according to the position of the character, the controller is configured to execute the method for adjusting the shooting angle of the camera shown in FIG. 10 , including:

S11. Acquire the character sound source information collected by the sound collector and the current shooting angle of the camera.

In some embodiments, when the controller in the display device drives the camera to rotate to adjust the shooting angle of the camera, it needs to determine the sound source information of the person generated when the person performs voice interaction with the display device at the location where the person is located. The source information refers to the sound information generated when the character interacts with the display device through voice.

The sound source information of the person can determine the azimuth and angle of the person's position when speaking, and in order to accurately determine the angle that the camera needs to adjust, it is necessary to first obtain the current state of the camera, that is, the current shooting angle. The current shooting angle of the camera needs to be acquired when the camera is in a stopped state, so as to ensure the accuracy of the current shooting angle of the camera, and thus to ensure the accuracy of determining the angle that the camera needs to adjust.

Therefore, before executing the acquisition of the current shooting angle of the camera, the controller is further configured to execute the following steps:

Step 111 , query the current operating state of the camera.

Step 112 , if the current operating state of the camera is in the rotating state, wait for the camera to rotate completely.

Step 113: If the current operating state of the camera is in the non-rotation state, obtain the current shooting angle of the camera.

A motor control service is configured in the controller, and the motor control service is used to drive the camera to rotate, obtain the running status of the camera and the orientation angle of the camera.

The motor control service monitors the running status of the camera in real time. The controller queries the current running status of the camera by calling the motor control service. The current running status of the camera can represent the current orientation angle of the camera and whether the camera is in a rotating state.

If the camera is in a rotating state, the current shooting angle of the camera cannot be obtained at this time, otherwise the exact value cannot be determined. Therefore, when the camera is in the rotating state, it is necessary to wait for the camera to execute the previous instruction to complete the rotation, and then perform the step of obtaining the current shooting angle of the camera in the stopped state.

If the camera is in a non-rotating state, that is, the camera is in a stopped state, the steps of obtaining the current shooting angle of the camera can be performed.

S12. Perform sound source identification on the person's sound source information, and determine the sound source angle information, where the sound source angle information is used to represent the azimuth angle of the person's position when speaking.

After obtaining the character sound source information generated by the interaction between the character and the display device, the controller needs to perform sound source identification on the character sound source information to determine the position of the character when speaking, specifically the azimuth angle, that is, the character is located in The left and right sides of the sound collector are still facing the sound collector, and the shooting angle of the camera is adjusted according to the position of the character.

When the character interacts with the display device, such as in a video call scenario, the character's voice may be in a dialogue with the opposite end user, while the character is still in the shooting area of the camera. If the controller executes the function of adjusting the shooting angle of the camera step, an invalid operation occurs.

Therefore, in order to accurately determine whether the shooting angle of the camera needs to be adjusted according to the sound source information of the person, it is necessary to analyze the sound source information of the person generated by the person to determine whether the sound source information of the person is the information that triggers the adjustment of the camera.

In some embodiments, the wake-up text for triggering the adjustment of the shooting angle of the camera may be stored in the controller in advance, for example, customizing "Hisense Small Gathering" as the wake-up text for sound source recognition. The character uses the voice "Hisense Xiaoju" as the identification sound source to trigger the process of adjusting the camera's shooting angle. The wake-up text can also be customized as other words, which are not specifically limited in this embodiment.

FIG. 11 exemplarily shows a flowchart of a wake-up text comparison method according to some embodiments. Specifically, referring to Figure 11, the controller is further configured to perform the following steps before performing sound source identification on the character sound source information and determining the sound source angle information:

S1021. Perform text extraction on the character sound source information to obtain a voice interaction text.

S1022 , comparing the voice interaction text and the preset wake-up text, the preset wake-up text refers to the text used to trigger the sound source identification process.

S1023. If the voice interaction text is consistent with the preset wake-up text, perform the step of performing sound source recognition on the person's sound source information.

In some embodiments, after acquiring the sound source information of the character, the controller first performs text extraction, and extracts the voice interaction text when the character interacts with the display device through voice. Compare the extracted voice interaction text with the preset wake-up text. If the comparison is inconsistent, for example, the character's voice is not "Hisense Xiaoju", but other interactive content. At this time, it means that the current character's voice does not trigger the adjustment of the camera's shooting angle. voice, the controller does not need to perform the relevant steps to adjust the camera's shooting angle.

If the comparison is consistent, it means that the current character's voice is the voice that triggers the adjustment of the camera's shooting angle. For example, the character's voice is the preset "Hisense Xiaoju". At this time, the controller can continue to perform the subsequent steps to adjust the camera's shooting angle.

When judging that the person's sound source information is a wake-up voice, that is, a trigger voice for adjusting the shooting angle of the camera, the controller needs to perform a subsequent sound source recognition process.

Since multiple groups of sound collectors are set in the display device, the multiple groups of sound collectors can collect multiple groups of person sound source information when the same person's voice is spoken, so when the controller obtains the person sound source information collected by the sound collector, it can obtain Each sound collector collects the character sound source information generated when the character speaks, that is, the controller will acquire multiple sets of character sound source information.

FIG. 12 exemplarily shows a flow chart of a method for sound source identification for character sound source information according to some embodiments. When multiple groups of sound collectors collect the same wake-up text, since the distance between each sound collector and the character is not the same, the sound source information of each character can be identified to determine the azimuth angle of the character’s voice, that is, Sound source angle information. Specifically, referring to FIG. 12 , the controller is further configured to perform the following steps when performing sound source identification on the character sound source information and determining the sound source angle information:

S121. Perform sound source identification on each character sound source information, and calculate the voice time difference generated when multiple groups of sound collectors collect the corresponding character sound source information.

S122 , based on the time difference of speech, calculate the sound source angle information of the position where the character is at the time of speech.

The frequency response of each sound collector is the same, and its sampling clock is also synchronized. However, because the distance between each sound collector and the character is not the same, the time when each sound collector can collect speech is not the same. There will be a difference in acquisition time between multiple groups of sound collectors.

In some embodiments, the angle and distance of the sound source from the array can be calculated by the sound collector array, so as to realize the tracking of the sound source at the position of the character when speaking. Based on the TDOA (Time Difference Of Arrival) sound source localization technology, the time difference between the arrival of the signal between the two microphones is estimated, so as to obtain the equation set of the sound source position coordinates, and then the exact position of the sound source can be obtained by solving the equation set Coordinates, that is, sound source angle information.

In some embodiments, in step S121, the controller performs sound source identification for each of the character sound source information, and calculates the voices generated by the plurality of groups of the sound collectors when collecting the corresponding character sound source information. The time difference is further configured to perform the following steps:

Step 1211: Extract the ambient noise, the sound source signal of the person's voice, and the propagation time of the person's voice to each sound collector from the person's sound source information.

Step 1212: Determine the received signal of each sound collector according to the environmental noise, the sound source signal and the propagation time.

Step 1213 , using the cross-correlation time delay estimation algorithm to process the received signal of each sound collector to obtain the speech time difference generated when every two sound collectors collect the corresponding character sound source information.

When calculating the speech time difference between every two sound collectors, the sound source array can be used to estimate the direction-of-arrival (DOA) estimation. time difference.

In the sound source localization system, the target signal received by each element of the sound collector array comes from the same sound source. Therefore, there is a strong correlation between the signals of each channel. By calculating the correlation function between each two channels of signals, the time delay between the signals observed by each two sound collectors, that is, the speech time difference, can be determined.

The character sound source information generated by the character during the speech includes the ambient noise and the sound source signal of the character voice, and the propagation time of the character's voice transmitted to each sound collector can also be extracted from the character sound source information by identifying and extracting, and calculating The received signal of each sound collector.

x _i (t)=α _i s(t-τ _i )+n _i (t);

In the formula, x _i (t) is the received signal of the i-th sound collector, s(t) is the sound source signal when the _{character's voice is spoken, τ i} is the propagation time of the character's voice propagating to the i-th sound collector, n _i (t) is the environmental noise, and α _i is the correction coefficient.

The cross-correlation delay estimation algorithm is used to process the received signal of each sound collector to estimate the delay, which is expressed as:

In the formula,

is the time delay between the i-th sound collector and the i+1-th sound collector, that is, the voice time difference.

Bringing in the received signal model of each sound collector, we get:

Since s(t) and n _i (t) are independent of each other, the above formula can be simplified as:

Among them, τ _ii+1 =τ _i -τ _i+1 , n _i and n _i+1 are Gaussian white noises that are not correlated with each other, the above formula is further simplified as:

From the properties of the cross-correlation delay estimation algorithm, when τ _ii+1 =τ _i -τ _i+1 ,

The maximum value is the time delay of the two sound collectors, that is, the voice time difference.

In the actual model of sound collector array signal processing, due to the presence of reverberation and noise effects,

The peak value is not obvious, which reduces the accuracy of delay estimation. for sharpening

The peak value of , the cross-power spectrum can be weighted in the frequency domain according to the prior knowledge of signal and noise, so as to suppress noise and reverberation interference. Finally, perform inverse Fourier transform to obtain the generalized cross-correlation function

in

represents the frequency domain weighting function.

Finally, PHAT weighting is used to make the interaction rate spectrum between the signals smoother, and the final speech time difference generated by each two sound collectors when collecting the corresponding character sound source information is obtained.

The cross-power spectrum weighted by PHAT is similar to the expression of the unit impulse response, which highlights the peak value of the delay, which can effectively suppress the reverberation noise and improve the accuracy and accuracy of the delay (speech time difference) estimation.

In some embodiments, in step S122, the controller is further configured to perform the following steps when calculating the sound source angle information of the position of the character when speaking based on the speech time difference:

Step 1221: Acquire the speed of sound in the current environmental state, the coordinates of each sound collector, and the set number of sound collectors.

Step 1222: Determine the number of combined pairs of sound collectors according to the set number of sound collectors, where the number of combined pairs refers to the number of combinations obtained by combining two sound collectors.

Step 1223 , according to the speech time difference, the sound speed and the coordinates of each sound collector corresponding to each two sound collectors, establish a vector relational equation set, the number of which is the same as the number of combination pairs.

Step 1224: Solve the vector relation equation system to obtain the vector value of the unit plane wave propagation vector of the sound source at the position of the person's speech.

Step 1225: Calculate, according to the vector value, the sound source angle information of the position where the character is speaking.

After calculating the voice time difference between every two sound collectors according to the method provided in the foregoing embodiment, the sound source angle information of the position of the character when speaking can be calculated according to each voice time difference.

When calculating the sound source angle information, it is necessary to establish multiple sets of vector relationship equations. In order to ensure the accuracy of the calculation results, the number of equations can be set to be the same as the number of combinations obtained by combining the sound collectors in pairs. To this end, the set number N of the sound collectors is obtained, and there are N(N-1)/2 pairs of combinations between all the sound collectors.

When establishing the vector relationship equation system, obtain the sound speed c and the coordinates of each sound collector in the current environmental state, and record the coordinates of the kth sound collector as (x _k , y _k , z _k ), and at the same time, set The sound source unit plane wave propagation vector at the position of the character's speech is u=(u, v, w). The sound source angle information can be determined by solving the vector value of the sound source unit plane wave propagation vector at the character's voice position.

According to the voice time difference corresponding to each two sound collectors

The speed of sound c, the coordinates of each sound collector (x _k , y _k , z _k ) and the unit plane wave propagation vector of the sound source at the location of the character’s speech are (u, v, w), and N(N-1) is established /2 vector relational equations:

This formula represents the set of vector relationship equations established between the ith sound collector and the jth sound collector.

Taking N=3 as an example, the following equations can be established:

(The set of vector relationship equations established between the first sound collector and the second sound collector);

(The set of vector relationship equations established between the first sound collector and the third sound collector);

(The set of vector relationship equations established between the third sound collector and the second sound collector).

Write the above three vector relational equations in matrix form:

Solve u=(u,v,w) according to the above matrix, and then use the sine and cosine relationship to get the angle value:

That is, the sound source angle information of the azimuth angle of the position of the character when speaking.

S13. Determine the target rotation direction and target rotation angle of the camera based on the current shooting angle and sound source angle information of the camera.

The controller determines the sound source angle information used to represent the azimuth angle of the person's position when speaking by performing sound source identification on the sound source information of the person. The sound source angle information can identify the current position of the character, the current shooting angle of the camera can identify the current position of the camera, and the target rotation angle that the camera needs to rotate in the horizontal direction can be determined according to the difference angle between the two positions. , and the target rotation direction when the camera is rotated.

FIG. 13 exemplarily shows a flowchart of a method for determining a target rotation direction and a target rotation angle of a camera according to some embodiments. Specifically, referring to FIG. 13 , the controller is further configured to perform the following steps when determining the target rotation direction and target rotation angle of the camera based on the current shooting angle and sound source angle information of the camera:

S131. Convert the sound source angle information into the coordinate angle of the camera.

Since the sound source angle information represents the azimuth angle of the character, in order to accurately calculate the azimuth angle that the camera needs to adjust according to the sound source angle information and the current shooting angle of the camera, the sound source angle information of the character can be converted into the camera. The coordinate angle, that is, the coordinate angle of the camera is used to replace the sound source angle information of the character.

Specifically, when the controller converts the sound source angle information into the coordinate angle of the camera, the controller is further configured to perform the following steps:

Step 1311: Acquire the sound source angle range when the character is speaking and the preset angle range when the camera rotates.

Step 1312: Calculate the angle difference between the sound source angle range and the preset angle range, and use the half value of the angle difference as the conversion angle.

Step 1313: Calculate the angle difference between the angle corresponding to the sound source angle information and the conversion angle, and use the angle difference as the coordinate angle of the camera.

Since the angle range of the sound source is different from the preset angle range of the camera, the preset angle range is 0°～120°, and the sound source angle range is 0°～180°, so the coordinate angle of the camera cannot directly replace the sound source angle information . Therefore, first calculate the angle difference between the sound source angle range and the preset angle range, then calculate the half value of the angle difference, and use the half value as the conversion angle when the sound source angle information is converted into the coordinate angle of the camera.

The angle difference between the sound source angle range and the preset angle range is 60°, the half value of the angle difference is 30°, and 30° is used as the conversion angle. Finally, the angle difference between the angle corresponding to the sound source angle information and the conversion angle is calculated, which is the coordinate angle of the camera converted from the sound source angle information.

For example, if the character is located on the left side of the sound collector, the angle corresponding to the sound source angle information determined by the controller by acquiring the character sound source information collected by multiple sound collectors is 50°, and the conversion angle is 30°. Therefore, The calculated angle difference is 20°, that is, the 50° corresponding to the sound source angle information is replaced by the camera's coordinate angle of 20° to represent it.

If the character is located on the right side of the sound collector, the angle corresponding to the sound source angle information determined by the controller by acquiring the character sound source information collected by multiple sound collectors is 130°, and the conversion angle is 30°. Therefore, the calculated angle The difference is 100°, that is, the 130° corresponding to the sound source angle information is replaced by the camera's coordinate angle of 100° to represent it.

S132: Calculate the angle difference between the coordinate angle of the camera and the current shooting angle of the camera, and use the angle difference as the target rotation angle of the camera.

The coordinate angle of the camera is used to identify the angle of the person's position within the camera coordinates. Therefore, according to the angle difference between the current shooting angle of the camera and the coordinate angle of the camera, the target rotation angle that the camera needs to rotate can be determined.

For example, if the current shooting angle of the camera is 100° and the coordinate angle of the camera is 20°, it means that the current shooting area of the camera is not aimed at the position of the person, and the difference between the two is 80°. Therefore, it is necessary to rotate the camera 80° after , the shooting area of the camera can be aimed at the position of the person, that is, the target rotation angle of the camera is 80°.

S133: Determine the target rotation direction of the camera according to the angle difference.

Since the direction facing the display device is used, the left side is taken as the 0° position of the camera, and the right side is taken as the 120° position of the camera. Therefore, after the angle difference is determined according to the coordinate angle of the camera and the current shooting angle of the camera, if the If the angle is greater than the coordinate angle, it means that the camera's shooting angle is on the right side of the character's position, and the angle difference is a negative value; if the current shooting angle is less than the coordinate angle, it means that the camera's shooting angle is on the left side of the character's position. side, the angle difference is a positive value at this time.

In some embodiments, the target rotation direction of the camera may be determined according to the positive or negative of the angle difference. If the angle difference is a positive value, it means that the shooting angle of the camera is on the left side of the position of the character. At this time, in order to make the image of the character captured by the camera, the shooting angle of the camera needs to be adjusted to the right, and the target rotation direction of the camera is determined. to turn right.

If the angle difference is a negative value, it means that the shooting angle of the camera is located on the right side of the person's position. At this time, in order to make the camera capture the image of the person, it is necessary to adjust the shooting angle of the camera to the left, and then determine the target rotation direction of the camera. to turn left.

For example, FIG. 14 exemplarily shows a scene graph for adjusting the shooting angle of the camera according to some embodiments. Referring to Figure 14, if the angle corresponding to the sound source angle information corresponding to the character is 50°, the converted coordinate angle of the camera is 20°; the current shooting angle of the camera is 100°, that is, the center line of the camera's viewing angle is located at the position of the character. To the right of the position, the calculated angle difference is -80°. The visible angle difference is a negative value. At this time, the camera needs to be adjusted to rotate 80° to the left.

FIG. 15 exemplarily shows another scene diagram for adjusting the shooting angle of the camera according to some embodiments. Referring to Figure 15, if the angle corresponding to the sound source angle information corresponding to the character is 120°, the converted coordinate angle of the camera is 90°; the current shooting angle of the camera is 40°, that is, the center line of the camera's viewing angle is located at the position of the character. To the left of the position, the calculated angle difference is 50°. The visible angle difference is a positive value. At this time, the camera needs to be adjusted to rotate 50° to the right.

S14. Adjust the shooting angle of the camera according to the target rotation direction and the target rotation angle, so that the shooting area of the camera faces the position where the person's voice is located.

After the controller determines the target rotation direction and target rotation angle required when the camera needs to adjust the shooting angle, it can adjust the shooting angle of the camera according to the target rotation direction and target rotation angle, so that the shooting area of the camera is facing the position of the character. , so that the camera can capture images including characters, so that the shooting angle of the camera can be adjusted according to the position of the characters.

FIG. 16 exemplarily shows a scene graph of the position of the character when speaking according to some embodiments. Since the preset angle range of the camera is different from the sound source angle range of the human voice, if it is reflected in the angle diagram, see Figure 16, there is a 30° position between the 0° position of the preset angle range and the 0° position of the sound source angle range ° angle difference, similarly, there is also a 30° angle difference between the 120° position of the preset angle range and the 180° position of the sound source angle range.

Then, if the character interacts with the display device, the position of the character is just within the 30° angle range, as shown in Figure 16 where the character (a) is located or the character (b) is located. At this time, when the controller converts the sound source angle information into the coordinate angle of the camera in the aforementioned step S131, the coordinate angle of the camera converted from the sound source angle information of the character will be negative, or larger than the camera. The maximum value of the preset angle range, that is, the coordinate angle of the camera obtained by conversion is not within the preset angle range of the camera.

For example, if the sound source angle information corresponding to the position of the person (a) is 20° and the conversion angle is 30°, the calculated coordinate angle of the camera is -10°. If the sound source angle information corresponding to the position of the person (b) is 170°, and the conversion angle is 30°, the calculated coordinate angle of the camera is 140°. It can be seen that the coordinate angles of the camera respectively converted according to the position of the person (a) and the position of the person (b) are beyond the preset angle range of the camera.

If the coordinate angles of the camera are all beyond the preset angle range of the camera, it means that the camera cannot be rotated to the position corresponding to the coordinate angle of the camera (where the voice of the person is located). Since the viewing angle range of the camera is between 60° and 75°, it means that when the camera is rotated to the 0° position or the 120° position, the viewing angle range of the camera can cover the 0° position of the preset angle range and the sound source. There is an angular difference of 30° between the 0° position of the angular range, and a 30° angular difference between the 120° position covering the preset angular range and the 180° position of the sound source angular range.

Therefore, if the position of the character is within a 30° angle difference between the 0° position of the preset angle range and the 0° position of the sound source angle range, or, if the character is located at the 120° position of the preset angle range and the sound source If there is a 30° angle difference between the 180° positions of the source angle range, in order to capture images including people, adjust the camera’s shooting angle according to the position corresponding to the minimum or maximum value of the camera’s preset angle range .

In some embodiments, the controller is further configured to perform the following steps: when the angle information of the sound source of the character is converted into the coordinate angle of the camera beyond the preset angle range of the camera, according to the current shooting angle of the camera and the preset angle range The angle difference between the minimum or maximum value determines the target rotation direction and target rotation angle of the camera.

For example, if the person (a) is located within a 30° angle difference between the 0° position of the preset angle range and the 0° position of the sound source angle range, that is, the sound source corresponding to the sound source angle information of the person (a) When the angle is 20°, the current shooting angle of the camera is 50°. Calculate the angle difference according to the minimum value of the camera's preset angle range of 0° and the current shooting angle of 50°. If the angle difference is -50°, the target rotation direction of the camera is determined to be leftward, and the target rotation angle is 50° . At this time, the center line (a) of the viewing angle of the camera coincides with the 0° line of the camera.

If the person (b) is located within a 30° angle difference between the 120° position of the preset angle range and the 180° position of the sound source angle range, that is, the sound source angle corresponding to the sound source angle information of the person (b) is 170°, when the current shooting angle of the camera is 50°. Calculate the angle difference according to the maximum value of the camera's preset angle range of 120° and the current shooting angle of 50°. If the angle difference is 70°, the target rotation direction of the camera is determined to be rightward, and the target rotation angle is 70°. At this time, the center line (b) of the viewing angle of the camera coincides with the 120° line of the camera.

Therefore, even if the sound source angle corresponding to the position of the character exceeds the preset angle range when the camera is rotated, the display device provided by the embodiment of the present application can still rotate the camera to the preset angle range according to the position of the character. The position of the minimum or maximum value of , depending on the viewing angle coverage of the camera, an image containing a person is captured.

It can be seen that, in the display device provided by the embodiment of the present application, the camera can be rotated within a preset angle range, and the controller is configured to obtain the sound source information of the characters collected by the sound collector and identify the sound source, and determine the sound source used for identification. The sound source angle information of the azimuth angle of the character's position; based on the current shooting angle and sound source angle information of the camera, determine the target rotation direction and target rotation angle of the camera; according to the target rotation direction and target rotation angle, adjust the camera's shooting angle, so that the shooting area of the camera is facing the position of the person's voice. It can be seen that the display device provided by the present application can trigger the rotation of the camera by using the sound source information of the person, and can automatically identify the real-time position of the user and adjust the shooting angle of the camera, so that the camera can always capture images containing the portrait.

The display device provided by the foregoing embodiment, when adjusting the shooting angle of the camera, is adjusted in the horizontal direction based on the sound source information when the character interacts with the display device, so that the portrait of the character can appear in the shooting area of the camera, and further. Images including portraits were captured.

After adjusting the shooting angle of the camera, when shooting a portrait of a person, the center line of its angle of view may not be aligned with the person, which will make the image captured by the camera not located in the center of the image, and the portrait will deviate, affecting the visual effect. Therefore, after adjusting the shooting angle of the camera to capture the portrait, the display device can also locate the position of the portrait through automatic focusing, so as to display the portrait in the central area of the image.

Since the character may be in a standing posture or a sitting posture when interacting with the display device, there is a different height gap between the face of the character and the camera. Therefore, after adjusting the shooting angle of the camera using the sound source information of the person, the shooting area of the camera may be located above or below the head of the person, which will cause the camera to fail to completely capture the person's portrait.

Therefore, it is necessary to adjust the camera vertically downward when the shooting area of the camera is above the character's head; or adjust the camera vertically upward when the shooting area of the camera is below the character's head; or When the area is to the left of the person's head, adjust the camera to the right in the horizontal direction; or when the shooting area of the camera is to the right of the person's head, adjust the camera to the left in the horizontal direction.

FIG. 17 exemplarily shows another scene diagram in which the camera rotates within a preset angle range according to some embodiments. The camera can be rotated horizontally as well as vertically. Therefore, the preset angle range of the camera includes 0 to 120° in the horizontal direction and 0 to 105° in the vertical direction. 17 exemplarily shows the rotation angles of the camera in the vertical direction: pitch 0°, pitch 90°, and pitch 105°; the rotation angles of the camera in the horizontal direction: horizontal 0°, horizontal 60°, horizontal 120°.

To this end, the display device provided by the embodiment of the present application, after adjusting the shooting angle of the camera based on the sound source information provided by the above-mentioned embodiment to include the portrait of the person, also accurately recognizes the position information of the person through the detection of the camera image, so as to calculate The difference between the portrait of the person and the image center of the camera is used to fine-tune the shooting angle of the camera from the horizontal and vertical directions again, so that the portrait of the person is in the center of the image captured by the camera, so as to ensure that the person in the display image is centered.

FIG. 18 exemplarily shows a flowchart of a camera control method according to some embodiments; FIG. 19 exemplarily shows an overall data flow diagram of a camera control method according to some embodiments. Referring to FIG. 18 and FIG. 19 , in a display device provided by an embodiment of the present application, when fine-tuning the camera, its controller is configured to perform the following steps:

S21. Acquire the shooting parameters of the camera and the collected designated image of the person located in the shooting area of the camera.

When fine-tuning the shooting angle of the camera, in some embodiments, an image detection method is used, by identifying the portrait of the person being photographed in the image, automatically focusing and positioning, and adjusting the shooting angle of the camera to display the portrait in the image. Central location.

For this reason, when the controller performs fine-tuning of the camera, it acquires the specified image captured by the camera in real time, and the specified image includes the portrait of the person in the shooting area of the camera.

In some embodiments, if the camera has performed the process of adjusting the shooting angle based on the voice source information of the person, at this time, the camera that captures the specified image is the camera after the shooting angle has been adjusted. Then, the controller needs to obtain the shooting parameters of the camera after adjusting the shooting angle and the designated image of the captured person located in the shooting area of the camera.

The shooting parameters of the camera include the horizontal viewing angle of the camera, the horizontal width of the image, the vertical viewing angle of the camera, and the vertical height of the image. The horizontal viewing angle of the camera means that the preset angle range of the camera is 0 to 120° in the horizontal direction, and the vertical viewing angle of the camera means that the preset angle range of the camera is 0 to 105° in the vertical direction. The image horizontal width and image vertical height are related to the resolution of the camera. If the camera supports 1080P image preview, the image horizontal width is 1920 pixels and the image vertical height is 1080 pixels.

S22 , performing identification processing on the designated image to obtain the position of the portrait region corresponding to the person, where the position of the portrait region refers to the region including the head image of the person.

In order to perform positioning and focusing display based on the portrait, the controller recognizes the designated image collected by the camera, identifies the portrait in the image, and obtains the position of the head area as the position of the portrait area, so that the portrait can be accurately displayed on the designated image. the central area.

The designated image captured by the camera can be simultaneously displayed on the monitor for preview, and the position of the portrait area can be displayed in the designated image in the form of a face frame, and then the face frame is also displayed in the designated image displayed on the monitor. A face frame is a rectangular or square frame that encloses the head and/or a few body parts of a portrait.

Since there can be multiple people interacting with the display device, when the camera captures a specified image, the specified image may include portraits of multiple people, so when determining the position of the portrait area, the portraits of multiple people need to be considered at the same time.

Specifically, the controller is further configured as:

Step 221: Perform identification processing on the designated image to obtain the position information of the head region corresponding to at least one person.

Step 222: Calculate the total area information of the head area position information corresponding to at least one character, and use the position corresponding to the total area information as the portrait area position corresponding to the character, and the portrait area position refers to the total area including the head image of at least one character.

Identify the number of persons in the specified image, and if there are multiple portraits in the specified image, the position information of the head region corresponding to the multiple persons will be obtained. The position information of the head area refers to the position information of the area framed by the face frame, which can exist in the form of coordinates. There is a corresponding face frame on the portrait of each person, and the portrait and the face frame are in a one-to-one relationship.

Calculate the total area information of the head area position information corresponding to at least one character, that is, combine the face frames corresponding to each character to obtain a total face frame. The total face frame refers to the total area framed by multiple face frames. The smallest rectangular area formed.

When there are multiple portraits in the specified image, the position of the portrait region corresponding to the total face frame includes multiple head images of the people. The position of the portrait area can also be used as the top boundary point of the total face frame by the position of the person's head at the top of the specified image, and the bottom boundary point of the total face frame by the position of the head of the person at the bottom of the specified image. The position of the character's head located at the far left in the specified image is used as the left boundary point of the total face frame, and the position of the character's head at the far right in the specified image is used as the right boundary point of the total face frame. Four boundary points are used to make parallel lines parallel to the corresponding sides of the display, and the four parallel lines are perpendicular to each other. After intersecting, a rectangular total face frame can be obtained.

S23. Calculate the azimuth distance between the area center of the portrait area position and the image center of the specified image, where the azimuth distance is used to identify the horizontal distance and the vertical distance.

In order to accurately determine the shooting angle that the camera needs to fine-tune so that the portrait is located in the image center of the designated image, it is necessary to first calculate the azimuth distance between the area center of the portrait area and the image center of the designated image. in accordance with.

Compared with the specified image, the portrait may be offset in the horizontal direction and may also be cheap in the vertical direction. Therefore, in order to fine-tune the camera accurately, it can be adjusted in the horizontal and vertical directions, and the azimuth distance Including horizontal distance and vertical distance.

FIG. 20 exemplarily shows a flowchart of a method for calculating azimuth distance according to some embodiments; FIG. 21 exemplarily shows a schematic diagram of calculating azimuth distance according to some embodiments. Referring to FIG. 20 and FIG. 21 , the controller is further configured as:

S231. Acquire the coordinate information of the position of the portrait area and the image center coordinate information of the specified image, where the image center coordinate information includes the image horizontal coordinate and the image vertical coordinate.

In some embodiments, when calculating the azimuth distance, the calculation may be performed according to the coordinate position of the position of the portrait area and the coordinate position of the specified image. When the controller recognizes and detects the specified image, it can obtain the coordinate information of each vertex of the position of the portrait area, that is, the pixel coordinate values of the upper left vertex, the upper right vertex, the lower left vertex, and the lower right vertex.

The image center P ₀ of the specified image is the center point of the image captured by the camera, that is, the center point of the display. Since the specified image is captured by the camera, the size of the specified image is the same as the resolution of the camera, that is, if the resolution of the camera is constant, the width and height of the image captured by the camera are also constant. Determine the image center coordinate information of the specified image according to the resolution of the camera.

For example, if the camera supports 1080P image preview, the horizontal width of the image is 1920 pixels, and the vertical height of the image is 1080 pixels. At this time, taking the upper left corner of the display as the coordinate origin, the direction from left to right along the surface of the display is the positive X-axis , the direction from top to bottom is the positive Y axis, then the horizontal coordinate of the image center of the specified image is 960 pixels, and the vertical coordinate is 540 pixels, that is, the image center P ₀ coordinate information (x ₀ , y ₀ ) of the specified image is (960,540).

S232 , based on the coordinate information of the location of the portrait area, calculate the area center coordinates of the portrait area location, where the area center coordinates include the area center horizontal coordinate and the area center vertical coordinate.

Coordinate position information area may be determined portrait portrait area location coordinate values of four vertices of the pixel, based on this calculated to obtain the center position of the area portrait region P and the horizontal coordinate of the coordinates ₁ is a central vertical region.

For example, if the coordinate information of the position of the portrait area is: upper left vertex A (200, 100), upper right vertex B (500, 100), lower left vertex C (200, 400), and lower right vertex D (500, 400), then calculate the area center P of the position of the portrait area _{1 The} coordinates (x ₁ , y ₁ ) are (350, 250).

S233: Calculate the difference between the horizontal coordinate of the area center of the portrait area position and the image horizontal coordinate of the designated image, and obtain the horizontal distance between the area center of the portrait area position and the image center of the designated image.

S234. Calculate the difference between the vertical coordinate of the area center of the portrait area position and the image vertical coordinate of the designated image, and obtain the vertical distance between the area center of the portrait area position and the image center of the designated image.

When determining the azimuth distance between the area center of the portrait area position and the image center of the specified image, the horizontal and vertical distances need to be calculated separately. Therefore, when calculating the distance D in the horizontal direction, it _{is determined by the difference between the horizontal coordinate x 1} of the area center of the portrait area and the horizontal coordinate x ₀ of the image of the specified image; when calculating the distance H in the vertical direction, it is determined by the area of the portrait area position. The difference between the center vertical coordinate y ₁ and the image vertical coordinate y ₀ of the specified image is determined.

For example, the horizontal distance D=x ₀ -x ₁ =960-350=610; the vertical distance H=y ₀ -y ₁ =540-250=290. In some embodiments, both the horizontal distance and the vertical distance are expressed in pixel coordinate values.

S24. If the azimuth distance exceeds the azimuth setting threshold, calculate the target adjustment angle of the camera according to the azimuth distance and the shooting parameters of the camera.

When the camera captures a portrait, if the portrait is not located in the central area of the image captured by the camera, there will be a certain distance between the area center of the portrait area and the image center of the specified image. Therefore, a threshold value can be set by a preset orientation to determine whether there is a distance difference between the area center of the portrait area and the image center of the designated image, thereby determining whether the portrait is located in the center area of the designated image.

Compare the azimuth distance between the area center of the portrait area position determined in the previous embodiment and the image center of the specified image with the azimuth setting threshold, if the azimuth distance between the area center of the portrait area position and the image center of the specified image is greater than or equal to the azimuth setting threshold If the threshold is set, it means that the portrait is not located in the center area of the specified image, the camera is not focusing and shooting, and the display position of the portrait in the specified image is deviated. Therefore, it is necessary to control the shooting angle of the camera to adjust the portrait to the central area of the designated image.

In order to place the portrait captured by the camera in the center area of the specified image, the target adjustment angle of the camera needs to be determined first. Since the camera can be rotated in the horizontal direction and also in the vertical direction, the target adjustment angle of the camera includes the target horizontal adjustment angle and the target vertical adjustment angle.

Since the image horizontal coordinate of the specified image may be the same as the horizontal coordinate of the area center of the portrait area, the image vertical coordinate of the specified image may not be the same as the vertical coordinate of the area center of the portrait area, that is, the shooting angle of the camera may be positive in the horizontal direction. For characters, there is a deviation in the vertical direction. At this time, there is no need to control the camera to adjust the shooting angle in the horizontal direction, but only need to adjust the shooting angle of the camera in the vertical direction. Similarly, there is also a possibility that the image vertical coordinate of the specified image may be the same as the vertical coordinate of the area center of the portrait area, and the image horizontal coordinate of the specified image may be different from the area center horizontal coordinate of the portrait area, that is, the shooting angle of the camera may be in The vertical direction is facing the person, and there is a deviation in the horizontal direction. At this time, there is no need to control the camera to adjust the shooting angle in the vertical direction, just adjust the shooting angle of the camera in the horizontal direction.

Therefore, in order to accurately determine whether the camera needs to adjust the shooting angle in the horizontal direction, the shooting angle in the vertical direction, or adjust the shooting angle in the horizontal direction and the vertical direction at the same time, the orientation setting threshold used for determination includes the horizontal setting threshold. Threshold and Vertical Threshold.

When calculating the target adjustment angle of the camera, it is determined according to the azimuth distance and the shooting parameters of the camera. The shooting parameters of the camera include the viewing angle of the camera and the width of the image. Specifically, the shooting parameters of the camera include the horizontal viewing angle of the camera, the horizontal width of the image, the vertical viewing angle of the camera, and the vertical height of the image. In some embodiments, the horizontal viewing angle of the camera ranges from 0 to 120°, and the vertical viewing angle of the camera ranges from 0 to 105°. If the camera supports 1080P image preview, the horizontal width of the image is 1920 pixels, and the vertical height of the image is 1080 pixels.

In some embodiments, when determining the shooting angle that the camera needs to adjust in the horizontal direction, the azimuth setting threshold is the horizontal setting threshold, and the azimuth distance between the area center of the portrait area position and the image center of the designated image is the horizontal direction distance, The shooting parameters of the camera include the horizontal viewing angle of the camera and the horizontal width of the image.

At this time, if the azimuth distance exceeds the azimuth setting threshold, the controller calculates the target adjustment angle of the camera according to the azimuth distance and the shooting parameters of the camera, and is further configured as: if the horizontal distance is greater than the horizontal setting threshold, according to The distance in the horizontal direction, the horizontal viewing angle of the camera, and the horizontal width of the image are used to calculate the target horizontal adjustment angle of the camera.

If the horizontal distance between the area center of the portrait area position and the image center of the specified image is greater than or equal to the horizontal set threshold, it means that there is a deviation between the portrait position of the camera in the horizontal direction and the center position of the specified image, so that the area center of the portrait area position There is a certain distance from the image center of the specified image. Therefore, in order to make the portrait in the center of the specified image, it is necessary to control the shooting angle of the camera to adjust, according to the horizontal distance D between the area center of the portrait area and the image center of the specified image, the horizontal viewing angle of the camera α and the horizontal width of the image. IW, calculate the target horizontal adjustment angle θ _{1 of the} camera.

FIG. 22 exemplarily shows a schematic diagram of a horizontal viewing angle of a camera according to some embodiments; FIG. 23 exemplarily shows a schematic diagram of calculating a target horizontal adjustment angle according to some embodiments. When calculating the target horizontal adjustment angle of the camera, see Figure 22 and Figure 23, the horizontal width of the image: IW=1920, the image horizontal coordinate of the specified image x ₀ =960; the horizontal coordinate of the area center of the portrait area position is x ₁ , the camera horizontal The viewing angle is α.

_{Calculate the horizontal distance D=x 0} -x ₁ between the area center of the portrait area position and the image center of the designated image.

Then calculate the target horizontal adjustment angle of the camera: θ ₁ =atan(2*Math.abs(x ₀ -x ₁ )*tan(α/2)/IW).

Since if the person is located at the left position facing the display, the area center of the portrait area position is located to the left of the image center of the specified image, ie x ₀ >x ₁ (the state shown in Figure 21). And if the person is located on the right side facing the display, then the area center of the portrait area location is located to the right of the image center of the specified image, that is, x ₀ <x ₁ (the state shown in Figure 23). It can be seen that when calculating the horizontal distance D between the area center of the portrait area and the image center of the specified image, a negative value may occur. Therefore, in order to accurately obtain the target horizontal adjustment angle of the camera, when calculating the horizontal distance D , and calculate the absolute value of the difference between _{x 0} -x _1.

In some embodiments, when determining the shooting angle that the camera needs to adjust in the vertical direction, the azimuth setting threshold is the vertical setting threshold, and the azimuth distance between the area center of the portrait area position and the image center of the designated image is the vertical direction distance, The shooting parameters of the camera include the vertical viewing angle of the camera and the vertical width of the image.

At this time, when the controller executes that the azimuth distance exceeds the azimuth setting threshold, it calculates the target adjustment angle of the camera according to the distance and the shooting parameters of the camera, and is further configured as: if the vertical distance is greater than the vertical setting threshold, Then, according to the vertical distance, the vertical viewing angle of the camera and the vertical height of the image, the vertical adjustment angle of the target of the camera is calculated.

If the vertical distance between the area center of the portrait area position and the image center of the specified image is greater than or equal to the vertical set threshold, it means that there is a deviation between the portrait position of the camera in the vertical direction and the center position of the specified image, so that the area center of the portrait area position There is a certain distance from the image center of the specified image. Therefore, in order to make the portrait in the center of the specified image, it is necessary to control the shooting angle of the camera to adjust, according to the vertical distance H between the area center of the portrait area and the image center of the specified image, the vertical viewing angle of the camera β and the vertical width of the image. IH, calculate the target vertical adjustment angle θ _{2 of the} camera.

FIG. 24 exemplarily shows a schematic diagram of a vertical viewing angle of a camera according to some embodiments; FIG. 25 exemplarily shows a schematic diagram of calculating a vertical adjustment angle of a target according to some embodiments. When calculating the target vertical adjustment angle of the camera, see Figure 24 and Figure 25, the vertical width of the image: IH=1080, the image vertical coordinate y ₀ =540 of the specified image; the vertical coordinate of the area center of the portrait area position is y ₁ , the camera is vertical The viewing angle is β.

_{Calculate the vertical distance H=y 0} -y ₁ between the area center of the portrait area position and the image center of the designated image.

Then calculate the target vertical adjustment angle of the camera: θ ₂ =atan(2*Math.abs(y ₀ -y ₁ )*tan(β/2)/IH).

Since if the person is located at the upper position facing the display, the area center of the portrait area position is located at the upper side of the image center of the designated image, that is, y ₀ >y ₁ (the state shown in FIG. 21 ). On the other hand, if the person is located on the lower side facing the display, the area center of the portrait area position is located on the lower side of the image center of the specified image, that is, y ₀ <y ₁ (the state shown in FIG. 25 ). It can be seen that when calculating the vertical distance H between the area center of the portrait area and the image center of the specified image, a negative value may occur. Therefore, in order to accurately obtain the target vertical adjustment angle of the camera, when calculating the vertical distance H , and calculate the absolute value of the difference between _{y 0} -y _1.

It can be seen that the azimuth distance in the horizontal direction between the area center of the portrait area position and the image center of the designated image exceeds the horizontal set threshold, and/or, when the azimuth preset in the vertical direction exceeds the vertical set threshold, then according to the azimuth The distance and the shooting parameters of the camera are used to calculate the target horizontal adjustment angle and/or the target vertical adjustment angle of the camera. The controller controls the shooting angle of the camera to adjust according to the target horizontal adjustment angle and/or the target vertical adjustment angle, which can ensure that the portrait captured by the camera is located in the center area of the designated image.

S25. Adjust the angle based on the target of the camera, and adjust the shooting angle of the camera, so that the portrait of the person is located in the center area of the designated image collected by the camera.

After determining the target adjustment angle that the camera needs to adjust, the controller can send a control command to the motor control service, and the motor control service responds to the control command to control the camera to adjust the shooting angle. The portrait can be placed in the center area of the specified image captured by the camera.

In the direction facing the monitor, if the area center of the portrait area position is horizontally to the left of the image center of the specified image, the camera will be rotated to the right according to the target horizontal adjustment angle; otherwise, it will be rotated to the left. If the area center of the portrait area position is horizontally above the image center of the specified image, rotate the camera downward according to the target vertical adjustment angle; otherwise, rotate it upward.

When the controller controls the camera to adjust the shooting angle, if the rotation speed is too fast, the image will shake, and it will stop unstable when it rotates to the specified angle. Therefore, in order to obtain a stable image, it is necessary to accurately determine the rotation direction and rotation speed of the camera when adjusting the shooting angle. Specifically, the controller is further configured to:

Step 251: Determine the target rotation speed and target adjustment direction of the camera according to the target adjustment angle of the camera.

Since the default rotation speed of the camera is 90°/s, if the camera is rotated at the default maximum rotation speed, the rotation speed of the camera will be too fast, the image will shake, and the camera will stop unstable when it rotates to the specified angle. Therefore, in some embodiments, the rotational speed of the camera is associated with the target adjustment angle.

Set the logic value of the maximum speed and the logic value of the minimum speed, so that the camera rotates within the speed range corresponding to the logic value of the maximum speed and the logic value of the minimum speed. For example, the default maximum speed logic value is 100, which is 100°/s, and the minimum speed logic value is 10, which is 100°/s.

In some embodiments, if the target adjustment angle of the camera is greater than or equal to the maximum rotational speed logic value, the maximum rotational speed logic value is used as the target rotational speed of the camera. If the target adjustment angle of the camera is greater than or equal to the maximum rotational speed logic value of 100, the target rotation speed of the camera is set to 100°/s.

In some embodiments, if the target adjustment angle of the camera is less than or equal to the minimum rotational speed logic value, the minimum rotational speed logic value is used as the target rotational speed of the camera. If the target adjustment angle of the camera is less than or equal to the minimum rotation speed logic value of 10, the target rotation speed of the camera is set to 10°/s.

In some embodiments, if the target adjustment angle of the camera is between the maximum rotation speed logic value and the minimum rotation speed logic value, the value of the target adjustment angle is used as the target rotation speed of the camera. If the target adjustment angle of the camera is between 100 and 10, the actual target adjustment angle is set as the target rotation speed of the camera. For example, if the target adjustment angle of the camera is 30, set the target rotation speed of the camera to 30°/s.

It can be seen that, by calculating the adjustment angle of the camera target, the corresponding camera rotation speed is set before the rotation, and then the camera rotation is performed. Therefore, when the adjustment angle is small, the rotation speed is relatively gentle. If the adjustment angle is large, the rotation is realized at a faster speed, so that the camera can adjust the shooting angle in a timely and stable manner, so that the portrait is located in the center area of the specified image. .

When determining the target adjustment direction of the camera, it can be determined according to the positive or negative value of the azimuth distance between the area center of the portrait area position and the image center of the specified image. In the horizontal direction, if the horizontal distance (D=x ₀ -x ₁ ) is a negative value, it means that the image center of the specified image captured by the camera is located to the left of the center of the area where the portrait area is located. When the camera captures the center area of the specified image, it is necessary to adjust the shooting angle of the camera to the right, then determine that the target adjustment direction of the camera is to rotate to the right. On the contrary, if the distance in the horizontal direction (D=x ₀ -x ₁ ) is a positive value, it is determined that the target adjustment direction of the camera is to rotate to the left.

In the vertical direction, if the vertical distance (H=y ₀ -y ₁ ) is a negative value, it means that the image center of the specified image captured by the camera is located on the upper side of the area center of the portrait area position. When the camera captures the center area of the specified image, and the shooting angle of the camera needs to be adjusted downward, the target adjustment direction of the camera is determined to be downward rotation. Conversely, if the vertical distance (H=y ₀ -y ₁ ) is a positive value, it is determined that the target adjustment direction of the camera is upward rotation.

Step 252: Adjust the shooting angle of the camera according to the target adjustment angle, the target adjustment direction and the target rotation speed.

After the target adjustment angle, target adjustment direction and target rotation speed of the camera are determined, the camera can be controlled to perform the corresponding rotation to adjust the shooting angle and realize the focus positioning of the character position, so that the portrait captured by the camera is located in the center of the specified image. displayed in the center area of the display.

It can be seen that in the display device provided by the embodiments of the present application, when controlling the camera, based on the solution of roughly adjusting the shooting angle of the camera through the voice source information of the characters in the display device provided in the foregoing embodiment, the camera can be used again. The image is recognized and detected to more accurately adjust the shooting angle of the camera, effectively locate the specific position of the person, and the camera captures images with high portrait detection accuracy. The display device provided by this embodiment comprehensively utilizes sound source localization and camera image analysis, and takes advantage of the sound source localization's strong spatial perception ability. First, the approximate position of the character is confirmed, the camera is driven toward the sound source, and the camera is used at the same time. The advantage of high accuracy of image analysis is to perform person detection on the captured image to determine the specific position, and drive the camera to perform fine-tuning, so as to achieve precise positioning, so that the person captured by the camera can be displayed in the center area of the designated image, and the focused display is realized on the display. The display device provided in this embodiment is suitable for scenes such as video calls and fitness, and it is very effective for quickly and accurately locating the focused person if the person's standing position is not within the shooting area of the default camera.

In the display device provided by the foregoing embodiment, based on the azimuth distance between the area center of the portrait area position and the image center of the designated image in the horizontal direction exceeding the azimuth setting threshold, the portrait is displayed in the center area of the display by fine-adjusting the shooting angle of the camera again. . In other embodiments, if the azimuth distance between the area center of the portrait area and the image center of the specified image in the horizontal direction does not exceed the azimuth setting threshold, it means that the display of the portrait in the specified image does not deviate, which is reflected in the display. When centered, the portrait can be displayed in the center of the display. In this case, there is no need to fine-tune the shooting angle of the camera.

However, if the person is standing far away from the camera on the display device, in the designated image captured by the camera, the area displayed for the person's portrait is small, so that the person cannot view the person's portrait displayed on the display from a long distance. Therefore, in order for the person to still be able to clearly see his own portrait even when the distance is relatively far, the display device provided by the embodiment of the present application can perform a portrait focus and magnification display on the position of the portrait area.

FIG. 26 exemplarily shows a flowchart of a method for focusing and zooming in on a portrait display according to some embodiments. Specifically, referring to FIG. 26 , based on the display device provided by the foregoing embodiment, the controller is further configured as:

S26. If the azimuth distance does not exceed the azimuth setting threshold, acquire a specified image of a preset number of frames.

S27. If the position of the portrait area in the designated image of the preset number of frames does not change, identify the size of the position of the portrait area in the designated image.

S28. If the size of the portrait area position is smaller than or equal to the preset ratio of the specified image, perform a portrait focus and magnification display on the display at the portrait area position in the specified image.

If the azimuth distance in the horizontal direction between the area center of the portrait area position and the image center of the specified image does not exceed the azimuth setting threshold, it means that the portrait in the specified image currently captured by the camera is in the center position. In this scenario, the controller obtains The specified image for a preset number of frames. In some embodiments, the preset number of frames may be 20 frames.

If the position of the portrait area in the designated image of the preset number of frames does not change, it means that the person currently keeps the relative area unchanged. At this time, the controller will recognize the image of the preset number of frames to determine that the position of the portrait area is smaller than the entire designated image area, and will automatically focus and enlarge the area where the person's head is located to adapt to the distance between the person and the display device. need.

In some embodiments, the preset ratio can be set to one third. If the size of the portrait area position is less than or equal to one third of the specified image, it means that the portrait area position is displayed too small and needs to be focused and enlarged. The proportional calculation method of the position of the portrait area can be calculated by the pixel area (number of pixels).

When focusing and zooming in on the position of the portrait area in the specified image, the zoom is performed by comparing the position of the portrait area with the aspect ratio of the monitor. Specifically, if the size of the position of the portrait area is smaller than or equal to the preset ratio of the specified image, the controller is further configured to:

Step 281: If the size of the portrait area position is smaller than or equal to the preset ratio of the specified image, calculate the aspect ratio value of the display and the aspect ratio value of the portrait area position.

FIG. 27 exemplarily shows a schematic diagram of zoomed-in portrait display according to some embodiments. Referring to FIG. 27 , in some embodiments, when focusing and zooming in on the position of the portrait region in a specified image, it may be determined according to the ratio of the position of the portrait region to the aspect ratio of the display. Therefore, the aspect ratio of the display and the aspect ratio of the portrait area position need to be calculated separately.

The aspect ratio can be calculated according to the pixel coordinate value. The aspect ratio of the display is the ratio of the width and height of the display, and the width and height of the display are the same as the resolution of the camera, that is, if the camera supports 1080P image preview, Then the horizontal width of the image is 1920 pixels, and the vertical height of the image is 1080 pixels, then the width value of the display is 1920 pixels, and the height value is 1080 pixels.

The aspect ratio value of the portrait area position refers to the ratio of the width value and the height value of the portrait area position. The portrait area position may include only the position of the head of the character, or, the position of the head of the character and a few body parts. The width value and height value of the position of the portrait area can be determined by imagining the coordinate value, and the specific method can refer to the method for determining the coordinate information of the position of the portrait area in the foregoing embodiment, which will not be repeated here.

Step 282: If the aspect ratio value of the display is inconsistent with the aspect ratio value of the portrait area position, adjust the aspect ratio value of the portrait area position, and the adjusted aspect ratio value of the portrait area position is the same as the aspect ratio value of the display.

Since there can be multiple characters interacting with the display device at the same time, the designated image captured by the camera will include portraits of multiple characters, and the position of the portrait area enclosed by the portraits of multiple characters may be a rectangle or a rectangle. In order to enlarge the position of the portrait area without causing deformation of the portrait, the aspect ratio of the portrait area needs to be the same as the aspect ratio of the display.

If the aspect ratio value of the display is inconsistent with the aspect ratio value of the portrait area position, as shown in (a) in Figure 27, adjust the aspect ratio value of the portrait area position so that the portrait area position after adjusting the aspect ratio value The aspect ratio value is the same as that of the display, as shown in Figure 27(b). There are two situations in which the aspect ratio of the display and the aspect ratio of the portrait area are inconsistent. One is that the aspect ratio of the portrait area is greater than the aspect ratio of the monitor, and the second is that the aspect ratio of the portrait area is smaller than that of the monitor. of the aspect ratio.

In some embodiments, if the aspect ratio value of the portrait area position is greater than the aspect ratio value of the display, the height value of the portrait area position is adjusted, and the aspect ratio value of the original width value of the portrait area position and the adjusted height value is the same as that of the display. The aspect ratio values are the same.

If the aspect ratio of the portrait area position is greater than the aspect ratio of the monitor, in order to keep the size of the portrait area position with the ratio of the monitor, you should use the area center point of the portrait area position to expand the upper and lower sides, and the height of the portrait area position should be expanded. value increases.

Then, in order to avoid changing the position of the area center point of the portrait area position, it is necessary to adjust the upper and lower sides corresponding to the height value of the portrait area position at the same time, then the expansion size of the upper and lower sides is (IH*pW/IW-pH)/2, where, IW is the width value of the display, IH is the height value of the display, pW is the width value of the portrait area position, and pH is the height value of the portrait area position.

In some embodiments, if the aspect ratio value of the portrait area position is smaller than the aspect ratio value of the display, the width value of the portrait area position is adjusted, and the aspect ratio value of the adjusted width value of the portrait area position and the original height value is the same as that of the display. The aspect ratio values are the same.

If the aspect ratio of the portrait area position is smaller than the aspect ratio of the monitor, in order to keep the size of the portrait area position with the ratio of the monitor, the center point of the portrait area position should be used to expand the left and right sides, and the width of the portrait area position should be expanded. value increases.

Then, in order to avoid changing the position of the area center point of the portrait area position, it is necessary to adjust the left and right sides corresponding to the width value of the portrait area position at the same time, then the expansion size of the left and right sides is (PH*IW/IH-pW)/2, where, IW is the width value of the display, IH is the height value of the display, pW is the width value of the portrait area position, and pH is the height value of the portrait area position.

Step 283: Determine the target enlarged area for the position of the portrait area according to the position of the portrait area adjusted by the aspect ratio.

Since the position of the portrait area only includes the position of the head of the character, or includes the position of the head of the character and a few limbs, if the position of the portrait area is directly enlarged and displayed on the display, distortion will occur. Therefore, in order to prevent the image distortion from being seriously distorted due to the excessive enlargement ratio when the position of the small portrait area is enlarged to full-screen display, it is necessary to determine the target enlargement area.

The target enlargement area is the area to be displayed in the display, and the target enlargement area includes the position of the portrait area and the surrounding area. In some embodiments, the target enlargement area is about 1.5 times the position of the portrait area. The position of the portrait area adjusted based on the aspect ratio is enlarged by 1.5 times, and the target enlarged area can be obtained, such as the dotted rectangle area shown in (c) in Figure 27. Enlarge the image corresponding to the target zoom area to full screen display without causing image distortion.

Step 284 , focus and enlarge the portrait corresponding to the target enlargement area, and display it on the display in full screen.

Enlarging the image corresponding to the target zoom area to a full-screen display will not cause image distortion while realizing the focus zoom display of the portrait.

Since the target magnification area is the area of the portrait area adjusted in the same proportion of the monitor, and the center of the area is the center point and the area is enlarged in the same proportion, some areas may exceed the boundary due to being close to the edge of the image. If the image cannot be displayed on the monitor, adjust the target zoom area according to the part beyond the boundary, so that the part beyond the boundary coincides with the edge corresponding to the specified image.

Specifically, the controller is further configured to perform the following steps when performing focusing and zooming in on the portrait corresponding to the target zoom area and displaying it on the display in full screen:

Step 2841: Obtain the coordinates of the center point of the target zoom-in area.

Since the target magnified area is the center point of the area center of the portrait area, the area obtained by enlarging according to the magnification ratio, therefore, the center of the target magnified area is the same as the area center of the portrait area, and the coordinates of the center point of the target magnified area It is the area center coordinate of the portrait area position.

Step 2842, calculate the first distance between the center point coordinates and any border of the target zoom area, and the second distance between the center point coordinates and any border of the display, and the position of any border of the target zoom area and any border of the display Corresponding.

After the position of the portrait area is enlarged, a certain side of the obtained target enlarged area may exceed a certain boundary of the display. For example, as shown in (c) in Figure 27, the left boundary of the target enlarged area exceeds the left boundary of the display.

In order to adjust the position of the target enlargement area when a certain side of the target enlargement area exceeds the boundary of the display, calculate the first distance L ₁ between the center point coordinates of the target enlargement area and any boundary of the target enlargement area, and calculate the target enlargement area _{The second distance L 2} between the coordinates of the center point of the display and any border of the display.

_{For example, calculate the first distance L 11} between the coordinates of the center point of the target enlargement area and the left border of the _{target enlargement area, calculate the first distance L 12} between the center point coordinates of the target enlargement area and the upper boundary of the target enlargement area, and calculate the target enlargement area _{The first distance L 13} between the coordinates of the center point of the target enlargement area and the right border of the _{target enlargement area is calculated, and the first distance L 14} between the center point coordinates of the target enlargement area and the lower boundary of the target enlargement area is calculated.

_{Calculate the second distance L 21} between the coordinates of the center point of the target enlargement area and the left border of the display _{, calculate the second distance L 22} between the coordinates of the center point of the target enlargement area and the upper border of the display, and calculate the coordinates of the center point of the target enlargement area and the display. _{The second distance L 23} of the right border is _{calculated, and the second distance L 24} between the coordinates of the center point of the target enlargement area and the lower border of the display is calculated.

Step 2843: If the distance difference between the second distance and the first distance is less than zero, adjust the position of the target enlarged area according to the distance difference.

When judging whether the target enlargement area exceeds a certain boundary of the display, the judgment is based on the difference between the first distance corresponding to the side of the target enlargement area located on the same side and the second distance corresponding to the side of the display.

Calculate the distance difference between the second distance L ₂ and the first distance L ₁ , if it is less than zero, it means that the edge corresponding to the first distance exceeds the edge corresponding to the second distance. For example, as shown in (c) of FIG. 27 , _{the distance difference between the second distance L 21} _{corresponding to the left border of the display and the first distance L 11} corresponding to the left border of the target enlargement area is less than zero, indicating that the left border of the target enlargement area is less than zero. beyond the left border of the display.

When the distance difference is less than zero, the entire target enlarged area is moved in the opposite direction of the side of the target enlarged area beyond the display, so that the side of the target enlarged area beyond the display coincides with the side of the display. For example, as shown in FIG. 27(d), if the left border of the target enlargement area exceeds the left border of the display, the entire target enlargement area is moved to the right so that the left border of the target enlargement area coincides with the left border of the display.

If the right border of the target enlargement area exceeds, the entire target enlargement area is shifted to the left, so that the right border coincides with the right border of the display; The upper boundary coincides; if the lower boundary of the target enlargement area exceeds, the entire target enlargement area is shifted upward, so that the lower boundary coincides with the lower boundary of the display.

The degree of movement of the target enlarged area when moving the position is determined by the distance difference, that is, the position of the target enlarged area is adjusted according to the value of the distance difference. For example, if the distance difference is L ₀ =|L ₂₁ -L ₁₁ |, move the entire target enlargement area to the right _{by the distance L 0} so that the left border of the target enlargement area coincides with the left border of the display, so that the target enlargement area is All images within are displayed on the monitor.

Step 2844 , focus and zoom in on the portrait corresponding to the target zoom-in area whose position has been adjusted, and display it on the display in full screen.

All images in the target zoom area after the position adjustment are displayed on the monitor, and then the portrait corresponding to the target zoom area can be focused and enlarged, that is, the images contained in the entire target zoom area are displayed on the monitor in full screen, and the focus zoom display effect is as follows This is shown in (e) of FIG. 27 .

In some embodiments, when the camera captures the portrait of the person, if the portrait is located in the central area of the display, and the person never changes his position, at this time, the display device does not need to control the camera to adjust the shooting angle, and continues to shoot the portrait of the person at the current shooting angle . Accumulate the designated images of the preset number of frames, and there is no change in the position of the person. When the proportion of the portrait area in the designated image in the designated image is small, the portrait area will be displayed in a zoomed-in portrait focus, so as to display the position corresponding to the portrait area. The image is displayed full screen in the monitor.

However, if the position of the portrait area is displayed on the display in the form of focus and magnification, and the position of the person appears changes, the display device needs to re-determine the area center of the portrait area position. Then you need to control the camera to adjust the shooting angle to ensure that the portrait is always in the center of the specified image and displayed in the center area of the monitor.

Because when the person changes from the state of unchanged position to the state of changing position, the display shows the position of the portrait area displayed in focus and magnification. Therefore, in order to ensure the accuracy of judging the camera to adjust the shooting angle based on the image detection and recognition method, the monitor needs to be adjusted. The position of the portrait area that is currently in focus and magnified display is restored to the original state, and then the subsequent steps of calculating the target adjustment angle of the camera are performed.

Specifically, before the controller performs the calculation of the target adjustment angle of the camera, it is further configured to perform the following steps:

Step 0241: Determine whether the specified image is subjected to a portrait focus zoom display operation.

Step 0242: If the specified image has not been subjected to the portrait focus and zoom-in display operation, execute the step of calculating the target adjustment angle of the camera.

Step 0243: If the specified image has been displayed by focusing and zooming in on the portrait, restore the display of the specified image, and perform the step of calculating the target adjustment angle of the camera.

If the controller performs the portrait focus magnification display operation on the portrait area position in the specified image, a magnification mark will be generated on the current specified image. If the controller detects that there is a magnifying mark on the currently designated image, it can determine that the designated image is subjected to the portrait focus zoom display operation; if no zoom mark is detected, it is determined that the designated image has not been subjected to the portrait focus zoom display operation.

When the controller determines that the specified image has not been subjected to the portrait focus and magnification display operation, it can directly perform image detection and analysis on the specified image, and continue to perform the subsequent steps of calculating the target adjustment angle of the camera.

When the controller judges that the specified image has been zoomed in portrait display operation, the zoomed-in specified image will affect the accuracy of image detection and analysis. Therefore, it is necessary to restore the specified image to its original state and cancel the zoomed in portrait display operation. At the time, the specified image in the original state is displayed on the display, and then the subsequent steps of calculating the target adjustment angle of the camera are continued.

It can be seen from the above technical solutions that in the display device provided by the embodiment of the present application, the controller performs recognition processing on the designated image collected by the camera, obtains the position of the portrait area, and calculates the azimuth distance between the area center of the portrait area position and the image center of the designated image. ;If the azimuth distance exceeds the azimuth setting threshold, calculate the target adjustment angle of the camera according to the azimuth distance and the shooting parameters of the camera; adjust the shooting angle of the camera based on the target adjustment angle of the camera, so that the portrait of the person is located at the specified location captured by the camera. the central area of the image. It can be seen that the display device provided by the embodiment of the present application accurately recognizes the position information of the person through the image detection of the camera, and automatically focuses and locates the position of the portrait, so as to finely adjust the shooting angle of the camera from the horizontal direction and the vertical direction, so that the portrait of the person is captured by the camera. The center of the image, thus ensuring that the display image character is centered.

FIG. 18 exemplarily shows a flowchart of a control method of a camera according to some embodiments. Referring to FIG. 18 , the present application also provides a method for controlling a camera, the method comprising:

S21, acquiring the shooting parameters of the camera and a designated image of the collected character located in the camera shooting area;

S22, performing identification processing on the designated image to obtain a portrait area position corresponding to the person, where the portrait area position refers to an area including a head image of a person;

S23, calculate the azimuth distance between the area center of the portrait area position and the image center of the designated image, and the azimuth distance is used to identify the horizontal direction distance and the vertical direction distance;

S24, if the azimuth distance exceeds the azimuth setting threshold, calculate the target adjustment angle of the camera according to the azimuth distance and the shooting parameters of the camera;

S25. Adjust the shooting angle of the camera based on the target adjustment angle of the camera, so that the portrait of the person is located in the center area of the designated image captured by the camera.

In a specific implementation, the present application further provides a computer storage medium, wherein the computer storage medium can store a program, and when the program is executed, it can include some or all of the steps in each embodiment of the camera control method provided by the present application. The storage medium may be a magnetic disk, an optical disk, a read-only memory (English: read-only memory, abbreviated as: ROM) or a random access memory (English: random access memory, abbreviated as: RAM) and the like.

Those skilled in the art can clearly understand that the technology in the embodiments of the present application can be implemented by means of software plus a necessary general hardware platform. Based on this understanding, the technical solutions in the embodiments of the present application can be embodied in the form of software products in essence or in the parts that make contributions to related technologies, and the computer software products can be stored in storage media, such as ROM/RAM, A magnetic disk, an optical disk, etc., includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in various embodiments or some parts of the embodiments of the present application.

Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present application, but not to limit them; although the present application has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that: The technical solutions described in the foregoing embodiments can still be modified, or some or all of the technical features thereof can be equivalently replaced; and these modifications or replacements do not make the essence of the corresponding technical solutions deviate from the technical solutions of the embodiments of the present application. Scope.

For the convenience of explanation, the above description has been made in conjunction with specific embodiments. However, the above exemplary discussions are not intended to be exhaustive or to limit implementations to the specific forms disclosed above. Numerous modifications and variations are possible in light of the above teachings. The above embodiments are chosen and described to better explain the principles and practical applications, so as to enable those skilled in the art to better utilize the described embodiments and various modified embodiments suitable for specific use considerations.

Claims

A display device, comprising:

a camera, the camera is configured to capture a portrait and realize rotation within a preset angle range;

A controller connected to the camera, the controller being configured to:

Acquiring the shooting parameters of the camera and the designated images of the collected characters located in the shooting area of the camera;

Performing identification processing on the designated image to obtain a portrait area position corresponding to the person, where the portrait area position refers to an area including the head image of the person;

Calculate the azimuth distance between the area center of the portrait area position and the image center of the designated image, and the azimuth distance is used to identify the horizontal direction distance and the vertical direction distance;

If the azimuth distance exceeds the azimuth setting threshold, calculate the target adjustment angle of the camera according to the azimuth distance and the shooting parameters of the camera;

Based on the target adjustment angle of the camera, the shooting angle of the camera is adjusted so that the portrait of the person is located in the center area of the designated image captured by the camera.
The display device according to claim 1, wherein the controller is further configured to perform the recognizing process on the designated image to obtain the position of the portrait region corresponding to the person:

Performing identification processing on the designated image to obtain the position information of the head region corresponding to at least one character;

Calculate the total area information of the head area position information corresponding to at least one character, take the position corresponding to the total area information as the portrait area position corresponding to the character, and the portrait area position refers to the total area including the head image of at least one character .
The display device according to claim 1, wherein the azimuth distance includes a horizontal direction distance and a vertical direction distance; Azimuth distance from the center of the image, further configured as:

Obtain the coordinate information of the position of the portrait area and the image center coordinate information of the designated image, and the image center coordinate information includes the image horizontal coordinate and the image vertical coordinate;

Based on the coordinate information of the position of the portrait area, calculate the area center coordinates of the portrait area position, and the area center coordinates include the horizontal coordinates of the area center and the vertical coordinates of the area center;

Calculate the difference between the horizontal coordinate of the area center of the portrait area position and the image horizontal coordinate of the designated image, and obtain the horizontal distance between the area center of the portrait area position and the image center of the designated image;

Calculate the difference between the vertical coordinate of the area center of the portrait area position and the image vertical coordinate of the designated image, and obtain the vertical distance between the area center of the portrait area position and the image center of the designated image.
The display device according to claim 1, wherein the azimuth setting threshold comprises a horizontal setting threshold, the azimuth distance comprises a horizontal distance, and the shooting parameters of the camera comprise a horizontal viewing angle of the camera and a horizontal width of the image ;

The controller calculates the target adjustment angle of the camera according to the azimuth distance and the shooting parameters of the camera, and is further configured as:

If the horizontal distance is greater than the horizontal set threshold, calculate the target horizontal adjustment angle of the camera according to the horizontal distance, the horizontal viewing angle of the camera, and the horizontal width of the image.
The display device according to claim 1, wherein the azimuth setting threshold comprises a vertical setting threshold, the azimuth distance comprises a vertical distance, and the shooting parameters of the camera comprise a vertical viewing angle of the camera and a vertical height of the image ;

The controller calculates the target adjustment angle of the camera according to the azimuth distance and the shooting parameters of the camera, and is further configured as:

If the vertical distance is greater than the vertical set threshold, calculate the target vertical adjustment angle of the camera according to the vertical distance, the vertical viewing angle of the camera, and the vertical height of the image.
The display device of claim 1, wherein the controller is further configured to:

If the azimuth distance does not exceed the azimuth setting threshold, obtain a specified image of a preset number of frames;

If the position of the portrait area in the designated image of the preset number of frames does not change, then identifying the size of the position of the portrait area in the designated image;

If the size of the position of the portrait area is smaller than or equal to the preset ratio of the designated image, the position of the portrait area in the designated image is displayed on the display in a focus on portrait zoom.
The display device according to claim 6, wherein the controller is performing the step of converting the portrait area in the designated image if the size of the position of the portrait area is smaller than or equal to a preset ratio of the designated image. The position is displayed in the display for portrait focus magnification, which is further configured to:

If the size of the portrait area position is less than or equal to the preset ratio of the specified image, calculate the aspect ratio value of the display and the aspect ratio value of the portrait area position;

If the aspect ratio value of the display is inconsistent with the aspect ratio value of the portrait area position, the aspect ratio value of the portrait area position is adjusted, and the adjusted aspect ratio value of the portrait area position is the same as that of the display. the same ratio;

Determine the target enlarged area of the position of the portrait area according to the position of the portrait area adjusted by the aspect ratio;

Focusing and enlarging the portrait corresponding to the target enlargement area is performed on the display in full screen.
The display device according to claim 7, wherein the controller adjusts the aspect ratio value of the portrait area position if the aspect ratio value of the display is inconsistent with the aspect ratio value of the portrait area position. , is further configured as:

If the aspect ratio value of the portrait area position is greater than the aspect ratio value of the display, the height value of the portrait area position is adjusted, and the aspect ratio value of the original width value of the portrait area position and the adjusted height value is the same as the height value. The displays have the same aspect ratio;

If the aspect ratio value of the portrait area position is smaller than the aspect ratio value of the display, adjust the width value of the portrait area position, and the aspect ratio value of the adjusted width value of the portrait area position and the original height value is the same as The displays have the same aspect ratio.
The display device according to claim 7, wherein the controller is further configured to:

obtaining the coordinates of the center point of the target zoom area;

Calculate the first distance between the coordinates of the center point and any border of the target zoom area, and the second distance between the center point coordinates and any border of the display, and any border of the target zoom area corresponding to any border position of the display;

If the distance difference between the second distance and the first distance is less than zero, adjusting the position of the target enlarged area according to the distance difference;

Focusing and enlarging the portrait corresponding to the target enlargement area after the position adjustment is performed, and displaying it on the display in full screen.
The display device according to claim 1, wherein before the controller performs the calculation of the target adjustment angle of the camera, it is further configured to:

judging whether the specified image is subjected to a portrait focus zoom display operation;

If the designated image has not been subjected to a portrait focus zoom-in display operation, the step of calculating the target adjustment angle of the camera is performed;

If the designated image has been subjected to an operation of focusing and zooming in on the portrait, the display of the designated image is resumed, and the step of calculating the target adjustment angle of the camera is performed.
The display device according to claim 1, wherein, when the controller performs the camera-based target angle adjustment to adjust the shooting angle of the camera, the controller is further configured to:

Determine the target rotation speed and target adjustment direction of the camera according to the target adjustment angle of the camera;

The shooting angle of the camera is adjusted according to the target adjustment angle, the target adjustment direction and the target rotation speed.
The display device according to claim 11, wherein the controller is further configured to:

If the target adjustment angle of the camera is greater than or equal to the maximum rotational speed logic value, the maximum rotational speed logic value is used as the target rotational speed of the camera;

If the target adjustment angle of the camera is less than or equal to the minimum rotational speed logic value, the minimum rotational speed logic value is used as the target rotational speed of the camera;

If the target adjustment angle of the camera is located between the maximum rotation speed logic value and the minimum rotation speed logic value, the value of the target adjustment angle is used as the target rotation speed of the camera.
The display device according to claim 1, further comprising a sound collector, wherein the sound collector is configured to collect sound source information of a person, and the sound source information of the person refers to when the person interacts with the display device through voice generated sound information;

The controller is further configured to:

Obtain the character sound source information collected by the sound collector and the current shooting angle of the camera;

performing sound source identification on the character sound source information, and determining sound source angle information, where the sound source angle information is used to represent the azimuth angle of the character's position when speaking;

Determine the target rotation direction and target rotation angle of the camera based on the current shooting angle and sound source angle information of the camera;

According to the target rotation direction and the target rotation angle, adjust the shooting angle of the camera, and the position where the shooting area of the camera after the adjustment of the shooting angle is facing the voice of the character;

Obtain the shooting parameters of the camera after adjusting the shooting angle and the designated image of the captured person located in the shooting area of the camera.
The display device according to claim 13, wherein the controller is further configured to: determine the target rotation direction and target rotation angle of the camera based on the current shooting angle and sound source angle information of the camera.

Convert the sound source angle information into the coordinate angle of the camera;

Calculate the angle difference between the coordinate angle of the camera and the current shooting angle of the camera, and use the angle difference as the target rotation angle of the camera;

According to the angle difference, the target rotation direction of the camera is determined.
The display device according to claim 14, wherein the controller is further configured to:

Obtain the sound source angle range of the character when speaking and the preset angle range when the camera is rotated;

Calculate the angle difference between the sound source angle range and the preset angle range, and use the half value of the angle difference as the conversion angle;

Calculate the angle difference between the angle corresponding to the sound source angle information and the conversion angle, and use the angle difference as the coordinate angle of the camera.
The display device according to claim 14, wherein the controller is further configured to: determine the target rotation direction of the camera according to the angle difference.

If the angle difference is a positive value, it is determined that the target rotation direction of the camera is to rotate to the right;

If the angle difference is a negative value, it is determined that the target rotation direction of the camera is to rotate to the left.
A method for controlling a camera, wherein the method comprises:

Acquiring the shooting parameters of the camera and the designated images of the collected characters located in the shooting area of the camera;

Performing identification processing on the designated image to obtain the position of the portrait region corresponding to the person, where the position of the portrait region refers to the region including the head image of the person;

Calculate the azimuth distance between the area center of the portrait area position and the image center of the designated image, and the azimuth distance is used to identify the horizontal direction distance and the vertical direction distance;

If the azimuth distance exceeds the azimuth setting threshold, calculate the target adjustment angle of the camera according to the azimuth distance and the shooting parameters of the camera;

Based on the target adjustment angle of the camera, the shooting angle of the camera is adjusted so that the portrait of the person is located in the center area of the designated image captured by the camera.