CN116156386A - Intelligent sound box and display method thereof - Google Patents

Intelligent sound box and display method thereof Download PDF

Info

Publication number
CN116156386A
CN116156386A CN202111383428.5A CN202111383428A CN116156386A CN 116156386 A CN116156386 A CN 116156386A CN 202111383428 A CN202111383428 A CN 202111383428A CN 116156386 A CN116156386 A CN 116156386A
Authority
CN
China
Prior art keywords
face
picture information
sound box
intelligent sound
orientation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111383428.5A
Other languages
Chinese (zh)
Inventor
杜兆臣
高雪松
田羽慧
孟卫明
王月岭
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hisense Group Holding Co Ltd
Original Assignee
Hisense Group Holding Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hisense Group Holding Co Ltd filed Critical Hisense Group Holding Co Ltd
Priority to CN202111383428.5A priority Critical patent/CN116156386A/en
Publication of CN116156386A publication Critical patent/CN116156386A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R1/00Details of transducers, loudspeakers or microphones
    • H04R1/20Arrangements for obtaining desired frequency or directional characteristics

Abstract

The embodiment of the invention provides an intelligent sound box and a display method thereof. The optical component is used for projecting the display content of the display screen to the outside of the intelligent sound box as a virtual screen. The image collector is used for collecting picture information of the external environment. The processor is used for determining the face orientation of the user from the picture information and generating a rotation instruction according to the change of the face orientation. The rotating component is used for rotating under the control of the rotating instruction so as to enable the virtual screen to face the user. The direction of the face of the user is determined according to the picture information, and a rotation instruction is generated according to the direction change of the face to control the rotation of the rotation part, so that when the direction of the face is changed, the direction of the virtual screen is changed along with the direction change, and the virtual screen always faces the user.

Description

Intelligent sound box and display method thereof
Technical Field
The invention relates to electronic equipment, in particular to an intelligent sound box and a display method of the intelligent sound box.
Background
Along with the development of science and technology, people are more and more interested in intelligent things, sound boxes are also developed from common sound boxes to intelligent, the existing intelligent sound boxes not only can play music, but also can play videos with display screens, can be connected with a network, and even people can execute some simple operations through voice control of the intelligent sound boxes. However, the display screen of the intelligent sound box is integrated with the intelligent sound box, and the display screen cannot rotate along with the orientation of the face.
In summary, how to realize that the display screen of the intelligent sound box can rotate along with the direction of the face is a technical problem to be solved currently.
Disclosure of Invention
The embodiment of the invention provides an intelligent sound box and a display method of the intelligent sound box, which are used for solving the problem of how to rotate a display screen of the intelligent sound box according to the orientation of a human face.
In a first aspect, an intelligent sound box provided by an embodiment of the present invention includes a display screen located inside the intelligent sound box, and configured to display content played by the intelligent sound box. And the optical component is used for projecting the display content of the display screen to the outside of the intelligent sound box to form a virtual screen. And the image collector is used for collecting picture information of the external environment. The processor is located inside the intelligent sound box and is used for determining the face orientation of the user from the picture information and generating a rotation instruction according to the change of the face orientation. The rotating component is positioned in the intelligent sound box and used for rotating under the control of the rotating instruction so that the virtual screen faces to the user.
In the technical scheme of the processor, on one hand, the display screen is arranged inside the sound box, and compared with an external display screen, the volume of the sound box can be effectively reduced; on the other hand, the orientation of the face of the user is determined through the processor according to the picture information, a rotation instruction is generated according to the change of the orientation of the face to control the rotation of the rotation part, and the rotation of the display screen is realized under the action of the rotation part, so that when the orientation of the face is changed, the orientation of the virtual screen is changed, and the virtual screen is always oriented to the user.
Optionally, the processor is specifically configured to generate a horizontal rotation instruction and/or a vertical rotation instruction according to a change of the face direction. The rotating component is used for rotating under the control of the horizontal rotating instruction and/or the vertical rotating instruction so as to enable the virtual screen to face the user.
According to the technical scheme, the rotating instruction is generated according to the change of the face orientation, the display screen is controlled to rotate horizontally or vertically downwards, and multi-angle rotation is achieved, so that the virtual screen can be changed along with the change of the face orientation, namely the virtual screen always faces the face.
Optionally, the rotating component includes a carrier and a position controller. The bearing piece is used for bearing the display screen and the image collector. The position controller is used for moving under the control of the rotation instruction.
In the technical scheme, the display screen and the image collector are placed on the bearing piece, the position controller drives the bearing piece to rotate under the control of the rotation instruction, the display screen and the image collector can rotate along with the rotation of the face, the directions of the image collector and the display screen are always in the same direction, and the follow-up generation of more accurate rotation instructions is facilitated. Meanwhile, the rotation of the display screen and the image collector is controlled only in the intelligent sound box without rotating the whole intelligent sound box, so that the control is simpler and more labor-saving from the angle of rotation control.
Optionally, the optical component level set up in the top of intelligent audio amplifier. The included angle between the display screen and the optical component is 30-45 degrees. The rotation angle of the rotating part is in the range of 0-175 degrees of horizontal rotation angle and in the range of 0-15 degrees of vertical rotation angle.
In the technical scheme, the included angle between the display screen and the optical component and the included angle formed by the rotation of the rotation component are controlled, so that the virtual screen has a good imaging effect.
Optionally, the processor is further configured to control the rotation of the image collector according to the rotation instruction; the orientation of the image collector is consistent with the orientation of the virtual screen.
In the technical scheme, the virtual screen always faces the human face, so that the image collector can always collect the picture information containing the human face according to the direction of the image collector and the direction of the virtual screen, and a more accurate rotation instruction is generated for the follow-up.
In a second aspect, an embodiment of the present invention further provides a method for displaying an intelligent sound box, where the method includes: performing face recognition and determining face orientation based on the picture information acquired by the image acquisition device of the intelligent sound box; determining the change of the face orientation through the collected pieces of picture information; and determining a rotation instruction for controlling the rotating component inside the intelligent sound box according to the change of the face orientation, wherein the rotation instruction is used for enabling a display screen inside the intelligent sound box to face a user through a virtual screen projected by the optical component of the intelligent sound box.
Optionally, the method for recognizing the face and determining the face orientation based on the picture information acquired by the image acquirer includes: performing face recognition based on the picture information acquired by the image acquisition device to obtain face information; determining hue, saturation, brightness HSV color probability model and centroid coordinates of the face according to the face information; the centroid coordinates are used for indicating the face orientation.
According to the technical scheme, the orientation of the face can be accurately determined according to the HSV color probability model and the barycenter coordinates of the face.
Optionally, determining the change of the face orientation through the collected multiple pieces of picture information includes: determining the change of the face orientation based on the first centroid coordinates of the first face identified by the first picture information and the second centroid of the second face identified by the second picture information; the first picture information and the second picture information are picture information acquired by the image acquirer at different moments within a set time length.
In the technical scheme, when the orientation of the face is changed, the barycenter coordinates of the face are also changed, so that when the barycenter coordinates of the face after the face is changed can be obtained more accurately, the more accurate orientation of the face after the face is changed can be obtained, and the content of a rotation instruction is provided for the rotation display screen of the subsequent rotation part.
Optionally, the time of collecting the second picture information is located after the time of collecting the first picture information; the second face is obtained by the following steps: moving by using a detection window where the first face is positioned; determining whether the barycenter coordinates in the moved detection window are the same as the first barycenter, if not, continuing to move; and determining the picture in the detection window which is the same as the first centroid coordinate as a second face.
According to the technical scheme, only the fixed approximate position of the changed face can be obtained according to the HSV color probability model, and the specific position of the changed face can be accurately obtained through whether the barycenter coordinates are the same or not.
Optionally, the first picture information is acquired first picture information with a face, where the first face is obtained by the following manner, and includes: performing face recognition on the picture information in the detection window in the first picture information according to the set detection window; if the face is not recognized, expanding the detection window or moving the detection window according to a set mode until the face is recognized or the first picture information is detected.
In the technical scheme, the initialized position of the face can be found by moving the detection window or changing the size of the detection window, and a reference value is provided for the position of the face after the face is found to be changed subsequently.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the description of the embodiments will be briefly described below, it will be apparent that the drawings in the following description are only some embodiments of the present invention, and that other drawings can be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a schematic diagram of a possible application scenario provided in an embodiment of the present invention;
fig. 2 is a schematic structural diagram of an intelligent sound box according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of the positions of the components of an intelligent sound box according to an embodiment of the present invention;
fig. 4 is a schematic diagram of a position controller according to an embodiment of the present invention;
FIG. 5 is a schematic diagram of a processor according to an embodiment of the present invention;
fig. 6a is a schematic diagram of a virtual screen in an intelligent sound box according to an embodiment of the present invention;
FIG. 6b is a schematic diagram illustrating a position of a virtual screen rotation according to an embodiment of the present invention;
fig. 7 is a flow chart of a display method of an intelligent sound box according to an embodiment of the present invention;
fig. 8 is a schematic flow chart of a method for training face recognition according to an embodiment of the present invention;
fig. 9 is a flowchart of a method for determining a face orientation according to an embodiment of the present invention;
fig. 10 is a schematic flow chart of determining the virtual screen orientation according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in further detail below with reference to the accompanying drawings, and it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Fig. 1 is a schematic diagram of a possible application scenario provided in an embodiment of the present invention. The application scenario may include at least one smart speaker and at least one user, fig. 1 taking the example of including smart speaker 101 and user 102. In one case, when the user 102 plays music using the smart speaker 101, the user 102 can play the smart speaker 101 as long as the sound of the smart speaker 101 can be heard by the user 102, and there is no strong limit relationship between the user 102 and the smart speaker 101.
In another case, when the user 102 views the video using the smart speaker 101, in order to ensure the comfort of the user 102 viewing the video, the user 102 adjusts the position of the user according to the position of the smart speaker 101, and the user 102 typically selects a position facing the smart speaker 101 to view the video, so as to ensure that the screen of the smart speaker can be completely viewed. When the user 102 needs to change a position to continue watching, the screen of the intelligent sound box 101 is integrated with the sound box, so that the screen cannot be rotated, at this time, the user 102 is required to move the intelligent sound box 101 to adjust the position of the intelligent sound box 101, but the intelligent sound box is generally heavy and inconvenient to move, so that when the user 102 can only watch the sound at a fixed position, the position of the intelligent sound box is inconvenient to change according to the position change of the user 102.
In view of this, the embodiment of the invention provides an intelligent sound box. The intelligent sound box can rotate the display screen of the intelligent sound box according to the direction of the face.
For convenience of explanation of the scheme, the following optical member is exemplified by a negative refractive lens. The optical component may be a negative refractive lens, a plastic lens, or other optical components, which are not limited herein.
Fig. 2 is a schematic structural diagram of an intelligent sound box according to an embodiment of the present invention. The smart box 200 includes an image collector 201, a negative refractive lens 202, a processor 203 located inside the smart box 200, a display 204, and a rotating member 205. Wherein the negative refractive lens 202 is used to project the display content of the display screen 204 as a virtual screen outside the smart enclosure 200. The image collector 201 is used for collecting picture information of an external environment. The processor 203 is configured to determine a face orientation of a user from the picture information, and generate a rotation instruction according to a change of the face orientation. The rotation section 205 is configured to rotate under the control of a rotation instruction so as to orient the virtual screen toward the user.
According to the components of the intelligent sound box, in the process of rotating the display screen by the intelligent sound box, the rotation instruction is determined according to the collected picture information, and the display screen is rotated according to the rotation instruction, so that the virtual screen corresponding to the display screen faces to the user, the virtual screen changes along with the change of the user when the direction of the user changes, and the virtual screen is always kept towards the user.
Fig. 3 is a schematic position diagram of each component of an intelligent sound box according to an embodiment of the present invention. The image collector 303 in the intelligent sound box 300 is used for collecting image information in an external environment, wherein the image collector is used for collecting the image information in real time, and the collected image information may or may not have face information. The location of the image collector 303 in the intelligent enclosure may have a variety of settings, and fig. 3 is only one example.
The image collector 303 then sends the collected image information to the processor 307 in real time, wherein the processor 307 is configured to generate a horizontal rotation instruction and/or a vertical rotation instruction according to the change of the face orientation. Specifically, the processor 307 finds face information in the picture information from the picture information acquired by the image acquirer, wherein the face information includes a face orientation, and then generates a rotation instruction according to a change in the face orientation and sends to the rotation section 308.
The rotating member 308 is then rotated under the control of a horizontal rotation command and/or a vertical rotation command to orient the virtual screen to the user. Specifically, the rotating member 308 is divided into a carrier 305 and a position controller 306, wherein the carrier 305 is used to carry the display screen 304 and the image collector 303, and the position controller 306 is used to move under the control of the rotation instruction. The bearing member 305 simultaneously bears the display screen 304 and the image collector 303, so that the directions of the display screen 304 and the image collector 303 are always synchronous, and a more accurate rotation instruction can be generated conveniently. Of course, in order to realize that the orientations of the display screen 304 and the image collector 303 are different, another possible implementation manner is that the carrier 305 is only used for carrying the display screen 304, and since the image collector 303 can be generally under the control of the processor 307, the rotation of the image collector 303 can be controlled by the processor 307 according to the rotation instruction.
When the rotating member 308 rotates the display screen 304 according to a rotation instruction, the rotatable angle range is 0 degrees to 175 degrees in the horizontal rotation angle range and 0 degrees to 15 degrees in the vertical rotation angle range. The position controller 306 rotates the display screen 304 according to the rotation instruction, and then the display screen 304 displays the displayed content on the external virtual screen 301 of the intelligent sound box according to the negative refraction lens 302, so that in order to enable the virtual screen 301 to have a better imaging effect, an included angle between the display screen 304 and the negative refraction lens 302 needs to be controlled to be 30-45 degrees.
Fig. 4 is a schematic diagram of a position controller according to an embodiment of the present invention. The position controller may be a steering engine, or may be other controllers, which are not limited herein. The working principle of the steering engine is described below by taking the steering engine HS311 as an example. The steering engine HS311 is internally composed of a motor, a gear, a circuit board chip and a swing arm position controller. The steering engine HS311 operates on the principle that when a steering engine receives a signal, a circuit in the steering engine HS311 generates a reference signal, so that a difference value between the internal voltage and the external voltage of the steering engine HS311 is obtained, the rotation direction of the steering engine HS311 is determined by the positive and negative of the difference value, and the rotation angle of the steering engine HS311 is determined by the absolute value of the difference value. Specifically, the difference value is that the regular steering engine HS311 rotates anticlockwise, and the steering engine HS311 rotates clockwise if the difference value is negative. The motor inside steering wheel HS311 drives the gear rotation to drive steering wheel HS311 swing arm's rotation, the inside position detector of steering wheel HS311 confirms whether steering wheel HS311 reaches the assigned position through judging swing arm pivoted angle size. The horizontal rotation angle of the steering engine HS311 reaches 175 degrees, the vertical rotation angle reaches 150 degrees, the working voltage is 4.8V-6.0V, the rotating speed is 60 degrees/0.19 s when the steering engine voltage is 4.8V, and the rotating speed is 60 degrees/0.15 s when the steering engine voltage is 6.0V.
Fig. 5 is a schematic diagram of a processor according to an embodiment of the present invention. The processor may be a single chip microcomputer, or may be other chips, which is not limited herein. Take Arduino single-chip microcomputer as an example. The Arduino singlechip is a common hardware interface expansion platform, can realize simple I/O operation, and generates an electric signal to be transmitted to the steering engine to drive the steering engine to rotate. The working principle of the Arduino singlechip is that information of up-down rotation is transmitted to a steering engine through a General-purpose input/output (GPIO) 10 port, information of left-right rotation is transmitted to the steering engine through a GPIO 9 port, and the steering engine rotates by a corresponding angle and direction after receiving signals. The No. 9 port controls the steering engine to rotate left and right (namely, rotate horizontally), and the No. 10 port controls the steering engine to rotate up and down (namely, rotate vertically).
According to the mutual cooperation among the components of the intelligent sound box, the virtual screen of the intelligent sound box can change along with the change of the face orientation and always faces the face, and the position relationship between the virtual screen and the face is introduced.
Fig. 6a is a schematic diagram illustrating the position of a virtual screen in an intelligent sound box according to an embodiment of the present invention. When the initial position of the user is the position B, the virtual screen in the sound box faces to the position B where the user is located, and when the user moves from the position B to the position a, as shown in fig. 6B, a schematic diagram of the rotation position of the virtual screen is provided in the embodiment of the present invention. The virtual screen in the intelligent sound box can rotate along with the change of the position of the user, and finally the virtual screen of the intelligent sound box faces to the position A where the user is located.
Fig. 7 is a schematic flow chart of a display method of an intelligent sound box according to an embodiment of the present invention. The method comprises the following steps:
and step 701, performing face recognition and determining face orientation based on the picture information acquired by the image acquisition device.
In the embodiment of the invention, face recognition is performed through a face recognition model according to the picture information acquired by the image acquisition device, and the orientation of the face is determined.
Step 702, determining the change of the face orientation through the collected pieces of picture information.
In the embodiment of the invention, when the orientation of the face is changed, the centroid coordinates of the face and the color probability model in the image information can be determined through the collected plurality of pieces of image information, and the centroid coordinates of the changed face can be determined according to the color probability model and the centroid coordinates of the face, wherein the centroid coordinates can be used for indicating the orientation of the face.
In step 703, a rotation command for controlling the rotation member is determined according to the change of the face orientation.
In the embodiment of the invention, the change of the face orientation is determined, and the processor sends the rotating instruction to the rotating component so that the rotating component rotates according to the face orientation change.
And step 704, enabling the virtual screen projected by the display screen through the negative refraction lens to face the user according to the rotation instruction.
In the embodiment of the invention, the rotating component rotates the display screen according to the rotating instruction, so that the display screen faces to a user through the virtual screen of the negative refraction lens.
As can be seen from the steps 701 to 704, by acquiring the centroid coordinates of the face, the changing direction of the face can be obtained according to the change of the centroid coordinates of the face, and the direction of the virtual screen is determined according to the changing direction of the face, so as to realize the direction of the virtual screen to the face.
According to the invention, the orientation of the virtual screen in the intelligent sound box is determined according to the orientation of the human face, so that the display screen is rotated, and the virtual screen always faces the human face. The orientation of the face in the acquired picture information needs to be determined first, and then the display screen is rotated according to the change of the orientation of the face so that the virtual screen faces the face more accurately.
Fig. 8 is a schematic flow chart of a method for training face recognition according to an embodiment of the present invention. The method comprises the following steps:
step 801, sample data is obtained through a face database, face information of a front face and a side face and a non-face picture downloaded from the internet.
In the embodiment of the invention, the sample data is generally face sample data selected from an augmented reality (Augmented Reality, AR) and observation distance limit (Observed Range Limit, ORL) face database, and any non-face sample data downloaded from the internet, wherein the selected face image comprises different poses, ornaments (glasses), expressions and the like. In order to more accurately perform face recognition, the face data of multiple angles is collected, and the face data needs to be collected through other ways, which can be human face data collection, or other ways, without limitation. The collected data are divided into a front face and a side face, wherein the angle range of the side face is [ -90 degrees, 90 degrees ], and the collected front face and side face data are 500 pieces respectively for achieving a better effect.
Step 802, preprocessing a picture.
In the embodiment of the invention, the picture is preprocessed according to the face image of the sample data, specifically, the picture is subjected to noise reduction, scale normalization to 20 x 20 size, and then the picture is subjected to operations such as color conversion, brightness conversion and the like.
Step 803, face information is identified from the picture according to the classifier a, the classifier B and the classifier C.
In the embodiment of the invention, a plurality of classifiers are trained by adopting an AdaBoost algorithm, wherein each classifier can classify human faces and non-human faces, classification errors possibly occur due to weak classification capability of a single classifier, for example, the AdaBoost algorithm only trains a classifier A, a picture M has a human face, but no human face information is recognized due to errors of the classifier A, and the picture M is judged to be a non-human face picture. However, when the AdaBoost algorithm trains the classifier a and the classifier B, wherein the picture M has a face, the classifier a does not recognize the face information in the picture M, but the classifier B recognizes the face information in the picture M, and the result obtained by connecting the classifier a and the classifier B in series is that the picture M is a face picture. Therefore, when a plurality of classifiers are connected in series, the classifier can be changed into a stronger classifier, and the failure rate of the classifier is reduced. Steps 801 to 803 described above are processes of training the classifier to recognize the face by the sample data.
Through the steps 801 to 803, it can be seen that a stronger classifier is obtained through the series connection of a plurality of classifiers, so that the face and the non-face in the picture can be accurately distinguished, and the face orientation can be accurately determined later.
According to the above-mentioned methods for training face recognition in steps 801 to 803, the obtained classifier can be applied to the face recognition of the acquired picture information.
Fig. 9 is a schematic flow chart of a method for determining a face orientation according to an embodiment of the present invention. The method comprises the following steps:
step 901, determining a detection window;
because the collected picture information is mostly bigger, the picture information needs to be subjected to partition identification, in the embodiment of the invention, the identification area is determined through the detection window, and the detection window is changed in a preset mode, so that the whole picture information is subjected to face detection.
Step 902, judging whether face information is detected according to the classifier, if so, executing step 903, and if not, executing step 904.
In the embodiment of the invention, whether the face information exists in the picture in the detection window can be determined through the classifier obtained through the training, if so, the position of the detection window is the position of the face, otherwise, the fact that the face information does not exist in the detection window is indicated, and the detection window needs to be moved to re-detect the face information.
Step 903, calculate HSV color probability and determine the centroid coordinates of the face.
According to the face recognition stage, face information is obtained, wherein the face information comprises the position of a window where a face is located and the size of the window, an HSV color probability model is built according to the face information, and the centroid coordinates of the face are determined according to the calculation mode of the plane centroid. The HSV color probability can quickly find the approximate position of the face after the face orientation is changed, and the centroid coordinate of face initialization can determine the approximate position based on the HSV color probability, so that the face orientation can be accurately positioned at the specific position. Constructing an HSV color probability model firstly needs to map RGB information in face information into HSV space (RGB refers to three color components of R-red, G-green and B-blue, and HSV refers to three information components of H-chroma, S-saturation and V-brightness). And taking the detected face area as an initial value, and calculating the HSV color probability according to a formula 1.
P (w) =c (w)/C (h) formula 1
Where w is a certain pixel value, h is a total pixel value in the region, P (w) is a probability value of a corresponding color of the detection region, and C represents the number of pixels.
Step 904, determining whether the detection window has detected the whole picture, if so, executing step 906, and if not, executing step 905.
In the embodiment of the invention, when the detection window does not detect the face information, firstly, whether the detection window has detected the whole picture is required to be judged, and if the detection window has not detected the whole picture, the detection window is required to be moved by setting a moving mode. If the whole picture is detected or no face information is detected, the size of the detection window is required to be changed, and the whole picture is detected again.
In step 905, the detection window is moved according to the set movement mode.
In the embodiment of the present invention, for example, the size of the picture N is 900×900, in order to determine the position of the face in the picture N, the whole picture N needs to be scanned, assuming that the size of the initialization detection window is 30×30, moving according to the set step length of the window until the picture is traversed according to the set moving mode, for example, the set moving mode is that the picture is firstly moved from left to right, then moved downwards by one step length, and then moved from left to right, so that the whole picture is traversed in such a way.
Step 906, determining whether the detection window is the maximum value, if so, executing step 907, and if not, executing step 908.
In the embodiment of the invention, when the detection window has detected no face information in the whole picture and the detection window is the maximum value of the detection window, the fact that no face information exists in the picture is indicated, and the detection of the picture is finished; when the detection window has finished detecting the whole picture, no face information is detected, and the detection window does not reach the maximum value of the detection window, the size of the detection window needs to be changed, and the picture is re-detected.
Step 907, end.
In the embodiment of the invention, when the detection window is the maximum value of the detection window and no face information is detected after the whole picture is detected, the detection of the picture is finished.
Step 908, the size of the detection window is changed to re-detect the picture.
In the embodiment of the invention, the size of the window needs to be continuously adjusted due to different proportions of the faces in the image. Firstly, initializing the size w multiplied by h of a detection window, a transformation coefficient k, the step length s of window scanning and the maximum value p of the window; and traversing the image according to the set step length s and window size, recording the position and the size of the window of the human face if the human face is detected, and expanding the detection window (the window is changed into k multiplied by w multiplied by h) if the human face is not detected until the window reaches the maximum value of the window.
According to the steps 901 to 908, it can be seen that the color probability model is constructed according to the image information of the face and the centroid coordinates of the face are determined according to the calculation mode of the plane centroid, so that the subsequent face orientation can be changed, the approximate position of the window where the face is located can be quickly determined according to the color probability model, the specific position of the window where the face is located can be accurately determined according to the HSV color probability model by the centroid coordinates of the face, and new face information can be quickly and accurately found.
The above steps 901 to 908 determine the orientation of the face, and then the determination of the orientation of the virtual screen according to the change in the orientation of the face will be described below.
Fig. 10 is a schematic flow chart of determining the virtual screen orientation according to an embodiment of the present invention. The method comprises the following steps:
in step 1001, a first centroid coordinate of a face at a first moment is obtained.
In the embodiment of the present invention, the first centroid coordinate of the face at the first moment is determined according to the step 903.
Step 1002, a detection window is moved.
In the embodiment of the invention, when the orientation of the face changes at the second moment, the detection window needs to be moved, and the second barycenter coordinate of the moved face is found according to the color probability model.
Step 1003, determining whether the second centroid coordinates of the moved face overlap with the first centroid coordinates, if so, executing step 1004, and if not, executing step 1005.
In the embodiment of the invention, the second centroid coordinate is compared with the first centroid coordinate, and if the second centroid coordinate is coincident with the first centroid coordinate, the centroid coordinate of the face at the second moment is the second centroid coordinate. And if the second centroid coordinate is not coincident with the first centroid coordinate, indicating that the second centroid coordinate is not the centroid coordinate of the face at the second moment.
For example, in the image acquired at time T0, the first centroid coordinates of the face are (x 1, y) 1 ) At the time T1, the direction of the face changes, and then in the picture acquired at the time T1, the centroid coordinate of the face also changes, so that in order to determine the centroid coordinate of the face at the time T1, according to the color probability model of the face at the time T0, a detection window is moved, a second centroid coordinate is found, and when the first centroid coordinate in the detection window coincides with the second centroid coordinate in position, the second centroid coordinate is the centroid coordinate of the face at the time T1.
Step 1004, determining that the picture in the detection window is the face at the second moment.
In the embodiment of the invention, the barycenter coordinates of the face at the second moment are determined, the processor can generate a rotation instruction according to the second barycenter coordinates and send the rotation instruction to the rotating component, and the rotating component rotates according to the rotation instruction, wherein the bearing component in the rotating component bears the display screen, the negative refraction lens and the image collector, and the three components rotate along with the rotation of the rotating component, so that after the barycenter coordinates change, the positions of the display screen and the negative refraction lens also change corresponding to the change of the barycenter coordinates, and the virtual screen projected by the negative refraction lens faces the barycenter coordinates, namely towards a user. In order to enable the image collector to collect face information more accurately, the image collector rotates along with the rotation of the rotating part and always faces the user.
Step 1005, the detection window is moved again.
In the embodiment of the invention, when the second centroid coordinate is not coincident with the first centroid coordinate, the face information found by the detection window is incorrect, and the detection window is required to be moved again for detection until the found second centroid coordinate is coincident with the first centroid coordinate.
According to the method, when the face orientation of the user changes, the face orientation after the user changes can be found more quickly, and a rotation instruction is generated according to the face orientation after the user changes, so that the virtual screen changes according to the face orientation of the user and always keeps towards the user.
It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks. It will be apparent to those skilled in the art that various modifications and variations can be made in the present application without departing from the spirit or scope of the application. Thus, if such modifications and variations of the present application fall within the scope of the claims and the equivalents thereof, the present application is intended to cover such modifications and variations.

Claims (10)

1. An intelligent sound box, which is characterized by comprising:
the display screen is used for displaying the content played by the intelligent sound box;
the optical component is used for projecting the display content of the display screen to the outside of the intelligent sound box to form a virtual screen;
the image collector is used for collecting picture information of an external environment;
the processor is used for determining the face orientation of the user from the picture information and generating a rotation instruction according to the change of the face orientation;
and the rotating component is used for rotating under the control of the rotating instruction so as to enable the virtual screen to face the user.
2. The intelligent sound box according to claim 1, wherein the processor is specifically configured to generate a horizontal rotation instruction and/or a vertical rotation instruction according to a change of a face orientation;
the rotating component is used for rotating under the control of the horizontal rotating instruction and/or the vertical rotating instruction so as to enable the virtual screen to face the user.
3. The intelligent sound box according to claim 1, wherein the rotating component comprises a bearing and a position controller;
the bearing piece is used for bearing the display screen and the image collector;
the position controller is used for moving under the control of the rotation instruction.
4. The intelligent sound box according to claim 1, wherein the optical component is horizontally disposed on top of the intelligent sound box;
the included angle between the display screen and the optical component is 30-45 degrees;
the rotation angle of the rotating part is in the range of 0-175 degrees of horizontal rotation angle and in the range of 0-15 degrees of vertical rotation angle.
5. The intelligent sound box according to any one of claims 1 to 4, wherein the processor is further configured to control the rotation of the image collector according to the rotation instruction; the orientation of the image collector is consistent with the orientation of the virtual screen.
6. The display method of the intelligent sound box is characterized by comprising the following steps:
performing face recognition and determining face orientation based on the picture information acquired by the image acquisition device of the intelligent sound box;
determining the change of the face orientation through the collected pieces of picture information;
and determining a rotation instruction for controlling the rotating component of the intelligent sound box according to the change of the face orientation, wherein the rotation instruction is used for enabling the virtual screen projected by the display screen of the intelligent sound box through the optical component of the intelligent sound box to face to a user.
7. The method of claim 6, wherein recognizing the face and determining the face orientation based on the picture information acquired by the image acquirer, comprises:
performing face recognition based on the picture information acquired by the image acquisition device to obtain face information;
determining hue, saturation, brightness HSV color probability model and centroid coordinates of the face according to the face information; the centroid coordinates are used for indicating the face orientation.
8. The method of claim 7, wherein determining the change in the orientation of the face from the acquired plurality of pieces of picture information comprises:
determining the change of the face orientation based on the first centroid coordinates of the first face identified by the first picture information and the second centroid coordinates of the second face identified by the second picture information; the first picture information and the second picture information are picture information acquired by the image acquirer at different moments within a set time length.
9. The method of claim 8, wherein the time of acquisition of the second picture information is located after the time of acquisition of the first picture information;
the second face is obtained by the following steps:
moving by using a detection window where the first face is positioned;
determining whether the barycenter coordinates in the moved detection window are the same as the first barycenter coordinates, and if not, continuing to move;
and determining the picture in the detection window which is the same as the first centroid coordinate as a second face.
10. The method of claim 9, wherein the first picture information is acquired first picture information having a face, the first face being obtained by:
performing face recognition on the picture information in the detection window in the first picture information according to the set detection window;
if the face is not recognized, expanding the detection window or moving the detection window according to a set mode until the face is recognized or the first picture information is detected.
CN202111383428.5A 2021-11-22 2021-11-22 Intelligent sound box and display method thereof Pending CN116156386A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111383428.5A CN116156386A (en) 2021-11-22 2021-11-22 Intelligent sound box and display method thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111383428.5A CN116156386A (en) 2021-11-22 2021-11-22 Intelligent sound box and display method thereof

Publications (1)

Publication Number Publication Date
CN116156386A true CN116156386A (en) 2023-05-23

Family

ID=86354877

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111383428.5A Pending CN116156386A (en) 2021-11-22 2021-11-22 Intelligent sound box and display method thereof

Country Status (1)

Country Link
CN (1) CN116156386A (en)

Similar Documents

Publication Publication Date Title
US20220197479A1 (en) Changing a presentation property of a dynamic interactive object
Memo et al. Head-mounted gesture controlled interface for human-computer interaction
CN109583285B (en) Object recognition method
CN106778477B (en) Tennis racket action recognition method and device
US9491441B2 (en) Method to extend laser depth map range
US8660362B2 (en) Combined depth filtering and super resolution
US5704836A (en) Motion-based command generation technology
US20180335840A1 (en) Eye tracking method, electronic device, and non-transitory computer readable storage medium
US20170024893A1 (en) Scene analysis for improved eye tracking
CN110647865A (en) Face gesture recognition method, device, equipment and storage medium
JP4768196B2 (en) Apparatus and method for pointing a target by image processing without performing three-dimensional modeling
CN102542566B (en) Orienting the position of a sensor
US20110164032A1 (en) Three-Dimensional User Interface
JP2015532739A (en) Expansion of tangible objects such as user interface controllers
WO2017092679A1 (en) Eyeball tracking method and apparatus, and device
CN107958479A (en) A kind of mobile terminal 3D faces augmented reality implementation method
CN110610453A (en) Image processing method and device and computer readable storage medium
US11461980B2 (en) Methods and systems for providing a tutorial for graphic manipulation of objects including real-time scanning in an augmented reality
US20210331628A1 (en) A-pillar display device, a-pillar display method, and non-transitory medium
US20190369807A1 (en) Information processing device, information processing method, and program
CN112017212A (en) Training and tracking method and system of face key point tracking model
JP2002366958A (en) Method and device for recognizing image
Vidhate et al. Virtual paint application by hand gesture recognition system
CN116156386A (en) Intelligent sound box and display method thereof
CN111507139A (en) Image effect generation method and device and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination