WO2024114170A1 - 场景感知方法、设备及存储介质 - Google Patents

场景感知方法、设备及存储介质 Download PDF

Info

Publication number
WO2024114170A1
WO2024114170A1 PCT/CN2023/125982 CN2023125982W WO2024114170A1 WO 2024114170 A1 WO2024114170 A1 WO 2024114170A1 CN 2023125982 W CN2023125982 W CN 2023125982W WO 2024114170 A1 WO2024114170 A1 WO 2024114170A1
Authority
WO
WIPO (PCT)
Prior art keywords
electronic device
camera
scene
mode
image
Prior art date
Application number
PCT/CN2023/125982
Other languages
English (en)
French (fr)
Other versions
WO2024114170A9 (zh
Inventor
李经纬
Original Assignee
荣耀终端有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 荣耀终端有限公司 filed Critical 荣耀终端有限公司
Publication of WO2024114170A1 publication Critical patent/WO2024114170A1/zh
Publication of WO2024114170A9 publication Critical patent/WO2024114170A9/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/18Eye characteristics, e.g. of the iris
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/61Control of cameras or camera modules based on recognised objects
    • H04N23/611Control of cameras or camera modules based on recognised objects where the recognised objects include parts of the human body
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/667Camera operation mode switching, e.g. between still and video, sport and normal or high- and low-resolution modes

Definitions

  • the present application relates to the field of terminal technology, and in particular to a scene perception method, device and storage medium.
  • users can use smart terminals anytime and anywhere. Users usually have different usage requirements for different usage scenarios. For example, users in conference rooms and classrooms usually need to mute or lower the volume. For example, users in bus stations and subway stations usually need to increase the volume or vibration intensity.
  • the embodiments of the present application provide a scene perception method, device and storage medium to achieve intelligent scene perception and scene control, thereby improving the user experience.
  • an embodiment of the present application proposes a scene perception method, which is applied to an electronic device, in which a camera of the electronic device operates in a first mode; the electronic device obtains a first image captured by the camera in the first mode, and detects whether there is a person's face in the first image; if the electronic device detects a person's face in the first image, the camera is controlled to switch from the first mode to the second mode; the electronic device obtains detection data, and the detection data includes a second image captured by the camera in the second mode; the electronic device recognizes the second image, determines the scene category in which the person uses the electronic device based on the second image, and controls the execution of a preset operation corresponding to the scene category.
  • the camera captures the first image in the first mode. If a person's face is detected in the image, it switches to the second mode, captures the second image in the second mode, and determines the current scene category of the electronic device by identifying the second image, and then performs operations corresponding to the scene category, thereby realizing intelligent scene perception and scene control, and improving the user's experience of using the device.
  • the power consumption of the device can be reduced to a certain extent.
  • a resolution of the first image is smaller than a resolution of the second image, and/or a frame rate of the first image is smaller than a frame rate of the second image.
  • the method before the camera of the electronic device operates in the first mode, the method further includes: the electronic device responds to a first operation of turning on a scene perception function.
  • the conditions for turning on the camera of the device are defined to trigger the device to execute the solution of intelligent perception scene category.
  • the method before the camera of the electronic device operates in the first mode, the method further Including: detecting that the state of the electronic device satisfies a first condition; the first condition includes at least one of the following: the screen state of the electronic device is a bright screen state; the electronic device is unlocked; the time difference between a light signal emitted by a proximity light sensor of the electronic device and a reflected signal of the light signal is greater than a first threshold, and/or the signal strength of the reflected signal is less than a second threshold, and/or the receiving light sensor does not receive the reflected signal; the detection data of the ambient light sensor of the electronic device is greater than a third threshold; the screen of the electronic device is facing a preset direction; the electronic device is in a moving state.
  • the first condition includes at least one of the following: the screen state of the electronic device is a bright screen state; the electronic device is unlocked; the time difference between a light signal emitted by a proximity light sensor of the electronic device and a reflected signal of the light signal
  • the conditions for turning on the camera of the device are further limited.
  • the first condition is added to prevent the camera from continuously collecting images when it is not necessary, thereby reducing the power consumption of the device.
  • it can be understood as any scene in which the camera cannot collect the face of a person.
  • the electronic device is a foldable device, the foldable device includes an inner screen and an outer screen, the inner screen is correspondingly provided with a first camera, and the outer screen is correspondingly provided with a second camera; the camera of the electronic device operates in a first mode, including: detecting that the outer screen of the electronic device is in a bright screen state and the electronic device is in a folded state, and controlling the second camera to operate in the first mode; or, detecting that the inner screen of the electronic device is in a bright screen state and the electronic device is in an unfolded state, and controlling the first camera to operate in the first mode.
  • the method before controlling the first camera to operate in the first mode, or controlling the second camera to operate in the first mode, the method also includes: detecting that the state of the electronic device satisfies a second condition; the second condition includes at least one of the following: the electronic device is unlocked; the time difference between a light signal emitted by a proximity light sensor of the electronic device and a reflected signal of the light signal is greater than a first threshold, and/or, the signal strength of the reflected signal is less than a second threshold, and/or, the receiving light sensor does not receive the reflected signal; the detection data of the ambient light sensor of the electronic device is greater than a third threshold; the inner screen or outer screen of the electronic device is facing a preset direction; the electronic device is in a moving state.
  • the second condition includes at least one of the following: the electronic device is unlocked; the time difference between a light signal emitted by a proximity light sensor of the electronic device and a reflected signal of the light signal is greater than a first threshold, and/or,
  • the conditions for turning on the camera of a foldable device are further limited.
  • a second condition is added to avoid the camera from continuously collecting the first image when it is not necessary, thereby reducing the power consumption of the device.
  • the method also includes: when the second camera is operating in the first mode, if it is detected that the electronic device changes from a folded state to an unfolded state, the electronic device controls the first camera to operate in the first mode and turns off the second camera; or, the electronic device controls the first camera to operate in the first mode; when the first camera is operating in the first mode, if it is detected that the electronic device changes from an unfolded state to a folded state, the electronic device controls the first camera to turn off and controls the second camera to operate in the first mode.
  • the first image can be continuously captured by switching the camera so that the device screen can also realize the function of intelligent scene perception in the new physical state.
  • the detection data further includes time data
  • the method further includes: if it is determined that the time data is within a preset time period, using a preset scene category corresponding to the preset time period as the scene category of the electronic device.
  • the possible scene categories of the device in the current time period can be obtained to assist the device in perceiving the scene.
  • the detection data further includes location data
  • the method further includes: if it is determined that the location data is within a preset location range, using a preset scene category corresponding to the preset location range as a scene category of the electronic device. Don't.
  • the possible scene categories of the device at the current location can be obtained to assist the device in perceiving the scene.
  • the possible scene category of the device in the current environment can be obtained to assist the device in perceiving the scene.
  • the detection data also includes data of a first sensor of the electronic device, and the first sensor includes a gyroscope sensor and an acceleration sensor; the method also includes: the electronic device determines the scene category of the electronic device based on the second image in the detection data and the data of the first sensor.
  • the electronic device determines the scene category of the electronic device based on the second image in the detection data and the data of the first sensor, including: if it is determined based on the data of the first sensor that the user is in motion, and it is determined based on the second image that the user continues to look at the screen of the electronic device, the scene category of the electronic device is determined to be a third scene; the motion state includes walking or riding.
  • the third scene can be a scene of walking or riding and looking at the screen, which is a scene of unsafe use of the electronic device.
  • the user's movement state and the user's eye state are detected respectively, so as to determine whether the user is in a scenario where it is unsafe to use the electronic device, and realize the ability to perceive the scenario.
  • the detection data further includes data of a second sensor of the electronic device, and the second sensor includes an ambient light sensor; the method further includes: if it is determined that the data of the second sensor is less than a fourth threshold, determining that the scene category of the electronic device is a fourth scene.
  • the fourth scene can be a dark environment scene, such as a bedroom or sleeping scene.
  • this solution by detecting the ambient light data of the electronic device's environment, it can be determined whether the user is in a dark environment to assist the device in perceiving the scene.
  • This solution can also be combined with the electronic device's clock information, location information, etc. to improve the accuracy of the device's perception of the scene.
  • the preset operations include at least one of the following: adjusting the volume; adjusting the screen brightness; adjusting the screen blue light; adjusting the vibration intensity; sending a first message, the first message is used to remind the user to stop using the electronic device; sending a second message, the second information is used to recommend content corresponding to the scene category; turning on the rear camera to detect obstacles.
  • the method also includes: the electronic device obtains a third image captured by the rear camera in the second mode; if an obstacle is identified in the third image, a third information is sent, and the third information is used to remind the user to avoid the obstacle.
  • This solution is mainly aimed at the third scenario mentioned above.
  • the user can be reminded to avoid obstacles in time, thereby improving the user's experience.
  • the camera of the electronic device operates in a first mode, including: a perception module of the electronic device sends a first indication to a second processing module of the electronic device, the first indication being used to instruct the second processing module to detect whether there is a human face within the camera range; the second processing module sends a first shooting instruction to the camera; and the camera operates in the first mode in response to the first shooting instruction.
  • the electronic device obtains a first image captured by the camera in the first mode, and detects whether there is a human face in the first image, including: a second processing module of the electronic device obtains the first image captured by the camera in the first mode, and detects whether there is a human face in the first image.
  • the camera is controlled to switch from the first mode to the second mode, including: if the second processing module of the electronic device detects that there is a human face in the first image, the second processing module sends a first message to the perception module of the electronic device, and the first message is used to notify the perception module that there is a human face within the camera range; the perception module sends a second indication to the first processing module of the electronic device, and the second indication is used to instruct the first processing module to identify the category of the scene within the camera range; the first processing module sends a second shooting instruction to the camera in response to the second indication, and the second shooting instruction is used to instruct the camera to operate in the second mode.
  • the electronic device recognizes a second image, determines a scene category in which a person uses the electronic device based on the second image, and controls execution of a preset operation corresponding to the scene category, including: a first processing module of the electronic device recognizes the second image, determines the scene category of the electronic device based on the second image, and sends a second message to a perception module of the electronic device, the second message is used to indicate the scene category of the electronic device; the perception module sends a third indication to a target application of the electronic device, the third indication is used to indicate the scene category of the electronic device; the target application controls execution of a preset operation corresponding to the scene category.
  • the second processing module of the electronic device detects a state of the electronic device.
  • the above-mentioned optional embodiments illustrate the interaction process between the underlying modules of the electronic device to achieve intelligent scene perception and scene control of the device.
  • an embodiment of the present application provides an electronic device, comprising: a camera, a memory and a processor, wherein the camera is used to capture images with different frame rates and/or resolutions, and the processor is used to call a computer program in the memory to execute a method as described in any one of the first aspects.
  • the processor includes a first processing module and a second processing module, the power consumption of the first processing module is higher than that of the second processing module; the second processing module is used to detect whether there is a human face in a first image captured by the camera in a first mode; the first processing module is used to identify a second image captured by the camera in the second mode, and determine the scene category in which the user uses the electronic device.
  • a second processing module with lower power consumption is used to detect whether there is a human face within the camera range, and a first processing module with higher power consumption is used to identify the current scene category of the electronic device, thereby optimizing the processing performance of the electronic device.
  • an embodiment of the present application provides an electronic device, the electronic device comprising a unit, a module or a circuit for executing any method as described in the first aspect.
  • an embodiment of the present application provides a computer-readable storage medium, which stores computer instructions.
  • the computer instructions When the computer instructions are executed on an electronic device, the electronic device executes the method described in any one of the first aspects.
  • an embodiment of the present application provides a chip, the chip includes a processor, the processor is used to call the memory A computer program in which the method described in any one of the first aspects is executed.
  • a computer program product comprises a computer program, and when the computer program is executed, the computer executes the method as described in any one of the first aspect.
  • FIG1 is a schematic diagram of a scenario provided in an embodiment of the present application.
  • FIG2 is a schematic diagram of an interface provided in an embodiment of the present application.
  • FIG3 is a schematic diagram of a scenario provided in an embodiment of the present application.
  • FIG4 is a schematic diagram of a scenario provided in an embodiment of the present application.
  • FIG5 is a schematic diagram of the structure of an electronic device provided in an embodiment of the present application.
  • FIG6 is a schematic diagram of the structure of an electronic device provided in an embodiment of the present application.
  • FIG7 is a schematic diagram of the structure of a SoC provided in an embodiment of the present application.
  • FIG8 is a schematic diagram of the structure of an electronic device provided in an embodiment of the present application.
  • FIG9 is a schematic diagram of a flow chart of a scene perception method provided in an embodiment of the present application.
  • FIG10 is a schematic diagram of a flow chart of a scene perception method provided in an embodiment of the present application.
  • FIG11 is a schematic diagram of the structure of an electronic device provided in an embodiment of the present application.
  • FIG12 is a schematic diagram of the structure of a chip provided in an embodiment of the present application.
  • FIG13 is a schematic diagram of the structure of a folding screen mobile phone provided in an embodiment of the present application.
  • FIG14 is a flow chart of a scene perception method provided in an embodiment of the present application.
  • words such as “first” and “second” are used to distinguish between identical or similar items with substantially identical functions and effects.
  • the first image and the second image are only used to distinguish between images with different frame rates and/or resolutions, and their order is not limited.
  • the first indication and the second indication are only used to distinguish between different indications.
  • At least one refers to one or more
  • plural refers to two or more.
  • “And/or” describes the association relationship of associated objects, indicating that three relationships may exist.
  • a and/or B can represent: A exists alone, A and B exist at the same time, and B exists alone, where A and B can be singular or plural.
  • the character “/” generally indicates that the previous and next associated objects are in an “or” relationship.
  • “At least one of the following (kind/piece)” or similar expressions refers to any combination of these items, including any combination of single items (kind/piece) or plural items (kind/piece).
  • at least one of a, b or c (kind/piece) can represent: a, b, c, ab, ac, bc, or abc, where a, b, c can be single or multiple.
  • Frame rate refers to the number of images captured or transmitted by a camera in one second, usually expressed in fps (frames per second).
  • the camera uses a first frame rate to capture/transmit images in the first mode and a second frame rate to capture/transmit images in the second mode, and the first frame rate is less than the second frame rate.
  • the resolution of the image captured in the first mode is less than the resolution of the image captured in the second mode.
  • Resolution also known as image resolution, refers to the amount of information stored in an image, which is the number of pixels per inch of the image.
  • the units of resolution include: dpi (dots per inch), ppi (pixels per inch), etc.
  • Geo-fencing is an application of location-based services (LBS), which uses a virtual fence to enclose a virtual geographic boundary.
  • LBS location-based services
  • the electronic device can receive automatic notifications, warnings, and other information prompts.
  • the electronic device can also automatically set system-related parameters, such as volume, vibration intensity, etc.
  • Geo-fencing can have different names based on different scenarios. For example, a geo-fence near a subway station can be called a subway fence, a geo-fence near an office building can be called an office fence, and a geo-fence near a teaching building can be called a classroom fence.
  • the geo-fence can be a general fence in a public area, and the electronic device needs to obtain authorization from the user of the electronic device to detect whether to enter the geo-fence and perform operations related to the geo-fence.
  • a lightweight neural network is a lighter model that has performance that is no worse than that of a heavier model, thereby realizing a hardware-friendly neural network.
  • the weight here usually refers to the scale or number of parameters of the model.
  • Commonly used lightweight neural network technologies include: distillation, pruning, quantization, weight sharing, low-rank decomposition, lightweight attention module, dynamic network architecture/training method, lighter network architecture design, etc., which are not limited to this embodiment of the present application.
  • users use smart terminals in a variety of scenarios, and the user's usage needs in different usage scenarios are somewhat different. Users usually need to manually adjust the relevant parameters of the smart terminal to adapt to the current usage scenario. For example, in quieter scenes, such as conference rooms, classrooms, hospitals, etc., it is often necessary to set the mobile phone to silent or lower the volume. In noisier scenes, such as stations and subway stations, it is often necessary to increase the volume or vibration intensity. Based on this, how to improve the scene perception ability of smart terminals is an urgent problem to be solved.
  • the embodiments of the present application provide a scene perception method, an electronic device, and a storage medium.
  • the electronic device provided by the embodiments of the present application, when the conditions for turning on the camera are met, instructs the camera to work in the first mode, obtains the first image captured by the camera in the first mode, and when it is detected that the first image contains a human face, instructs the camera to switch from the first mode to the second mode, and obtains the second image captured by the camera in the second mode, wherein the resolution and/or frame rate of the second image is greater than the first image.
  • the scene category in which the user uses the electronic device is determined, and the preset operation corresponding to the scene category is executed, thereby realizing automatic detection and identification of the use scene of the electronic device, so as to execute the preset operation corresponding to the use scene, such as adjusting the volume, vibration intensity, pushing information, etc., to improve the user experience.
  • FIG1 is a schematic diagram of a scenario provided by an embodiment of the present application.
  • the front camera of the mobile phone is triggered to capture a first image in the first mode, and when a face of a person is detected in the first image, The front camera and/or rear camera of the mobile phone is triggered to capture a second image in the second mode, and the scene category in which the user is currently using the mobile phone is determined by performing image analysis on the second image.
  • a preset operation corresponding to the scene category is performed, such as adjusting the volume, brightness, vibration intensity, sending a prompt message, etc.
  • the power consumption of the camera in the first mode is lower than that in the second mode.
  • the resolution of the first image is lower than that of the second image, and the frame rate of the first image is lower than that of the second image.
  • the first image captured by the front camera may be one or more images, and the second image captured by the front camera and/or the rear camera may be one or more images, and this embodiment does not impose any restrictions on this.
  • the condition for turning on the camera includes: the scene perception function has been turned on.
  • the camera of the mobile phone runs in the first mode (background operation, for example, the mobile phone currently opens a third-party application and the camera runs in the background) to capture the above-mentioned first image.
  • the first operation can be a click operation of the user in the system setting interface, and the first operation can also be a voice operation, which is not limited in the embodiments of the present application.
  • Figure 2 is a schematic diagram of an interface provided in an embodiment of the present application.
  • the user can choose to turn on or off the scene perception function in the setting interface of the system application.
  • the front camera of the mobile phone is triggered to capture the above-mentioned first image.
  • the user can also choose to turn on or off the scene perception function in the settings interface of the third-party application.
  • the front camera of the mobile phone is triggered to capture the above-mentioned first image.
  • the opened third-party application includes the user currently clicking on the third-party application, or the user opening the third-party application and the third-party application entering the background running state.
  • condition for turning on the camera further includes a first condition
  • first condition includes at least one of the following:
  • the screen status of the mobile phone is on; the mobile phone is unlocked; the time difference between the light signal emitted by the proximity light sensor of the mobile phone and the reflected signal of the light signal is greater than a first threshold, and/or the signal strength of the reflected signal is less than a second threshold, and/or the receiving light sensor does not receive the reflected signal; the detection data of the ambient light sensor of the mobile phone is greater than a third threshold; the screen of the mobile phone is facing a preset direction; the mobile phone is in a moving state.
  • the camera of the mobile phone in response to a first operation of turning on a scene perception function and satisfying a first condition, operates in a first mode to capture the above-mentioned first image.
  • the camera of the mobile phone is triggered to continuously collect images.
  • the mobile phone camera is prevented from continuously collecting images when it is not necessary, thereby further reducing the power consumption of the device.
  • the front camera of the mobile phone is triggered to capture the first image.
  • the screen state of the mobile phone is detected, and if the screen state of the mobile phone is a bright screen state, the front camera of the mobile phone is triggered to capture the first image.
  • the interface displayed in the bright screen state of the mobile phone includes, for example, a lock screen interface, a main interface, and a third-party application interface.
  • the mobile phone if it is detected that the user has turned on the scene perception function, it is detected whether the mobile phone is unlocked. If the mobile phone is unlocked, the front camera of the mobile phone is triggered to capture the first image.
  • the proximity light sensor of the mobile phone if it is detected that the user has turned on the scene perception function, the proximity light sensor of the mobile phone is detected. If it is determined that the light signal emitted by the proximity light sensor is not blocked, trigger the front camera of the mobile phone to capture the first image.
  • the time difference between a light signal emitted by a proximity light sensor of a mobile phone and a reflected signal of the light signal is greater than a first threshold, and/or the signal strength of the reflected signal is less than a second threshold, and/or the proximity light sensor does not receive the reflected signal, it can be determined that the light signal emitted by the proximity light sensor is not blocked.
  • the camera can stop continuously capturing images to reduce device power consumption.
  • the detection data mainly refers to the ambient light brightness. It should be understood that if the detection data of the ambient light sensor of the mobile phone is greater than the third threshold value, it means that the electronic device is not in a dark environment, such as the mobile phone is in a pocket, or it is currently at night.
  • the screen orientation of the mobile phone is detected, and if the screen of the mobile phone is facing a preset direction, the front camera of the mobile phone is triggered to capture the first image.
  • the preset direction can be understood as the direction in which the user uses the mobile phone, usually the direction in which the mobile phone screen faces the user, and the direction can be determined by detecting the posture data of the mobile phone, where the posture data includes the pitch angle, yaw angle, and roll angle.
  • the mobile phone if it is detected that the user has turned on the scene perception function, it is detected whether the mobile phone is in a moving state, and if it is determined that the mobile phone is in a moving state, the front camera of the mobile phone is triggered to capture the first image.
  • the mobile phone is in a moving state, for example, the user carries (including holds) the mobile phone while walking or riding, the user carries (including holds) the mobile phone while riding in a vehicle, etc.
  • the screen of the mobile phone is in a bright screen state, the mobile phone is unlocked, the light signal emitted by the proximity light sensor of the mobile phone is not blocked, the screen of the mobile phone is facing a preset direction, and the mobile phone is in a moving state, then the front camera of the mobile phone is triggered to capture the first image.
  • the above embodiment shows a scene perception method. If the conditions for turning on the camera are met, facial detection is triggered. The conditions for turning on the camera at least include that the scene perception function has been turned on. Facial detection is performed by acquiring a first image with a lower resolution. If a person's face is detected in the first image, a second image with a higher resolution is acquired. The scene category in which the user is currently using the mobile phone is determined by analyzing the second image, and then the preset operation corresponding to the scene category is executed.
  • the above method realizes the function of intelligent scene perception of the mobile phone. The user does not need to manually set system parameters such as volume and vibration intensity according to scene changes, which improves the user experience.
  • the mobile phone if it is detected that the user has turned on the scene perception function, it is detected whether the mobile phone has entered a preset geo-fence. If the mobile phone enters the preset geo-fence, the mobile phone can learn the scene corresponding to the geo-fence, and then perform the preset operation corresponding to the scene.
  • the preset geo-fence may include, for example, a subway fence, an office fence, a classroom fence, etc. Exemplarily, when it is detected that the mobile phone enters the subway fence, the mobile phone can learn that the user is about to enter the subway car, and the volume of the mobile phone can be turned up, or the vibration intensity can be increased.
  • the mobile phone When it is detected that the mobile phone enters the office fence, the mobile phone can learn that the user is about to enter the office, and the mobile phone can be set to silent, or the vibration intensity can be increased.
  • the facial detection of the above embodiment may not be performed, and only the current location of the mobile phone is used to determine whether it enters the preset geo-fence, so as to set the corresponding preset operation.
  • FIG3 is a schematic diagram of a scenario provided by an embodiment of the present application.
  • the condition for turning on the camera triggers the front camera of the mobile phone to capture the first image, and when it is detected that the first image contains a face of a person, the detection data is obtained.
  • the detection data includes at least one of image data, time data, location data, and voice data.
  • the scene category in which the user is currently using the mobile phone is determined. For example, in FIG3, the current scene is identified as a meeting or classroom scene, and a preset operation corresponding to the current scene can be executed, such as lowering the volume, or setting it to mute, or increasing the vibration intensity, etc.
  • the detection data includes a second image captured by the front camera and/or the rear camera, and the scenario category in which the user is currently using the mobile phone is determined based on the second image in the detection data, that is, the scenario category in which the user is currently using the mobile phone is determined by performing image analysis on the second image.
  • the second image is input into the scene detection model to obtain a first detection result output by the scene detection model, and the first detection result is used to indicate the scene category in which the user uses the mobile phone.
  • the scene detection model can be trained using a lightweight neural network model.
  • the training process of a scene detection model includes:
  • Step a construct a training set and a test set for the scene detection model.
  • the training set or the test set includes sample images and scene categories (sample annotations) corresponding to the sample images.
  • the sample images in the training set and the test set are different.
  • Step b Based on the initial scene detection model and the training set, the scene detection model is trained. Specifically, the sample images of the training set are used as the input of the initial scene detection model, and the scene categories corresponding to the sample images of the training set are used as the output of the initial scene detection model to train the scene detection model.
  • Step c Based on the scene detection model trained in step b and the test set, the prediction results of the scene detection model are verified. When the model loss function converges, the training of the scene detection model is stopped.
  • the detection data includes time data
  • the scene category in which the user currently uses the mobile phone is determined based on the time data in the detection data. As an example, if it is determined that the time data is within a preset time period, the preset scene category corresponding to the preset time period is used as the scene category in which the user uses the mobile phone.
  • the mobile phone can learn the meeting time and location. If the current moment is within the meeting time, the current scene can be determined to be a meeting scene. For example, by obtaining an electronic class schedule, the mobile phone can learn the class time and location. If the current moment is within the class time, the current scene can be determined to be a classroom scene.
  • the detection data includes location data
  • the scene category in which the user currently uses the mobile phone is determined based on the location data in the detection data.
  • the preset scene category corresponding to the preset location range is used as the scene category in which the user uses the mobile phone.
  • the detection data includes voice data
  • the scenario category in which the user is currently using the mobile phone is determined based on the voice data in the detection data.
  • the voice data contains one sound source or less than N sound sources
  • the scenario category in which the user uses the mobile phone is determined to be the first scenario, such as the meeting or classroom scenes in Figure 3.
  • the voice data contains more than M sound sources
  • the scenario category in which the user uses the mobile phone is determined to be the second scenario, such as the scene of a station or subway station.
  • N is a positive integer greater than or equal to 2
  • M is a positive integer greater than N.
  • the detection data includes image data (i.e., the second image), time data, location data, and voice data, and the above detection data are comprehensively analyzed to determine the scene category in which the user is currently using the mobile phone. Compared with the above various implementations, the scene category determined by this implementation is more accurate.
  • the above embodiment shows a scene perception method. If the camera is turned on, face detection is triggered. The camera is turned on at least when the scene perception function is turned on. If a face is detected, the detection data is obtained.
  • the method can realize the function of intelligent scene perception of mobile phones, and users do not need to manually set parameters such as system volume and vibration intensity according to scene changes, which improves the user experience.
  • FIG4 is a schematic diagram of a scenario provided by an embodiment of the present application.
  • the front camera of the mobile phone is triggered to capture a first image, and when it is detected that the first image contains a face of a person, detection data is obtained.
  • the detection data includes image data and data of a first sensor
  • the first sensor includes a gyroscope sensor and an acceleration sensor. Based on the image data in the detection data and the data of the first sensor, the scenario category in which the user uses the mobile phone is determined.
  • the current scene is a scene of walking and staring at the screen
  • a preset operation corresponding to the current scene can be executed, such as sending a first message to remind or suggest the user not to use the mobile phone.
  • the first message is sent by a pop-up window or voice.
  • the image data in the detection data includes a plurality of continuous second images captured by the front camera. If it is determined based on the data of the first sensor that the user is walking, and it is determined based on the plurality of continuous second images that the user is continuously staring at the mobile phone screen, the scenario category in which the user uses the mobile phone is determined to be the third scenario.
  • the third scenario is a scenario of walking and staring at the screen.
  • the image data in the detection data includes a plurality of continuous second images captured by the front camera. If it is determined based on the data of the first sensor that the user is in a riding state, and it is determined based on the plurality of continuous second images that the user is continuously staring at the mobile phone screen, the scenario category in which the user uses the mobile phone is determined to be the third scenario.
  • the third scenario is a scenario of riding and staring at the screen.
  • determining the motion state of the user based on the data of the first sensor includes at least one of the following:
  • the mobile phone posture data includes the pitch angle, yaw angle and roll angle of the mobile phone; obtain the mobile phone acceleration detected by the acceleration sensor, and determine the user's motion state based on the mobile phone acceleration.
  • determining that a user continues to gaze at a mobile phone screen based on multiple consecutive second images includes: inputting the multiple consecutive second images into a gaze detection model in sequence, obtaining a second detection result output by the gaze detection model, and the second detection result is used to indicate whether the user continues to gaze at the screen of the electronic device.
  • the gaze detection model can be trained based on a deep learning method using a lightweight neural network model.
  • the following operation can also be performed: turn on the rear camera to detect obstacles.
  • the mobile phone turns on the rear camera, the camera works in the second mode, and obtains a third image captured by the rear camera in the second mode.
  • obstacles are identified in the third image, such as steps, telephone poles, motor vehicles, potholes, etc.
  • a third message is sent, and the third message is used to remind the user to avoid obstacles.
  • the resolution of the third image is greater than the resolution of the first image
  • the frame rate of the third image is greater than the frame rate of the first image.
  • a target detection model can be used to determine whether there are obstacles in the third image.
  • the target detection model can be based on a deep learning method and trained using a lightweight neural network model.
  • the training process of the target detection model includes: step a, constructing a training set and a test set of the target detection model, wherein the training set or the test set includes sample images and annotation information of the sample images, and the annotation information is used to indicate whether there are obstacles in the sample images, and the sample images in the training set and the test set are different.
  • Step b training the target detection model based on the initial target detection model and the training set. Specifically, the sample images of the training set are used as the initial The target detection model is input, and the annotation information of the sample images of the training set is used as the output of the initial target detection model to train the target detection model.
  • Step c Based on the target detection model trained in step b and the test set, the prediction results of the target detection model are verified. When the model loss function converges, the training of the target detection model is stopped.
  • the above embodiment shows a scene perception method. If the conditions for turning on the camera are met, facial detection is triggered. The conditions for turning on the camera at least include that the scene perception function has been turned on. If a person's face is detected, various detection data are obtained, including image data, posture data, speed acceleration data, etc., to sense whether the user is using the mobile phone in an unsafe scene, such as walking or riding while looking at the screen. If it is sensed that the user is using the mobile phone in an unsafe scene, the user can be reminded not to use the mobile phone or pay attention to road safety, thereby improving the user's experience.
  • an unsafe scene such as walking or riding while looking at the screen.
  • the detection data includes data from a second sensor, and the second sensor includes an ambient light sensor. If it is determined that the data from the second sensor is less than a fourth threshold, it is determined that the scenario category in which the user uses the mobile phone is the fourth scenario.
  • the data from the second sensor is used to indicate the ambient light data of the mobile phone in the current scene, such as light intensity. If the data from the ambient light sensor is less than the fourth threshold, it means that the mobile phone is currently in a dark environment, that is, the fourth scenario (e.g., bedroom/sleep scenario). At this time, the mobile phone can automatically adjust the screen brightness of the mobile phone, or start the low blue light mode.
  • a comprehensive judgment can also be made in combination with the clock, the device usage habits, the geographic fence, etc.
  • the operations after entering the bedroom/sleep scene include, for example, reducing the screen brightness, reducing blue light, recommending sleep content (i.e., content corresponding to the scene category bedroom), etc.
  • the mobile phone can identify the scene category in which the user uses the mobile phone by combining image data, clock (time) data, location data, voice data and at least one of various sensor data, and then perform a preset operation corresponding to the scene category.
  • the preset operation includes at least one of the following: adjusting the volume; adjusting the screen brightness; adjusting the screen blue light; adjusting the vibration intensity; sending a first message, the first message is used to remind the user to stop using the electronic device; sending a second message, the second message is used to recommend content corresponding to the scene category; turning on the rear camera to detect obstacles.
  • the scene perception method provided in the embodiment of the present application can be applied not only to straight-screen mobile phones, but also to foldable screen mobile phones.
  • the folding screen mobile phone includes an inner screen and an outer screen, the inner screen is correspondingly provided with a first camera, and the outer screen is correspondingly provided with a second camera.
  • the first camera is camera 3 in Figure 13
  • the second camera is camera 1 in Figure 13.
  • the second camera is controlled to operate in the first mode.
  • the second camera is controlled to operate in the first mode.
  • the second camera before controlling the second camera to operate in the first mode, it also includes: detecting that the state of the mobile phone satisfies at least one of the following: the mobile phone is unlocked; the time difference between the light signal emitted by the proximity light sensor (on the external screen) of the mobile phone and the reflected signal of the light signal is greater than the first threshold, and/or the signal strength of the reflected signal is less than the second threshold, and/or the receiving light sensor does not receive the reflected signal; the detection data of the ambient light sensor of the mobile phone is greater than the third threshold; the external screen of the mobile phone is facing a preset direction; the mobile phone is in a moving state.
  • the first camera is controlled to operate in the first mode. Before controlling the first camera to operate in the first mode, it also includes: detecting the state of the mobile phone The state satisfies at least one of the following: the mobile phone is unlocked; the time difference between the light signal emitted by the proximity light sensor (on the inner screen) of the mobile phone and the reflected signal of the light signal is greater than a first threshold, and/or the signal strength of the reflected signal is less than a second threshold, and/or the receiving light sensor does not receive the reflected signal; the detection data of the ambient light sensor of the mobile phone is greater than a third threshold; the inner screen of the mobile phone is facing a preset direction; the mobile phone is in a moving state.
  • the first camera before controlling the first camera to operate in the first mode, or controlling the second camera to operate in the first mode, it also includes: detecting that the state of the electronic device satisfies the second condition; the second condition includes at least one of the following: the electronic device is unlocked; the time difference between the light signal emitted by the proximity light sensor of the electronic device and the reflected signal of the light signal is greater than the first threshold, and/or the signal strength of the reflected signal is less than the second threshold, and/or the receiving light sensor does not receive the reflected signal; the detection data of the ambient light sensor of the electronic device is greater than the third threshold; the inner screen or outer screen of the electronic device is facing a preset direction; the electronic device is in a moving state.
  • the second condition includes at least one of the following: the electronic device is unlocked; the time difference between the light signal emitted by the proximity light sensor of the electronic device and the reflected signal of the light signal is greater than the first threshold, and/or the signal strength of the reflected signal is less than the second threshold
  • the electronic device controls the first camera to operate in the first mode and turns off the second camera.
  • the electronic device controls the first camera to operate in the first mode, the second camera continues to operate in the first mode, and the cameras on the inner and outer screens are turned on at the same time to expand the range of detection device scenes.
  • the electronic device controls the first camera to turn off, and controls the second camera to operate in the first mode.
  • the electronic device controls the first camera to turn off.
  • FIG13 is a schematic diagram of the structure of a folding screen mobile phone provided in an embodiment of the present application.
  • the screen of the folding screen mobile phone includes a first screen, a second screen and a third screen
  • the first screen is the outer screen of the folding screen mobile phone
  • the second screen and the third screen are the inner screens of the folding screen mobile phone
  • the folding screen includes the second screen and the third screen
  • the folding screen is folded according to the folding edge shown in FIG13 (4) to form the second screen and the third screen.
  • the virtual axis where the folding screen is located is a common axis.
  • the inner screen refers to the screen located inside when the folding screen is in a folded state
  • the outer screen refers to the screen located outside when the folding screen is in a closed state.
  • the angle ⁇ between the second screen and the third screen is the hinge angle of the folding screen mobile phone. Determining the hinge angle can determine the physical state of the folding screen.
  • the physical state includes a folded state as shown in FIG13 (3), an unfolded state as shown in FIG13 (4), or a bracket state as shown in FIG13 (2).
  • the folding screen mobile phone shown in FIG13 includes three groups of cameras, which are respectively recorded as camera 1, camera 2 and camera 3.
  • camera 1 is set in the middle of the upper part of the first screen
  • camera 2 is set in the back panel
  • camera 3 is set in the middle of the upper part of the third screen.
  • the camera 1 of the foldable screen mobile phone is triggered to continuously capture the first image to detect whether there is a human face in the camera range.
  • the camera 1 of the folding screen mobile phone is triggered to continuously capture the first image to detect whether there is a person's face within the camera range.
  • the camera 3 of the foldable screen mobile phone is triggered to continuously capture the first image to detect whether there is a human face in the camera range.
  • the camera 3 of the foldable screen mobile phone is triggered to continuously capture the first image to detect whether there is a human face in the camera range.
  • the foldable screen mobile phone meets the conditions for turning on the camera, the current mobile phone is in the folded state and the camera 1 is turned on.
  • the camera 1 can be turned off and the camera 3 can be turned on to detect whether there is a human face within the range of the camera 3.
  • the foldable screen mobile phone if the foldable screen mobile phone has met the conditions for turning on the camera, the current mobile phone is in the unfolded state and the camera 3 is turned on.
  • the camera 3 can be turned off and the camera 1 can be turned on to detect whether there is a human face within the range of the camera 1.
  • foldable screen mobile phones can execute the scene perception method with reference to the foldable screen mobile phone shown in Figure 13, and their implementation principles and technical effects are similar.
  • the embodiments of the present application do not impose any restrictions on the structural style of the foldable screen mobile phone.
  • FIG5 is a schematic diagram of the structure of an electronic device provided in the embodiment of the present application.
  • the electronic device 100 may include: a processor 110, an external memory interface 120, an internal memory 121, a universal serial bus (USB) interface 130, a charging management module 140, a power management module 141, a battery 142, an antenna 1, an antenna 2, a mobile communication module 150, a wireless communication module 160, an audio module 170, a speaker 170A, a receiver 170B, a microphone 170C, an earphone interface 170D, a sensor 180, a button 190, a motor 191, an indicator 192, a camera 193, a display screen 194, and a subscriber identification module (SIM) card interface 195, etc.
  • SIM subscriber identification module
  • the structure illustrated in this embodiment does not constitute a specific limitation on the electronic device 100.
  • the electronic device 100 may include more or fewer components than shown in the figure, or combine some components, or split some components, or arrange the components differently.
  • the components shown in the figure may be implemented in hardware, software, or a combination of software and hardware.
  • the processor 110 may include one or more processing units, for example, the processor 110 may include an application processor (AP), a modem processor, a graphics processor (GPU), an image signal processor (ISP), a controller, a video codec, a digital signal processor (DSP), a baseband processor, a display processing unit (DPU), and/or a neural-network processing unit (NPU), etc.
  • AP application processor
  • GPU graphics processor
  • ISP image signal processor
  • DSP digital signal processor
  • DPU digital signal processor
  • NPU neural-network processing unit
  • Different processing units may be independent devices or integrated in one or more processors.
  • the electronic device 100 may also include one or more processors 110 .
  • processor 110 may include one or more interfaces.
  • the interface connection relationship between the modules illustrated in the embodiment of the present invention is only a schematic illustration and does not constitute a structural limitation on the electronic device 100.
  • the electronic device 100 may also adopt different interface connection methods in the above embodiments, or a combination of multiple interface connection methods.
  • the charging management module 140 is used to receive charging input from the charger.
  • the power management module 141 is used to connect the battery 142, the charging management module 140 and the processor 110.
  • the power management module 141 receives input from the battery 142 and/or the charging management module 140, and provides power to the processor 110, the internal memory 121, the display screen 194, the camera 193, and the wireless communication module 160.
  • the power management module 141 can also be used to monitor parameters such as battery capacity, battery cycle number, and battery health status.
  • the power management module 141 can also be set in the processor 110.
  • the power management module 141 and the charging management module 140 can also be set in the same device.
  • the wireless communication function of the electronic device 100 can be implemented through the antenna 1, the antenna 2, the mobile communication module 150, the wireless communication module 160, the modem processor and the baseband processor.
  • the mobile communication module 150 can provide a solution for wireless communication including 2G/3G/4G/5G etc. applied to the electronic device 100.
  • the wireless communication module 160 can provide wireless communication solutions for application in the electronic device 100, including wireless local area networks (WLAN), Bluetooth, global navigation satellite system (GNSS), frequency modulation (FM), NFC, infrared technology (IR), etc.
  • WLAN wireless local area networks
  • GNSS global navigation satellite system
  • FM frequency modulation
  • NFC NFC
  • IR infrared technology
  • the electronic device 100 can realize the display function through the GPU, the display screen 194, and the application processor.
  • the GPU is a microprocessor for image processing, which connects the display screen 194 and the application processor.
  • the GPU is used to perform mathematical and geometric calculations for graphics rendering.
  • the processor 110 may include one or more GPUs, which execute instructions to generate or change display information.
  • the display screen 194 is used to display images, videos, etc.
  • the display screen 194 includes a display panel.
  • the display panel can be a liquid crystal display (LCD), an organic light-emitting diode (OLED), an active-matrix organic light-emitting diode or an active-matrix organic light-emitting diode (AMOLED), a flexible light-emitting diode (FLED), Miniled, MicroLed, Micro-oLed, quantum dot light-emitting diodes (QLED), etc.
  • the electronic device 100 may include 1 or N display screens 194, where N is a positive integer greater than 1.
  • the electronic device 100 can implement a shooting function through an ISP, one or more cameras 193, a video codec, a GPU, one or more display screens 194, and an application processor.
  • NPU is a neural network (NN) computing processor.
  • NN neural network
  • applications such as intelligent cognition of electronic device 100 can be realized, such as image recognition, face recognition, voice recognition, text understanding, etc.
  • the external memory interface 120 can be used to connect an external memory card, such as a Micro SD card, to expand the storage capacity of the electronic device 100.
  • the external memory card communicates with the processor 110 through the external memory interface 120 to implement a data storage function. For example, data files such as music, photos, and videos are stored in the external memory card.
  • the internal memory 121 may be used to store one or more computer programs, which include instructions.
  • the processor 110 may execute the instructions stored in the internal memory 121, thereby enabling the electronic device 100 to perform various functional applications and data processing.
  • the sensor 180 may include a pressure sensor 180A, a gyroscope sensor 180B, an air pressure sensor 180C, a magnetic sensor 180D, an acceleration sensor 180E, a distance sensor 180F, a proximity light sensor 180G, a fingerprint sensor 180H, a temperature sensor 180J, a touch sensor 180K, an ambient light sensor 180L, a bone conduction sensor 180M, and the like.
  • the gyro sensor 180B can be used to determine the motion posture of the electronic device 100. In some embodiments, the angular velocity of the electronic device 100 around three axes (i.e., x, y, and z axes) can be determined by the gyro sensor 180B. The gyro sensor 180B can be used for shooting anti-shake. The gyro sensor 180B can also be used for navigation, somatosensory game scenes, etc.
  • the magnetic sensor 180D is used to detect the magnetic field strength of the magnet, obtain magnetic data, and detect the physical state of the folding screen of the electronic device 100 through the magnetic data.
  • the magnet is used to generate a magnetic field.
  • the magnetic sensor 180D can be set in a body corresponding to the back plate shown in (1) of Figure 13, for example, and the magnet can be set in a body corresponding to the first screen shown in (1) of Figure 13, for example.
  • the magnet allows the magnetic sensor 180D to detect magnetic data. As the opening and closing state of the folding screen changes, the distance between the magnetic sensor 180D and the magnet changes accordingly, and the magnetic field strength of the magnet detected by the magnetic sensor 180D also changes.
  • the smart sensor hub can judge the physical state of the folding screen according to the magnetic data obtained by the magnetic sensor 180D under the magnetic field of the magnet.
  • the physical state includes, for example, an unfolded state, a bracket state, or a folded state (closed state).
  • the sensor 180 may also include a Hall sensor, which can also be used to detect the magnetic field strength of the magnet, output a high/low level, and determine the physical state of the folding screen of the electronic device 100 through the high/low level.
  • the acceleration sensor 180E can detect the magnitude of the acceleration of the electronic device 100 in all directions (generally three axes). When the electronic device 100 is stationary, the magnitude and direction of gravity can be detected. It can also be used to identify the posture of the electronic device and is applied to applications such as horizontal and vertical screen switching and pedometers.
  • the distance sensor 180F is used to measure the distance.
  • the electronic device 100 can measure the distance by infrared or laser. In some embodiments, when shooting a scene, the electronic device 100 can use the distance sensor 180F to measure the distance to achieve fast focusing.
  • the proximity light sensor 180G may include, for example, a light emitting diode (LED) and a light detector, such as a photodiode.
  • the light emitting diode may be an infrared light emitting diode.
  • the electronic device 100 emits infrared light outward through the light emitting diode.
  • the electronic device 100 uses a photodiode to detect infrared reflected light from nearby objects. When sufficient reflected light is detected, it can be determined that there is an object near the electronic device 100. When insufficient reflected light is detected, the electronic device 100 can determine that there is no object near the electronic device 100.
  • the electronic device 100 can use the proximity light sensor 180G to detect that the user holds the electronic device 100 close to the ear to talk, so as to automatically turn off the screen to save power.
  • the proximity light sensor 180G can also be used in leather case mode and pocket mode to automatically unlock and lock the screen.
  • the ambient light sensor 180L is used to sense the brightness of the ambient light.
  • the electronic device 100 can adaptively adjust the brightness of the display screen 194 according to the perceived ambient light brightness.
  • the ambient light sensor 180L can also be used to automatically adjust the white balance when taking pictures.
  • the ambient light sensor 180L can also cooperate with the proximity light sensor 180G to detect whether the electronic device 100 is in a pocket to prevent accidental touches.
  • the key 190 includes a power key, a volume key, etc.
  • the key 190 may be a mechanical key or a touch key.
  • the electronic device 100 may receive key inputs and generate key signal inputs related to user settings and function control of the electronic device 100.
  • the electronic device mentioned above may also be referred to as a terminal device (terminal), a user equipment (UE), a mobile station (MS), a mobile terminal (MT), etc.
  • the electronic device may be a mobile phone with a touch screen, a wearable device, a tablet computer (Pad), a computer with a wireless transceiver function, a virtual reality (VR) terminal device, an augmented reality (AR) terminal device, a wireless terminal in industrial control (industrial control), a wireless terminal in self-driving, a wireless terminal in remote medical surgery, a wireless terminal in smart grid (smart grid), a wireless terminal in transportation safety (transportation safety), a wireless terminal in a smart city (smart city), a wireless terminal in a smart home (smart home), etc.
  • the embodiments of the present application do not limit the specific technology and specific device form adopted by the electronic device.
  • Fig. 6 is a schematic diagram of the structure of an electronic device provided in an embodiment of the present application.
  • the electronic device may include an improved camera 601 and an improved processor 602.
  • the improved camera 601 refers to adding a control circuit and a working circuit corresponding to the newly added shooting mode in the existing camera module to achieve a low-power configuration.
  • the shooting mode of the existing camera module is mode 1, and the camera module includes a working circuit corresponding to mode 1. If the shooting mode adds mode 2, correspondingly, it also involves switching between mode 1 and mode 2.
  • the improved camera module includes not only the working circuit corresponding to mode 1, but also the working circuit corresponding to the newly added mode 2, and the control circuit corresponding to the switching between the two modes. It should be understood that more than two shooting modes can be set according to actual application requirements, and the embodiments of the present application do not impose any restrictions on this.
  • the improved camera 601 includes two working modes: a first mode and a second mode, the first mode can be called a low-power shooting mode, and the second mode can be called a normal shooting mode.
  • the resolution of the image captured by the camera 601 in the first mode is smaller than the resolution of the image captured in the second mode, and the frame rate of the image captured by the camera 601 in the first mode is smaller than the frame rate of the image captured in the second mode.
  • the camera 601 can switch between the two modes.
  • the camera 601 works in the first mode and can perform permanent scanning.
  • the camera 601 continuously captures a first image with a first resolution at a first frame rate to detect whether there is a person within the range of the camera 601, such as detecting whether the first image contains a person's face; if a person is detected, such as detecting that the first image contains a person's face, the camera 601 switches from the first mode to the second mode, and continuously captures a second image with a second resolution at a second frame rate to detect the category of the current scene, or detect whether the person is looking at the screen, or.
  • the first frame rate is less than the second frame rate
  • the first resolution is less than the second resolution.
  • the camera 601 can dynamically adjust the frame rate and resolution of the captured image to adapt to different needs. For example, the camera 601 captures images at a lower frame rate and resolution to detect whether there is a person's face within the range of the camera 601, and the camera 601 captures images at a higher frame rate and resolution to identify the scene category in which the person uses the electronic device.
  • the minimum frame rate of the camera 601 can be 1fps
  • the maximum frame rate can be 30fps.
  • the minimum resolution of the camera 601 can be 120 ⁇ 180, and the maximum resolution can be 480 ⁇ 640.
  • the maximum frame rate can also be 240fps, and the maximum resolution can also be 2736 ⁇ 3648.
  • the embodiments of the present application do not specifically limit the frame rate range and resolution range of the camera to collect images, that is, the minimum frame rate and maximum frame rate of the camera to collect images are not limited, nor are the minimum resolution and maximum resolution of the camera to collect images.
  • the frame rate range and resolution range can be reasonably set according to requirements.
  • the camera 601 may be a front camera of the electronic device or a rear camera of the electronic device, which is not limited in this embodiment.
  • the improved processor 602 may be a system on chip (SoC).
  • SoC system on chip
  • the camera 601 may send the image to the SoC for image analysis to detect whether there is a person's face in the image, whether the person is looking at the screen, the category of the current scene, etc.
  • SoC can support low power consumption AON ISP (Always On ISP). Referring to FIG. 7 , camera 601 transmits the image to AON ISP. AON ISP does not process any image effects except for format conversion. Then, the format converted image is stored in on-chip static random access memory (on-chip SRAM). SoC can also support extremely low power consumption cores, computing and algorithm operation. The line and image storage all work in low power mode. In addition, the SoC can also support low power embedded neural network processor eNPU (emdedded NPU).
  • eNPU embedded neural network processor
  • FIG7 is a schematic diagram of the structure of a SoC provided in an embodiment of the present application.
  • the SoC includes a first processing unit and a second processing unit.
  • the first processing unit includes an image signal processing ISP, a neural network processor NPU and a central processing unit CPU
  • the second processing unit includes an I2C bus interface, an AON ISP, an on-chip SRAM, a digital signal processor DSP and an eNPU.
  • the power consumption of the second processing unit is lower than that of the first processing unit.
  • the power consumption of the eNPU in the second processing unit is lower than that of the NPU in the first processing unit
  • the power consumption of the AON ISP in the second processing unit is lower than that of the ISP in the first processing unit.
  • the first processing unit can be used to process the second image of the second resolution captured by the camera 601.
  • the camera 601 captures the second image of the second resolution
  • the NPU detects the processed second image of the second resolution, such as detecting the category of the current scene, or detecting whether the person is looking at the screen.
  • the first processing unit sends the data (such as image data) to the memory, it can perform security processing (such as encryption processing) and store the data after security processing in a security buffer of the memory.
  • Security processing is used to protect the user's privacy data.
  • the second processing unit may be used to process the first image of the first resolution captured by the camera 601.
  • the camera 601 captures the first image of the first resolution
  • the AON ISP obtains the first image of the first resolution through the I2C bus interface.
  • the eNPU detects the processed first image of the first resolution, for example, detecting whether the first image contains a person's face.
  • the on-chip SRAM in the second processing unit may be used to store the processed first image of the first resolution
  • the DSP may be used to notify the eNPU to perform image detection, receive the detection results reported by the eNPU, and report the detection results to the upper-layer application.
  • the second processing unit adopts a low-power configuration to reduce the power consumption of the electronic device.
  • the embodiment of the present application does not limit the format of the image data transmitted between the first processing unit or the second processing unit and the camera.
  • the image data can be camera serial interface (Camera Serial Interface, CSI) mobile industry processor interface (Mobile Industry Processor Interface, MIPI) data.
  • CSI Camera Serial Interface
  • MIPI Mobile Industry Processor Interface
  • the software system of the electronic device can adopt a layered architecture, an event-driven architecture, a micro-core architecture, a microservice architecture, or a cloud architecture.
  • the embodiment of the present application takes the Android system as an example of a software system with a layered architecture to illustrate the software structure of the electronic device.
  • Figure 8 is a schematic diagram of the structure of an electronic device provided in an embodiment of the present application.
  • the layered architecture divides the software system of the electronic device into several layers, and each layer has a clear role and division of labor.
  • the layers communicate with each other through software interfaces.
  • the electronic device of the embodiment of the present application includes an application layer (Applications), an application framework layer (Application Framework), a hardware abstraction layer (Hardware Abstraction Layer, HAL), a kernel layer (Kernel), a sensor control center (Sensorhub) and a hardware layer.
  • Application layer Applications
  • Application Framework application framework layer
  • HAL hardware abstraction layer
  • Kernel kernel layer
  • Sensorhub sensor control center
  • the application layer can include a series of applications, and the application layer runs the applications by calling the application programming interface (API) provided by the application framework layer.
  • API application programming interface
  • the application layer may include a scene perception application and a perception module.
  • the scene perception application is connected to the perception module, and the scene perception application is registered in the perception module.
  • the perception module performs state management and data transmission. For example, when the perception module learns from the second processing module in the Sensorhub that there is a human face within the camera range, the perception module notifies the first processing module in the HAL so that the first processing module can recognize the human face based on the image collected by the camera.
  • the scene category of the electronic device is finally reported by the perception module to the scene perception application.
  • the application layer also includes other applications (not shown in FIG. 8 ), such as a gaze-on-screen application and a gaze-on-always-on-display (AOD) application.
  • a gaze-on-screen application and a gaze-on-always-on-display (AOD) application correspond to multiple applications.
  • multiple applications correspond to the same algorithm, for example, the gaze-on-screen application and the gaze-on-AOD application correspond to the gaze detection algorithm, and the perception module can be used to uniformly schedule and manage the gaze detection algorithm.
  • different applications correspond to different algorithms, for example, the scene perception application corresponds to the scene recognition algorithm (the scene perception application also corresponds to the face (presence) detection algorithm), and the gaze-on-screen application corresponds to the gaze detection algorithm.
  • Both algorithms involve obtaining image data from the underlying camera, and the perception module can be used to schedule and manage the priorities among multiple algorithms.
  • the scene recognition algorithm and the gaze detection algorithm have the same priority, and the perception module can notify the underlying camera to report the image data to the scene perception application and the gaze-on-screen application at the same time.
  • the scene recognition algorithm can be deployed in the first processing module, and the face detection algorithm can be deployed in the second processing module.
  • the algorithms shown in this embodiment are only examples.
  • the application layer further includes a third processing module, and the third processing module is used to obtain the physical state of the folding screen reported by the second processing module, and the state of the internal and external screen cameras (on or off).
  • the third processing module is also used to notify the first processing module of the state of the internal and external screen cameras.
  • the application may also include camera, gallery, calendar, call, map, navigation, WLAN, Bluetooth, music, video, short message and other applications, which may be system applications or third-party applications, and the embodiments of the present application do not limit this.
  • the application framework layer provides API and programming framework for the applications in the application layer.
  • the application framework layer includes some predefined functions.
  • the application framework layer may include a camera service (CameraService), which is used for priority scheduling and management of all applications that need to use the camera.
  • the application framework layer may also include, for example, a window manager, a content provider, a resource manager, a notification manager, a view system, etc., but the embodiments of the present application do not limit this.
  • the hardware abstraction layer may include an AO (always on) service and a first processing module.
  • the AO service may be used to control the on or off of a scene recognition algorithm in the first processing module, and to control the on or off of a face detection algorithm in the second processing module, as well as upper and lower layer data transmission.
  • the first processing module may be used to process images with higher resolution and/or higher frame rate, such as the second image described above, to detect the second image and identify the scene category in which the user uses the device.
  • the first processing module is also used for camera mode switching, for example, the first processing module receives a second instruction from the perception module and controls the camera to switch from the first mode to the second mode.
  • the kernel layer is a layer between hardware and software.
  • the kernel layer is used to drive the hardware to make the hardware work.
  • the kernel layer may include a camera driver, which is used to drive the camera in the electronic device to work in the first mode or the second mode to collect images with different frame rates and/or resolutions.
  • the kernel layer may also include display driver audio driver, sensor driver, motor driver, etc., which are not limited in the embodiments of the present application.
  • the sensor driver may drive, for example, a proximity light sensor to emit a light signal to detect whether the user is currently holding the electronic device close to the ear to make a call, etc.
  • the sensor driver may also drive, for example, a gyroscope sensor to detect the posture data of the electronic device; the sensor driver may also drive, for example, an ambient light sensor to detect the ambient light brightness to detect whether the electronic device is in a dark environment, and a dark environment includes, for example, a mobile phone in a pocket, etc.
  • Sensorhub is used to realize centralized control of sensors to reduce CPU load.
  • Sensorhub is equivalent to a microprogrammed control unit (MCU), which can run programs used to drive multiple sensors.
  • MCU microprogrammed control unit
  • Sensorhub can support the ability to mount multiple sensors, and it can be used as a
  • An independent chip is placed between the CPU and various sensors, or it can be integrated into the application processor (AP) in the CPU.
  • the Sensorhub may include a second processing module, which may be used to process images with lower resolution and/or lower frame rate, such as the first image mentioned above, to detect whether there is a human face in the first image.
  • the second processing module is a low-power processing module, and the second processing module is resident or runs in a low-power form.
  • the second processing module is also used to obtain data reported by various sensors, and determine various states of the electronic device based on various sensor data, such as screen state, unlock state, use state, etc.
  • the second processing module may send a first shooting instruction to the camera, instructing the camera to resident scan in a low-power shooting mode (such as the first mode) to detect whether there is a human face within the camera range.
  • a low-power shooting mode such as the first mode
  • the second processing module may determine whether to send a first shooting instruction to the camera (camera on the inner screen or outer screen) by detecting whether the physical state of the folding screen, the screen state, and the device state meet the second condition.
  • the second processing module when the second processing module detects that the physical state of the screen of the foldable screen mobile phone has changed, such as from a folded state to an unfolded state, or from an unfolded state to a folded state, the second processing module can control the camera of the foldable screen mobile phone (such as the camera on the outer screen of the mobile phone, and/or the camera on the inner screen of the mobile phone) to turn on or off.
  • the camera of the foldable screen mobile phone such as the camera on the outer screen of the mobile phone, and/or the camera on the inner screen of the mobile phone
  • the hardware layer may include, for example, cameras, various sensors, and AON ISP.
  • the layers in the hierarchical structure shown in FIG8 and the modules or components contained in each layer do not constitute a specific limitation on the electronic device.
  • the electronic device may include more or fewer layers than shown, and each layer may include more or fewer components, which is not limited in this application.
  • the modules included in each layer shown in FIG8 are modules involved in the embodiments of the present application, and the modules included in each layer do not constitute a limitation on the structure of the electronic device and the level (example description) of module deployment.
  • the modules shown in FIG8 can be deployed separately, or several modules can be deployed together, and the division of modules in FIG8 is an example.
  • the names of the modules shown in FIG8 are examples.
  • FIG9 is a flow chart of a scene perception method provided in an embodiment of the present application. As shown in FIG9 , the scene perception method provided in this embodiment includes:
  • Step 901 The target application registers the scene perception function in the perception module.
  • the target application is a scene perception application at the application layer.
  • the target application obtains scene perception function information from a server (or cloud). After obtaining the scene perception function information, the target application can register the scene perception function in the perception module so that the perception module can run resident and execute matters related to scene perception.
  • the scene perception function information includes preset operations corresponding to different scene categories, etc.
  • Step 902 The perception module determines whether to start the scene perception function.
  • step 903 is executed.
  • the system application or the third-party application sends a notification to the perception module to inform the perception module that the user has turned on the scene perception function.
  • Step 903 The perception module sends a first instruction to the second processing module, where the first instruction is used to instruct the second processing module to detect whether there is a human face within the camera range of the electronic device.
  • the perception module sends the first instruction to control the start of the second processing module.
  • Processing modules including module power-on, work scenario delivery, resource preparation, etc.
  • Step 904 In response to the first instruction, the second processing module sends a first shooting instruction to the camera, where the first shooting instruction is used to instruct the camera to work in the first mode.
  • the second processing module sends the first shooting instruction to control the startup of the camera, including powering on the camera, mode switching, image output resolution and frame rate settings, etc.
  • the camera continuously captures images at a lower frame rate and/or resolution in the first mode, namely, the first image.
  • Step 905 The camera sends the first image to the second processing module.
  • the camera captures a first image of a first resolution at a first frame rate in a first mode.
  • the first frame rate may be set to 5 fps
  • the first resolution may be set to 120 ⁇ 180 or 640 ⁇ 480.
  • Step 906 The second processing module identifies whether the first image contains a human face.
  • step 907 is executed. Otherwise, the second processing module continues to execute step 906 unless the camera is controlled to be turned off.
  • the second processing module is pre-installed with a facial recognition model, which may be trained using a lightweight neural network model and is used to identify whether an image contains a human face.
  • the facial recognition model can be deployed on the eNPU of the second processing module, with good real-time performance.
  • Step 907 The second processing module sends a first message to the perception module, where the first message is used to notify the perception module that a human face is recognized within the range of the electronic device camera.
  • Step 908 The perception module sends a second instruction to the first processing module, and the second instruction is used to instruct the first processing module to identify the category of the scene within the camera range.
  • the second processing module sends the second instruction to control the startup of the first processing module, including module power-on, work scene delivery, resource preparation, etc.
  • Step 909 The first processing module sends a second shooting instruction to the camera in response to the second instruction, where the second shooting instruction is used to instruct the camera to work in the second mode.
  • the first processing module sends a second shooting instruction to control the camera to switch modes, that is, switch from the first mode to the second mode.
  • the camera continuously captures images, that is, second images, at a higher frame rate and/or resolution in the second mode.
  • Step 910 The camera sends a second image to the first processing module.
  • the camera In response to the second shooting instruction, the camera captures a second image of a second resolution at a second frame rate in a second mode, and sends the second image to the first processing module.
  • the second frame rate can be set to 30fps
  • the second resolution can be set to 1920 ⁇ 1080.
  • Step 911 The first processing module identifies a scenario category in which a user uses the electronic device based on the second image.
  • the first processing module is preset with a scene detection model
  • the scene detection model can be trained using a lightweight neural network model to identify the scene category corresponding to the second image.
  • the scene detection model can be deployed on the NPU of the first processing module and has good real-time performance.
  • Step 912 The first processing module sends a second message to the perception module.
  • the second message is used to indicate the scenario category in which the user uses the electronic device.
  • the second message includes an identifier of the scenario category.
  • Step 913 The perception module sends a third indication to the target application, where the third indication is used to indicate the category of the scenario in which the user uses the electronic device.
  • the third indication includes an identifier of the scenario category.
  • Step 914 In response to the third indication, the target application controls execution of a preset operation corresponding to the scenario category.
  • the target application pre-stores information about the scene perception function, including preset operations corresponding to different scene categories.
  • the preset operations include at least one of the following: adjusting the volume; adjusting the screen brightness; adjusting the screen blue light; adjusting the vibration intensity; sending a first message, the first message is used to remind the user to stop using the electronic device; sending a second message, the second message is used to recommend content corresponding to the scene category; turning on the rear camera to detect obstacles.
  • the target application obtains a third indication from the perception module, determines a preset operation corresponding to the scene category based on the scene category identifier in the third indication and pre-stored scene perception function information, and then controls the execution of the preset operation corresponding to the scene category.
  • the first processing module may correspond to the first processing unit shown in FIG. 7
  • the second processing module may correspond to the second processing unit shown in FIG. 7 .
  • the perception module can send a first shooting instruction to the camera through the second processing module (low power processing module), so that the camera captures the first image at a lower frame rate and/or resolution.
  • the second processing module recognizes that the first image contains a person's face, it can notify the first processing module so that the first processing module sends a second shooting instruction to the camera, so that the camera captures the second image at a higher frame rate and/or resolution.
  • the first processing module identifies the scene category corresponding to the second image and informs the application of the scene category so that the application performs the preset operation corresponding to the scene category.
  • the electronic device after the user turns on the scene perception function, the electronic device performs face detection with low power consumption, performs scene detection when it is determined that the user is using the electronic device, and automatically sets system parameters or pushes notifications based on the scene detection results, thereby realizing the function of intelligent scene perception of the electronic device. Since both the camera and the second processing module are configured with low power consumption, the power consumption of executing this solution is extremely low.
  • FIG10 is a flow chart of a scene perception method provided in an embodiment of the present application. Based on the embodiment shown in FIG9 , as shown in FIG10 , the scene perception method provided in this embodiment includes:
  • Step 1010 The target application registers the scene perception function in the perception module.
  • Step 1011 The perception module determines whether to start the scene perception function.
  • step 1012 is executed.
  • Step 1012 The perception module sends a first indication to the second processing module, where the first indication is used to instruct the second processing module to detect whether there is a human face within the camera range of the electronic device.
  • Step 1013 The second processing module determines whether the first condition is met.
  • step 1014 is executed. Otherwise, the second processing module continues the low power consumption detection to determine whether the first condition is met.
  • the first condition includes at least one of the following:
  • the screen status of the electronic device is on; the electronic device is unlocked; the time difference between the light signal emitted by the proximity light sensor of the electronic device and the reflected signal of the light signal is greater than a first threshold, and/or the signal strength of the reflected signal is less than a second threshold, and/or the receiving light sensor does not receive the reflected signal; the detection data of the ambient light sensor of the electronic device is greater than a third threshold; the screen of the electronic device is facing a preset direction; the electronic device is in a moving state.
  • the camera of the electronic device is triggered to continuously capture the first image.
  • the electronic device is prevented from continuously capturing the first image when it is not necessary, thereby further reducing the power consumption of the device.
  • Step 1014 The second processing module sends a first shooting instruction to the camera, where the first shooting instruction is used to instruct the camera to work in the first mode.
  • the foldable device includes an inner screen and an outer screen, the inner screen is correspondingly provided with a first camera, and the outer screen is correspondingly provided with a second camera.
  • step 1013 can be replaced by: the second processing module detects that the inner screen of the electronic device is in a bright screen state, and the electronic device is in an unfolded state.
  • step 1014 can be: the second processing module sends a first shooting instruction to the first camera.
  • the second processing module before the second processing module sends the first shooting instruction to the first camera, it also includes: detecting that the state of the electronic device satisfies the second condition, and the second condition includes at least one of the following: the electronic device is unlocked; the time difference between the light signal emitted by the proximity light sensor (on the inner screen) of the electronic device and the reflected signal of the light signal is greater than the first threshold, and/or the signal strength of the reflected signal is less than the second threshold, and/or the receiving light sensor does not receive the reflected signal; the detection data of the ambient light sensor of the electronic device is greater than the third threshold; the inner screen of the electronic device is facing a preset direction; the electronic device is in a moving state.
  • the second condition includes at least one of the following: the electronic device is unlocked; the time difference between the light signal emitted by the proximity light sensor (on the inner screen) of the electronic device and the reflected signal of the light signal is greater than the first threshold, and/or the signal strength of the reflected signal is less than
  • step 1013 can be replaced by: the second processing module detects that the external screen of the electronic device is in a bright screen state, and the electronic device is in a folded state.
  • step 1014 can be: the second processing module sends a first shooting instruction to the second camera.
  • the second processing module before the second processing module sends the first shooting instruction to the second camera, it also includes: detecting that the state of the electronic device satisfies the second condition, and the second condition includes at least one of the following: the electronic device is unlocked; the time difference between the light signal emitted by the proximity light sensor (on the external screen) of the electronic device and the reflected signal of the light signal is greater than the first threshold, and/or the signal strength of the reflected signal is less than the second threshold, and/or the receiving light sensor does not receive the reflected signal; the detection data of the ambient light sensor of the electronic device is greater than the third threshold; the external screen of the electronic device is facing a preset direction; the electronic device is in a moving state.
  • the second condition includes at least one of the following: the electronic device is unlocked; the time difference between the light signal emitted by the proximity light sensor (on the external screen) of the electronic device and the reflected signal of the light signal is greater than the first threshold, and/or the signal strength of the reflected signal is less than
  • Step 1015 The camera sends the first image to the second processing module.
  • Step 1016 The second processing module identifies whether the first image contains a person's face.
  • step 1017 is executed;
  • step 1013 If the second processing module does not recognize the face of the person in the first image, the process jumps back to step 1013 .
  • Step 1017 The second processing module sends a first message to the perception module, where the first message is used to notify the perception module that a human face is recognized within the range of the electronic device camera.
  • Step 1018 The perception module sends a second indication to the first processing module, where the second indication is used to instruct the first processing module to identify the category of the scene within the camera range.
  • Step 1019 The first processing module sends a second shooting instruction to the camera in response to the second instruction, where the second shooting instruction is used to instruct the camera to work in the second mode.
  • Step 1020 The camera sends the second image to the first processing module.
  • Step 1021 The first processing module recognizes a scenario category in which a user uses the electronic device based on the second image.
  • Step 1022 The first processing module sends a second message to the perception module, where the second message is used to indicate a scenario category in which the user uses the electronic device.
  • Step 1023 The perception module sends a third indication to the target application, where the third indication is used to indicate a scenario category in which the user uses the electronic device.
  • Step 1024 The target application controls execution of a preset operation corresponding to the scene category.
  • the perception module can send a first shooting instruction to the camera through the second processing module, so that the camera captures the first image at a lower frame rate and/or resolution. If the second processing module recognizes that the first image contains a human face, it can notify the first processing module, and the first processing module sends a second shooting instruction to the camera, so that the camera captures the second image at a higher frame rate and/or resolution.
  • the first processing module identifies the second image, determines the scene category corresponding to the second image, and informs the application of the scene category so that the application can execute the preset operation corresponding to the scene category, such as automatically setting system parameters or pushing notifications, to realize the function of intelligently sensing the scene of the device.
  • the power consumption of executing the above scheme is extremely low.
  • adding the first condition can avoid the electronic device from performing unnecessary human face detection, thereby further reducing the power consumption of the device.
  • the embodiment of the present application also provides a scene perception method, which is applied to an electronic device with a flexible screen.
  • a scene perception method which is applied to an electronic device with a flexible screen.
  • the following is an example of a folding screen mobile phone for explanation.
  • a group of cameras are respectively set on the inner screen and the outer screen of the folding screen mobile phone to collect image data.
  • the scene perception method of this embodiment involves the processing logic of the underlying module of the device when the folding screen state changes, which is explained below in conjunction with Figure 14.
  • FIG14 is a flow chart of a scene perception method provided in an embodiment of the present application.
  • the scene perception method of this embodiment may include the following steps:
  • Step 1401 The second processing module obtains sensor data to determine the change in the physical state of the folding screen.
  • the sensor data includes, for example, a magnetic sensor, a Hall sensor, etc., and the sensor data is obtained to determine whether the physical state of the folding screen has changed.
  • the change in the physical state of the folding screen includes from a folded state to an unfolded state, or from an unfolded state to a folded state.
  • Step 1402a The second processing module controls turning on or off the camera on the inner screen based on the change in the physical state of the folding screen.
  • Step 1402b The second processing module controls turning on or off the camera on the external screen based on the change in the physical state of the folding screen.
  • the second processing module may control the turning on of the camera on the inner screen and/or the camera on the outer screen.
  • the camera on the outer screen is turned on. If the physical state of the folding screen changes from a folded state to an unfolded state, the second processing module can control the camera on the outer screen to be turned off, and at the same time control the camera on the inner screen to be turned on.
  • the camera on the inner screen is turned on. If the physical state of the folding screen changes from an unfolded state to a folded state, the second processing module can control the camera on the inner screen to be turned off and the camera on the outer screen to be turned on.
  • the camera on the outer screen may be camera 1 shown in FIG. 13
  • the camera on the inner screen may be camera 3 shown in FIG. 13 .
  • Step 1402c The second processing module reports the physical state of the folding screen and the state of the internal and external screen cameras to the third processing module.
  • Step 1403 The third processing module sends a notification to the first processing module to notify the status of the internal and external screen cameras.
  • This embodiment shows the interaction process between various modules inside the foldable screen mobile phone when the physical state of the screen of the mobile phone changes.
  • precise control of the low-power continuous scanning function of the camera is achieved, so that when the user uses the foldable screen mobile phone, the mobile phone can intelligently identify the current scene of the mobile phone, such as classroom, conference room, subway station, etc., and then automatically set the mobile phone system parameters (such as volume, vibration intensity, etc.), thereby improving the user experience.
  • Figure 11 is a structural diagram of an electronic device provided in an embodiment of the present application. As shown in Figure 11, the electronic device includes a camera 1106, a processor 1101, a communication line 1104 and at least one communication interface (communication interface 1103 is used as an example in Figure 11).
  • the electronic device includes a camera 1106, a processor 1101, a communication line 1104 and at least one communication interface (communication interface 1103 is used as an example in Figure 11).
  • the camera 1106 may be used to capture images with different frame rates and/or resolutions, and the processor 1101 may be used to detect whether there is a human face in the image and to recognize the scene.
  • the processor 1101 may be a general-purpose central processing unit (CPU), a microprocessor, an application-specific integrated circuit (ASIC), or one or more integrated circuits for controlling the execution of the program of the present application.
  • the processor 1101 includes a first processing module and a second processing module, and the power consumption of the first processing module is higher than that of the second processing module; the second processing module may be used to detect whether there is a human face in the first image captured by the camera in the first mode, and the first processing module may be used to identify the second image captured by the camera in the second mode, and determine the scene category in which the user uses the electronic device.
  • Communications link 1104 may include circuitry to transmit information between the above-described components.
  • the communication interface 1103 uses any transceiver-like device for communicating with other devices or communication networks, such as Ethernet, wireless local area networks (WLAN), etc.
  • the electronic device may further include a memory 1102 .
  • the memory 1102 may be a read-only memory (ROM) or other types of static storage devices that can store static information and instructions, a random access memory (RAM) or other types of dynamic storage devices that can store information and instructions, or an electrically erasable programmable read-only memory (EEPROM), a compact disc read-only memory (CD-ROM) or other optical disc storage, optical disc storage (including compressed optical disc, laser disc, optical disc, digital versatile disc, Blu-ray disc, etc.), a magnetic disk storage medium or other magnetic storage device, or any other medium that can be used to carry or store the desired program code in the form of instructions or data structures and can be accessed by a computer, but is not limited thereto.
  • the memory may be independent and connected to the processor via a communication line 1104. The memory may also be integrated with the processor.
  • the memory 1102 is used to store computer-executable instructions for executing the solution of the present application, and the execution is controlled by the processor 1101.
  • the processor 1101 is used to execute the computer-executable instructions stored in the memory 1102, thereby realizing the scene perception method provided in the embodiment of the present application.
  • the electronic device further includes a display screen 1207 , and the display screen 1207 may be a folding screen.
  • the computer-executable instructions in the embodiments of the present application may also be referred to as application program codes, which is not specifically limited in the embodiments of the present application.
  • the processor 1101 may include one or more CPUs, such as CPU0 and CPU1 in FIG. 11 .
  • an electronic device may include multiple processors, such as processor 1101 and processor 1105 in FIG. 11.
  • processors may be a single-core (single-CPU) processor or a multi-core (multi-CPU) processor.
  • the processor here may refer to one or more devices, circuits, and/or processing cores for processing data (e.g., computer program instructions).
  • FIG12 is a schematic diagram of the structure of a chip provided in an embodiment of the present application.
  • the chip 120 includes one or more (including two) processors 1220 and a communication interface 1230 .
  • the memory 1240 stores the following elements: executable modules or data structures, or a subset of executable modules or data structures, or an extended set of executable modules or data structures.
  • the memory 1240 may include a read-only memory and a random access memory, and provide instructions and data to the processor 1220.
  • a portion of the memory 1240 may also include a non-volatile random access memory (NVRAM).
  • NVRAM non-volatile random access memory
  • the memory 1240, the communication interface 1230 and the memory 1240 are connected via the bus system 1210.
  • the bus system 1210 may include a power bus, a control bus, a status signal bus, etc. in addition to the data bus.
  • various buses are labeled as the bus system 1210 in FIG. 12 .
  • the method described in the above embodiment of the present application can be applied to the processor 1220, or implemented by the processor 1220.
  • the processor 1220 may be an integrated circuit chip with signal processing capabilities. In the implementation process, each step of the above method can be completed by an integrated logic circuit of hardware in the processor 1220 or an instruction in the form of software.
  • the above processor 1220 can be a general-purpose processor (for example, a microprocessor or a conventional processor), a digital signal processor (digital signal processing, DSP), an application specific integrated circuit (application specific integrated circuit, ASIC), a field-programmable gate array (field-programmable gate array, FPGA) or other programmable logic devices, discrete gates, transistor logic devices or discrete hardware components.
  • the processor 1220 can implement or execute the disclosed methods, steps and logic block diagrams in the embodiments of the present invention.
  • the instructions stored in the memory for execution by the processor may be implemented in the form of a computer program product.
  • the computer program product may be pre-written in the memory, or may be downloaded and installed in the memory in the form of software.
  • the present application also provides a computer program product, which includes one or more computer instructions.
  • the computer program instructions When the computer program instructions are loaded and executed on an electronic device, the electronic device executes the technical solution in the above embodiment, and its implementation principle and technical effect are similar to those of the above related embodiments, which will not be repeated here.
  • Computer instructions can be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium.
  • computer instructions can be transmitted from one website, computer, server or data center to another website, computer, server or data center via wired (e.g., coaxial cable, optical fiber, digital subscriber line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.) means.
  • wired e.g., coaxial cable, optical fiber, digital subscriber line (DSL)
  • wireless e.g., infrared, wireless, microwave, etc.
  • An embodiment of the present application also provides a computer-readable storage medium, which stores computer instructions.
  • the computer instructions When the computer instructions are executed on an electronic device, the electronic device executes the technical solution in the above embodiment.
  • the implementation principle and technical effect are similar to those of the above-mentioned related embodiments and will not be repeated here.
  • Computer-readable storage media may include computer storage media and communication media, and may also include any medium that can transfer a computer program from one place to another.
  • Computer-readable storage media may include: compact disc read-only storage CD-ROM, RAM, ROM, EEPROM or other optical disk storage; computer-readable storage media may include magnetic disk storage or other magnetic disk storage devices.
  • any connecting line may also be appropriately referred to as a computer-readable storage medium. For example, if the software is transmitted from a website, server or other remote source using coaxial cable, fiber optic cable, twisted pair, DSL or wireless technology (such as infrared, radio and microwave), coaxial cable, fiber optic cable, twisted pair, DSL or wireless technology such as infrared, radio and microwave are included in the definition of medium.
  • Disks and optical discs as used herein include compact discs (CDs), laser discs, optical discs, digital versatile discs (DVDs), floppy disks and Blu-ray discs, where disks usually reproduce data magnetically, while optical discs reproduce data optically using lasers.
  • the user information including but not limited to user device information, user personal information, user facial information, etc.
  • data including but not limited to data used for analysis, stored data, displayed data, etc.
  • user information including but not limited to user device information, user personal information, user facial information, etc.
  • data including but not limited to data used for analysis, stored data, displayed data, etc.
  • the collection, use and processing of relevant data must comply with the relevant laws, regulations and standards of relevant countries and regions, and provide corresponding operation entrances for users to choose to authorize or refuse.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • General Health & Medical Sciences (AREA)
  • Signal Processing (AREA)
  • Psychiatry (AREA)
  • Social Psychology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Ophthalmology & Optometry (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Studio Devices (AREA)

Abstract

本申请提供一种场景感知方法、设备及存储介质,涉及终端技术领域以及人工智能领域,还涉及智能感知、智能控制、智能推荐等技术领域。该方法包括:电子设备在开启智能感知功能后,摄像头以第一模式运行。电子设备检测到摄像头范围内有人物面部时,控制摄像头以第二模式运行。电子设备获取检测数据,检测数据至少包括摄像头在第二模式下采集的图像数据,基于检测数据中的图像数据确定人物使用电子设备的场景类别,并执行与该场景类别对应的预设操作,实现智能感知场景和场景控制,提升用户的用机体验。

Description

场景感知方法、设备及存储介质
本申请要求于2022年11月30日提交中国国家知识产权局、申请号为202211521455.9、申请名称为“场景感知方法、设备及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及终端技术领域,尤其涉及场景感知方法、设备及存储介质。
背景技术
随着智能终端的普及,用户可以随时随地使用智能终端。用户对不同的使用场景,通常具有不同的使用需求,例如用户在会议室、教室等场景,通常需要设置静音或调低音量,又例如用户在车站、地铁站等场景,通常需要调高音量或震动强度。
相关技术中,在不同的使用场景下,用户需要手动进行系统设置,例如打开系统设置界面,在该界面选择并调整相关参数,用户体验有待提升。
发明内容
本申请实施例提供一种场景感知方法、设备及存储介质,实现智能感知场景和场景控制,提升用户的用机体验。
第一方面,本申请实施例提出一种场景感知方法,应用于电子设备,该方法中,电子设备的摄像头以第一模式运行;电子设备获取摄像头在第一模式下采集的第一图像,检测第一图像中是否有人物面部;若电子设备在第一图像中检测到人物面部,控制摄像头由第一模式切换至第二模式;电子设备获取检测数据,检测数据包括摄像头在第二模式下采集的第二图像;电子设备识别第二图像,基于第二图像确定人物使用电子设备的场景类别,控制执行与场景类别对应的预设操作。
该方案中,摄像头在第一模式采集第一图像,若检测到图像中有人物面部,则切换至第二模式,在第二模式采集第二图像,并通过识别第二图像确定电子设备当前的场景类别,进而执行与场景类别对应的操作,实现智能感知场景和场景控制,提升用户的用机体验。此外,通过获取第一模式下摄像头采集的第一图像,检测人物面部有无,再经摄像头模式的切换,获取第二模式下摄像头采集的第二图像,识别场景类别,可在一定程度上降低设备功耗。
作为一种示例,第一图像的分辨率小于第二图像的分辨率,和/或,第一图像的帧率小于第二图像的帧率。
在第一方面的一个可选实施例中,电子设备的摄像头以第一模式运行之前,该方法还包括:电子设备响应于开启场景感知功能的第一操作。
该方案中,限定了设备开启摄像头的条件,以触发设备执行智能感知场景类别的方案。
在第一方面的一个可选实施例中,电子设备的摄像头以第一模式运行之前,该方法还 包括:检测到电子设备的状态满足第一条件;第一条件包括以下至少一项:电子设备的屏幕状态为亮屏状态;电子设备已解锁;电子设备的接近光传感器发射的光信号与光信号的反射信号的时间差大于第一阈值,和/或,反射信号的信号强度小于第二阈值,和/或,接收光传感器未接收到反射信号;电子设备的环境光传感器的检测数据大于第三阈值;电子设备的屏幕朝向预设方向;电子设备处于移动状态。
该方案中,进一步限定了设备开启摄像头的条件,除了用户手动开启场景感知功能外,通过增设第一条件,以避免摄像头在非必要时持续采集图像,降低设备功耗。非必要时可以理解为摄像头不可能采集到人物面部的任意一种场景。
在第一方面的一个可选实施例中,电子设备为可折叠设备,可折叠设备包括内屏和外屏,内屏对应设置有第一摄像头,外屏对应设置有第二摄像头;电子设备的摄像头以第一模式运行,包括:检测到电子设备的外屏为亮屏状态,且电子设备处于折叠状态,控制第二摄像头以第一模式运行;或者,检测到电子设备的内屏为亮屏状态,且电子设备处于展开状态,控制第一摄像头以第一模式运行。
该方案可应用于可折叠设备,若设备在折叠状态或展开状态下,有相应的屏幕(内屏或外屏)处于亮屏状态,可控制开启处于亮屏状态的屏幕上的摄像头,以检测摄像头范围内是否有人物面部,进而可触发识别电子设备场景。
在第一方面的一个可选实施例中,在控制第一摄像头以第一模式运行,或者控制第二摄像头以第一模式运行之前,该方法还包括:检测到电子设备的状态满足第二条件;第二条件包括以下至少一项:电子设备已解锁;电子设备的接近光传感器发射的光信号与光信号的反射信号的时间差大于第一阈值,和/或,反射信号的信号强度小于第二阈值,和/或,接收光传感器未接收到反射信号;电子设备的环境光传感器的检测数据大于第三阈值;电子设备的内屏或外屏朝向预设方向;电子设备处于移动状态。
该方案中,进一步限定了可折叠设备开启摄像头的条件,除了用户开启场景感知功能、电子设备屏幕处于折叠状态或展开状态时,有相应屏幕点亮外,通过增设第二条件,以避免摄像头在非必要时持续采集第一图像,降低设备功耗。
在第一方面的一个可选实施例中,该方法还包括:当第二摄像头以第一模式运行时,若检测到电子设备由折叠状态至展开状态时,电子设备控制第一摄像头以第一模式运行,并关闭第二摄像头;或者,电子设备控制第一摄像头以第一模式运行;当第一摄像头以第一模式运行时,若检测到电子设备由展开状态至折叠状态时,电子设备控制第一摄像头关闭,并控制第二摄像头以第一模式运行。
该方案中,在其他条件不变时,若用户改变设备屏幕的物理状态,如由折叠状态至展开状态,或者由展开状态至折叠状态,可通过切换摄像头,持续采集第一图像,以便设备屏幕在新的物理状态下,还能够实现智能感知场景的功能。
在第一方面的一个可选实施例中,检测数据还包括时间数据,该方法还包括:若确定时间数据在预设时间段内,将预设时间段对应的预设场景类别作为电子设备的场景类别。
该方案中,基于电子设备的时钟信息,可获取设备在当前时段可能的场景类别,以辅助设备感知场景。
在第一方面的一个可选实施例中,检测数据还包括位置数据,该方法还包括:若确定位置数据在预设位置范围内,将预设位置范围对应的预设场景类别作为电子设备的场景类 别。
该方案中,基于电子设备的位置信息,可获取设备在当前位置可能的场景类别,以辅助设备感知场景。
在第一方面的一个可选实施例中,检测数据还包括语音数据,该方法还包括:若识别到语音数据中包含一个声源或少于N个声源,确定电子设备的场景类别为第一场景,N为大于或等于2的正整数;若识别到语音数据中包含大于M个声源,确定电子设备的场景类别为第二场景,M为大于N的正整数。本实施例中,第一场景可以是较为安静的场景,如会议室或教室的场景。第二场景可以是较为嘈杂的场景,如地铁站、车站的场景。
该方案中,基于电子设备所处环境的语音信息,可获知设备在当前环境可能的场景类别,以辅助设备感知场景。
在第一方面的一个可选实施例中,检测数据还包括电子设备的第一传感器的数据,第一传感器包括陀螺仪传感器和加速度传感器;该方法还包括:电子设备基于检测数据中第二图像以及第一传感器的数据,确定电子设备的场景类别。
该方案中,基于摄像头采集的第二图像,结合电子设备中传感器数据,可获知设备当前可能的场景类别,可提升设备感知场景的准确度。
在第一方面的一个可选实施例中,电子设备基于检测数据中第二图像以及第一传感器的数据,确定电子设备的场景类别,包括:若基于第一传感器的数据确定用户处于运动状态,且基于第二图像确定用户持续注视电子设备的屏幕,确定电子设备的场景类别为第三场景;运动状态包括步行或骑行状态。本实施例中,第三场景可以是步行、骑行注视屏幕的场景,属于不安全使用电子设备的场景。
该方案中,基于电子设备的传感器数据和图像数据,分别检测用户的运动状态以及用户的眼部状态(如持续注视屏幕的状态),可确定用户是否处于不安全使用电子设备的场景,实现对该场景的感知能力。
在第一方面的一个可选实施例中,检测数据还包括电子设备的第二传感器的数据,第二传感器包括环境光传感器;该方法还包括:若确定第二传感器的数据小于第四阈值,确定电子设备的场景类别为第四场景。本实施例中,第四场景可以是暗环境的场景,例如卧室或睡眠场景。
该方案中,通过检测电子设备所处环境的环境光数据,可确定用户是否处于暗环境,以辅助设备感知场景。该方案还可以结合电子设备的时钟信息、位置信息等,以提升设备感知场景的准确度。
在第一方面的一个可选实施例中,预设操作包括以下至少一种:调节音量大小;调节屏幕亮度;调节屏幕蓝光;调节震动强度;发送第一信息,第一信息用于提醒用户停止使用电子设备;发送第二信息,第二信息用于推荐与场景类别对应的内容;开启后置摄像头,用于检测障碍物。
该方案中,不同的场景类别可以对应不同的操作,以实现设备在感知场景类别后的智能控制。
在第一方面的一个可选实施例中,若预设操作为开启后置摄像头,该方法还包括:电子设备获取后置摄像头在第二模式下采集的第三图像;若识别到第三图像存在障碍物,发送第三信息,第三信息用于提醒用户避开障碍物。
该方案主要针对上述的第三场景,通过开启后置摄像头,以检测设备周围环境是否存在障碍物,以便及时提醒用户避让障碍物,提升用户的用机体验。
在第一方面的一个可选实施例中,电子设备的摄像头以第一模式运行,包括:电子设备的感知模块向电子设备的第二处理模块发送第一指示,第一指示用于指示第二处理模块检测摄像头范围内是否有人物面部;第二处理模块向摄像头发送第一拍摄指令;摄像头响应于第一拍摄指令,以第一模式运行。
在第一方面的一个可选实施例中,电子设备获取摄像头在第一模式下采集的第一图像,检测第一图像中是否有人物面部,包括:电子设备的第二处理模块获取摄像头在第一模式下采集的第一图像,检测第一图像中是否有人物面部。
在第一方面的一个可选实施例中,若电子设备在第一图像中检测到人物面部,控制摄像头由第一模式切换至第二模式,包括:若电子设备的第二处理模块检测到第一图像中有人物面部,第二处理模块向电子设备的感知模块发送第一消息,第一消息用于通知感知模块摄像头范围内有人物面部;感知模块向电子设备的第一处理模块发送第二指示,第二指示用于指示第一处理模块识别摄像头范围内场景的类别;第一处理模块响应于第二指示,向摄像头发送第二拍摄指令,第二拍摄指令用于指示摄像头以第二模式运行。
在第一方面的一个可选实施例中,电子设备识别第二图像,基于第二图像确定人物使用电子设备的场景类别,控制执行与场景类别对应的预设操作,包括:电子设备的第一处理模块识别第二图像,基于第二图像确定电子设备的场景类别,向电子设备的感知模块发送第二消息,第二消息用于指示电子设备的场景类别;感知模块向电子设备的目标应用发送第三指示,第三指示用于指示电子设备的场景类别;目标应用控制执行与场景类别对应的预设操作。
在第一方面的一个可选实施例中,电子设备的第二处理模块检测电子设备的状态。
上述几个可选实施例示出了电子设备底层模块之间的交互过程,以实现设备智能感知场景和场景控制。
第二方面,本申请实施例提供了一种电子设备,电子设备包括:摄像头,存储器和处理器,所述摄像头用于采集不同帧率和/或分辨率的图像,所述处理器用于调用所述存储器中的计算机程序,以执行如第一方面任一项所述的方法。
在第二方面的一个可选实施例中,处理器包括第一处理模块和第二处理模块,第一处理模块的功耗高于第二处理模块的功耗;第二处理模块用于检测摄像头在第一模式下采集的第一图像中是否有人物面部;第一处理模块用于识别摄像头在第二模式下采集的第二图像,确定用户使用电子设备的场景类别。
该方案中,通过较低功耗的第二处理模块检测摄像头范围内是否存在人物面部,通过较高功耗的第一处理模块识别电子设备当前的场景类别,优化电子设备的处理性能。
第三方面,本申请实施例提供一种电子设备,电子设备包括用于执行如第一方面任一项所述的方法的单元、模块或电路。
第四方面,本申请实施例提供了一种计算机可读存储介质,计算机可读存储介质存储有计算机指令,当计算机指令在电子设备上运行时,使得电子设备执行如第一方面任一项所述的方法。
第五方面,本申请实施例提供了一种芯片,芯片包括处理器,处理器用于调用存储器 中的计算机程序,以执行如第一方面任一项所述的方法。
第六方面,一种计算机程序产品,包括计算机程序,当计算机程序被运行时,使得计算机执行如第一方面任一项所述的方法。
应当理解的是,本申请的第二方面至第六方面与本申请的第一方面的技术方案相对应,各方面及对应的可行实施方式所取得的有益效果相似,不再赘述。
附图说明
图1为本申请实施例提供的一种场景示意图;
图2为本申请实施例提供的一种界面示意图;
图3为本申请实施例提供的一种场景示意图;
图4为本申请实施例提供的一种场景示意图;
图5为本申请实施例提供的一种电子设备的结构示意图;
图6为本申请实施例提供的一种电子设备的结构示意图;
图7为本申请实施例提供的一种SoC的结构示意图;
图8为本申请实施例提供的一种电子设备的结构示意图;
图9为本申请实施例提供的一种场景感知方法的流程示意图;
图10为本申请实施例提供的一种场景感知方法的流程示意图;
图11为本申请实施例提供的一种电子设备的结构示意图;
图12为本申请实施例提供的一种芯片的结构示意图;
图13为本申请实施例提供的一种折叠屏手机的结构示意图;
图14为本申请实施例提供的一种场景感知方法的流程示意图。
具体实施方式
为了便于清楚描述本申请实施例的技术方案,在本申请的实施例中,采用了“第一”、“第二”等字样对功能和作用基本相同的相同项或相似项进行区分。例如,第一图像和第二图像仅仅是为了区分不同帧率和/或分辨的图像,并不对其先后顺序进行限定。又例如,第一指示和第二指示仅仅是为了区分不同的指示。本领域技术人员可以理解“第一”、“第二”等字样并不对数量和执行次序进行限定,并且“第一”、“第二”等字样也并不限定一定不同。
需要说明的是,本申请实施例中,“示例性的”或者“例如”等词用于表示作例子、例证或说明。本申请中被描述为“示例性的”或者“例如”的任何实施例或设计方案不应被解释为比其他实施例或设计方案更优选或更具优势。确切而言,使用“示例性的”或者“例如”等词旨在以具体方式呈现相关概念。
本申请实施例中,“至少一个”是指一个或者多个,“多个”是指两个或两个以上。“和/或”,描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B的情况,其中A,B可以是单数或者复数。字符“/”一般表示前后关联对象是一种“或”的关系。“以下至少一项(种/个)”或其类似表达,是指的这些项中的任意组合,包括单项(种/个)或复数项(种/个)的任意组合。例如,a,b或c中的至少一项(种/个),可以表示:a,b,c,a-b,a-c,b-c,或a-b-c,其中a,b,c可以是单个,也可以是多个。
下面对本申请中涉及的部分用语进行解释说明,以便于本领域技术人员理解。
帧率(frame rate),是指摄像头在一秒钟采集或传输图像的张数,通常用fps(即帧每秒)表示。本申请实施例中,摄像头在第一模式下采用第一帧率采集/传输图像,在第二模式下采用第二帧率采集/传输图像,第一帧率小于第二帧率。在一些实施例中,第一模式下采集的图像的分辨率小于第二模式下采集的图像的分辨率。
分辨率,即图像分辨率,是指图像中存储的信息量,是每英寸图像内有多少个像素点,分辨率的单位有:dpi(点每英寸)、ppi(像素每英寸)等。
地理围栏(geo-fencing),是基于位置的服务(location based services,LBS)的一种应用,就是用一个虚拟的栅栏围出一个虚拟地理边界。当电子设备进入、离开某个特定地理区域,或在该区域内活动时,电子设备可以接收自动通知和警告等信息提示,电子设备还可以自动设置系统相关参数,如音量、震动强度等。地理围栏基于不同场景可以有不同名称,例如地铁站附近的地理围栏可称为地铁围栏,又例如办公楼附近的地理围栏可称为办公围栏,再例如教学楼附近的地理围栏可称为教室围栏。本申请的各实施例中,地理围栏可以是公共区域的通用围栏,电子设备检测进入地理围栏是否执行与地理围栏相关的操作,均需要得到电子设备的用户的授权。
轻量级神经网络,是一个较轻的模型,其拥有不差于较重模型的性能,从而实现硬件友好型的神经网络。这里的轻重通常指模型的规模或参数量。常用的轻量级神经网络的技术有:蒸馏、剪枝、量化、权重共享、低秩分解、注意力模块轻量化、动态网络架构/训练方式、更轻的网络架构设计等等,对此本申请实施例不做限制。
目前,用户使用智能终端的场景多种多样,不同的使用场景下用户的使用需求具有一定差异性,用户通常需要手动调整智能终端的相关参数,以适配当前的使用场景。示例性的,在较为安静的场景,如会议室、教室、医院等,往往需要将手机设置为静音或调低音量。在较为嘈杂的场景,如车站、地铁站等,往往需要调高音量或震动强度。基于此,如何提升智能终端的场景感知能力是一个亟待解决的问题。
针对上述问题,本申请实施例提供一种场景感知方法、电子设备及存储介质,本申请实施例提供的电子设备,在满足开启摄像头的条件时,指示摄像头在第一模式下工作,获取摄像头在第一模式下采集的第一图像,在检测到第一图像中包含人物面部时,指示摄像头由第一模式切换至第二模式,获取摄像头在第二模式下采集的第二图像,其中第二图像的分辨率和/或帧率大于第一图像。通过对第二图像进行图像分析,确定用户使用电子设备的场景类别,执行与场景类别对应的预设操作,从而实现对电子设备使用场景的自动检测和识别,以执行使用场景对应的预设操作,如调节音量、震动强度、推送信息等,以改善用户的使用体验。
下面以具体地实施例对本申请的技术方案以及本申请的技术方案如何解决上述技术问题进行详细说明。下面这几个具体的实施例可以独立实现,也可以相互结合,对于相同或相似的概念或过程可能在某些实施例中不再赘述。
下面以电子设备为手机为例进行示例说明,该示例并不构成对本申请实施例的限定。
图1为本申请实施例提供的一种场景示意图。如图1所示,若满足开启摄像头的条件,触发手机的前置摄像头在第一模式下采集第一图像,在检测到第一图像中包含人物面部时, 触发手机的前置摄像头和/或后置摄像头在第二模式下采集第二图像,通过对第二图像进行图像分析,确定用户当前使用手机的场景类别。基于场景类别,执行与该场景类别对应的预设操作,例如调节音量、亮度、震动强度、发送提示信息等。
本实施例中,摄像头在第一模式下的功耗低于在第二模式下的功耗。第一图像的分辨率小于第二图像的分辨率,第一图像的帧率小于第二图像的帧率。前置摄像头采集的第一图像可以是一张或多张,前置摄像头和/或后置摄像头采集的第二图像可以是一张或多张,对此本实施例不作任何限制。
在一些实施例中,开启摄像头的条件包括:已开启场景感知功能。响应于开启场景感知功能的第一操作,手机的摄像头以第一模式运行(后台运行,例如手机当前开启某第三方应用,摄像头在后台运行),采集上述的第一图像。其中,第一操作可以是用户在系统设置界面的点击操作,第一操作还可以是语音操作,对此本申请实施例不作限定。
示例性的,图2为本申请实施例提供的一种界面示意图,如图2所示,用户可在系统应用的设置界面中选择开启或关闭场景感知功能,在检测到手机已开启该场景感知功能时,触发手机的前置摄像头采集上述第一图像。
在一些实施例中,用户还可以在第三方应用的设置界面中选择开启或关闭场景感知功能,在检测到手机已开启该第三方应用,且在第三方应用中开启该场景感知功能时,触发手机的前置摄像头采集上述第一图像。
需要说明的是,已开启第三方应用包括用户当前点开第三方应用,或者,用户在开启第三方应用后第三方应用进入后台运行状态。
在一些实施例中,开启摄像头的条件还包括第一条件,第一条件包括以下至少一种:
手机的屏幕状态为亮屏状态;手机已解锁;手机的接近光传感器发射的光信号与光信号的反射信号的时间差大于第一阈值,和/或,反射信号的信号强度小于第二阈值,和/或,接收光传感器未接收到反射信号;手机的环境光传感器的检测数据大于第三阈值;手机的屏幕朝向预设方向;手机处于移动状态。
在一些实施例中,响应于开启场景感知功能的第一操作,且满足第一条件,手机的摄像头以第一模式运行,采集上述的第一图像。
本实施例中,若手机已开启场景感知功能,且满足第一条件,则触发手机的摄像头持续采集图像。通过增设第一条件,以避免手机摄像头在非必要时持续采集图像,进一步降低设备功耗。
基于前述实施例,下面对触发前置摄像头采集第一图像的各种可能的实施方式进行说明。
在一种可能的实施方式中,若检测到用户已开启场景感知功能,触发手机的前置摄像头采集第一图像。
在一种可能的实施方式中,若检测到用户已开启场景感知功能,检测手机的屏幕状态,若手机的屏幕状态为亮屏状态,触发手机的前置摄像头采集第一图像。手机亮屏状态显示的界面包括例如锁屏界面、主界面、第三方应用界面。
在一种可能的实施方式中,若检测到用户已开启场景感知功能,检测手机是否已解锁,若手机已解锁,触发手机的前置摄像头采集第一图像。
在一种可能的实施方式中,若检测到用户已开启场景感知功能,检测手机的接近光传 感器发射的光信号是否被遮挡,若确定接近光传感器发射的光信号未被遮挡,触发手机的前置摄像头采集第一图像。
作为一种示例,若手机的接近光传感器发射的光信号与光信号的反射信号的时间差大于第一阈值,和/或,反射信号的信号强度小于第二阈值,和/或,接近光传感器未接收到反射信号,可确定接近光传感器发射的光信号未被遮挡。
可以理解的是,若用户通过听筒拨打或接听电话,或者,手机位于手提包或口袋时,手机接近光传感器发射的光信号会被遮挡,前置摄像头采集的图像无法检测到人物面部,此时可停止摄像头持续采图,以降低设备功耗。
在一种可能的实施方式中,若检测到用户已开启场景感知功能,检测手机的环境光传感器的检测数据是否大于第三阈值,若确定环境光传感器的检测数据大于第三阈值,触发手机的前置摄像头采集第一图像。其中,检测数据主要是指环境光亮度。应理解,若手机的环境光传感器的检测数据大于第三阈值,说明电子设备并不处于暗环境,如手机在口袋里,或者当前是夜晚时段。
在一种可能的实施方式中,若检测到用户已开启场景感知功能,检测手机的屏幕朝向,若手机的屏幕朝向预设方向,触发手机的前置摄像头采集第一图像。本实施方式中,预设方向可以理解为用户使用手机的方向,通常是手机屏幕朝向用户的方向,该方向可通过检测手机的姿态数据确定,其中姿态数据包括俯仰角、偏航角和滚动角。
在一种可能的实施方式中,若检测到用户已开启场景感知功能,检测手机是否处于移动状态,若确定手机处于移动状态,触发手机的前置摄像头采集第一图像。在本申请实施例中,手机处于移动状态包括例如用户携带(包括握持)手机行走或骑行,用户携带(包括握持)手机乘坐车辆等。
在一种可能的实施方式中,若检测到用户已开启场景感知功能,且确定满足以下至少两项:手机的屏幕状态为亮屏状态,手机已解锁,手机的接近光传感器发射的光信号未被遮挡,手机的屏幕朝向预设方向,手机处于移动状态,则触发手机的前置摄像头采集第一图像。
上述实施例示出了一种场景感知方法,若满足开启摄像头的条件,则触发面部检测,该开启摄像头的条件至少包括场景感知功能已开启。通过获取较低分辨率的第一图像进行面部检测,若第一图像中检测到人物面部,获取较高分辨率的第二图像,再通过分析第二图像确定用户当前使用手机的场景类别,进而执行与场景类别对应的预设操作。上述方法实现手机智能感知场景的功能,用户无需根据场景变化手动设置如音量、震动强度等系统参数,提升了用户的使用体验。
在一些实施例中,若检测到用户已开启场景感知功能,检测手机是否进入预设的地理围栏,若手机进入预设的地理围栏,手机可获知该地理围栏对应的场景,进而执行与场景对应的预设操作。其中,预设的地理围栏可以包括例如地铁围栏、办公围栏、教室围栏等。示例性的,当检测到手机进入地铁围栏,手机可获知用户即将进入地铁车厢,可将手机音量调高,或提高震动强度。当检测到手机进入办公围栏,手机可获知用户即将进入办公室,可将手机设置为静音,或提高震动强度等。本实施例中,可以不执行上述实施例的面部检测,仅根据手机的当前位置确定是否进入预设地理围栏,从而设置对应的预设操作。
图3为本申请实施例提供的一种场景示意图。如图3所示,若满足上述实施例所述的 开启摄像头的条件,触发手机的前置摄像头采集第一图像,在检测到第一图像中包含人物面部时,获取检测数据。本实施例中,检测数据包括图像数据、时间数据、位置数据、语音数据的至少一种。基于检测数据确定用户当前使用手机的场景类别,例如图3中识别到当前场景为会议或课堂场景,可执行与当前场景对应的预设操作,例如降低音量,或设置为静音,或提高震动强度等。
在一种可能的实施方式中,检测数据包括前置摄像头和/或后置摄像头采集的第二图像,基于检测数据中的第二图像确定用户当前使用手机的场景类别,即通过对第二图像进行图像分析,确定用户当前使用手机的场景类别。
作为一种示例,将第二图像输入场景检测模型,获取场景检测模型输出的第一检测结果,第一检测结果用于指示用户使用手机的场景类别。本示例中,场景检测模型可以是采用轻量级神经网络模型训练得到的。
作为一种示例,场景检测模型的训练过程包括:
步骤a、构建场景检测模型的训练集和测试集,训练集或测试集中均包括样本图像和样本图像对应的场景类别(样本标注),训练集和测试集中的样本图像不同。
步骤b、基于初始的场景检测模型和训练集,对场景检测模型进行训练。具体的,将训练集的样本图像作为初始的场景检测模型的输入,将训练集的样本图像对应的场景类别作为初始的场景检测模型的输出,对场景检测模型进行训练。
步骤c、基于步骤b训练的场景检测模型和测试集,对场景检测模型的预测结果进行验证,当模型损失函数收敛时,停止对场景检测模型的训练。
在一种可能的实施方式中,检测数据包括时间数据,基于检测数据中的时间数据确定用户当前使用手机的场景类别。作为一种示例,若确定时间数据在预设时间段内,将预设时间段对应的预设场景类别作为用户使用手机的场景类别。
示例性的,通过获取会议通知,手机可以获知会议时间段和会议地点,若当前时刻在会议时间段内,可确定当前场景为会议场景。示例性的,通过获取电子课表,手机可以获知上课时间段和上课地点,若当前时刻在上课时间段内,可确定当前场景为课堂场景。
在一种可能的实施方式中,检测数据包括位置数据,基于检测数据中的位置数据确定用户当前使用手机的场景类别。作为一种示例,若确定位置数据在预设位置范围内,将预设位置范围对应的预设场景类别作为用户使用手机的场景类别。
在一种可能的实施方式中,检测数据包括语音数据,基于检测数据中的语音数据确定用户当前使用手机的场景类别。作为一种示例,若识别到语音数据中包含一个声源或少于N个声源,确定用户使用手机的场景类别为第一场景,例如图3中的会议或课堂等场景。作为另一种示例,若识别到语音数据中包含大于M个声源,确定用户使用手机的场景类别为第二场景,例如车站或地铁站等场景。本示例中,N为大于或等于2的正整数,M为大于N的正整数。
在一种可能的实施方式中,检测数据包括图像数据(即第二图像),时间数据,位置数据以及语音数据,综合分析上述检测数据,确定用户当前使用手机的场景类别。与上述各种实施方式相比,通过本实施方式确定的场景类别的准确率更高。
上述实施例示出了一种场景感知方法,若满足开启摄像头的条件,则触发面部检测。该开启摄像头的条件至少包括场景感知功能已开启。若检测到人物面部,获取各项检测数 据,包括图像数据、时间数据、位置数据、语音数据等,综合各项检测数据,以确定用户当前使用手机的场景类别,进而执行与场景类别对应的预设操作。上述方法实现手机智能感知场景的功能,用户无需根据场景变化手动设置如系统音量、震动强度等参数,提升了用户的使用体验。
图4为本申请实施例提供的一种场景示意图。如图4所示,若满足上述实施例所述的开启摄像头的条件,触发手机的前置摄像头采集第一图像,在检测到第一图像中包含人物面部时,获取检测数据。本实施例中,检测数据包括图像数据和第一传感器的数据,第一传感器包括陀螺仪传感器和加速度传感器。基于检测数据中的图像数据以及第一传感器的数据,确定用户使用手机的场景类别,例如图4中识别到当前场景为步行注视屏幕的场景,可执行与当前场景对应的预设操作,例如发送第一信息,用于提醒或建议用户请勿使用手机。在一些实施例中,通过弹窗或语音的方式发送该第一信息。
在一种可能的实施方式中,检测数据中的图像数据包括前置摄像头采集的多张连续的第二图像,若基于第一传感器的数据确定用户处于步行状态,且基于多张连续的第二图像确定用户持续注视手机屏幕,确定用户使用手机的场景类别为第三场景。本实施方式中,第三场景即步行注视屏幕的场景。
在一种可能的实施方式中,检测数据中的图像数据包括前置摄像头采集的多张连续的第二图像,若基于第一传感器的数据确定用户处于骑行状态,且基于多张连续的第二图像确定用户持续注视手机屏幕,确定用户使用手机的场景类别为第三场景。本实施方式中,第三场景即骑行注视屏幕的场景。
作为一种示例,基于第一传感器的数据确定用户的运动状态,包括以下至少一种:
获取陀螺仪传感器检测的手机姿态数据,基于手机姿态数据确定用户重心的偏移度,基于用户重心的偏移度确定用户的运动状态,手机姿态数据包括手机的俯仰角、偏航角和滚动角;获取加速度传感器检测的手机加速度,基于手机加速度确定用户的运动状态。
在一些实施例中,还可以结合手机指南针辅助确定用户的运动状态。
作为一种示例,基于多张连续的第二图像确定用户持续注视手机屏幕,包括:将多张连续的第二图像依次输入注视检测模型,获取注视检测模型输出的第二检测结果,第二检测结果用于指示用户是否持续注视电子设备的屏幕。本示例中,注视检测模型可以是基于深度学习方法,采用轻量级神经网络模型训练得到的。
在一些实施例中,若确定用户当前使用手机的场景为步行注视屏幕的场景,还可以执行如下操作:开启后置摄像头,用于检测障碍物。具体的,手机开启后置摄像头,摄像头工作在第二模式下,获取后置摄像头在第二模式下采集的第三图像,若识别到第三图像中存在障碍物,例如台阶、电线杆、机动车、坑洼等,发送第三信息,第三信息用于提醒用户避开障碍物。其中,第三图像的分辨率大于第一图像的分辨率,和/或,第三图像的帧率大于第一图像的帧率。本实施例中,可采用目标检测模型确定第三图像中是否存在障碍物,该目标检测模型可以是基于深度学习方法,采用轻量级神经网络模型训练得到的。
作为一种示例,目标检测模型的训练过程包括:步骤a、构建目标检测模型的训练集和测试集,训练集或测试集中均包括样本图像和样本图像的标注信息,标注信息用于指示样本图像中是否存在障碍物,训练集和测试集中的样本图像不同。步骤b、基于初始的目标检测模型和训练集,对目标检测模型进行训练。具体的,将训练集的样本图像作为初始 的目标检测模型的输入,将训练集的样本图像的标注信息作为初始的目标检测模型的输出,对目标检测模型进行训练。步骤c、基于步骤b训练的目标检测模型和测试集,对目标检测模型的预测结果进行验证,当模型损失函数收敛时,停止对目标检测模型的训练。
上述实施例示出了一种场景感知方法,若满足开启摄像头的条件,则触发面部检测,该开启摄像头的条件至少包括场景感知功能已开启。若检测到人物面部,获取各项检测数据,包括图像数据、姿态数据、速度加速度数据等,以感知用户是否在不安全场景下使用手机,如步行或骑行注视屏幕,若感知到用户在不安全场景下使用手机,可提醒用户请勿使用手机或注意道路安全,提升了用户的使用体验。
在一些实施例中,检测数据包括第二传感器的数据,第二传感器包括环境光传感器。若确定第二传感器的数据小于第四阈值,确定用户使用手机的场景类别为第四场景。本实施例中,第二传感器的数据用于指示手机在当前场景的环境光数据,如光照度。若环境光传感器的数据小于第四阈值,表示手机当前处于暗环境,即第四场景(例如卧室/睡眠场景),此时,手机可自动调节手机屏幕亮度,或者启动低蓝光模式。
在一些实施例中,在识别卧室/睡眠场景时,除了分析上述的第二传感器的数据之外,还可以结合时钟、用机习惯、地理围栏等进行综合判断。进入卧室/睡眠场景后的操作,包括例如降低屏幕亮度、降低蓝光、推荐入睡内容(即与场景类别卧室对应的内容)等。
基于上述几个实施例,手机可结合图像数据、时钟(时间)数据、位置数据、语音数据以及各类传感器数据的至少一种,识别用户使用手机的场景类别,进而执行与该场景类别对应的预设操作。其中,预设操作包括以下至少一种:调节音量大小;调节屏幕亮度;调节屏幕蓝光;调节震动强度;发送第一信息,第一信息用于提醒用户停止使用所述电子设备;发送第二信息,第二信息用于推荐与所述场景类别对应的内容;开启后置摄像头,用于检测障碍物。
本申请实施例提供的场景感知方案,实现对电子设备场景的智能检测,以及基于场景检测自动设置系统参数的功能,如在会议室、课堂等设置静音或提高震动强度,卧室降低使用音量,公共场合防偷窥,走路使用手机时防撞或防踩坑等,改善用户的用机体验。
本申请实施例提供的场景感知方法,除了应用于直板手机外,还可应用于折叠屏手机。
在一些实施例中,若手机为折叠屏手机,折叠屏手机包括内屏和外屏,内屏对应设置有第一摄像头,外屏对应设置有第二摄像头,例如,第一摄像头为图13中的摄像头3,第二摄像头为图13中的摄像头1。
作为一种示例,若检测到手机的外屏为亮屏状态,且手机处于折叠状态,控制第二摄像头以第一模式运行。作为一种示例,若检测到手机的外屏为亮屏状态,且手机处于展开状态,控制第二摄像头以第一模式运行。基于该两种示例,在一些实施例中,在控制第二摄像头以第一模式运行之前,还包括:检测到手机的状态满足以下至少一项:手机已解锁;手机的接近光传感器(外屏上)发射的光信号与光信号的反射信号的时间差大于第一阈值,和/或,反射信号的信号强度小于第二阈值,和/或,接收光传感器未接收到反射信号;手机的环境光传感器的检测数据大于第三阈值;手机的外屏朝向预设方向;手机处于移动状态。
作为一种示例,若检测到手机的内屏为亮屏状态,且手机处于展开状态,控制第一摄像头以第一模式运行。在控制第一摄像头以第一模式运行之前,还包括:检测到手机的状 态满足以下至少一项:手机已解锁;手机的接近光传感器(内屏上)发射的光信号与光信号的反射信号的时间差大于第一阈值,和/或,反射信号的信号强度小于第二阈值,和/或,接收光传感器未接收到反射信号;手机的环境光传感器的检测数据大于第三阈值;手机的内屏朝向预设方向;手机处于移动状态。
基于上述几个示例,在控制第一摄像头以第一模式运行,或者控制第二摄像头以第一模式运行之前,还包括:检测到电子设备的状态满足第二条件;第二条件包括以下至少一项:电子设备已解锁;电子设备的接近光传感器发射的光信号与光信号的反射信号的时间差大于第一阈值,和/或,反射信号的信号强度小于第二阈值,和/或,接收光传感器未接收到反射信号;电子设备的环境光传感器的检测数据大于第三阈值;电子设备的内屏或外屏朝向预设方向;电子设备处于移动状态。
作为一种示例,当第二摄像头以第一模式运行时,若检测到电子设备由折叠状态至展开状态时,电子设备控制第一摄像头以第一模式运行,并关闭第二摄像头。
作为一种示例,当第二摄像头以第一模式运行时,若检测到电子设备由折叠状态至展开状态时,电子设备控制第一摄像头以第一模式运行,第二摄像头保持以第一模式运行,内外屏上摄像头同时开启,以扩大检测设备场景的范围。
作为一种示例,当第一摄像头以第一模式运行时,若检测到电子设备由展开状态至折叠状态时,电子设备控制第一摄像头关闭,并控制第二摄像头以第一模式运行。
作为一种示例,当第一摄像头以第一模式运行时,若检测到电子设备由展开状态至折叠状态时,电子设备控制第一摄像头关闭。
下面结合附图13对折叠屏手机执行场景感知方法进行详细说明。
示例性的,图13为本申请实施例提供的一种折叠屏手机的结构示意图。如图13所示,折叠屏手机的屏幕包括第一屏、第二屏和第三屏,第一屏为折叠屏手机的外屏,第二屏和第三屏为折叠屏手机的内屏,折叠屏包括第二屏和第三屏,折叠屏按照图13中(4)所示的折叠边折叠,形成第二屏和第三屏。折叠屏所在的虚拟轴线为公共轴。其中,内屏指的是折叠屏处于折叠状态时位于内部的屏,外屏指的是折叠屏处于闭合状态时位于外部的屏。第二屏和第三屏之间的夹角β为折叠屏手机的合页角度,确定合页角度即可确定折叠屏的物理状态。物理状态包括如图13中(3)所示的折叠状态、图13中(4)所示的展开状态、或图13中(2)所示的支架状态。图13所示的折叠屏手机包括3组摄像头,分别记为摄像头1、摄像头2和摄像头3。图13中(1)所示,摄像头1设置在第一屏上部的中间位置,摄像头2设置在背板上,图13中(4)所示,摄像头3设置在第三屏上部的中间位置。折叠屏手机在折叠状态下,摄像头1可看作是前置摄像头,摄像头2可看作是后置摄像头,如图13中(3)所示。折叠屏手机在展开状态下,摄像头3可看作是前置摄像头,摄像头1和2可看作是后置摄像头。
下面以图13所示的折叠屏手机为例,对折叠屏手机在何种情况下开启摄像头以及开启哪些摄像头进行举例说明。
在一种可能的实施方式中,若折叠屏手机已开启场景感知功能,当检测到手机为折叠状态,且外屏(如图13中的第一屏)为亮屏状态,触发折叠屏手机的摄像头1持续采集第一图像,以检测摄像头范围是否有人物面部。
在一种可能的实施方式中,若折叠屏手机已开启场景感知功能,当检测到手机为折叠 状态,且外屏为亮屏状态,且满足上述第二条件的至少一项时,触发折叠屏手机的摄像头1持续采集第一图像,以检测摄像头范围是否有人物面部。
在一种可能的实施方式中,若折叠屏手机已开启场景感知功能,当检测到手机为展开状态,且内屏(如图13中的第二屏和第三屏)为亮屏状态,触发折叠屏手机的摄像头3,持续采集第一图像,以检测摄像头范围是否有人物面部。
在一种可能的实施方式中,若折叠屏手机已开启场景感知功能,当检测到手机为展开状态,且内屏为亮屏状态,且满足上述第二条件的至少一项时,触发折叠屏手机的摄像头3,持续采集第一图像,以检测摄像头范围是否有人物面部。
在一种可能的实施方式中,若折叠屏手机已满足开启摄像头的条件,当前手机为折叠状态且已开启摄像头1。当检测到手机由折叠状态至展开状态,且开启摄像头的其他条件不变时,可关闭摄像头1,并开启摄像头3,以检测摄像头3范围内是否有人物面部。
在一种可能的实施方式中,若折叠屏手机已满足开启摄像头的条件,当前手机为展开状态且已开启摄像头3。当检测到手机由展开状态切换至折叠状态,且开启摄像头的其他条件不变时,可关闭摄像头3,并开启摄像头1,以检测摄像头1范围内是否有人物面部。
需要说明的是,其他形态的折叠屏手机执行场景感知方法,可参照图13所示的折叠屏手机,其实现原理和技术效果类似,本申请实施例对折叠屏手机的结构样式不作任何限制。
为了能够更好地理解本申请实施例,下面对本申请实施例的电子设备的结构进行介绍。示例性的,图5为本申请实施例提供的一种电子设备的结构示意图。如图5所示,电子设备100可以包括:处理器110,外部存储器接口120,内部存储器121,通用串行总线(universal serial bus,USB)接口130,充电管理模块140,电源管理模块141,电池142,天线1,天线2,移动通信模块150,无线通信模块160,音频模块170,扬声器170A,受话器170B,麦克风170C,耳机接口170D,传感器180,按键190,马达191,指示器192,摄像头193,显示屏194,以及用户标识模块(subscriber identification module,SIM)卡接口195等。可以理解的是,本实施例示意的结构并不构成对电子设备100的具体限定。在本申请另一些实施例中,电子设备100可以包括比图示更多或更少的部件,或者组合某些部件,或者拆分某些部件,或者不同的部件布置。图示的部件可以以硬件,软件,或软件和硬件的组合实现。
处理器110可以包括一个或多个处理单元,例如:处理器110可以包括应用处理器(application processor,AP),调制解调处理器,图形处理器(graphics processing unit,GPU),图像信号处理器(image signal processor,ISP),控制器,视频编解码器,数字信号处理器(digital signal processor,DSP),基带处理器,显示处理单元(display process unit,DPU),和/或神经网络处理器(neural-network processing unit,NPU)等。其中,不同的处理单元可以是独立的器件,也可以集成在一个或多个处理器中。
在一些实施例中,电子设备100也可以包括一个或多个处理器110。
在一些实施例中,处理器110可以包括一个或多个接口。
可以理解的是,本发明实施例示意的各模块间的接口连接关系,只是示意性说明,并不构成对电子设备100的结构限定。在本申请另一些实施例中,电子设备100也可以采用上述实施例中不同的接口连接方式,或多种接口连接方式的组合。
充电管理模块140用于从充电器接收充电输入。电源管理模块141用于连接电池142,充电管理模块140与处理器110。电源管理模块141接收电池142和/或充电管理模块140的输入,为处理器110,内部存储器121,显示屏194,摄像头193,和无线通信模块160等供电。电源管理模块141还可以用于监测电池容量,电池循环次数,电池健康状态等参数。在其他一些实施例中,电源管理模块141也可以设置于处理器110中。在另一些实施例中,电源管理模块141和充电管理模块140也可以设置于同一个器件中。
电子设备100的无线通信功能可以通过天线1,天线2,移动通信模块150,无线通信模块160,调制解调处理器以及基带处理器等实现。移动通信模块150可以提供应用在电子设备100上的包括2G/3G/4G/5G等无线通信的解决方案。
无线通信模块160可以提供应用在电子设备100上的包括无线局域网(wireless local area networks,WLAN),蓝牙,全球导航卫星系统(global navigation satellite system,GNSS),调频(frequency modulation,FM),NFC,红外技术(infrared,IR)等无线通信的解决方案。
电子设备100通过GPU,显示屏194,以及应用处理器等可以实现显示功能。GPU为图像处理的微处理器,连接显示屏194和应用处理器。GPU用于执行数学和几何计算,用于图形渲染。处理器110可包括一个或多个GPU,其执行指令以生成或改变显示信息。
显示屏194用于显示图像,视频等。显示屏194包括显示面板。显示面板可以采用液晶显示屏(liquid crystal display,LCD),有机发光二极管(organic light-emitting diode,OLED),有源矩阵有机发光二极体或主动矩阵有机发光二极体(active-matrix organic light emitting diode的,AMOLED),柔性发光二极管(flex light-emitting diode,FLED),Miniled,MicroLed,Micro-oLed,量子点发光二极管(quantum dot light emitting diodes,QLED)等。在一些实施例中,电子设备100可以包括1个或N个显示屏194,N为大于1的正整数。
电子设备100可以通过ISP,一个或多个摄像头193,视频编解码器,GPU,一个或多个显示屏194以及应用处理器等实现拍摄功能。
NPU为神经网络(neural-network,NN)计算处理器,通过借鉴生物神经网络结构,例如借鉴人脑神经元之间传递模式,对输入信息快速处理,还可以不断的自学习。通过NPU可以实现电子设备100的智能认知等应用,例如:图像识别,人脸识别,语音识别,文本理解等。
外部存储器接口120可以用于连接外部存储卡,例如Micro SD卡,实现扩展电子设备100的存储能力。外部存储卡通过外部存储器接口120与处理器110通信,实现数据存储功能。例如将音乐、照片、视频等数据文件保存在外部存储卡中。
内部存储器121可以用于存储一个或多个计算机程序,该一个或多个计算机程序包括指令。处理器110可以通过运行存储在内部存储器121的上述指令,从而使得电子设备100执行各种功能应用以及数据处理等。
传感器180可以包括压力传感器180A,陀螺仪传感器180B,气压传感器180C,磁传感器180D,加速度传感器180E,距离传感器180F,接近光传感器180G,指纹传感器180H,温度传感器180J,触摸传感器180K,环境光传感器180L,骨传导传感器180M等。
陀螺仪传感器180B可以用于确定电子设备100的运动姿态。在一些实施例中,可以通过陀螺仪传感器180B确定电子设备100围绕三个轴(即,x,y和z轴)的角速度。陀螺仪传感器180B可以用于拍摄防抖。陀螺仪传感器180B还可以用于导航,体感游戏场景等。
磁传感器180D用于检测磁铁的磁场强度,得到磁力数据,通过磁力数据检测电子设备100的折叠屏的物理状态。磁铁用于产生磁场。在本申请实施例中,磁传感器180D可以设置于例如图13中(1)所示的背板对应的机体中,磁铁可以设置于例如图13中(1)所示的第一屏对应的机体中,磁铁可以让磁传感器180D检测到磁力数据,随着折叠屏开合状态的变化,磁传感器180D与磁铁之间的距离相应发生变化,磁传感器180D检测到的磁铁的磁场强度也会发生变化,进而智能传感集线器可以根据磁传感器180D在磁铁的磁场作用下所获取到的磁力数据,判断折叠屏的物理状态,物理状态包括如展开状态、支架状态、或折叠状态(闭合状态)。在一些实施例中,传感器180还可以包括霍尔传感器,霍尔传感器同样可用于检测磁铁的磁场强度,输出高/低电平,通过高/低电平确定电子设备100的折叠屏的物理状态。
加速度传感器180E可检测电子设备100在各个方向上(一般为三轴)加速度的大小。当电子设备100静止时可检测出重力的大小及方向。还可以用于识别电子设备姿态,应用于横竖屏切换,计步器等应用。
距离传感器180F,用于测量距离。电子设备100可以通过红外或激光测量距离。在一些实施例中,拍摄场景,电子设备100可以利用距离传感器180F测距以实现快速对焦。
接近光传感器180G可以包括例如发光二极管(LED)和光检测器,例如光电二极管。发光二极管可以是红外发光二极管。电子设备100通过发光二极管向外发射红外光。电子设备100使用光电二极管检测来自附近物体的红外反射光。当检测到充分的反射光时,可以确定电子设备100附近有物体。当检测到不充分的反射光时,电子设备100可以确定电子设备100附近没有物体。电子设备100可以利用接近光传感器180G检测用户手持电子设备100贴近耳朵通话,以便自动熄灭屏幕达到省电的目的。接近光传感器180G也可用于皮套模式,口袋模式自动解锁与锁屏。
环境光传感器180L用于感知环境光亮度。电子设备100可以根据感知的环境光亮度自适应调节显示屏194亮度。环境光传感器180L也可用于拍照时自动调节白平衡。环境光传感器180L还可以与接近光传感器180G配合,检测电子设备100是否在口袋里,以防误触。
按键190包括开机键,音量键等。按键190可以是机械按键,也可以是触摸式按键。电子设备100可以接收按键输入,产生与电子设备100的用户设置以及功能控制有关的键信号输入。
上述电子设备也可以称为终端设备(terminal)、用户设备(user equipment,UE)、移动台(mobile station,MS)、移动终端(mobile terminal,MT)等。电子设备可以为拥有触摸屏的手机(mobile phone)、穿戴式设备、平板电脑(Pad)、带无线收发功能的电脑、虚拟现实(virtual reality,VR)终端设备、增强现实(augmented reality,AR)终端设备、工业控制(industrial control)中的无线终端、无人驾驶(self-driving)中的无线终端、远程手术(remote medical surgery)中的无线终端、智能电网(smart grid)中的无线终端、运输安全(transportation safety)中的无线终端、智慧城市(smart city)中的无线终端、智慧家庭(smart home)中的无线终端等。本申请实施例对电子设备所采用的具体技术和具体设备形态不做限定。
在本申请实施例中,为了实现电子设备智能感知场景的功能,需要对电子设备的摄像 头和处理器做硬件和软件上的改进。下面首先对电子设备的硬件改进进行说明。
示例性的,图6为本申请实施例提供的一种电子设备的结构示意图。如图6所示,电子设备可以包括改进的摄像头601和改进的处理器602。
改进的摄像头601是指在现有的摄像头模组中增设控制电路和新增拍摄模式所对应的工作电路,实现低功耗配置。例如,现有的摄像头模组的拍摄模式为模式1,摄像头模组中包括模式1对应的工作电路。若拍摄模式增加模式2,相应的,还涉及模式1和模式2之间的切换,对此,改进的摄像头模组除了包括模式1对应的工作电路,还包括新增的模式2对应的工作电路,以及两种模式切换对应的控制电路。应理解,根据实际应用需求,可以设置两个以上拍摄模式,对此本申请实施例不作任何限制。
作为一种示例,改进的摄像头601包括两种工作模式:第一模式和第二模式,第一模式可称为低功耗拍摄模式,第二模式可称为常规拍摄模式。摄像头601在第一模式下采集的图像的分辨率小于在第二模式下采集的图像的分辨率,摄像头601在第一模式下采集的图像的帧率小于在第二模式下采集的图像的帧率。摄像头601可在这两种模式间切换。
示例性的,若满足开启摄像头的条件,摄像头601工作在第一模式下,可常驻扫描,摄像头601以第一帧率持续采集第一分辨率的第一图像,以检测摄像头601范围内是否有人,如检测第一图像是否包含人物面部;若检测到有人出现后,如检测到第一图像中包含人物面部,摄像头601由第一模式切换至第二模式,以第二帧率持续采集第二分辨率的第二图像,以检测当前场景的类别,或者检测人物是否注视屏幕,或者。其中,第一帧率小于第二帧率,第一分辨率小于第二分辨率。
基于上述示例,摄像头601可以动态调节采集图像的帧率和分辨率,以适应不同的需求,例如,摄像头601以较低的帧率和分辨率采集图像,以检测摄像头601范围内是否有人物面部,摄像头601以较高的帧率和分辨率采集图像,以识别人物使用电子设备的场景类别。示例性的,摄像头601的最低帧率可以为1fps,最高帧率可以为30fps。摄像头601的最低分辨率可以为120×180,最高分辨率可以为480×640。在一些示例中,最高帧率还可以为240fps,最高分辨率还可以为2736×3648。
需要说明的是,本申请实施例对摄像头采集图像的帧率范围、分辨率范围不作具体限定,即可以不限制摄像头采集图像的最低帧率和最高帧率,也不限制摄像头采集图像的最低分辨率和最高分辨率。在实际应用中,可以根据需求对帧率范围和分辨率范围作出合理设置。
需要说明的是,摄像头601可以是电子设备的前置摄像头,也可以是电子设备的后置摄像头,对此本实施例不作限制。
改进的处理器602可以是片上系统(System on Chip,SoC)。在本申请实施例中,当电子设备触发摄像头601常驻扫描,获取当前场景下的图像,摄像头601可将图像发送至SoC进行图像分析,以检测图像中是否存在人物面部、人物是否注视屏幕、当前场景的类别等。
为了实现低功耗的目的,SoC可以支持低功耗AON ISP(Always On ISP),参考附图7,摄像头601将图像传输至AON ISP,AON ISP除了对图像作格式转换外,不作任何图像效果的处理,随后将格式转换后的图像存储至片上静态随机存取存储器(on-chip Static Random-Access Memory,on-chip SRAM)。SoC还可以支持极低功耗核心,计算、算法运 行、图像存储都工作在低功耗模式。并且,SoC还可以支持低功耗嵌入式神经网络处理器eNPU(emdedded NPU)。
下面对于本申请实施例涉及的SoC进行详细说明。
示例性的,图7为本申请实施例提供的一种SoC的结构示意图。如图7所示,SoC包括第一处理单元和第二处理单元。其中,第一处理单元包括图像信号处理ISP、神经网络处理器NPU和中央处理器CPU,第二处理单元包括I2C总线接口、AON ISP、on-chip SRAM、数字信号处理器DSP和eNPU。在Soc中,第二处理单元的功耗低于第一处理单元的功耗,具体来说,第二处理单元中eNPU的功耗低于第一处理单元中NPU的功耗,第二处理单元中AON ISP的功耗低于第一处理单元中ISP的功耗。
作为一种示例,第一处理单元可用于处理摄像头601采集的上述的第二分辨率的第二图像。示例性的,在第二模式下,摄像头601采集第二分辨率的第二图像,经IPS处理后,由NPU对处理后的第二分辨率的第二图像进行检测,例如检测当前场景的类别、或者检测人物是否注视屏幕。第一处理单元将数据(如图像数据)发送至存储器前,可以进行安全处理(例如加密处理),将安全处理后的数据存储在存储器的安全缓冲器(buffer)中。安全处理用于保护用户的隐私数据。
作为一种示例,第二处理单元可用于处理摄像头601采集的上述的第一分辨率的第一图像。示例性的,在第一模式下,摄像头601采集第一分辨率的第一图像,AON ISP通过I2C总线接口获取第一分辨率的第一图像,经AON ISP处理后,由eNPU对处理后的第一分辨率的第一图像进行检测,例如检测第一图像中是否包含人物面部。第二处理单元中的on-chip SRAM可用于存储处理后的第一分辨率的第一图像,DSP可用于通知eNPU进行图像检测、接收eNPU上报的检测结果,并将检测结果上报至上层应用。第二处理单元采用低功耗配置,以降低电子设备的功耗。
需要说明的是,本申请实施例对于第一处理单元或第二处理单元与摄像头之间传输的图像数据的格式不作限制。示例性的,图像数据可以为摄像串行接口(Camera Serial Interface,CSI)移动产业处理器接口(Mobile Industry Processor Interface,MIPI)数据。
电子设备的软件系统可以采用分层架构,事件驱动架构,微核架构,微服务架构,或云架构。本申请实施例以分层架构的软件系统为Android系统为例,示例性说明电子设备的软件结构。图8为本申请实施例提供的一种电子设备的结构示意图。分层架构将电子设备的软件系统分成若干个层,每一层都有清晰的角色和分工。层与层之间通过软件接口通信。
参照图8,本申请实施例的电子设备包括应用程序层(Applications)、应用程序框架层(Application Framework)、硬件抽象层(Hardware Abstraction Layer,HAL),内核层(Kernel),传感器控制中心(Sensorhub)以及硬件层。
应用程序层可以包括一系列应用程序,应用程序层通过调用应用程序框架层所提供的应用程序接口(application programming interface,API)运行应用程序。
本申请实施例中,应用程序层可以包括场景感知应用和感知模块,场景感知应用与感知模块连接,场景感知应用在感知模块中注册,由感知模块作状态管理下发和数据传输,例如,感知模块从Sensorhub中的第二处理模块获知摄像头范围内有人物面部时,感知模块通知HAL中的第一处理模块,以便第一处理模块基于摄像头采集的图像识别人物使用 电子设备的场景类别,最终由感知模块向场景感知应用上报识别结果。
在一些实施例中,应用程序层还包括其他应用(图8未示出),例如注视不息屏应用,注视常亮显示(always on display,AOD)应用。一种可能的情况,多个应用对应同一算法,例如,注视不息屏应用和注视AOD应用对应注视检测算法,感知模块可用于对注视检测算法作统一的调度和管理。另一种可能的情况,不同应用对应不同的算法,例如,场景感知应用对应场景识别算法(场景感知应用还对应面部(有无)检测算法),注视不息屏应用对应注视检测算法,这两个算法均涉及从底层的摄像头获取图像数据,感知模块可用于对多个算法间的优先级的调度和管理。存在一种可能,场景识别算法和注视检测算法的优先级相同,感知模块可通知底层的摄像头将图像数据同时上报给场景感知应用和注视不息屏应用。本实施例中,场景识别算法可部署在第一处理模块中,面部检测算法可部署在第二处理模块中。本实施例示出的算法仅为示例。
在一些实施例中,若电子设备包括折叠屏,应用程序层还包括第三处理模块,第三处理模块用于获取第二处理模块上报的折叠屏物理状态,以及内外屏摄像头的状态(开启或关闭)。此外,第三处理模块还用于将内外屏摄像头的状态通知给第一处理模块。
在一些实施例中,应用程序还可以包括相机,图库,日历,通话,地图,导航,WLAN,蓝牙,音乐,视频,短信息等应用程序,可以是系统应用,也可以是第三方应用,本申请实施例对此不做限制。
应用程序框架层为应用程序层的应用程序提供API和编程框架。应用程序框架层包括一些预先定义的函数。如图8所示,应用程序框架层可以包括摄像头服务(CameraService),摄像头服务用于对所有需要使用摄像头的应用的优先级调度和管理。
在一些实施例中,应用程序框架层还可以包括例如窗口管理器,内容提供器,资源管理器,通知管理器,视图系统等,本申请实施例对此不做限制。
硬件抽象层可以包括AO(always on)服务和第一处理模块。AO服务可用于控制第一处理模块中的场景识别算法的开启或关闭,以及控制第二处理模块中的面部检测算法的开启或关闭,以及上下层数据传输。第一处理模块可用于处理较高分辨率和/或较高帧率的图像,如上述的第二图像,以检测第二图像,识别用户使用设备的场景类别。第一处理模块还用于用于摄像头模式切换,例如第一处理模块接收来自感知模块的第二指示,控制摄像头由第一模式切换至第二模式。
内核层是硬件和软件之间的层。内核层用于驱动硬件,使得硬件工作。在本申请实施例中,内核层可以包括摄像头驱动,摄像头驱动用于驱动电子设备中摄像头工作在第一模式或第二模式下,以采集不同帧率和/或分辨率的图像。
此外,内核层还可以包括显示驱动音频驱动,传感器驱动,马达驱动等,本申请实施例对此不做限制。其中,传感器驱动可以驱动例如接近光传感器发射光信号,以检测用户当前是否手持电子设备贴近耳朵通话等,传感器驱动还可以驱动例如陀螺仪传感器,以检测电子设备的姿态数据;传感器驱动还可以驱动例如环境光传感器检测环境光亮度,以检测电子设备是否处于暗环境,暗环境包括例如手机在口袋里等。
Sensorhub用于实现对传感器的集中控制,以减小CPU的负荷。Sensorhub相当于微程序控制器(Microprogrammed Control Unit,MCU),该MCU上可以运行用于驱动多个传感器工作的程序,也就是说Sensorhub中可以支持挂载多个传感器的能力,其可以作为一个 独立的芯片,放置在CPU与各类传感器之间,也可以集成在CPU中的应用处理器(application processor,AP)中。
在本申请实施例中,Sensorhub可以包括第二处理模块,第二处理模块可用于处理较低分辨率和/或较低帧率的的图像,如上述的第一图像,以检测第一图像中是否有人物面部。相较于第一处理模块,第二处理模块为低功耗处理模块,第二处理模块常驻运行或以低功耗形式运行。作为一种示例,在场景感知功能开启时,第二处理模块还用于获取各类传感器上报的数据,基于各类传感器数据确定电子设备的各项状态,如屏幕状态、解锁状态、使用状态等,若设备状态满足第一条件,第二处理模块可向摄像头发送第一拍摄指令,指示摄像头在低功耗拍摄模式(如第一模式)下常驻扫图,以检测摄像头范围内是否有人物面部。作为一种示例,对于可折叠设备,第二处理模块可通过检测折叠屏物理状态、屏幕状态以及设备状态是否满足第二条件,确定是否向摄像头(内屏或外屏上的摄像头)发送第一拍摄指令。作为一种示例,第二处理模块在检测到折叠屏手机的屏幕的物理状态发生改变时,如由折叠状态至展开状态,或者由展开状态至折叠状态,第二处理模块可控制折叠屏手机的摄像头(如手机外屏的摄像头,和/或,手机内屏的摄像头)开启或关闭。
硬件层可以包括例如摄像头、各类传感器和AON ISP等。
可以理解的是,图8示出的层级结构中的层以及各层中包含的模块或部件,并不构成对电子设备的具体限定。在另一些实施例中,电子设备可以包括比图示更多或更少的层,以及每个层中可以包括更多或更少的部件,本申请不做限定。图8所示的各分层中包括的模块为本申请实施例中涉及到的模块,各分层中包括的模块并不构成对电子设备的结构和模块部署的层级(示例说明)的限定。在一些实施例中,图8中所示的模块可以单独部署,或者几个模块可以部署在一起,图8中对模块的划分为一种示例。在一些实施例中,图8中所示的模块的名称为示例说明。
在上述所示的电子设备的结构的基础上,下面结合具体的实施例对本申请实施例提供的场景感知方法进行说明。下面这几个实施例可以相互结合,对于相同或相似的概念或过程可能在某些实施例不再赘述。
图9为本申请实施例提供的一种场景感知方法的流程示意图。如图9所示,本实施例提供的场景感知方法,包括:
步骤901、目标应用在感知模块注册场景感知功能。
本实施例中,目标应用即应用程序层的场景感知应用。在一些实施例中,目标应用从服务器(或称为云端)获取场景感知功能的信息,在获取场景感知功能的信息后,目标应用可在感知模块注册场景感知功能,以便感知模块常驻运行和执行与场景感知相关的事项。其中,场景感知功能的信息包括不同场景类别对应的预设操作等。
步骤902、感知模块确定是否启动场景感知功能。
若感知模块确定启动场景感知功能,则执行步骤903。
示例性的,参照图2,若用户在系统应用或第三方应用的设置界面选择开启场景感知功能,系统应用或第三方应用向感知模块发送通知,以告知感知模块用户已开启场景感知功能。
步骤903、感知模块向第二处理模块发送第一指示,第一指示用于指示第二处理模块检测电子设备的摄像头范围内是否有人物面部。感知模块发送第一指示,可控制启动第二 处理模块,包括模块上电、工作场景下发、资源准备等。
步骤904、第二处理模块响应于第一指示,向摄像头发送第一拍摄指令,第一拍摄指令用于指示摄像头工作在第一模式。第二处理模块发送第一拍摄指令,以控制启动摄像头,包括摄像头上电、模式切换、出图分辨率和帧率设置等。摄像头在第一模式下以较低的帧率和/或分辨率持续采集图像,即第一图像。
步骤905、摄像头向第二处理模块发送第一图像。
摄像头响应于第一拍摄指令,在第一模式下以第一帧率采集第一分辨率的第一图像。示例性的,第一帧率可设置为5fps,第一分辨率可设置为120×180或者640×480。
步骤906、第二处理模块识别第一图像是否包含人物面部。
若第二处理模块在第一图像中识别到人物面部,则执行步骤907。否则,第二处理模块继续执行步骤906,除非摄像头被控制关闭。
在一些实施例中,第二处理模块预置有面部识别模型,面部识别模型可以是采用轻量级神经网络模型训练得到的,用于识别图像中是否包含人物面部。
在一些实施例中,面部识别模型可部署在第二处理模块的eNPU上,具有良好的实时性。
步骤907、第二处理模块向感知模块发送第一消息,第一消息用于通知感知模块电子设备摄像头范围内识别到人物面部。
步骤908、感知模块向第一处理模块发送第二指示,第二指示用于指示第一处理模块识别摄像头范围内场景的类别。第二处理模块发送第二指示,可控制启动第一处理模块,包括模块上电、工作场景下发、资源准备等。
步骤909、第一处理模块响应于第二指示,向摄像头发送第二拍摄指令,第二拍摄指令用于指示摄像头工作在第二模式。
第一处理模块发送第二拍摄指令,以控制摄像头进行模式切换,即由第一模式切换至第二模式。摄像头在第二模式下以较高的帧率和/或分辨率持续采集图像,即第二图像。
步骤910、摄像头向第一处理模块发送第二图像。
摄像头响应于第二拍摄指令,在第二模式下以第二帧率采集第二分辨率的第二图像,并向第一处理模块发送第二图像。示例性的,第二帧率可设置为30fps,第二分辨率可设置为1920×1080。
步骤911、第一处理模块基于第二图像识别用户使用电子设备的场景类别。
在一些实施例中,第一处理模块预设有场景检测模型,场景检测模型可以是采用轻量级神经网络模型训练得到的,用于识别第二图像对应的场景类别。
在一些实施例中,场景检测模型可部署在第一处理模块的NPU上,具有良好的实时性。
步骤912、第一处理模块向感知模块发送第二消息。
本实施例中,第二消息用于指示用户使用电子设备的场景类别。在一些实施例中,第二消息包括场景类别的标识。
步骤913、感知模块向目标应用发送第三指示,第三指示用于指示用户使用电子设备的场景类别。在一些实施例中,第三指示包括场景类别的标识。
步骤914、响应于第三指示,目标应用控制执行与场景类别对应的预设操作。
本实施例中,目标应用预存场景感知功能的信息,包括不同场景类别对应的预设操作。预设操作包括以下至少一种:调节音量大小;调节屏幕亮度;调节屏幕蓝光;调节震动强度;发送第一信息,第一信息用于提醒用户停止使用电子设备;发送第二信息,第二信息用于推荐与场景类别对应的内容;开启后置摄像头,用于检测障碍物。
在一些实施例中,目标应用从感知模块获取第三指示,基于第三指示中场景类别的标识以及预存的场景感知功能的信息,确定场景类别对应的预设操作,进而控制执行与场景类别对应的预设操作。
需要说明的是,在本实施例中,第一处理模块可对应图7所示的第一处理单元,第二处理模块可对应图7所示的第二处理单元。
上述实施例中,若电子设备已注册场景感知功能,且用户已开启该场景感知功能,感知模块可通过第二处理模块(低功耗处理模块)向摄像头发送第一拍摄指令,使得摄像头以较低的帧率和/或分辨率采集第一图像。若第二处理模块识别到第一图像中包含人物面部,可通知第一处理模块,以便第一处理模块向摄像头发送第二拍摄指令,使得摄像头以较高的帧率和/或分辨率采集第二图像。第一处理模块识别第二图像对应的场景类别,并将场景类别告知应用,以便应用执行场景类别对应的预设操作。
上述方案在用户开启场景感知功能后,电子设备以较低功耗进行人物面部检测,在确定用户正在使用电子设备时进行场景检测,并基于场景检测结果自动设置系统参数或推送通知,实现电子设备智能感知场景的功能。由于摄像头和第二处理模块均采用低功耗配置,执行该方案的功耗极低。
图10为本申请实施例提供的一种场景感知方法的流程示意图。在图9所示实施例的基础上,如图10所示,本实施例提供的场景感知方法,包括:
步骤1010、目标应用在感知模块注册场景感知功能。
步骤1011、感知模块确定是否启动场景感知功能。
若感知模块确定启动场景感知功能,则执行步骤1012。
步骤1012、感知模块向第二处理模块发送第一指示,第一指示用于指示第二处理模块检测电子设备的摄像头范围内是否有人物面部。
步骤1013、第二处理模块确定是否满足第一条件。
若第二处理模块确定满足第一条件,则执行步骤1014。否则,第二处理模块持续低功耗检测,以确定是否满足第一条件。
本实施例中,第一条件包括以下至少一种:
电子设备的屏幕状态为亮屏状态;电子设备已解锁;电子设备的接近光传感器发射的光信号与光信号的反射信号的时间差大于第一阈值,和/或,反射信号的信号强度小于第二阈值,和/或,接收光传感器未接收到反射信号;电子设备的环境光传感器的检测数据大于第三阈值;电子设备的屏幕朝向预设方向;电子设备处于移动状态。
本实施例中,若电子设备已开启场景感知功能,且满足上述的第一条件,则触发电子设备的摄像头持续采集第一图像。通过增设第一条件,以避免电子设备在非必要时持续采集第一图像,进一步降低设备功耗。
步骤1014、第二处理模块向摄像头发送第一拍摄指令,第一拍摄指令用于指示摄像头工作在第一模式。
在一些实施例中,若电子设备为可折叠设备,可折叠设备包括内屏和外屏,内屏对应设置有第一摄像头,外屏对应设置有第二摄像头。
一种情况,步骤1013可以替换为:第二处理模块检测到电子设备的内屏为亮屏状态,且电子设备处于展开状态。相应的,步骤1014可以是:第二处理模块向第一摄像头发送第一拍摄指令。在一些实施例中,第二处理模块向第一摄像头发送第一拍摄指令之前,还包括:检测到电子设备的状态满足第二条件,第二条件包括以下至少一项:电子设备已解锁;电子设备的接近光传感器(内屏上)发射的光信号与光信号的反射信号的时间差大于第一阈值,和/或,反射信号的信号强度小于第二阈值,和/或,接收光传感器未接收到反射信号;电子设备的环境光传感器的检测数据大于第三阈值;电子设备的内屏朝向预设方向;电子设备处于移动状态。
一种情况,步骤1013可以替换为:第二处理模块检测到电子设备的外屏为亮屏状态,且电子设备处于折叠状态。相应的,步骤1014可以是:第二处理模块向第二摄像头发送第一拍摄指令。在一些实施例中,第二处理模块向第二摄像头发送第一拍摄指令之前,还包括:检测到电子设备的状态满足第二条件,第二条件包括以下至少一项:电子设备已解锁;电子设备的接近光传感器(外屏上)发射的光信号与光信号的反射信号的时间差大于第一阈值,和/或,反射信号的信号强度小于第二阈值,和/或,接收光传感器未接收到反射信号;电子设备的环境光传感器的检测数据大于第三阈值;电子设备的外屏朝向预设方向;电子设备处于移动状态。
步骤1015、摄像头向第二处理模块发送第一图像。
步骤1016、第二处理模块识别第一图像是否包含人物面部。
若第二处理模块在第一图像中识别到人物面部,则执行步骤1017;
若第二处理模块在第一图像中未识别到人物面部,则跳转回步骤1013。
步骤1017、第二处理模块向感知模块发送第一消息,第一消息用于通知感知模块电子设备摄像头范围内识别到人物面部。
步骤1018、感知模块向第一处理模块发送第二指示,第二指示用于指示第一处理模块识别摄像头范围内场景的类别。
步骤1019、第一处理模块响应于第二指示,向摄像头发送第二拍摄指令,第二拍摄指令用于指示摄像头工作在第二模式。
步骤1020、摄像头向第一处理模块发送第二图像。
步骤1021、第一处理模块基于第二图像识别用户使用电子设备的场景类别。
步骤1022、第一处理模块向感知模块发送第二消息,第二消息用于指示用户使用电子设备的场景类别。
步骤1023、感知模块向目标应用发送第三指示,第三指示用于指示用户使用电子设备的场景类别。
步骤1024、目标应用控制执行与场景类别对应的预设操作。
上述实施例中,若用户已开启场景感知功能,且满足上述的第一条件,感知模块可通过第二处理模块向摄像头发送第一拍摄指令,使得摄像头以较低的帧率和/或分辨率采集第一图像。若第二处理模块识别到第一图像中包含人物面部,可通知第一处理模块,由第一处理模块向摄像头发送第二拍摄指令,使得摄像头以较高的帧率和/或分辨率采集第二图像。 第一处理模块识别第二图像,确定第二图像对应的场景类别,并将场景类别告知应用,以便应用执行场景类别对应的预设操作,如自动设置系统参数或推送通知,实现设备智能感知场景的功能。一方面,由于摄像头和第二处理模块均采用低功耗配置,执行上述方案的功耗极低。另一方面,增设第一条件可避免电子设备进行非必要的人物面部检测,从而进一步降低设备功耗。
本申请实施例还提供一种场景感知方法,该方法应用于具有柔性屏幕的电子设备,下面以折叠屏手机为例进行方案说明,折叠屏手机的内屏和外屏上分别设置一组摄像头,用于采集图像数据。本实施例的场景感知方法涉及折叠屏状态变化时,设备底层模块的处理逻辑,下面结合附图14进行说明。
示例性的,图14为本申请实施例提供的一种场景感知方法的流程示意图。如图14所示,本实施例的场景感知方法,可以包括以下步骤:
步骤1401、第二处理模块获取传感器数据,以确定折叠屏的物理状态变化。
本实施例中,传感器数据包括例如磁传感器、霍尔传感器等,通过获取传感器数据确定折叠屏的物理状态是否发生变化。折叠屏的物理状态发生变化包括由折叠状态至展开状态,或者由展开状态至折叠状态。
步骤1402a、第二处理模块基于折叠屏的物理状态变化,控制开启或关闭内屏上的摄像头。
步骤1402b、第二处理模块基于折叠屏的物理状态变化,控制开启或关闭外屏上的摄像头。
在一种可能的实施方式中,若折叠屏的物理状态变化为由折叠状态至展开状态,第二处理模块可控制开启内屏上的摄像头,和/或,外屏上的摄像头。
在一种可能的实施方式中,外屏上的摄像头已开启,若折叠屏的物理状态变化为由折叠状态至展开状态,第二处理模块可控制关闭外屏上的摄像头,同时控制开启内屏上的摄像头。
在一种可能的实施方式中,内屏上的摄像头已开启,若折叠屏的物理状态变化为由展开状态至折叠状态,第二处理模块可控制关闭内屏上的摄像头,同时开启外屏上的摄像头。
示例性的,外屏上的摄像头可以是图13所示的摄像头1,内屏上的摄像头可以是图13所示的摄像头3。
步骤1402c、第二处理模块将折叠屏的物理状态以及内外屏摄像头的状态上报给第三处理模块。
需要指出的是,本实施例对步骤1402a至1402c的执行顺序不做任何限定。
步骤1403、第三处理模块向第一处理模块发送通知,通知内外屏摄像头的状态。
本实施例示出了折叠屏手机的屏幕的物理状态发生变化时,手机内部各个模块之间的交互过程,通过上述交互以实现对摄像头低功耗持续扫图功能的精确控制,使得用户在使用折叠屏手机时,手机能够智能识别手机当前所处场景,如教室、会议室、地铁站等,进而自动设置手机系统参数(如音量、震动强度等),提升用户的用机体验。
图11为本申请实施例提供的一种电子设备的结构示意图,如图11所示,电子设备包括摄像头1106,处理器1101,通信线路1104以及至少一个通信接口(图11中示例性的以通信接口1103为例进行说明)。
摄像头1106可用于采集不同帧率和/或分辨率的图像,处理器1101可用于检测图像中是否有人物面部,以及识别场景。
处理器1101可以是一个通用中央处理器(central processing unit,CPU),微处理器,特定应用集成电路(application-specific integrated circuit,ASIC),或一个或多个用于控制本申请方案程序执行的集成电路。在一些实施例中,处理器1101包括第一处理模块和第二处理模块,第一处理模块的功耗高于第二处理模块的功耗;第二处理模块可用于检测摄像头在第一模式下采集的第一图像中是否有人物面部,第一处理模块用于识别摄像头在第二模式下采集的第二图像,确定用户使用电子设备的场景类别。
通信线路1104可包括在上述组件之间传送信息的电路。
通信接口1103,使用任何收发器一类的装置,用于与其他设备或通信网络通信,如以太网,无线局域网(wireless local area networks,WLAN)等。
在一些实施例中,电子设备还可以包括存储器1102。
存储器1102可以是只读存储器(read-only memory,ROM)或可存储静态信息和指令的其他类型的静态存储设备,随机存取存储器(random access memory,RAM)或者可存储信息和指令的其他类型的动态存储设备,也可以是电可擦可编程只读存储器(electrically erasable programmable read-only memory,EEPROM)、只读光盘(compactdisc read-only memory,CD-ROM)或其他光盘存储、光碟存储(包括压缩光碟、激光碟、光碟、数字通用光碟、蓝光光碟等)、磁盘存储介质或者其他磁存储设备、或者能够用于携带或存储具有指令或数据结构形式的期望的程序代码并能够由计算机存取的任何其他介质,但不限于此。存储器可以是独立存在,通过通信线路1104与处理器相连接。存储器也可以和处理器集成在一起。
其中,存储器1102用于存储执行本申请方案的计算机执行指令,并由处理器1101来控制执行。处理器1101用于执行存储器1102中存储的计算机执行指令,从而实现本申请实施例所提供的场景感知方法。
在一些实施例中,电子设备还包括显示屏1207,显示屏1207可以是折叠屏。
本申请实施例中的计算机执行指令也可以称之为应用程序代码,本申请实施例对此不作具体限定。
作为一种示例,处理器1101可以包括一个或多个CPU,例如图11中的CPU0和CPU1。
作为一种示例,电子设备可以包括多个处理器,例如图11中的处理器1101和处理器1105。这些处理器中的每一个可以是一个单核(single-CPU)处理器,也可以是一个多核(multi-CPU)处理器。这里的处理器可以指一个或多个设备、电路、和/或用于处理数据(例如计算机程序指令)的处理核。
图12为本申请实施例提供的一种芯片的结构示意图。如图12所示,芯片120包括一个或两个以上(包括两个)处理器1220和通信接口1230。
在一些实施方式中,存储器1240存储了如下的元素:可执行模块或者数据结构,或者,可执行模块或者数据结构的子集,或者,可执行模块或者数据结构的扩展集。
本申请实施例中,存储器1240可以包括只读存储器和随机存取存储器,并向处理器1220提供指令和数据。存储器1240的一部分还可以包括非易失性随机存取存储器(non-volatile random access memory,NVRAM)。
本申请实施例中,存储器1240、通信接口1230以及存储器1240通过总线系统1210 耦合在一起。其中,总线系统1210除包括数据总线之外,还可以包括电源总线、控制总线和状态信号总线等。为了便于描述,在图12中将各种总线都标为总线系统1210。
上述本申请实施例描述的方法可以应用于处理器1220中,或者由处理器1220实现。处理器1220可能是一种集成电路芯片,具有信号的处理能力。在实现过程中,上述方法的各步骤可以通过处理器1220中的硬件的集成逻辑电路或者软件形式的指令完成。上述的处理器1220可以是通用处理器(例如,微处理器或常规处理器)、数字信号处理器(digitalsignal processing,DSP)、专用集成电路(application specific integrated circuit,ASIC)、现成可编程门阵列(field-programmable gate array,FPGA)或者其他可编程逻辑器件、分立门、晶体管逻辑器件或分立硬件组件,处理器1220可以实现或者执行本发明实施例中的公开的各方法、步骤及逻辑框图。
上述实施例中,存储器存储的供处理器执行的指令可以以计算机程序产品的形式实现。计算机程序产品可以是事先写入在存储器中,也可以是以软件形式下载并安装在存储器中。
本申请实施例还提供一种计算机程序产品,计算机程序产品包括一个或多个计算机指令。在电子设备上加载和执行计算机程序指令时,使得电子设备执行上述实施例中的技术方案,其实现原理和技术效果与上述相关实施例类似,此处不再赘述。
计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一计算机可读存储介质传输,例如,计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线(digital subscriber line,DSL)或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。
本申请实施例还提供了一种计算机可读存储介质,计算机可读存储介质存储有计算机指令,当所述计算机指令在电子设备上运行时,使得所述电子设备执行上述实施例中的技术方案,其实现原理和技术效果与上述相关实施例类似,此处不再赘述。
计算机可读存储介质可以包括计算机存储介质和通信介质,还可以包括任何可以将计算机程序从一个地方传送到另一个地方的介质。计算机可读存储介质可以包括:紧凑型光盘只读储存器CD-ROM、RAM、ROM、EEPROM或其它光盘存储器;计算机可读存储介质可以包括磁盘存储器或其它磁盘存储设备。而且,任何连接线也可以被适当地称为计算机可读存储介质。例如,如果使用同轴电缆,光纤电缆,双绞线,DSL或无线技术(如红外,无线电和微波)从网站,服务器或其它远程源传输软件,则同轴电缆,光纤电缆,双绞线,DSL或诸如红外,无线电和微波之类的无线技术包括在介质的定义中。如本文所使用的磁盘和光盘包括光盘(CD),激光盘,光盘,数字通用光盘(digital versatile disc,DVD),软盘和蓝光盘,其中磁盘通常以磁性方式再现数据,而光盘利用激光光学地再现数据。
此外,需要说明的是,本申请所涉及的用户信息(包括但不限于用户设备信息、用户个人信息、用户面部信息等)和数据(包括但不限于用于分析的数据、存储的数据、展示的数据等),均为经用户授权或者经过各方充分授权的信息和数据,并且相关数据的收集、使用和处理需要遵守相关国家和地区的相关法律法规和标准,并提供有相应的操作入口,供用户选择授权或者拒绝。
以上,仅为本发明的具体实施方式,但本发明的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本发明揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本发明的保护范围之内。因此,本发明的保护范围应以权利要求的保护范围为准。

Claims (23)

  1. 一种场景感知方法,其特征在于,应用于电子设备,所述方法包括:
    所述电子设备的摄像头以第一模式运行;
    所述电子设备获取所述摄像头在所述第一模式下采集的第一图像,检测所述第一图像中是否有人物面部;
    若所述电子设备在所述第一图像中检测到人物面部,控制所述摄像头由所述第一模式切换至第二模式;
    所述电子设备获取检测数据,所述检测数据包括所述摄像头在所述第二模式下采集的第二图像;
    所述电子设备识别所述第二图像,基于所述第二图像确定所述人物使用所述电子设备的场景类别,控制执行与所述场景类别对应的预设操作。
  2. 根据权利要求1所述的方法,其特征在于,
    所述电子设备的摄像头以第一模式运行之前,所述方法还包括:
    电子设备响应于开启场景感知功能的第一操作。
  3. 根据权利要求1或2所述的方法,其特征在于,
    所述电子设备的摄像头以第一模式运行之前,所述方法还包括:
    检测到所述电子设备的状态满足第一条件;所述第一条件包括以下至少一项:
    所述电子设备的屏幕状态为亮屏状态;
    所述电子设备已解锁;
    所述电子设备的接近光传感器发射的光信号与所述光信号的反射信号的时间差大于第一阈值,和/或,所述反射信号的信号强度小于第二阈值,和/或,所述接收光传感器未接收到所述反射信号;
    所述电子设备的环境光传感器的检测数据大于第三阈值;
    所述电子设备的屏幕朝向预设方向;
    所述电子设备处于移动状态。
  4. 根据权利要求1或2所述的方法,其特征在于,所述电子设备为可折叠设备,所述可折叠设备包括内屏和外屏,所述内屏对应设置有第一摄像头,所述外屏对应设置有第二摄像头;所述电子设备的摄像头以第一模式运行,包括:
    检测到所述电子设备的外屏为亮屏状态,且所述电子设备处于折叠状态,控制所述第二摄像头以第一模式运行;或者
    检测到所述电子设备的内屏为亮屏状态,且所述电子设备处于展开状态,控制所述第一摄像头以第一模式运行。
  5. 根据权利要求4所述的方法,其特征在于,在控制所述第一摄像头以第一模式运行,或者控制所述第二摄像头以第一模式运行之前,所述方法还包括:
    检测到所述电子设备的状态满足第二条件;所述第二条件包括以下至少一项:
    所述电子设备已解锁;
    所述电子设备的接近光传感器发射的光信号与所述光信号的反射信号的时间差大于第一阈值,和/或,所述反射信号的信号强度小于第二阈值,和/或,所述接收光传感器未 接收到所述反射信号;
    所述电子设备的环境光传感器的检测数据大于第三阈值;
    所述电子设备的内屏或外屏朝向预设方向;
    所述电子设备处于移动状态。
  6. 根据权利要求4或5所述的方法,其特征在于,所述方法还包括:
    当所述第二摄像头以所述第一模式运行时,若检测到所述电子设备由折叠状态至展开状态时,所述电子设备控制所述第一摄像头以所述第一模式运行,并关闭所述第二摄像头;或者,所述电子设备控制所述第一摄像头以所述第一模式运行;
    当所述第一摄像头以所述第一模式运行时,若检测到所述电子设备由展开状态至折叠状态时,所述电子设备控制所述第一摄像头关闭,并控制所述第二摄像头以所述第一模式运行。
  7. 根据权利要求1至6任一项所述的方法,其特征在于,所述检测数据还包括时间数据,所述方法还包括:若确定所述时间数据在预设时间段内,将所述预设时间段对应的预设场景类别作为所述电子设备的场景类别。
  8. 根据权利要求1至7任一项所述的方法,其特征在于,所述检测数据还包括位置数据,所述方法还包括:若确定所述位置数据在预设位置范围内,将所述预设位置范围对应的预设场景类别作为所述电子设备的场景类别。
  9. 根据权利要求1至8任一项所述的方法,其特征在于,所述检测数据还包括语音数据,所述方法还包括:若识别到所述语音数据中包含一个声源或少于N个声源,确定所述电子设备的场景类别为第一场景,N为大于或等于2的正整数;若识别到所述语音数据中包含大于M个声源,确定所述电子设备的场景类别为第二场景,M为大于N的正整数。
  10. 根据权利要求1至9任一项所述的方法,其特征在于,所述检测数据还包括所述电子设备的第一传感器的数据,所述第一传感器包括陀螺仪传感器和加速度传感器;所述方法还包括:
    所述电子设备基于所述检测数据中所述第二图像以及所述第一传感器的数据,确定所述电子设备的场景类别。
  11. 根据权利要求10所述的方法,其特征在于,所述电子设备基于所述检测数据中所述第二图像以及所述第一传感器的数据,确定所述电子设备的场景类别,包括:
    若基于所述第一传感器的数据确定用户处于运动状态,且基于所述第二图像确定用户持续注视所述电子设备的屏幕,确定所述电子设备的场景类别为第三场景;
    所述运动状态包括步行或骑行状态。
  12. 根据权利要求1至11任一项所述的方法,其特征在于,所述检测数据还包括所述电子设备的第二传感器的数据,所述第二传感器包括环境光传感器;所述方法还包括:
    若确定所述第二传感器的数据小于第四阈值,确定所述电子设备的场景类别为第四场景。
  13. 根据权利要求1至12任一项所述的方法,其特征在于,
    所述预设操作包括以下至少一种:
    调节音量大小;
    调节屏幕亮度;
    调节屏幕蓝光;
    调节震动强度;
    发送第一信息,第一信息用于提醒用户停止使用所述电子设备;
    发送第二信息,第二信息用于推荐与所述场景类别对应的内容;
    开启后置摄像头,用于检测障碍物。
  14. 根据权利要求13所述的方法,其特征在于,若所述预设操作为开启所述后置摄像头,所述方法还包括:
    所述电子设备获取所述后置摄像头在所述第二模式下采集的第三图像;
    若识别到所述第三图像存在障碍物,发送第三信息,所述第三信息用于提醒用户避开所述障碍物。
  15. 根据权利要求1至14任一项所述的方法,其特征在于,
    所述电子设备的摄像头以第一模式运行,包括:
    所述电子设备的感知模块向所述电子设备的第二处理模块发送第一指示,所述第一指示用于指示所述第二处理模块检测所述摄像头范围内是否有人物面部;
    所述第二处理模块向所述摄像头发送第一拍摄指令;
    所述摄像头响应于所述第一拍摄指令,以所述第一模式运行。
  16. 根据权利要求1至15任一项所述的方法,其特征在于,所述电子设备获取所述摄像头在所述第一模式下采集的第一图像,检测所述第一图像中是否有人物面部,包括:
    所述电子设备的第二处理模块获取所述摄像头在所述第一模式下采集的所述第一图像,检测所述第一图像中是否有人物面部。
  17. 根据权利要求1至16任一项所述的方法,其特征在于,若所述电子设备在所述第一图像中检测到人物面部,控制所述摄像头由所述第一模式切换至第二模式,包括:
    若所述电子设备的第二处理模块检测到所述第一图像中有人物面部,所述第二处理模块向所述电子设备的感知模块发送第一消息,所述第一消息用于通知所述感知模块所述摄像头范围内有人物面部;
    所述感知模块向所述电子设备的第一处理模块发送第二指示,所述第二指示用于指示所述第一处理模块识别所述摄像头范围内场景的类别;
    所述第一处理模块响应于所述第二指示,向所述摄像头发送第二拍摄指令,所述第二拍摄指令用于指示所述摄像头以所述第二模式运行。
  18. 根据权利要求1至17任一项所述的方法,其特征在于,所述电子设备识别所述第二图像,基于所述第二图像确定所述人物使用所述电子设备的场景类别,控制执行与所述场景类别对应的预设操作,包括:
    所述电子设备的第一处理模块识别所述第二图像,基于所述第二图像确定所述电子设备的场景类别,向所述电子设备的感知模块发送第二消息,所述第二消息用于指示所述电子设备的场景类别;
    所述感知模块向所述电子设备的目标应用发送第三指示,所述第三指示用于指示所述电子设备的场景类别;
    所述目标应用控制执行与所述场景类别对应的预设操作。
  19. 根据权利要求3至6任一项所述的方法,其特征在于,所述电子设备的第二处理 模块检测所述电子设备的状态。
  20. 一种电子设备,其特征在于,所述电子设备包括:摄像头,存储器和处理器;
    所述摄像头用于采集不同帧率和/或分辨率的图像;所述处理器用于调用所述存储器中的计算机程序,以执行如权利要求1至19任一项所述的场景感知方法。
  21. 根据权利要求20所述的方法,其特征在于,所述处理器包括第一处理模块和第二处理模块,所述第一处理模块的功耗高于所述第二处理模块的功耗;
    所述第二处理模块用于检测所述摄像头在第一模式下采集的第一图像中是否有人物面部;所述第一处理模块用于识别所述摄像头在第二模式下采集的第二图像,确定用户使用所述电子设备的场景类别。
  22. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质存储有计算机指令,当所述计算机指令在电子设备上运行时,使得所述电子设备执行如权利要求1至19任一项所述的场景感知方法。
  23. 一种芯片,其特征在于,所述芯片包括处理器,所述处理器用于调用存储器中的计算机程序,以执行如权利要求1至19任一项所述的场景感知方法。
PCT/CN2023/125982 2022-11-30 2023-10-23 场景感知方法、设备及存储介质 WO2024114170A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202211521455.9A CN118118775A (zh) 2022-11-30 2022-11-30 场景感知方法、设备及存储介质
CN202211521455.9 2022-11-30

Publications (2)

Publication Number Publication Date
WO2024114170A1 true WO2024114170A1 (zh) 2024-06-06
WO2024114170A9 WO2024114170A9 (zh) 2024-08-15

Family

ID=91211109

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/125982 WO2024114170A1 (zh) 2022-11-30 2023-10-23 场景感知方法、设备及存储介质

Country Status (2)

Country Link
CN (1) CN118118775A (zh)
WO (1) WO2024114170A1 (zh)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104076898A (zh) * 2013-03-27 2014-10-01 腾讯科技(深圳)有限公司 一种控制移动终端屏幕亮度的方法和装置
CN106303088A (zh) * 2016-09-30 2017-01-04 努比亚技术有限公司 提示调节方法及移动终端
CN110310668A (zh) * 2019-05-21 2019-10-08 深圳壹账通智能科技有限公司 静音检测方法、系统、设备及计算机可读存储介质
CN111163650A (zh) * 2017-09-15 2020-05-15 深圳传音通讯有限公司 一种基于智能终端的提醒方法和提醒系统
US20200184968A1 (en) * 2017-04-24 2020-06-11 Lg Electronics Inc. Artificial intelligence device
CN114257670A (zh) * 2022-02-28 2022-03-29 荣耀终端有限公司 一种具有折叠屏的电子设备的显示方法

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104076898A (zh) * 2013-03-27 2014-10-01 腾讯科技(深圳)有限公司 一种控制移动终端屏幕亮度的方法和装置
CN106303088A (zh) * 2016-09-30 2017-01-04 努比亚技术有限公司 提示调节方法及移动终端
US20200184968A1 (en) * 2017-04-24 2020-06-11 Lg Electronics Inc. Artificial intelligence device
CN111163650A (zh) * 2017-09-15 2020-05-15 深圳传音通讯有限公司 一种基于智能终端的提醒方法和提醒系统
CN110310668A (zh) * 2019-05-21 2019-10-08 深圳壹账通智能科技有限公司 静音检测方法、系统、设备及计算机可读存储介质
CN114257670A (zh) * 2022-02-28 2022-03-29 荣耀终端有限公司 一种具有折叠屏的电子设备的显示方法

Also Published As

Publication number Publication date
CN118118775A (zh) 2024-05-31
WO2024114170A9 (zh) 2024-08-15

Similar Documents

Publication Publication Date Title
CN111919433B (zh) 用于操作移动相机以用于低功率使用的方法和装置
EP4030422B1 (en) Voice interaction method and device
WO2021104104A1 (zh) 一种高能效的显示处理方法及设备
WO2021147396A1 (zh) 图标管理方法及智能终端
KR102484738B1 (ko) 어플리케이션 권한을 관리하는 방법 및 전자 장치
US12020620B2 (en) Display method, electronic device, and computer storage medium
WO2023000772A1 (zh) 模式切换方法、装置、电子设备及芯片系统
WO2022037398A1 (zh) 一种音频控制方法、设备及系统
CN110401768A (zh) 调节电子设备的工作状态的方法和装置
EP4407453A1 (en) Application running method and related device
US9332580B2 (en) Methods and apparatus for forming ad-hoc networks among headset computers sharing an identifier
WO2023226719A1 (zh) 一种识别终端状态的方法和装置
WO2024114170A1 (zh) 场景感知方法、设备及存储介质
EP4273695A1 (en) Service anomaly warning method, electronic device and storage medium
WO2022152174A1 (zh) 一种投屏的方法和电子设备
WO2024114137A1 (zh) 人群识别方法、设备及存储介质
WO2024114115A9 (zh) 手势感知方法、设备及存储介质
CN118118776A (zh) 疲劳感知方法、设备及存储介质
CN118114695A (zh) 扫码方法、设备及存储介质
WO2023016347A1 (zh) 声纹认证应答方法、系统及电子设备
WO2024088073A1 (zh) 一种体感交互方法、电子设备、系统及可读存储介质
EP4266155A1 (en) Skin care check-in method and electronic device
EP4246308A1 (en) Always-on display control method, electronic device and storage medium
WO2024083031A1 (zh) 一种显示方法、电子设备和系统
WO2023207799A1 (zh) 消息处理方法和电子设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23896325

Country of ref document: EP

Kind code of ref document: A1