WO2024017467A1 - Apparatus and method for processing a visual content - Google Patents

Apparatus and method for processing a visual content Download PDF

Info

Publication number
WO2024017467A1
WO2024017467A1 PCT/EP2022/070314 EP2022070314W WO2024017467A1 WO 2024017467 A1 WO2024017467 A1 WO 2024017467A1 EP 2022070314 W EP2022070314 W EP 2022070314W WO 2024017467 A1 WO2024017467 A1 WO 2024017467A1
Authority
WO
WIPO (PCT)
Prior art keywords
information
visual content
reflective surface
target
user
Prior art date
Application number
PCT/EP2022/070314
Other languages
French (fr)
Inventor
Baiqiang XIA
Jian Song
Original Assignee
Huawei Technologies Co., Ltd.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co., Ltd. filed Critical Huawei Technologies Co., Ltd.
Priority to PCT/EP2022/070314 priority Critical patent/WO2024017467A1/en
Publication of WO2024017467A1 publication Critical patent/WO2024017467A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T11/002D [Two Dimensional] image generation
    • G06T11/60Editing figures and text; Combining figures or text

Definitions

  • the present disclosure relates generally to the field of data processing, and particularly to an apparatus and method for processing a visual content for the purpose of determining whether target information shown on one or more reflective surfaces present in the visual content is to be removed or not.
  • Reflections or, in other words, reflective surfaces commonly exist in our daily life environment. Some examples of the reflective surfaces include mirrors, glasses, windows, sunglasses, screens, polished metal, or ceramic surfaces, or even water marks on the floor.
  • the reflective surfaces can unexpectedly capture some visual information from surrounding objects, and result in unexpected information leakage.
  • the eyeglasses of a user participating in a video conference may reflect objects shown on his/her computer screen, which may contain business secrets or may be a personal ID/bank card or a business contract hardcopy, or may reflect the user hands typing a password on the keyboard.
  • a mirror shown in a selfie photo may reflect faces of other persons or a product prototype which is not supposed to be exposed to the public, or a window presented in the field of view of a camera may reflect the structure and ongoing activities taking place in the office.
  • the reflective surfaces cause significant information leakage risks when some visual information is shared among users, which may in turn cause huge personal or business damages to the users (e.g., service providers may lose trust from the users and eventually loss their market share).
  • This problem become more severe when considering more general cases for reflection, which are not easy to be realized by users and harder to avoid. For example, reflections from the outer surface of a coffee cup or a shiny plastic bag may be able to tell if there is a person/object nearby. In extreme cases, even diffused reflections on a white wall can be used to reconstruct a light field.
  • an apparatus for processing a visual content comprises a memory and a processor coupled to the memory.
  • the memory stores processor-executable instructions which, when executed by the processor, cause the processor to operate as follows.
  • the processor receives the visual content.
  • the processor determines whether a reflective surface is present in the visual content.
  • the processor determines whether target information is shown on the reflective surface.
  • the processor provides prompt information to a user, which comprises: (i) an indication that the target information is shown on the reflective surface; and (ii) a request for removing the target information by using an editing operation.
  • the processor receives a user input from the user, which indicates whether the target information is to be removed by using the editing operation.
  • the apparatus thus configured may promptly inform the user of the presence of the target information shown on the reflective surface, so that the user could decide on remaining or removing the target information from the visual content.
  • the processor is further configured to obtain a preview version of the visual content without the target information by using the editing operation. Then, the processor is configured to provide, together with the prompt information, the preview version of the visual content to the user. By using such a preview version of the visual content, the user may check the result of applying the editing operation to the visual content before making a final decision on the target information, which may be beneficial in some use scenarios.
  • the memory further stores a set of target-information types and a set of risk levels each associated with one information type of the set of targetinformation types.
  • the processor when the target information is shown on the reflective surface, the processor is further configured to operate as follows. The processor finds, in the set of target-information types, a target-information type which the target information belongs to. Then, the processor finds, in the set of risk levels, a risk level associated with the targetinformation type. Next, the processor provides, together with the prompt information, the target-information type and the risk level to the user. By knowing the type and risk level of the target information shown on the reflective surface, the user may properly decide whether the target information is to be removed or not.
  • the set of target-information types comprises personal data, user activity-related information, and a company-related unique identifying symbol.
  • the apparatus according to the first aspect may analyze various information shown on the reflective surface, which may make it more flexible in use.
  • the processor is further configured to magnify a piece of the visual content which contains the reflective surface and provide, together with the prompt information, the magnified piece of the visual content to the user.
  • the apparatus according to the first aspect may allow the user to examine the target information shown on the reflective surface in minute detail.
  • the processor is configured to determine whether the reflective surface is present in the visual content and whether the target information is shown on the reflective surface by using a machine learning algorithm. By using the machine learning algorithm, the apparatus according to the first aspect may operate more efficiently.
  • the editing operation comprises at least one of an operation of painting over the reflective surface in the visual content; an operation of blurring the reflective surface in the visual content; an operation of cutting off a piece of the visual content which contains the reflective surface; and an operation of replacing the target information with public information on the reflective surface in the visual content.
  • the cutting- off operation may be used when the reflective surface showing the target information in an image is at the boundary of the image.
  • the blurring and coloring operations may be used when there are no strict requirements for the quality of the visual content (e.g., the quality of an image).
  • the target information e.g., a new company logo which is reflected in a mirror shown in an image
  • public information e.g., an old company logo already known to the public
  • a method for processing a visual content starts with the step of receiving the visual content. Then, the method proceeds to the step of determining whether a reflective surface is present in the visual content. When the reflective surface is present in the visual content, the method goes on to the step of determining whether target information is shown on the reflective surface. When the target information is shown on the reflective surface, the method proceeds to the step of providing prompt information to a user, which comprises: (i) an indication that the target information is shown on the reflective surface; and (ii) a request for removing the target information by using an editing operation. After that, the method goes on to the step of receiving a user input from the user, which indicates whether the target information is to be removed by using the editing operation.
  • the method further comprises the steps of obtaining a preview version of the visual content without the target information by using the editing operation and providing, together with the prompt information, the preview version of the visual content to the user.
  • the user may check the result of applying the editing operation to the visual content before making a final decision on the target information, which may be beneficial in some use scenarios.
  • the method further comprises, upon determining that the target information is shown on the reflective surface, the following steps of finding, in a pre-stored set of target-information types, a target-information type which the target information belongs to; finding, in a pre-stored set of risk levels, a risk level associated with the target-information type; and providing, together with the prompt information, the targetinformation type and the risk level to the user.
  • the set of target-information types comprises personal data, user activity-related information, and a company-related unique identifying symbol.
  • the method according to the second aspect may be used to analyze various information shown on the reflective surface, which may make it more flexible in use.
  • the method further comprises the steps of magnifying a piece of the visual content which contains the reflective surface and providing, together with the prompt information, the magnified piece of the visual content to the user. Said magnification may allow the user to examine the target information shown on the reflective surface in minute detail.
  • said determining whether the reflective surface is present in the visual content and said determining whether the target information is shown on the reflective surface are performed by using a machine learning algorithm.
  • the method according to the second aspect may be performed more efficiently.
  • the editing operation comprises at least one of: an operation of painting over the reflective surface in the visual content; an operation of blurring the reflective surface in the visual content; an operation of cutting a piece of the visual content which contains the reflective surface; and an operation of replacing the target information with public information on the reflective surface in the visual content. This may make the method according to the second aspect more flexible in use.
  • the cutting-off operation may be used when the reflective surface showing the target information in an image is at the boundary of the image.
  • the blurring and coloring operations may be used when there are no strict requirements for the quality of the visual content (e.g., the quality of an image).
  • the target information e.g., a new company logo which is reflected in a mirror shown in an image
  • public information e.g., an old company logo already known to the public.
  • a computer program product comprises a computer-readable storage medium that stores a computer code. Being executed by at least one processor, the computer code causes the at least one processor to perform the method according to the second aspect.
  • FIG. 1 shows a block diagram of an apparatus for processing a visual content in accordance with one example embodiment
  • FIG. 2 shows a flowchart of a method for operating the apparatus of FIG. 1 in accordance with one example embodiment
  • FIG. 3 schematically shows a visual content containing a reflection in accordance with one exemplary embodiment
  • FIG. 4 schematically shows prompt information which a user receives during the processing of the visual content of FIG. 3 in accordance with the method of FIG. 2;
  • FIG. 5 schematically shows the result of processing of the visual content of FIG. 3 in accordance with the method of FIG. 2;
  • FIG. 6 shows a block diagram of a wireless communication system in accordance with one exemplary embodiment.
  • a visual content may refer to a picture, photo, image, image frame, diagram (e.g., flowchart, block diagram, circuit diagram), chart (e.g., bar chart, line chart, map, etc.), infographic, video (e.g., video clip, online video), video frame, screenshot, mem, slide deck, or any combination thereof.
  • a video it may be decomposed into video frames, each of which may be processed individually in accordance with the aspects of the present disclosure, as will be discussed below in detail.
  • a reflective surface may refer to a surface that is able to bounce light, thereby forming a reflection that may contain visual information.
  • the reflective surface may include, but not limited to, a water surface, ocular surface, white-paper surface, colored-object surface, metal surface, mirror surface, glass surface, etc. It should be noted that the present disclosure is not limited to a certain type of reflections formed by the reflective surface. In other words, all type of reflections that may be used to recover, partly or fully, the visual information shown on the reflective surface (e.g., regular and irregular reflections) are intended to fall within the scope of our considerations. When a scene to be imaged/captured by a camera contains one or more reflective surfaces, the visual information shown on the reflective surface(s) will be included in a resulting visual content (e.g., a resulting image).
  • a resulting visual content e.g., a resulting image
  • target information may refer to one or more certain types of visual information shown on reflective surfaces.
  • the target information may be defined by a user and may comprise, but not limited to, restricted information, secret information, and confidential information. Therefore, it is highly desirable to avoid inadvertently disclosing the target information in any visual content.
  • restricted information restricted information
  • secret information secret information
  • confidential information confidential information
  • the exemplary embodiments disclosed herein provide a technical solution that allows mitigating or even eliminating the above-sounded drawbacks peculiar to the prior art.
  • the technical solution involves providing prompt information to a user in response to detecting target information on one or more reflective surfaces present in a visual content.
  • the prompt information indicates that the target information is shown on the reflective surface(s), as well as requests the user to confirm whether the target information is to be removed by using an editing operation.
  • the user can be promptly informed of the presence of the target information in the visual content, so that he/she could decide on remaining or removing the target information from the visual content. By so doing, it is possible to reduce the risks of visual information leakage through the reflective surface(s) present in the visual content.
  • the apparatus 100 may implemented as an individual device or may be part of a User Equipment (UE).
  • the UE may refer to an electronic computing device that is configured to perform wireless communications.
  • the UE may be implemented as a mobile station, a mobile terminal, a mobile subscriber unit, a mobile phone, a cellular phone, a smart phone, a cordless phone, a personal digital assistant (PDA), a wireless communication device, a desktop computer, a laptop computer, a tablet computer, a gaming device, a netbook, a smartbook, an ultrabook, a medical mobile device or equipment, a biometric sensor, a wearable device (e.g., a smart watch, smart glasses, a smart wrist band, etc.), an entertainment device (e.g., an audio player, a video player, etc.), a vehicular component or sensor (e.g., a driver-assistance system), a smart meter/sensor, an unmanned vehicle (
  • the apparatus 100 comprises a processor 102 and a memory 104.
  • the memory 104 stores processor-executable instructions 106 which, when executed by the processor 102, cause the processor 102 to perform the aspects of the present disclosure, as will be described below in more detail.
  • the number, arrangement, and interconnection of the constructive elements constituting the apparatus 100, which are shown in FIG. 1, are not intended to be any limitation of the present disclosure, but merely used to provide a general idea of how the constructive elements may be implemented within the apparatus 100.
  • the processor 102 may be replaced with several processors, as well as the memory 104 may be replaced with several removable and/or fixed storage devices, depending on particular applications.
  • the apparatus 100 when the apparatus 100 is implemented as an individual device, it may further comprise a transceiver (not shown in FIG. 1) configured to perform wireless communications, for example, with a UE (e.g., in order to transmit the results of processing the visual content to the UE).
  • the transceiver itself may be implemented as two individual devices, with one for a receiving operation and another for a transmitting operation. Irrespective of its implementation, the transceiver is intended to be capable of performing different operations required to perform the data reception and transmission, such, for example, as signal modulation/demodulation, encoding/decoding, etc.
  • the processor 102 may be implemented as a CPU, general-purpose processor, single-purpose processor, microcontroller, microprocessor, application specific integrated circuit (ASIC), field programmable gate array (FPGA), digital signal processor (DSP), complex programmable logic device, etc. It should be also noted that the processor 102 may be implemented as any combination of one or more of the aforesaid. As an example, the processor 102 may be a combination of two or more microprocessors.
  • the memory 104 may be implemented as a classical nonvolatile or volatile memory used in the modern electronic computing machines.
  • the nonvolatile memory may include Read-Only Memory (ROM), ferroelectric Random-Access Memory (RAM), Programmable ROM (PROM), Electrically Erasable PROM (EEPROM), solid state drive (SSD), flash memory, magnetic disk storage (such as hard drives and magnetic tapes), optical disc storage (such as CD, DVD and Blu-ray discs), etc.
  • ROM Read-Only Memory
  • RAM ferroelectric Random-Access Memory
  • PROM Programmable ROM
  • EEPROM Electrically Erasable PROM
  • SSD solid state drive
  • flash memory magnetic disk storage (such as hard drives and magnetic tapes), optical disc storage (such as CD, DVD and Blu-ray discs), etc.
  • the volatile memory examples thereof include Dynamic RAM, Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDR SDRAM), Static RAM, etc.
  • the processor-executable instructions 106 stored in the memory 104 may be configured as a computer-executable program code which causes the processor 102 to perform the aspects of the present disclosure.
  • the computer-executable program code for carrying out operations or steps for the aspects of the present disclosure may be written in any combination of one or more programming languages, such as Java, C++, or the like.
  • the computerexecutable program code may be in the form of a high-level language or in a pre-compiled form and be generated by an interpreter (also pre-stored in the memory 104) on the fly.
  • FIG. 2 shows a flowchart of a method 200 for operating the apparatus 100 in accordance with one example embodiment.
  • the method 200 is performed locally in the UE.
  • the apparatus 100 is implemented as an individual device (e.g., another UE or a remote server) configured to perform wire or wireless communications with the UE, the method 200 is performed remotely relative to the UE.
  • the method 200 starts with a step S202, in which the processor 102 receives a visual content.
  • the visual content may be an image just captured by a camera of a UE (e.g., smartphone) or an image downloaded from a database of images stored locally in the UE or remotely in another UE or a server.
  • other types of the visual content are also possible, such, for example, as video clips, diagrams, etc.
  • the method 200 proceeds to a step S204, in which the processor 102 determines whether a reflective surface is present in the visual content.
  • the processor 102 may perform the step S204 by using a machine learning algorithm, such as a neural network (e.g., Region-based Convolutional Neural Network (R-CNN), Fast R-CNN, Faster R-CNN, You Only Look Once (YOLO) framework, etc.).
  • a machine learning algorithm such as a neural network (e.g., Region-based Convolutional Neural Network (R-CNN), Fast R-CNN, Faster R-CNN, You Only Look Once (YOLO) framework, etc.).
  • Standard computer vision pipelines for object detection such as instance-level object segmentation, may be also used, which may be based on Mask R-CNN, U-Net frameworks.
  • the processor 102 may perform the step S204 by using handcrafted features and rules from human experts.
  • contextual information may be formulated as an auxiliary task to improve said determination in the step S204 (
  • the method 200 goes on to a step S206, in which the processor 102 determines whether target information is shown on the reflective surface.
  • the target information may be restricted information, confidential information or any other information that should not be made public (at least temporarily).
  • the target information may be user-defined. Thus, a user may decide which object or concept shown on the reflective surface is to be removed. For example, if the user does not want others to know he/she has a cat, he/she may indicate the reflection of the cat as the target information.
  • the step 206 may be also performed by the processor 102 with the aid of a suitable machine learning algorithm, such as a properly trained neural network (e.g., a CNN).
  • a properly trained neural network e.g., a CNN
  • the method 200 proceeds to a step S208, in which the processor 102 provides (e.g., outputs, displays or transmits) prompt information to the user.
  • the prompt information may be outputted to the user as a notification or message via a dedicated mobile application installed on a smartphone (the mobile application may be also used by the user to select the visual content to be processed and define the target information).
  • the prompt information comprises: (i) an indication that the target information is shown on the reflective surface; and (ii) a request for removing the target information by using an editing operation.
  • the editing operation may be an operation of painting over the reflective surface in the visual content, an operation of blurring the reflective surface in the visual content, an operation of cutting off a piece of the visual content which contains the reflective surface, an operation of replacing the target information with public information on the reflective surface in the visual content, or any combination of these operations.
  • These editing operations may be used depending on requirements for the naturalness or, in other words, quality of the visual content to be processed.
  • the prompt information may comprise the list of the above-indicated editing operations, if required.
  • different visual generation algorithms may be adopted, which are configured to use a visual content with one or more reflections as input data to obtain the visual content without the reflection(s) (i.e., with the reflection(s) removed/minimized) as output data. These algorithms may be also trained considering the naturalness of the visual content, e.g., via adversarial loss, to make the processed visual content are perceptually realistic.
  • the method 200 ends up with a step S210, in which the processor 102 receives a user input from the user.
  • the user input indicates whether the target information is to be removed by using the editing operation. If the prompt information also indicates some or all of the above-indicated editing operations, the user input may further indicate which of them to use (of course, if the user has decided to remove the target information from the visual content).
  • the visual content is processed by using the editing operation, it is ready for further or future usage (e.g., the visual content may be stored locally in a photo album in the UE or may be uploaded into a remote server or shared with other users via a wireless network).
  • the method 200 may comprise an additional step (before the step S208), in which the processor 102 obtains a preview version of the visual content without the target information by using the editing operation.
  • the processor 102 may be configured to select one of the above-described editing operations on its own based on predefined (e.g., by the user via the dedicated mobile application) requirements for the quality of the visual content (e.g., in the form of ’’high quality”, “medium quality”, or “low quality”).
  • predefined e.g., by the user via the dedicated mobile application
  • requirements for the quality of the visual content e.g., in the form of ’’high quality”, “medium quality”, or “low quality”.
  • the method 200 may comprise an additional step (before the step S208), in which the processor 102 magnifies a piece of the visual content which contains the reflective surface. Subsequently, the processor 102 may provide the magnified piece of the visual content, together with the prompt information, to the user in the step S208. Said magnification may be required when there are small objects shown on the reflective surface, such as jewels, small-font text, etc.
  • the method 200 may comprise an additional step (before the step S208), in which the processor 102 determines the type and risk level of the target information shown on the reflective surface.
  • the memory 104 may further store a set of targetinformation types and a set of risk levels each associated with one information type of the set of target-information types. Both the sets may be defined by the user.
  • targetinformation types comprise, but not limited to, personal data, user activity-related information, and a company-related unique identifying symbol.
  • the personal data also referred to as personally identifiable information, may refer to any information related to an identifiable person, such as an identification number (e.g., social security number, bank card number, etc.), an identification card (e.g., bank card, visiting card, access badge, driving license, etc.), location data (e.g., home address, current location data, etc.), a name, an online identifier (e.g., IP address, nickname, etc.), health information, an economic, cultural, or social identity of the person, or any combination thereof.
  • the user activity-related information may refer to any scene and activity in which one or more users of interest are involved (e.g., a movie scene, closed training of athletes, etc.).
  • the company-related unique identifying symbol may comprise a company logo, brand, business identity code, etc.
  • risk levels they may be presented by using a certain pattern of color coding (e.g., “red”, “orange” and “green” colors may correspond to high, medium and low risk levels, respectively) and/or character coding (e.g., “A”, “B” and “C” letters may correspond to the high, medium and low risk levels, respectively).
  • the processor 102 finds, in the set of target-information types, a target-information type which the target information belongs to.
  • the processor 102 finds, in the set of risk levels, a risk level associated with the target-information type.
  • the processor 102 may provide, together with the prompt information, the target-information type and the risk level to the user in the step S208. It should be noted that if there are two or more target-information types shown on the reflective surface, the processor 102 may sum their respective risk levels to obtain a combined risk level, which may be then provided to the user (e.g., by using a certain pattern of color coding or character coding). If the combined risk level is above a threshold (e.g., risk level “A”), the user may immediately decide to remove the target information.
  • a threshold e.g., risk level “A”
  • FIG. 3 schematically shows a visual content containing a reflection in accordance with one exemplary embodiment. More specifically, the visual content is displayed on a touchscreen of a smartphone 300 as an image 302, in which a user 304 is shown as looking at himself/herself in a mirror 306.
  • the image 302 is captured (e.g., by another user) such that the mirror 306 reflects not only the user 304 but also his/her cat. It is assumed that the user 304 wants to share the image 302 with his/her friends but does not want them to know about his/her cat. For this reason, the reflection of the cat may be considered as the target information.
  • the image 302 is opened in a dedicated mobile application that is used by the user 304 to select images to be processed and indicate (at least) the target information to be detected on the images.
  • the mobile application is configured to initiate the execution of the method 200 by requesting the user 304 to confirm the processing of the image 302. If the user 304 presses “Yes”, then the mobile application will cause the apparatus 100 to start the method 200.
  • the apparatus 100 may be integrated in the smartphone 300 (i.e., the image 302 may be processed locally in the smartphone 300), or may be a remote server (e.g., a cloud server) which is accessed by the smartphone 300 via a wireless network (i.e., the image 302 may be transmitted to the remote server for its processing in accordance with the method 200).
  • the software components e.g., the buttons “Yes” and “No” and their arrangement on the touchscreen of the smartphone 300 are for illustrative purposes only and should be not construed as any limitation of the present disclosure.
  • FIG. 4 schematically shows the prompt information which the user 304 receives during the processing of the visual content (i.e., the image 302) in accordance with the method 200. More specifically, it is assumed that the processor 102 of the apparatus 100 has performed the steps S202-S206 of the method 200 and found the reflection of the cat in the mirror 306. Subsequently, in the step S208 of the method 200, the processor 102 provides the prompt information to the user 304 (e.g., transmits to the smartphone 300), so that the prompt information is displayed on the touchscreen of the smartphone 300. As follows from FIG. 4, the prompt information comprises the text indication “Target information (‘cat’) detected!” and the text request “Remove the target information?”.
  • the prompt information comprises the text indication “Target information (‘cat’) detected!” and the text request “Remove the target information?”.
  • the prompt information may comprise graphic elements overlaid on the image 302 to highlight the target information (i.e., the reflection of the cat).
  • graphic elements may be represented by different straight or curved lines (e.g., solid, dashed, etc.) surrounding the target information (see the dashed curved line surrounding the cat in FIG. 4), different arrow pointers, etc.
  • the processor 102 of the apparatus 100 will receive the result of pressing the button “Yes” or “No” as the user input in the step S210 of the method 200.
  • the prompt information may additionally comprise the list of editing operations available for the image 302; this list may be formed based on the above-described editing operations and requirements for image quality which may be predefined by the user 304 in the mobile application.
  • FIG. 5 schematically shows the result of processing of the visual content (i.e., the image 302) in accordance with the method 200. It is assumed that the user 304 has pressed the button “Yes” in response to the prompt information shown in FIG. 4 and selected the operation of replacing the target information with public information on the reflective surface to make the image 302 more realistic.
  • the processor 102 of the apparatus 100 will replace the reflection of the cat with the reflection of any other object which can be made public.
  • a set of such objects may be pre-stored in the memory 104 of the apparatus 100.
  • the reflection of the cat is replaced with the reflection of a potted flower in FIG. 5. It should be noted that this result of processing the image 302 might have been reported to the user as the preview version of the image 302 in the step S208 of the method 200, so that the user could assess the efficiency of the editing operation.
  • FIG. 6 shows a block diagram of a wireless communication system 600 in accordance with one exemplary embodiment.
  • the system 600 comprises a smartphone 602 that is assumed to be implemented in the same or similar manner as the smartphone 300. In other words, the smartphone 602 has a similar dedicated mobile application installed thereon.
  • the system 600 further comprises a cloud server 604 which is assumed to be implemented as the apparatus 100.
  • the smartphone 602 accesses the cloud server 604 via a wireless network 606 (e.g., the Internet) when it is required to process a visual content in accordance with the method 200.
  • a wireless network 606 e.g., the Internet
  • the smartphone 602 uploads the image into the cloud server 604, where the method 200 is initiated in respect of the image.
  • the cloud server 604 detects that the image contains one or more reflective surfaces showing the target information, it will report this to the smartphone 602 via the wireless network 606, so that a user of the smartphone 602 can decide whether to remove the target information from the image or not. If the user confirms that the target information is to be removed, the cloud server 604 will apply a suitable (based on requirements for image quality, if any) editing operation to the image. After that, the cloud server 604 will provide the processed or edited image back to the smartphone 602, so that the user can transmit it to different devices of his/her friends, such as a tabletop computer 608, smart watches 610 and another smartphone 612.
  • the smartphone 602 may communicate with the cloud server 604 via the wireless network 606 and with each of the tabletop computer 608, the smart watches 610 and the smartphone 612 via a different wireless or wire network (e.g., Wi-Fi network, Local Area Network (LAN), etc.).
  • a different wireless or wire network e.g., Wi-Fi network, Local Area Network (LAN), etc.
  • the smartphone 602 is used by the user to share a video (e.g., live stream) with each of the tabletop computer 608, the smart watches 610 and the smartphone 612 via the wireless network 606, the mobile application installed on the smartphone 602 may be configured to decompose the video into a set of video frames (in real time) and direct each video frame of the set of video frames to the cloud server 604 for processing in accordance with the method 200.
  • a video e.g., live stream
  • the smart watches 610 and the smartphone 612 via the wireless network 606
  • the mobile application installed on the smartphone 602 may be configured to decompose the video into a set of video frames (in real time) and direct each video frame of the set of video frames to the cloud server 604 for processing in accordance with the method 200.
  • the tabletop computer 608 is used by a teacher to deliver a video lecture to students who are, for example, users of the smartphones 602 and 612.
  • a dedicated mobile application like the one used in the smartphone 300 is installed on the tabletop computer 608.
  • the mobile application may (continuously or periodically) take a screenshot and direct the screenshot to the could server 604 for processing in accordance with the method 200. For example, if one of the users of the smartphones 602 and 612 wears eyeglasses, there may be reflections on the surface of his/her eyeglasses.
  • the cloud server 604 determines that the reflections present in the screenshot(s) are not related to the information of the video lecture (i.e., in this case, the target information is any information different from that delivered by the teacher), it may provide the teacher with the prompt information indicating that, for example, the user of the smartphone 602 is not focused on the video lecture. It should be noted that, in this case, the cloud server 604 cannot inform the teacher of the exact content of the reflection(s) to ensure the right to privacy.
  • each step or operation of the method 200, or any combinations of the steps or operations can be implemented by various means, such as hardware, firmware, and/or software.
  • one or more of the steps or operations described above can be embodied by processor executable instructions, data structures, program modules, and other suitable data representations.
  • the processor-executable instructions which embody the steps or operations described above can be stored on a corresponding data carrier and executed by the processor 102.
  • This data carrier can be implemented as any computer-readable storage medium configured to be readable by said at least one processor to execute the processor executable instructions.
  • Such computer-readable storage media can include both volatile and nonvolatile media, removable and non-removable media.
  • the computer-readable media comprise media implemented in any method or technology suitable for storing information.
  • the practical examples of the computer-readable media include, but are not limited to informationdelivery media, RAM, ROM, EEPROM, flash memory or other memory technology, CD- ROM, digital versatile discs (DVD), holographic media or other optical disc storage, magnetic tape, magnetic cassettes, magnetic disk storage, and other magnetic storage devices.

Landscapes

  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

The present disclosure relates to an apparatus and method for processing a visual content for the purpose of determining whether target information shown on one or more reflective surfaces present in the visual content is to be removed or not. If it is determined that the target information is shown on the reflective surface(s), prompt information is provided to a user. The prompt information indicates that the target information is shown on the reflective surface(s), as well as requests the user to confirm whether the target information is to be removed by using an editing operation. Thus, the user can be promptly informed of the presence of the target information in the visual content, so that he/she could decide on remaining or removing the target information from the visual content. By so doing, it is possible to reduce the risks of visual information leakage through the reflective surface(s) present in the visual content.

Description

APPARATUS AND METHOD FOR PROCESSING A VISUAL CONTENT
TECHNICAL FIELD
The present disclosure relates generally to the field of data processing, and particularly to an apparatus and method for processing a visual content for the purpose of determining whether target information shown on one or more reflective surfaces present in the visual content is to be removed or not.
BACKGROUND
The advent of the Internet has greatly boosted the speed and efficiency of information sharing. In recent years, visual information sharing has become one of the most popular Internet services, such as sharing photos or videos via social network websites or mobile applications or sharing videos during a video call. With the pervasive usage of visual information sharing, information security issues have also become more and more severe. For example, video conferences have become the Top-1 reason for data breaches in recent years, world-widely.
A particular type of information leakage risk is caused by visual reflections. Reflections or, in other words, reflective surfaces commonly exist in our daily life environment. Some examples of the reflective surfaces include mirrors, glasses, windows, sunglasses, screens, polished metal, or ceramic surfaces, or even water marks on the floor. The reflective surfaces can unexpectedly capture some visual information from surrounding objects, and result in unexpected information leakage. For example, the eyeglasses of a user participating in a video conference may reflect objects shown on his/her computer screen, which may contain business secrets or may be a personal ID/bank card or a business contract hardcopy, or may reflect the user hands typing a password on the keyboard. As another example, a mirror shown in a selfie photo may reflect faces of other persons or a product prototype which is not supposed to be exposed to the public, or a window presented in the field of view of a camera may reflect the structure and ongoing activities taking place in the office. In all of these and other cases, the reflective surfaces cause significant information leakage risks when some visual information is shared among users, which may in turn cause huge personal or business damages to the users (e.g., service providers may lose trust from the users and eventually loss their market share). This problem become more severe when considering more general cases for reflection, which are not easy to be realized by users and harder to avoid. For example, reflections from the outer surface of a coffee cup or a shiny plastic bag may be able to tell if there is a person/object nearby. In extreme cases, even diffused reflections on a white wall can be used to reconstruct a light field.
Therefore, there is a need for a technical solution that would allow one to reduce the risks of information leakage through visual reflections.
SUMMARY
This summary is provided to introduce a selection of concepts in a simplified form that are further described below in the detailed description. This summary is not intended to identify key features of the present disclosure, nor is it intended to be used to limit the scope of the present disclosure.
It is an objective of the present disclosure to provide a technical solution that allows one to reduce the risks of information leakage through visual reflections present in a visual content.
The objective above is achieved by the features of the independent claims in the appended claims. Further embodiments and examples are apparent from the dependent claims, the detailed description and the accompanying drawings.
According to a first aspect, an apparatus for processing a visual content is provided. The apparatus comprises a memory and a processor coupled to the memory. The memory stores processor-executable instructions which, when executed by the processor, cause the processor to operate as follows. At first, the processor receives the visual content. Then, the processor determines whether a reflective surface is present in the visual content. When the reflective surface is present in the visual content, the processor determines whether target information is shown on the reflective surface. When the target information is shown on the reflective surface, the processor provides prompt information to a user, which comprises: (i) an indication that the target information is shown on the reflective surface; and (ii) a request for removing the target information by using an editing operation. After that, the processor receives a user input from the user, which indicates whether the target information is to be removed by using the editing operation. The apparatus thus configured may promptly inform the user of the presence of the target information shown on the reflective surface, so that the user could decide on remaining or removing the target information from the visual content.
In one embodiment of the first aspect, the processor is further configured to obtain a preview version of the visual content without the target information by using the editing operation. Then, the processor is configured to provide, together with the prompt information, the preview version of the visual content to the user. By using such a preview version of the visual content, the user may check the result of applying the editing operation to the visual content before making a final decision on the target information, which may be beneficial in some use scenarios.
In one embodiment of the first aspect, the memory further stores a set of target-information types and a set of risk levels each associated with one information type of the set of targetinformation types. In this embodiment, when the target information is shown on the reflective surface, the processor is further configured to operate as follows. The processor finds, in the set of target-information types, a target-information type which the target information belongs to. Then, the processor finds, in the set of risk levels, a risk level associated with the targetinformation type. Next, the processor provides, together with the prompt information, the target-information type and the risk level to the user. By knowing the type and risk level of the target information shown on the reflective surface, the user may properly decide whether the target information is to be removed or not.
In one embodiment of the first aspect, the set of target-information types comprises personal data, user activity-related information, and a company-related unique identifying symbol. Thus, the apparatus according to the first aspect may analyze various information shown on the reflective surface, which may make it more flexible in use.
In one embodiment of the first aspect, the processor is further configured to magnify a piece of the visual content which contains the reflective surface and provide, together with the prompt information, the magnified piece of the visual content to the user. By so doing, the apparatus according to the first aspect may allow the user to examine the target information shown on the reflective surface in minute detail. In one embodiment of the first aspect, the processor is configured to determine whether the reflective surface is present in the visual content and whether the target information is shown on the reflective surface by using a machine learning algorithm. By using the machine learning algorithm, the apparatus according to the first aspect may operate more efficiently.
In one embodiment of the first aspect, the editing operation comprises at least one of an operation of painting over the reflective surface in the visual content; an operation of blurring the reflective surface in the visual content; an operation of cutting off a piece of the visual content which contains the reflective surface; and an operation of replacing the target information with public information on the reflective surface in the visual content. This may make the apparatus according to the first aspect more flexible in use. For example, the cutting- off operation may be used when the reflective surface showing the target information in an image is at the boundary of the image. The blurring and coloring operations may be used when there are no strict requirements for the quality of the visual content (e.g., the quality of an image). When the quality of the visual content is of great importance, the target information (e.g., a new company logo which is reflected in a mirror shown in an image) may be replaced with public information (e.g., an old company logo already known to the public). Thus, each of the editing operations listed above may be beneficial in a certain use scenario.
According to a second aspect, a method for processing a visual content is provided. The method starts with the step of receiving the visual content. Then, the method proceeds to the step of determining whether a reflective surface is present in the visual content. When the reflective surface is present in the visual content, the method goes on to the step of determining whether target information is shown on the reflective surface. When the target information is shown on the reflective surface, the method proceeds to the step of providing prompt information to a user, which comprises: (i) an indication that the target information is shown on the reflective surface; and (ii) a request for removing the target information by using an editing operation. After that, the method goes on to the step of receiving a user input from the user, which indicates whether the target information is to be removed by using the editing operation. By so doing, it is possible to promptly inform the user of the presence of the target information shown on the reflective surface, so that the user could decide on remaining or removing the target information from the visual content. In one embodiment of the second aspect, the method further comprises the steps of obtaining a preview version of the visual content without the target information by using the editing operation and providing, together with the prompt information, the preview version of the visual content to the user. By using such a preview version of the visual content, the user may check the result of applying the editing operation to the visual content before making a final decision on the target information, which may be beneficial in some use scenarios.
In one embodiment of the second aspect, the method further comprises, upon determining that the target information is shown on the reflective surface, the following steps of finding, in a pre-stored set of target-information types, a target-information type which the target information belongs to; finding, in a pre-stored set of risk levels, a risk level associated with the target-information type; and providing, together with the prompt information, the targetinformation type and the risk level to the user. By knowing the type and risk level of the target information shown on the reflective surface, the user may properly decide whether the target information is to be removed or not.
In one embodiment of the second aspect, the set of target-information types comprises personal data, user activity-related information, and a company-related unique identifying symbol. Thus, the method according to the second aspect may be used to analyze various information shown on the reflective surface, which may make it more flexible in use.
In one embodiment of the second aspect, the method further comprises the steps of magnifying a piece of the visual content which contains the reflective surface and providing, together with the prompt information, the magnified piece of the visual content to the user. Said magnification may allow the user to examine the target information shown on the reflective surface in minute detail.
In one embodiment of the second aspect, said determining whether the reflective surface is present in the visual content and said determining whether the target information is shown on the reflective surface are performed by using a machine learning algorithm. By using the machine learning algorithm, the method according to the second aspect may be performed more efficiently. In one embodiment of the second aspect, the editing operation comprises at least one of: an operation of painting over the reflective surface in the visual content; an operation of blurring the reflective surface in the visual content; an operation of cutting a piece of the visual content which contains the reflective surface; and an operation of replacing the target information with public information on the reflective surface in the visual content. This may make the method according to the second aspect more flexible in use. For example, the cutting-off operation may be used when the reflective surface showing the target information in an image is at the boundary of the image. The blurring and coloring operations may be used when there are no strict requirements for the quality of the visual content (e.g., the quality of an image). When the quality of the visual content is of great importance, the target information (e.g., a new company logo which is reflected in a mirror shown in an image) may be replaced with public information (e.g., an old company logo already known to the public). Thus, each of the editing operations listed above may be beneficial in a certain use scenario.
According to a third aspect, a computer program product is provided. The computer program product comprises a computer-readable storage medium that stores a computer code. Being executed by at least one processor, the computer code causes the at least one processor to perform the method according to the second aspect. By using such a computer program product, it is possible to simplify the implementation of the method according to the second aspect in any data processing apparatus, like the apparatus according to the first aspect.
Other features and advantages of the present disclosure will be apparent upon reading the following detailed description and reviewing the accompanying drawings.
BRIEF DESCRIPTION OF THE DRAWINGS
The present disclosure is explained below with reference to the accompanying drawings in which:
FIG. 1 shows a block diagram of an apparatus for processing a visual content in accordance with one example embodiment;
FIG. 2 shows a flowchart of a method for operating the apparatus of FIG. 1 in accordance with one example embodiment;
FIG. 3 schematically shows a visual content containing a reflection in accordance with one exemplary embodiment; FIG. 4 schematically shows prompt information which a user receives during the processing of the visual content of FIG. 3 in accordance with the method of FIG. 2;
FIG. 5 schematically shows the result of processing of the visual content of FIG. 3 in accordance with the method of FIG. 2; and
FIG. 6 shows a block diagram of a wireless communication system in accordance with one exemplary embodiment.
DETAILED DESCRIPTION
Various embodiments of the present disclosure are further described in more detail with reference to the accompanying drawings. However, the present disclosure can be embodied in many other forms and should not be construed as limited to any certain structure or function discussed in the following description. In contrast, these embodiments are provided to make the description of the present disclosure detailed and complete.
According to the detailed description, it will be apparent to the ones skilled in the art that the scope of the present disclosure encompasses any embodiment thereof, which is disclosed herein, irrespective of whether this embodiment is implemented independently or in concert with any other embodiment of the present disclosure. For example, the apparatus and method disclosed herein can be implemented in practice by using any numbers of the embodiments provided herein. Furthermore, it should be understood that any embodiment of the present disclosure can be implemented using one or more of the elements presented in the appended claims.
Unless otherwise stated, any embodiment recited herein as “exemplary embodiment” should not be construed as preferable or having an advantage over other embodiments.
According to the exemplary embodiments disclosed herein, a visual content may refer to a picture, photo, image, image frame, diagram (e.g., flowchart, block diagram, circuit diagram), chart (e.g., bar chart, line chart, map, etc.), infographic, video (e.g., video clip, online video), video frame, screenshot, mem, slide deck, or any combination thereof. In case of using a video, it may be decomposed into video frames, each of which may be processed individually in accordance with the aspects of the present disclosure, as will be discussed below in detail. As used in the exemplary embodiments disclosed herein, a reflective surface may refer to a surface that is able to bounce light, thereby forming a reflection that may contain visual information. Some examples of the reflective surface may include, but not limited to, a water surface, ocular surface, white-paper surface, colored-object surface, metal surface, mirror surface, glass surface, etc. It should be noted that the present disclosure is not limited to a certain type of reflections formed by the reflective surface. In other words, all type of reflections that may be used to recover, partly or fully, the visual information shown on the reflective surface (e.g., regular and irregular reflections) are intended to fall within the scope of our considerations. When a scene to be imaged/captured by a camera contains one or more reflective surfaces, the visual information shown on the reflective surface(s) will be included in a resulting visual content (e.g., a resulting image).
According to the exemplary embodiments disclosed herein, target information may refer to one or more certain types of visual information shown on reflective surfaces. The target information may be defined by a user and may comprise, but not limited to, restricted information, secret information, and confidential information. Therefore, it is highly desirable to avoid inadvertently disclosing the target information in any visual content. However, to the best of the authors’ knowledge, the information leakage issues through reflections/reflective surfaces shown in the visual content have not been addressed in research and industrial products concerning visual information sharing services.
Thus, the exemplary embodiments disclosed herein provide a technical solution that allows mitigating or even eliminating the above-sounded drawbacks peculiar to the prior art. In particular, the technical solution involves providing prompt information to a user in response to detecting target information on one or more reflective surfaces present in a visual content. The prompt information indicates that the target information is shown on the reflective surface(s), as well as requests the user to confirm whether the target information is to be removed by using an editing operation. Thus, the user can be promptly informed of the presence of the target information in the visual content, so that he/she could decide on remaining or removing the target information from the visual content. By so doing, it is possible to reduce the risks of visual information leakage through the reflective surface(s) present in the visual content. FIG. 1 shows a block diagram of an apparatus 100 for processing a visual content in accordance with one example embodiment. The apparatus 100 may implemented as an individual device or may be part of a User Equipment (UE). The UE may refer to an electronic computing device that is configured to perform wireless communications. The UE may be implemented as a mobile station, a mobile terminal, a mobile subscriber unit, a mobile phone, a cellular phone, a smart phone, a cordless phone, a personal digital assistant (PDA), a wireless communication device, a desktop computer, a laptop computer, a tablet computer, a gaming device, a netbook, a smartbook, an ultrabook, a medical mobile device or equipment, a biometric sensor, a wearable device (e.g., a smart watch, smart glasses, a smart wrist band, etc.), an entertainment device (e.g., an audio player, a video player, etc.), a vehicular component or sensor (e.g., a driver-assistance system), a smart meter/sensor, an unmanned vehicle (e.g., an industrial robot, a quadcopter, etc.) and its component (e.g., a self-driving car computer), industrial manufacturing equipment, a global positioning system (GPS) device, an Internet-of-Things (loT) device, an Industrial loT (IIoT) device, a machine-type communication (MTC) device, a group of Massive loT (MIoT) or Massive MTC (mMTC) devices/sensors, or any other suitable mobile device configured to support wireless communications. In some embodiments, the UE may refer to at least two collocated and inter-connected UEs thus defined.
As shown in FIG. 1, the apparatus 100 comprises a processor 102 and a memory 104. The memory 104 stores processor-executable instructions 106 which, when executed by the processor 102, cause the processor 102 to perform the aspects of the present disclosure, as will be described below in more detail. It should be noted that the number, arrangement, and interconnection of the constructive elements constituting the apparatus 100, which are shown in FIG. 1, are not intended to be any limitation of the present disclosure, but merely used to provide a general idea of how the constructive elements may be implemented within the apparatus 100. For example, the processor 102 may be replaced with several processors, as well as the memory 104 may be replaced with several removable and/or fixed storage devices, depending on particular applications. Furthermore, when the apparatus 100 is implemented as an individual device, it may further comprise a transceiver (not shown in FIG. 1) configured to perform wireless communications, for example, with a UE (e.g., in order to transmit the results of processing the visual content to the UE). The transceiver itself may be implemented as two individual devices, with one for a receiving operation and another for a transmitting operation. Irrespective of its implementation, the transceiver is intended to be capable of performing different operations required to perform the data reception and transmission, such, for example, as signal modulation/demodulation, encoding/decoding, etc.
The processor 102 may be implemented as a CPU, general-purpose processor, single-purpose processor, microcontroller, microprocessor, application specific integrated circuit (ASIC), field programmable gate array (FPGA), digital signal processor (DSP), complex programmable logic device, etc. It should be also noted that the processor 102 may be implemented as any combination of one or more of the aforesaid. As an example, the processor 102 may be a combination of two or more microprocessors.
The memory 104 may be implemented as a classical nonvolatile or volatile memory used in the modern electronic computing machines. As an example, the nonvolatile memory may include Read-Only Memory (ROM), ferroelectric Random-Access Memory (RAM), Programmable ROM (PROM), Electrically Erasable PROM (EEPROM), solid state drive (SSD), flash memory, magnetic disk storage (such as hard drives and magnetic tapes), optical disc storage (such as CD, DVD and Blu-ray discs), etc. As for the volatile memory, examples thereof include Dynamic RAM, Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDR SDRAM), Static RAM, etc.
The processor-executable instructions 106 stored in the memory 104 may be configured as a computer-executable program code which causes the processor 102 to perform the aspects of the present disclosure. The computer-executable program code for carrying out operations or steps for the aspects of the present disclosure may be written in any combination of one or more programming languages, such as Java, C++, or the like. In some examples, the computerexecutable program code may be in the form of a high-level language or in a pre-compiled form and be generated by an interpreter (also pre-stored in the memory 104) on the fly.
FIG. 2 shows a flowchart of a method 200 for operating the apparatus 100 in accordance with one example embodiment. If the apparatus 100 is part of a UE, the method 200 is performed locally in the UE. If the apparatus 100 is implemented as an individual device (e.g., another UE or a remote server) configured to perform wire or wireless communications with the UE, the method 200 is performed remotely relative to the UE. The method 200 starts with a step S202, in which the processor 102 receives a visual content. For example, the visual content may be an image just captured by a camera of a UE (e.g., smartphone) or an image downloaded from a database of images stored locally in the UE or remotely in another UE or a server. As noted earlier, other types of the visual content are also possible, such, for example, as video clips, diagrams, etc.
Then, the method 200 proceeds to a step S204, in which the processor 102 determines whether a reflective surface is present in the visual content. The processor 102 may perform the step S204 by using a machine learning algorithm, such as a neural network (e.g., Region-based Convolutional Neural Network (R-CNN), Fast R-CNN, Faster R-CNN, You Only Look Once (YOLO) framework, etc.). Standard computer vision pipelines for object detection, such as instance-level object segmentation, may be also used, which may be based on Mask R-CNN, U-Net frameworks. Alternatively or additionally, the processor 102 may perform the step S204 by using handcrafted features and rules from human experts. Furthermore, contextual information may be formulated as an auxiliary task to improve said determination in the step S204 (e.g., the detection of a face is helpful for the detection of eyeglasses and, consequently, the reflections provided by the eyeglasses).
When the reflective surface is present in the visual content, the method 200 goes on to a step S206, in which the processor 102 determines whether target information is shown on the reflective surface. As noted earlier, the target information may be restricted information, confidential information or any other information that should not be made public (at least temporarily). The target information may be user-defined. Thus, a user may decide which object or concept shown on the reflective surface is to be removed. For example, if the user does not want others to know he/she has a cat, he/she may indicate the reflection of the cat as the target information. The step 206 may be also performed by the processor 102 with the aid of a suitable machine learning algorithm, such as a properly trained neural network (e.g., a CNN).
When the target information is shown on the reflective surface, the method 200 proceeds to a step S208, in which the processor 102 provides (e.g., outputs, displays or transmits) prompt information to the user. For example, the prompt information may be outputted to the user as a notification or message via a dedicated mobile application installed on a smartphone (the mobile application may be also used by the user to select the visual content to be processed and define the target information). The prompt information comprises: (i) an indication that the target information is shown on the reflective surface; and (ii) a request for removing the target information by using an editing operation. The editing operation may be an operation of painting over the reflective surface in the visual content, an operation of blurring the reflective surface in the visual content, an operation of cutting off a piece of the visual content which contains the reflective surface, an operation of replacing the target information with public information on the reflective surface in the visual content, or any combination of these operations. These editing operations may be used depending on requirements for the naturalness or, in other words, quality of the visual content to be processed. Thus, the prompt information may comprise the list of the above-indicated editing operations, if required. Moreover, different visual generation algorithms may be adopted, which are configured to use a visual content with one or more reflections as input data to obtain the visual content without the reflection(s) (i.e., with the reflection(s) removed/minimized) as output data. These algorithms may be also trained considering the naturalness of the visual content, e.g., via adversarial loss, to make the processed visual content are perceptually realistic.
The method 200 ends up with a step S210, in which the processor 102 receives a user input from the user. The user input indicates whether the target information is to be removed by using the editing operation. If the prompt information also indicates some or all of the above-indicated editing operations, the user input may further indicate which of them to use (of course, if the user has decided to remove the target information from the visual content). After the visual content is processed by using the editing operation, it is ready for further or future usage (e.g., the visual content may be stored locally in a photo album in the UE or may be uploaded into a remote server or shared with other users via a wireless network).
In one embodiment, the method 200 may comprise an additional step (before the step S208), in which the processor 102 obtains a preview version of the visual content without the target information by using the editing operation. In this regard, the processor 102 may be configured to select one of the above-described editing operations on its own based on predefined (e.g., by the user via the dedicated mobile application) requirements for the quality of the visual content (e.g., in the form of ’’high quality”, “medium quality”, or “low quality”). When such a preview version is obtained, it may be provided, together with the prompt information, to the user in the step S208. In an alternative or additional embodiment, the method 200 may comprise an additional step (before the step S208), in which the processor 102 magnifies a piece of the visual content which contains the reflective surface. Subsequently, the processor 102 may provide the magnified piece of the visual content, together with the prompt information, to the user in the step S208. Said magnification may be required when there are small objects shown on the reflective surface, such as jewels, small-font text, etc.
In one embodiment, the method 200 may comprise an additional step (before the step S208), in which the processor 102 determines the type and risk level of the target information shown on the reflective surface. In this regard, the memory 104 may further store a set of targetinformation types and a set of risk levels each associated with one information type of the set of target-information types. Both the sets may be defined by the user. Some examples of targetinformation types comprise, but not limited to, personal data, user activity-related information, and a company-related unique identifying symbol. The personal data, also referred to as personally identifiable information, may refer to any information related to an identifiable person, such as an identification number (e.g., social security number, bank card number, etc.), an identification card (e.g., bank card, visiting card, access badge, driving license, etc.), location data (e.g., home address, current location data, etc.), a name, an online identifier (e.g., IP address, nickname, etc.), health information, an economic, cultural, or social identity of the person, or any combination thereof. The user activity-related information may refer to any scene and activity in which one or more users of interest are involved (e.g., a movie scene, closed training of athletes, etc.). The company-related unique identifying symbol may comprise a company logo, brand, business identity code, etc. As for risk levels, they may be presented by using a certain pattern of color coding (e.g., “red”, “orange” and “green” colors may correspond to high, medium and low risk levels, respectively) and/or character coding (e.g., “A”, “B” and “C” letters may correspond to the high, medium and low risk levels, respectively). At first, the processor 102 finds, in the set of target-information types, a target-information type which the target information belongs to. Then, the processor 102 finds, in the set of risk levels, a risk level associated with the target-information type. Subsequently, the processor 102 may provide, together with the prompt information, the target-information type and the risk level to the user in the step S208. It should be noted that if there are two or more target-information types shown on the reflective surface, the processor 102 may sum their respective risk levels to obtain a combined risk level, which may be then provided to the user (e.g., by using a certain pattern of color coding or character coding). If the combined risk level is above a threshold (e.g., risk level “A”), the user may immediately decide to remove the target information.
FIG. 3 schematically shows a visual content containing a reflection in accordance with one exemplary embodiment. More specifically, the visual content is displayed on a touchscreen of a smartphone 300 as an image 302, in which a user 304 is shown as looking at himself/herself in a mirror 306. The image 302 is captured (e.g., by another user) such that the mirror 306 reflects not only the user 304 but also his/her cat. It is assumed that the user 304 wants to share the image 302 with his/her friends but does not want them to know about his/her cat. For this reason, the reflection of the cat may be considered as the target information. It is also assumed that the image 302 is opened in a dedicated mobile application that is used by the user 304 to select images to be processed and indicate (at least) the target information to be detected on the images. In turn, the mobile application is configured to initiate the execution of the method 200 by requesting the user 304 to confirm the processing of the image 302. If the user 304 presses “Yes”, then the mobile application will cause the apparatus 100 to start the method 200. The apparatus 100 may be integrated in the smartphone 300 (i.e., the image 302 may be processed locally in the smartphone 300), or may be a remote server (e.g., a cloud server) which is accessed by the smartphone 300 via a wireless network (i.e., the image 302 may be transmitted to the remote server for its processing in accordance with the method 200). Those skilled in the art would recognize that the software components (e.g., the buttons “Yes” and “No”) and their arrangement on the touchscreen of the smartphone 300 are for illustrative purposes only and should be not construed as any limitation of the present disclosure.
FIG. 4 schematically shows the prompt information which the user 304 receives during the processing of the visual content (i.e., the image 302) in accordance with the method 200. More specifically, it is assumed that the processor 102 of the apparatus 100 has performed the steps S202-S206 of the method 200 and found the reflection of the cat in the mirror 306. Subsequently, in the step S208 of the method 200, the processor 102 provides the prompt information to the user 304 (e.g., transmits to the smartphone 300), so that the prompt information is displayed on the touchscreen of the smartphone 300. As follows from FIG. 4, the prompt information comprises the text indication “Target information (‘cat’) detected!” and the text request “Remove the target information?”. It should be noted that, instead of or in addition to the text indication, the prompt information may comprise graphic elements overlaid on the image 302 to highlight the target information (i.e., the reflection of the cat). Such graphic elements may be represented by different straight or curved lines (e.g., solid, dashed, etc.) surrounding the target information (see the dashed curved line surrounding the cat in FIG. 4), different arrow pointers, etc. If the user 304 decides to remove the reflection of the cat in the mirror 306, he/she should press the button “Yes” on the touchscreen of the smartphone 300; otherwise, he/she should press the button “No”. The processor 102 of the apparatus 100 will receive the result of pressing the button “Yes” or “No” as the user input in the step S210 of the method 200. It should be also noted that the prompt information may additionally comprise the list of editing operations available for the image 302; this list may be formed based on the above-described editing operations and requirements for image quality which may be predefined by the user 304 in the mobile application.
FIG. 5 schematically shows the result of processing of the visual content (i.e., the image 302) in accordance with the method 200. It is assumed that the user 304 has pressed the button “Yes” in response to the prompt information shown in FIG. 4 and selected the operation of replacing the target information with public information on the reflective surface to make the image 302 more realistic. In response to this user input, the processor 102 of the apparatus 100 will replace the reflection of the cat with the reflection of any other object which can be made public. A set of such objects may be pre-stored in the memory 104 of the apparatus 100. As an example, the reflection of the cat is replaced with the reflection of a potted flower in FIG. 5. It should be noted that this result of processing the image 302 might have been reported to the user as the preview version of the image 302 in the step S208 of the method 200, so that the user could assess the efficiency of the editing operation.
FIG. 6 shows a block diagram of a wireless communication system 600 in accordance with one exemplary embodiment. The system 600 comprises a smartphone 602 that is assumed to be implemented in the same or similar manner as the smartphone 300. In other words, the smartphone 602 has a similar dedicated mobile application installed thereon. The system 600 further comprises a cloud server 604 which is assumed to be implemented as the apparatus 100. The smartphone 602 accesses the cloud server 604 via a wireless network 606 (e.g., the Internet) when it is required to process a visual content in accordance with the method 200. Let us imagine that the smartphone 602 has an image like the image 302, which is to be checked for the presence of the target information. Then, the smartphone 602 uploads the image into the cloud server 604, where the method 200 is initiated in respect of the image. If the cloud server 604 detects that the image contains one or more reflective surfaces showing the target information, it will report this to the smartphone 602 via the wireless network 606, so that a user of the smartphone 602 can decide whether to remove the target information from the image or not. If the user confirms that the target information is to be removed, the cloud server 604 will apply a suitable (based on requirements for image quality, if any) editing operation to the image. After that, the cloud server 604 will provide the processed or edited image back to the smartphone 602, so that the user can transmit it to different devices of his/her friends, such as a tabletop computer 608, smart watches 610 and another smartphone 612. If required, the smartphone 602 may communicate with the cloud server 604 via the wireless network 606 and with each of the tabletop computer 608, the smart watches 610 and the smartphone 612 via a different wireless or wire network (e.g., Wi-Fi network, Local Area Network (LAN), etc.).
It should be also noted that if the smartphone 602 is used by the user to share a video (e.g., live stream) with each of the tabletop computer 608, the smart watches 610 and the smartphone 612 via the wireless network 606, the mobile application installed on the smartphone 602 may be configured to decompose the video into a set of video frames (in real time) and direct each video frame of the set of video frames to the cloud server 604 for processing in accordance with the method 200.
In another embodiment, it is assumed that the tabletop computer 608 is used by a teacher to deliver a video lecture to students who are, for example, users of the smartphones 602 and 612. In this case, a dedicated mobile application like the one used in the smartphone 300 is installed on the tabletop computer 608. Assuming that the video lecture is delivered via a communication platform that is configured to show all participants at the same time (e.g., Microsoft Teams, Skype, Zoom, etc.), the mobile application may (continuously or periodically) take a screenshot and direct the screenshot to the could server 604 for processing in accordance with the method 200. For example, if one of the users of the smartphones 602 and 612 wears eyeglasses, there may be reflections on the surface of his/her eyeglasses. If the cloud server 604 determines that the reflections present in the screenshot(s) are not related to the information of the video lecture (i.e., in this case, the target information is any information different from that delivered by the teacher), it may provide the teacher with the prompt information indicating that, for example, the user of the smartphone 602 is not focused on the video lecture. It should be noted that, in this case, the cloud server 604 cannot inform the teacher of the exact content of the reflection(s) to ensure the right to privacy. Those skilled in the art would recognize that each step or operation of the method 200, or any combinations of the steps or operations, can be implemented by various means, such as hardware, firmware, and/or software. As an example, one or more of the steps or operations described above can be embodied by processor executable instructions, data structures, program modules, and other suitable data representations. Furthermore, the processor-executable instructions which embody the steps or operations described above can be stored on a corresponding data carrier and executed by the processor 102. This data carrier can be implemented as any computer-readable storage medium configured to be readable by said at least one processor to execute the processor executable instructions. Such computer-readable storage media can include both volatile and nonvolatile media, removable and non-removable media. By way of example, and not limitation, the computer-readable media comprise media implemented in any method or technology suitable for storing information. In more detail, the practical examples of the computer-readable media include, but are not limited to informationdelivery media, RAM, ROM, EEPROM, flash memory or other memory technology, CD- ROM, digital versatile discs (DVD), holographic media or other optical disc storage, magnetic tape, magnetic cassettes, magnetic disk storage, and other magnetic storage devices.
Although the example embodiments of the present disclosure are described herein, it should be noted that any various changes and modifications could be made in the embodiments of the present disclosure, without departing from the scope of legal protection which is defined by the appended claims. In the appended claims, the word “comprising” does not exclude other elements or operations, and the indefinite article “a” or “an” does not exclude a plurality. The mere fact that certain measures are recited in mutually different dependent claims does not indicate that a combination of these measures cannot be used to advantage.

Claims

CLAIMS An apparatus for processing a visual content, comprising: a memory storing processor-executable instructions; and a processor coupled to the memory and configured, when executing the processorexecutable instructions, to: receive the visual content; determine whether a reflective surface is present in the visual content; when the reflective surface is present in the visual content, determine whether target information is shown on the reflective surface; when the target information is shown on the reflective surface, provide prompt information to a user, the prompt information comprising: (i) an indication that the target information is shown on the reflective surface; and (ii) a request for removing the target information by using an editing operation; and receive a user input from the user, the user input indicating whether the target information is to be removed by using the editing operation. The apparatus of claim 1, wherein the processor is further configured to: before providing the prompt information to the user, obtain a preview version of the visual content without the target information by using the editing operation; and provide, together with the prompt information, the preview version of the visual content to the user. The apparatus of claim 1 or 2, wherein the memory further stores a set of targetinformation types and a set of risk levels each associated with one information type of the set of target-information types, and wherein the processor is further configured, when the target information is shown on the reflective surface, to: find, in the set of target-information types, a target-information type which the target information belongs to; find, in the set of risk levels, a risk level associated with the target-information type; and provide, together with the prompt information, the target-information type and the risk level to the user. The apparatus of claim 3, wherein the set of target-information types comprises: personal data; user activity-related information; and a company-related unique identifying symbol. The apparatus of any one of claims 1 to 4, wherein the processor is further configured to: magnify a piece of the visual content which contains the reflective surface; and provide, together with the prompt information, the magnified piece of the visual content to the user. The apparatus of any one of claims 1 to 5, wherein the processor is configured to determine whether the reflective surface is present in the visual content and whether the target information is shown on the reflective surface by using a machine learning algorithm. The apparatus of any one of claims 1 to 6, wherein the editing operation comprises at least one of: an operation of painting over the reflective surface in the visual content; an operation of blurring the reflective surface in the visual content; an operation of cutting off a piece of the visual content which contains the reflective surface; and an operation of replacing the target information with public information on the reflective surface in the visual content. A method for processing a visual content, comprising: receiving the visual content; determining whether a reflective surface is present in the visual content; when the reflective surface is present in the visual content, determining whether target information is shown on the reflective surface; when the target information is shown on the reflective surface, providing prompt information to a user, the prompt information comprising: (i) an indication that the target information is shown on the reflective surface; and (ii) a request for removing the target information by using an editing operation; and receiving a user input from the user, the user input indicating whether the target information is to be removed by using the editing operation.
9. The method of claim 8, further comprising: before providing the prompt information to the user, obtaining a preview version of the visual content without the target information by using the editing operation; and providing, together with the prompt information, the preview version of the visual content to the user.
10. The method of claim 8 or 9, further comprising, upon determining that the target information is shown on the reflective surface: finding, in a pre-stored set of target-information types, a target-information type which the target information belongs to; finding, in a pre-stored set of risk levels, a risk level associated with the targetinformation type; and providing, together with the prompt information, the target-information type and the risk level to the user.
11. The method of claim 10, wherein the set of target-information types comprises: personal data; user activity-related information; and a company-related unique identifying symbol.
12. The method of any one of claims 8 to 11, further comprising: magnifying a piece of the visual content which contains the reflective surface; and providing, together with the prompt information, the magnified piece of the visual content to the user.
13. The method of any one of claims 8 to 12, wherein said determining whether the reflective surface is present in the visual content and said determining whether the target information is shown on the reflective surface are performed by using a machine learning algorithm.
14. The method of any one of claims 8 to 13, wherein the editing operation comprises at least one of: an operation of painting over the reflective surface in the visual content; an operation of blurring the reflective surface in the visual content; an operation of cutting a piece of the visual content which contains the reflective surface; and an operation of replacing the target information with public information on the reflective surface in the visual content. 15. A computer program product comprising a computer-readable storage medium, wherein the computer-readable storage medium stores a computer code which, when executed by at least one processor, causes the at least one processor to perform the method according to any one of claims 8 to 14.
PCT/EP2022/070314 2022-07-20 2022-07-20 Apparatus and method for processing a visual content WO2024017467A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/EP2022/070314 WO2024017467A1 (en) 2022-07-20 2022-07-20 Apparatus and method for processing a visual content

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/EP2022/070314 WO2024017467A1 (en) 2022-07-20 2022-07-20 Apparatus and method for processing a visual content

Publications (1)

Publication Number Publication Date
WO2024017467A1 true WO2024017467A1 (en) 2024-01-25

Family

ID=82932567

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2022/070314 WO2024017467A1 (en) 2022-07-20 2022-07-20 Apparatus and method for processing a visual content

Country Status (1)

Country Link
WO (1) WO2024017467A1 (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190042871A1 (en) * 2018-03-05 2019-02-07 Intel Corporation Method and system of reflection suppression for image processing
US20210390983A1 (en) * 2020-06-16 2021-12-16 Samuel Chi Hong Yau Removing visual content representing a reflection of a screen

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190042871A1 (en) * 2018-03-05 2019-02-07 Intel Corporation Method and system of reflection suppression for image processing
US20210390983A1 (en) * 2020-06-16 2021-12-16 Samuel Chi Hong Yau Removing visual content representing a reflection of a screen

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
WASSWA HASSAN HWASSWA@TTU EDU ET AL: "The Proof is in the Glare On the Privacy Risk Posed by Eyeglasses in Video Calls", 2022 13TH INTERNATIONAL CONFERENCE ON E-EDUCATION, E-BUSINESS, E-MANAGEMENT, AND E-LEARNING (IC4E), ACMPUB27, NEW YORK, NY, USA, 18 April 2022 (2022-04-18), pages 46 - 54, XP058854241, ISBN: 978-1-4503-9585-4, DOI: 10.1145/3510548.3519378 *

Similar Documents

Publication Publication Date Title
Bennett et al. How teens with visual impairments take, edit, and share photos on social media
RU2735617C2 (en) Method, apparatus and system for displaying information
KR102196486B1 (en) Content collection navigation and auto-forwarding
CN108027827B (en) Coordinated communication and/or storage based on image analysis
US8917913B2 (en) Searching with face recognition and social networking profiles
EP3713159B1 (en) Gallery of messages with a shared interest
US9911002B2 (en) Method of modifying image including photographing restricted element, and device and system for performing the method
US9674485B1 (en) System and method for image processing
CN104850213B (en) Wearable electronic device and information processing method for wearable electronic device
US11196962B2 (en) Method and a device for a video call based on a virtual image
US8826150B1 (en) System and method for tagging images in a social network
KR20190084278A (en) Automatic suggestions for sharing images
CN103365922A (en) Method and device for associating images with personal information
US9491257B2 (en) Facilitation of social interactions
US20170337652A1 (en) Locally broadcasted token to cause image masking of user likeness
CN102549591A (en) Shared face training data
AU2015297230A1 (en) Method of modifying image including photographing restricted element, and device and system for performing the method
US20200112838A1 (en) Mobile device that creates a communication group based on the mobile device identifying people currently located at a particular location
CN110780833A (en) Data sharing method and system
US20180267998A1 (en) Contextual and cognitive metadata for shared photographs
US11824873B2 (en) Digital media authentication
CA3086381C (en) Method for detecting the possible taking of screenshots
CN111224794B (en) Group communication method, device and equipment
WO2024017467A1 (en) Apparatus and method for processing a visual content
CN114332975A (en) Identifying objects partially covered with simulated covering

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22754827

Country of ref document: EP

Kind code of ref document: A1