WO2021250564A1 - Method for accessing multimedia content - Google Patents

Method for accessing multimedia content Download PDF

Info

Publication number
WO2021250564A1
WO2021250564A1 PCT/IB2021/055029 IB2021055029W WO2021250564A1 WO 2021250564 A1 WO2021250564 A1 WO 2021250564A1 IB 2021055029 W IB2021055029 W IB 2021055029W WO 2021250564 A1 WO2021250564 A1 WO 2021250564A1
Authority
WO
WIPO (PCT)
Prior art keywords
multimedia content
code
identification data
user
searching
Prior art date
Application number
PCT/IB2021/055029
Other languages
French (fr)
Inventor
Daniele LAZZARA
Enrico MINOTTI
Manolo MARTINI
Original Assignee
Pica Group S.P.A.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Pica Group S.P.A. filed Critical Pica Group S.P.A.
Priority to EP21737494.1A priority Critical patent/EP4162376A1/en
Publication of WO2021250564A1 publication Critical patent/WO2021250564A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/30Authentication, i.e. establishing the identity or authorisation of security principals
    • G06F21/31User authentication
    • G06F21/32User authentication using biometric data, e.g. fingerprints, iris scans or voiceprints
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2413Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
    • G06F18/24133Distances to prototypes
    • G06F18/24143Distances to neighbourhood prototypes, e.g. restricted Coulomb energy networks [RCEN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/40Spoof detection, e.g. liveness detection
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/08Network architectures or network communication protocols for network security for authentication of entities
    • H04L63/0861Network architectures or network communication protocols for network security for authentication of entities using biometrical features, e.g. fingerprint, retina-scan

Definitions

  • the present invention relates to a method for accessing multimedia content, for example photos and/or videos.
  • the invention relates to an access method based on facial recognition that allows a subject to be recognised in a recognition image or video and to allow access to multimedia content in which the same subject is present maintaining respect for privacy through the use of a code. DESCRIPTION OF THE STATE OF THE ART
  • Participants in the event are generally interested in accessing multimedia content recorded during the event.
  • a few examples can be that of a marathon runner who wishes to have his/her picture taken at the finishing line, a family that wishes to have a photo taken at an amusement park, or a group of friends at a concert who want a video to remember the event by.
  • This system suffers from a serious lack of privacy. To enable participants in the event to identify the multimedia content in which they are present, they must be able to access all the multimedia content in order to evaluate whether or not they appear in this content. In other words, any participant in the event can access all the multimedia content of the event, including content relating to other participants.
  • any person with an Internet connection to the platform can access the content, even if extraneous to the event, resulting in an even greater infringement of the privacy of participants.
  • facial recognition has been utilised to allow recognition of a participant in a given event.
  • the user To be able to access the multimedia content in which he/she appears, the user must know the context in which the multimedia content is present, for example the name of an event in which he/she participated, with the aim of finding the platform that hosts the multimedia content, so as to execute facial recognition.
  • said website requires to indicate a list of said events, so that the user can choose the event in which to search for the multimedia content in which he/she appears.
  • the service provider for consultation of the multimedia content for example the administrator of said website, must explicitly indicate all the events among which a search can be performed. This means that some events, which due to their nature should preferably be private, such as for example a wedding, a corporate event, or the like, are visible.
  • the solution devised allows access to multimedia content in an anonymous and privacy compliant manner generally utilising two paired criteria: a code and a recognition image.
  • the recognition image comprises a face, for example the photo of a person, and advantageously allows the recognition of multimedia content comprising this face with a facial recognition procedure.
  • the code allows access to the multimedia content to be limited based on said code.
  • An embodiment can relate to a method for accessing multimedia content comprising the steps of: generating a first code, acquiring multimedia content, acquiring the first code, acquiring a recognition image, extracting identification data of the recognition image, searching for multimedia content associated with the identification data, enabling access to the multimedia content resulting from the step of searching.
  • the steps of acquiring and extracting can be performed by a user device, the method can further comprise, after the step of extracting, a step of transmitting the identification data from the user device to a service provider device.
  • the step of searching can comprise a step of extracting identification data of at least one of the multimedia content, a step of computing a difference between the identification data of the recognition image and the identification data of the at least one of the multimedia content. Due to this configuration, it is advantageously possible to implement the search in an efficient and reliable manner through the multimedia content without making it visible to third parties.
  • the method can further comprise a step of associating the multimedia content, resulting from the step of searching, with a second code.
  • the second code can correspond to the first code.
  • the second code can be a unique code, different from the first code, preferably deriving from the first code.
  • the method can further comprise a step of removing at least part of the multimedia content resulting from the step of searching.
  • the method can further comprise a second step of extracting identification data from at least one multimedia content resulting from the step of searching, and a second step of searching for multimedia content associated with the identification data resulting from the second step of extracting.
  • An embodiment can also relate to a system for accessing multimedia content comprising at least: a user device, a photographer device, a service provider device, connectable to the user device and to the photographer device, where the system can be configured so as to execute the method according to any one of the previous embodiments.
  • An embodiment can also relate to a service provider device configured so as to execute the steps of: generating a first code, acquiring multimedia content, acquiring the first code, acquiring identification data of a recognition image, searching for multimedia content associated with the identification data, enabling access to the multimedia content resulting from the step of searching.
  • Fig. 1 schematically illustrates a method 1000 for accessing multimedia content
  • Fig. 2 schematically illustrates a method 2000 for accessing multimedia content
  • Fig. 3A schematically illustrates a plurality of multimedia content 3200 and a plurality of codes 3100
  • Fig. 3B schematically illustrates a plurality of multimedia content 3200, some of which is associated with a code 3100;
  • Fig. 3C schematically illustrates a plurality of multimedia content 3200, some of which is associated with a code 3110;
  • FIG. 4 schematically illustrates devices of a system 4000
  • FIG. 5 schematically illustrates a step of searching S5600 for multimedia content
  • Fig. 6 schematically illustrates a method 6000 for accessing multimedia content
  • Fig. 7 schematically illustrates a method 7000 for accessing multimedia content
  • Fig. 8 schematically illustrates a method 8000 for accessing multimedia content
  • Fig. 9 schematically illustrates a method 9000 for accessing multimedia content.
  • Fig. 1 schematically illustrates a method 1000 for accessing multimedia content.
  • Fig. 4 schematically illustrates a system 4000 that can implement the method 1000 of Fig. 1 and optionally other methods described below.
  • the method 1000 can be implemented by a system 4000 for accessing multimedia content comprising at least:
  • a user device 4300 for example operated by a user who participated in an event
  • a photographer device 4500 for example operated by a photographer who captured multimedia content at the event;
  • a service provider device 4400 connectable to the user device 4300 and to the photographer device 4500, preferably so as to be able to exchange data therewith, for example via the Internet.
  • the system 4000 can be configured so as to execute the method 1000, or other methods described below, and advantageously allow the photographer to upload multimedia content relating to the event to the service provider device 4400, and the user to access multimedia content in which he/she is present, among those available on the service provider device 4400.
  • Some embodiments of the invention may also relate to the single device 4300, 4400, 4500, configured so as to execute one or more steps of the method 1000, or of other methods described below.
  • the invention allows this result to be obtained without necessarily requiring contact between the user and the photographer, thus improving the approach to the privacy of the data of both. Moreover, as will be evident below, the invention allows the user to prevent undesirable accesses to the multimedia content that portrays him/her.
  • the devices 4300, 4400, 4500 can generally comprise electronic devices operated by the operators identified above, i.e., the user, the service provider, and the photographer, for the purpose of implementing one or more of the steps of the method 1000. It is therefore evident that these devices can be implemented generically by any electronic device with hardware and software suitable for implementation of one or more of the steps described below, for example a PC, a smartphone, a tablet, a camera, etc.
  • the method 1000 comprises a step of generating SH OO a code 3100.
  • the code 3100 can generally be a code interpretable by a computer, for example an alphanumerical code and/or a graphic code, for example a barcode, an Aruco code, a QRcode, etc.
  • the step SH OO can be implemented, for example, by the service provider device 4400.
  • the code 3100 can be distributed to one or more users, for example electronically by sending to the respective user device 4300, but also orally and/or on paper.
  • the service provider device 4400 it will not be necessary for the service provider device 4400 to identify and/or store data concerning the one or more users to whom the code 3100 has been provided, making the distribution procedure of the code 3100 particularly simple and effective.
  • the code 3100 can be unique for a given event. In this case the code 3100 can be common to a plurality of users participating in the given event. Alternatively, or additionally, the code 3100 can be unique for a given user.
  • the code 3100 can be applied to both configurations. Any differences of implementation of the invention in the two cases shall be discussed below.
  • the method 1000 further comprises a step of acquiring S1200 multimedia content 3200.
  • the multimedia content 3200 can, for example, be photographs, videos, or the like.
  • the step S1200 can be implemented, for example, by the photographer device 4500 and/or by the service provider device 4400. In some embodiments this can take place, for example, by uploading the multimedia content 3200 to an online platform, for example a website, an app, or the like managed by the service provider, more generally transmitting the multimedia content 3200, acquired by the photographer device 4500, to the service provider device 4400.
  • an online platform for example a website, an app, or the like managed by the service provider, more generally transmitting the multimedia content 3200, acquired by the photographer device 4500, to the service provider device 4400.
  • step S1200 is represented schematically as subsequent to the step SH OO, in some embodiments it will be possible to acquire the multimedia content 3200 before or in parallel to the step of generating SH OO the code 3100.
  • Fig. 3A schematically illustrates three codes 3100, which are intended as being different from one another as indicated schematically by the indices #1-3, and six multimedia content 3200, also intended as being different from one another as indicated schematically by the indices #1-6. Therefore, this figure can schematically represent the content of a data memory of the service provider device S4400 after execution of the steps SH OO and S1200.
  • the method 1000 further comprises a step of acquiring S1300 the code 3100.
  • the step of acquiring S1300 the code 3100 can preferably be implemented by the user device 4300. It will be clear that execution of the step S1300, illustrated in Fig. 1 as before the step S1500 is not limited to this positioning in time. In general, it will be sufficient for the step S1300 to be after the step SH OO, where the code 3100 is generated, and before the step S1700, where, as will be described below, the code 3100 is utilised.
  • the code 3100 allows the service provider device 4400 to associate the user with the code 3100, or with a code deriving from the code 3100, so as to ensure that the user has control of the multimedia content that is recognised as comprising his/her face.
  • the method 1000 can therefore comprise a step of checking use of the code 3100 and a step of terminating the method, not illustrated, in the case in which, following the step of checking use, it is found that the code 3100 acquired is already present in the database of codes utilised.
  • the steps of checking use and terminating can, in some embodiments, be after the step S1300. In some embodiments, the steps of checking use and terminating can preferably be implemented by the user device 4300, or by the service provider device 4400.
  • the method 1000 further comprises a step of acquiring S1400 a recognition image.
  • the step of acquiring the code 3100 can preferably be implemented by the user device 4300.
  • the recognition image can be acquired in the form of an image file, or extracted as frame of a video.
  • the acquisition operation can be carried out in a known manner, for example by taking a photo using a camera and/or smartphone, and/or via the user device 4300, and/or acquiring a file already saved via software, an app, or the like.
  • the recognition image preferably comprises the face of the user, so as to allow execution of facial recognition, as will be described below.
  • the step S1400 can be implemented by the user device 4300. In some embodiments, the step S1400 can further comprise taking a selfie and/or a step of checking that a photo of a real face has been taken.
  • This last step can be implemented in the form of a "liveness detection” step.
  • This term identifies a class of algorithms with which it is possible to determine that a selfie was taken from a real face and not from a photo of said face. Some examples of these algorithms are known, for example, from the documents:
  • the advantage of this approach consists in the possibility of ensuring that the recognition image derives from a real person, i.e., the user of the method, and not from another image thereof. In this way it is advantageously possible to ensure that third parties do not exploit photographs or videos of a user to access multimedia content portraying the user, which instead advantageously only the user is allowed to access, by taking the selfie ascertained with the methods described.
  • the method 1000 further comprises a step of extracting S1500 identification data of the recognition image.
  • the recognition image it will be possible to analyse the recognition image so as to extract identification data of the recognition image, i.e., a series of data such that a unique correspondence exists between the recognition image and said identification data, preferably between the face present in the recognition image and said identification data.
  • identification data i.e., a series of data such that a unique correspondence exists between the recognition image and said identification data, preferably between the face present in the recognition image and said identification data.
  • the extraction function of the identification data can be configured so as to result in a unique correspondence of all said recognition images of the given person and a unique set of identification data, or of all said recognition images of the given person and a plurality of sets of identification data, with a difference between them below a predetermined value.
  • identification data can be interpreted as a digital code associated with the face of a given person.
  • This extraction of identification data can be obtained, for example, utilising a suitable algorithm for the extraction of identification data such as the one provided : - by the open source library DLIB, available at the address http ://dlib.net;
  • the step S1400 can be executed by the user device 4300 while the step S1500 can be executed by the service provider device 4400.
  • a further step, not illustrated, of transmission of the recognition image from the user device 4300 to the service provider device 4400 will also be provided.
  • the steps S1400 and S1500 can be executed by the user device 4300, which advantageously makes it possible to avoid transmission of the recognition images to the service provider device 4400.
  • the method 2000 can further comprise a step of transmitting/receiving the identification data S2510 of the recognition image to the service provider device 4400, preferably comprised between the steps S1500 and S1600.
  • the step S2510 can advantageously be performed as transmission from the user device 4300 and as reception from the service provider device 4400.
  • This implementation is particularly advantageous as it reduces the amount of data to be transmitted from the user device 4300 to the service provider device 4400, and the amount of data that the latter is required to save. Moreover, the computational resources required for said extraction are distributed between the various users 4300, reducing the workload of the service provider device 4400. Finally, images portraying the user are neither transmitted nor stored by the service provider, reducing the risk of accidental dissemination of said images.
  • the method 1000 further comprises a step of searching S1600 for multimedia content associated with the identification data.
  • the step of searching S1600 can preferably be implemented by the service provider device 4400.
  • the search can generally be aimed at identifying one or more multimedia content comprising a face identified based on the identification data extracted from the recognition image.
  • the search can be limited to the multimedia content associated with a given event. In this case, it is necessary, for the service provider, to know which multimedia content is associated with a given event. In an embodiment, this will be possible by implementing in step S1200 the acquisition of a code event, provided to the photographer, or by asking the photographer to indicate an event from a list of available events.
  • the code event provided to the photographer can be the same code 3100.
  • the code provided to the user in step SH OO is unique for the user, it will be possible to implement in step SH OO an association of the plurality of unique codes provided to the users with a single code event, provided to the photographers. Subsequently, during acquisition of the user code in step S1300 it will be advantageously possible to identify the code event associated therewith and consequently recognise the multimedia content deriving from this event.
  • step S1600 different methods for recognition of multimedia content can be implemented.
  • Fig. 5 illustrates, by way of example, a possible implementation of the step S1600 in the form of a step of searching S5600 for multimedia content.
  • the step of searching S5600 can comprise a step of extracting S5610 identification data of at least one of the multimedia content 3200.
  • this step of extracting can be performed analogously to the step of extracting S1500 identification data of the recognition image, already described.
  • known algorithms can be used for implementation of the step of extracting S1500 and/or for the step of extracting S5610, possibly different to one another.
  • the step of searching S5600 can further comprise a step of computing S5620 a difference between the identification data of the recognition image and the identification data of the at least one of the multimedia content 3200.
  • training of the neural network that led to the extraction of the identification data for the recognition image can be the same used for the neural network utilised for extraction of the identification data of the multimedia content.
  • this embodiment it is thus advantageous to allow the two neural networks indicated above to be aligned, sharing the same training dataset.
  • This can advantageously be implemented by sharing the training dataset between the two neural networks.
  • the reference training dataset is saved on the service provider device 4400 and shared with the user device 4300.
  • extraction of the vectors indicated above starting from a given image or multimedia content can be obtained through known identification data extraction algorithms, such as:
  • the neural network utilised can be a version reduced to 29 convolutional layers with a lower number of filters, for example half, with respect to ResNet- 34 defined in the document:
  • extraction of the identification data can be obtained with any algorithm for the extraction of the facial features.
  • algorithms based on the model 128-D defined above are preferable, as it allows a specific algorithm, optionally optimised, to be utilised for each particular platform.
  • on the server it may be more convenient to utilise DLIB rather than other libraries for integration with the CUDA interfaces of the Nvidia GPUs, while on mobile devices that are not equipped with Nvidia GPUs it may be more efficient to utilise other extraction algorithms for the purpose of improving performance.
  • the multimedia content for which the difference, or the Euclidean distance in the specific example above, is lower than a predefined threshold value can be considered as comprising a face identified based on the identification data extracted from the recognition image and hence belonging to the results of the step of searching S1600. Notwithstanding the description above has been made, for clarity of disclosure, in relation to a single multimedia content, it will be evident that the invention can compare the identification data of the recognition image with a plurality of identification data corresponding to a plurality of multimedia content.
  • the method 1000 further comprises a step of enabling S1700 access to the multimedia content resulting from the step of searching S1600.
  • the step of enabling S1700 can preferably be implemented by the service provider device 4400.
  • the multimedia content resulting from the step of searching S1600 is multimedia content in which the face of the user, identified by the recognition image, is present.
  • the step of enabling can therefore be implemented so as to allow the user, preferably through the user device 4300, to access the multimedia content resulting from the step of searching S1600.
  • the specific implementation as will be evident to the person skilled in the art, can generally be obtained through an association of the multimedia content resulting from the step of searching S1600 with permissions associated with the user associated with the identification data of the recognition image, for example allowing the user in question to create an account with the service provider device 4400. Additionally, or alternatively, it will be possible to perform an association of the multimedia content resulting from the step of searching S1600 with permissions associated with the user device 4300.
  • the method 1000 can further comprise a step of associating S6800 multimedia content 3200 resulting from the step of searching S1600 with a second code 3110.
  • the step of associating S6800 can preferably be implemented by the service provider device 4400.
  • step S6800 illustrated in Fig. 6 as subsequent to the step S1600 can be implemented before, after or simultaneously to the step S1700.
  • the step of associating 6800 allows the multimedia content 3200 resulting from the step of searching S1600, and hence deriving from the recognition image, to be associated with a second code 3110 associated with the user, i.e., with the person illustrated in the recognition image. In this way it will be possible to identify this multimedia content in subsequent searches, so as to prevent the multimedia content from being shown to other users. This function will be described in detail below.
  • the second code 3110 can have features similar to those already described for the first code.
  • Association can take place in a known way, for example creating a tag associated with the multimedia content and containing the second code 3110. Alternatively, or additionally, it will be possible to register an association between the second code 3110 and the one or more multimedia content in a database.
  • the second code 3110 can correspond to the first code 3100.
  • This embodiment is particularly advantageous, for example, in the case in which the first code 3100 is unique for each user.
  • the second code 3110 is unique for the user, its association with the multimedia content in which the user has been identified will allow identification of the multimedia content in future searches unequivocally as belonging to the user in question.
  • This will also advantageously allow rapid identification of all the multimedia content associated with the user, with input by the latter of the first code 3100, which is known to him/her.
  • FIG. 3B This embodiment is schematically illustrated in Fig. 3B.
  • the code #1 is the one acquired by the user at step S1300 and that it is a unique code associated with the user that executed the method 1000.
  • the multimedia content #1-3 comprises the face of the user, recognised by the recognition image acquired by the user at step S1400, which consequently led to identification of the multimedia content #1-3 at step S1600.
  • the result of the step of associating S5800 schematically illustrated by the dashed lines, thus allows association of the code #1, linked to the user, with the multimedia content #1-3, comprising the face of the user.
  • the second code 3110 can be a unique code, different from the first code 3100.
  • This embodiment is particularly advantageous, for example, in the case in which the first code 3100 is unique for each event but common to a plurality of users.
  • the step of associating 6800 can be preceded by a step of creating S7810 the second code 3110, as schematically illustrated in Fig. 7.
  • the step of creating S7810 can preferably be implemented by the service provider device 4400.
  • step S7810 illustrated in Fig. 7 as preceding step S6800 can be implemented not necessarily immediately before the step S6800 but, more in general, between the step S1400, which gives rise to a recognition procedure associated with a user, and the step S6800, where the use of the second code 3110 is necessary.
  • Fig. 3C This embodiment is schematically illustrated in Fig. 3C.
  • the code #1 is the code acquired by the user at step S1300 and that it is a unique code associated with an event, but common to a plurality of users participating in that event.
  • the second code #1-1 is associated with the user who is executing the method, and, as in the case of Fig. 3B, this is associated with the multimedia content #1- 3, as indicated by the dashed lines.
  • the second code 3110 can preferably be derived from the first code 3100.
  • the term derived it will be meant that at least part of the first code 3100 is utilised for the generation of a plurality of second codes 3110.
  • at least part of the code #1 can be utilised for the generation of the codes #1- 1 and #1-2.
  • the generation of the second codes will be such that, starting from the second code 3110 it is possible to trace the part of the first code 3100 from which the second code 3110 was generated.
  • the first code 3100 could be an alphanumerical code of X characters and the second code 3110 could be an alphanumerical code of X+Y characters, where the X characters of the first and of the second code correspond, and where the Y characters of the second code allow the creation of a plurality of second codes, unique for each user.
  • the generation of a second code 3110 in a manner derived from the first code 3100 advantageously allows the service provider device 4400 to identify not only the multimedia content associated with a single user, as described previously, but also that associated with a single event, in a simple and effective way.
  • the method 1000 can further comprise a step of removing S8900 at least part of the multimedia content 3200 resulting from the step of searching S1600.
  • the step of removing S8900 can preferably be implemented by the service provider device 4400.
  • step S8900 illustrated in Fig. 8 as subsequent to the step S1600 can be implemented before, after or simultaneously to any one of the steps subsequent to the step S1600.
  • the aim of the step of removing S8900 is to prevent a multimedia content 3200, resulting from the step of searching S1600, from also being the result of another search, with the exception of multimedia content with several faces, so as to prevent access to the multimedia content from being authorised also for users who are not portrayed.
  • the step of removing S8900 can be implemented, in some embodiments, by producing a copy of the multimedia content 3200 resulting from the step of searching S1600, and eliminating/modifying the pixels concerned by the area in which the face corresponding to the identification data of the recognition image was recognised. In subsequent executions of the method 1000, it will then be possible to execute the step of searching S1600 on the multimedia content thus modified, thereby making it impossible for the same face to be recognised more than once and hence inadvertently granting access to the content of a user to other users.
  • the step of removing S8900 can be implemented, in some embodiments, by identifying the area in which the face corresponding to the identification data of the recognition image was recognised and saving information relating to said area in association with the multimedia content.
  • the step of searching SI600 on the multimedia content with the exception of said area, thus making it impossible for the same face to be recognised more than once with the same advantages already described.
  • the term "removal” shall therefore in this case not imply removal of data from the multimedia content but only identification of areas to be removed in subsequent searches. This approach has the advantage of reducing the amount of data that must be saved, as no copy of the multimedia content is required.
  • the step of removing S8900 in any case allows the different faces to be identified and the different users portrayed therein to access the multimedia content.
  • Fig. 9 represents a further embodiment of the invention.
  • the method 9000 differs from the method 1000 due to the additional presence of the steps S9410 and S9410.
  • the search for the multimedia content could be executed several times based on the extracted identification data of the recognition image. This can, in some cases, lead to failure to recognise multimedia content in the case in which the face of the user has changed between the time of the event and the time of execution of the method, for example due to the presence/absence of glasses, different haircut, beard, etc.
  • the method 9000 further comprises a second step of extracting S9410 identification data from at least one multimedia content resulting from the step of searching S1600.
  • these identification data can be more representative of the face of the user at the time of the event than those extracted from the reference image. It is therefore advantageous to continue and/or repeat the search for the multimedia content utilising the identification data extracted at step S9410 instead of those extracted at step S1500.
  • the second step of searching S9610 can generally be implemented just as the step of searching S1600, S5600 already described, with the difference that the search is based on the identification data extracted at step S9410 instead of those extracted at step S1500.
  • the steps S9410 and S9610 can be implemented by the service provider device 9400.
  • first code 3110 second code 3200: multimedia content

Abstract

The invention relates in general to a method (1000) for accessing multimedia content comprising the steps of: generating (S1100) a first code (3100), acquiring (S1200) multimedia content (3200), acquiring (S1300) the first code (3100), acquiring (S1400) a recognition image, extracting (S1500) identification data of the recognition image, searching (S1600, S5600) for multimedia content associated with the identification data, and enabling (S1700) access to the multimedia content resulting from the step of searching (S1600).

Description

METHOD FOR ACCESSING MULTIMEDIA CONTENT
DESCRIPTION
The present invention relates to a method for accessing multimedia content, for example photos and/or videos. In particular, the invention relates to an access method based on facial recognition that allows a subject to be recognised in a recognition image or video and to allow access to multimedia content in which the same subject is present maintaining respect for privacy through the use of a code. DESCRIPTION OF THE STATE OF THE ART
During various events, such as sporting competitions, concerts, meetings and celebrations, a large number of the public and one or more photographers are present. During the event, the photographers can take one or more photos, or film one or more videos, which hereinafter shall both be defined as multimedia content.
Participants in the event are generally interested in accessing multimedia content recorded during the event. A few examples can be that of a marathon runner who wishes to have his/her picture taken at the finishing line, a family that wishes to have a photo taken at an amusement park, or a group of friends at a concert who want a video to remember the event by.
In these cases, it is useful to be able to allow the people appearing in the photo to access multimedia content. In the past, for this purpose, systems were used in which photographers uploaded multimedia content relating to the event to a website or other platform and participants at the event accessed the platform, searched manually for the multimedia content in which they appeared, and then downloaded it or, where required, purchased it.
This system suffers from a serious lack of privacy. To enable participants in the event to identify the multimedia content in which they are present, they must be able to access all the multimedia content in order to evaluate whether or not they appear in this content. In other words, any participant in the event can access all the multimedia content of the event, including content relating to other participants.
Moreover, in particular in the case in which access to the platform on which the multimedia content is not limited to the participants in the event, any person with an Internet connection to the platform can access the content, even if extraneous to the event, resulting in an even greater infringement of the privacy of participants.
Recently, facial recognition has been utilised to allow recognition of a participant in a given event. To be able to access the multimedia content in which he/she appears, the user must know the context in which the multimedia content is present, for example the name of an event in which he/she participated, with the aim of finding the platform that hosts the multimedia content, so as to execute facial recognition.
This requires the user to choose the event, for example on a website that collects the multimedia content of different events. In this configuration, said website requires to indicate a list of said events, so that the user can choose the event in which to search for the multimedia content in which he/she appears.
This solution has various problems. The service provider for consultation of the multimedia content, for example the administrator of said website, must explicitly indicate all the events among which a search can be performed. This means that some events, which due to their nature should preferably be private, such as for example a wedding, a corporate event, or the like, are visible.
It is also possible for anyone, choosing an event randomly and uploading a photo of a third person, for example obtained from Internet sources such as Facebook, Linkedln, Instagram, WhatsApp or the like, to match it with the multimedia content of the event. In other words, anyone in possession of any photo of a given person can obtain access to multimedia content in which this person is present. This clearly presents a problem for privacy.
Moreover, it has been seen that a cross search for a face in all the possible events performed by an event container, for example said website, can easily generate false positives, due to the large amount of multimedia content that increases the number of similar faces.
Additionally, facial recognition on a large amount of multimedia content makes use of a large amount of computational resources and is thus much more costly for the administrator of the platform that offers the service. Therefore, there is the need to allow participants in an event to access multimedia content that portrays them in a manner that is simple for the participants, i.e., the service users, and that is safe, which prevents infringements of privacy, and which can be technically and commercially managed by the service provider for accessing multimedia content.
SUMMARY OF THE INVENTION
The solution devised allows access to multimedia content in an anonymous and privacy compliant manner generally utilising two paired criteria: a code and a recognition image.
In particular, the recognition image comprises a face, for example the photo of a person, and advantageously allows the recognition of multimedia content comprising this face with a facial recognition procedure. Additionally, the code allows access to the multimedia content to be limited based on said code.
In this way, it is not necessary to provide a specific list of events, nor to allow access to all multimedia content in order to allow each single user to carry out his/her own search.
An embodiment can relate to a method for accessing multimedia content comprising the steps of: generating a first code, acquiring multimedia content, acquiring the first code, acquiring a recognition image, extracting identification data of the recognition image, searching for multimedia content associated with the identification data, enabling access to the multimedia content resulting from the step of searching.
Due to this configuration, it is advantageously possible to enable access to multimedia content in a privacy compliant manner and with a limited use of computational resources.
In some embodiments, the steps of acquiring and extracting can be performed by a user device, the method can further comprise, after the step of extracting, a step of transmitting the identification data from the user device to a service provider device.
Due to this configuration, it is advantageously possible to limit distribution of the recognition image outside the user device.
In some embodiments, the step of searching can comprise a step of extracting identification data of at least one of the multimedia content, a step of computing a difference between the identification data of the recognition image and the identification data of the at least one of the multimedia content. Due to this configuration, it is advantageously possible to implement the search in an efficient and reliable manner through the multimedia content without making it visible to third parties.
In some embodiments, the method can further comprise a step of associating the multimedia content, resulting from the step of searching, with a second code.
Due to this configuration, it is advantageously possible to allow subsequent recognition of the multimedia data resulting from the search, for example to speed up subsequent searches and/or catalogue the multimedia data utilising the second code as tag or key.
In some embodiments, the second code can correspond to the first code.
Due to this configuration, it is advantageously possible to directly associate the code provided by the user with the multimedia content that portrays him/her.
In some embodiments, the second code can be a unique code, different from the first code, preferably deriving from the first code.
Due to this configuration, it is advantageously possible to use the first code as code associated with a given event. In some embodiments, the method can further comprise a step of removing at least part of the multimedia content resulting from the step of searching.
Due to this configuration, it is advantageously possible to prevent the content that portrays a first user and at least a second user from making the first user viewable by the second user, leaving the first user the choice of whether to remove the part of the multimedia content that portrays him/her.
In some embodiments, the method can further comprise a second step of extracting identification data from at least one multimedia content resulting from the step of searching, and a second step of searching for multimedia content associated with the identification data resulting from the second step of extracting.
Due to this configuration, it is advantageously possible to allow the correct identification of multimedia content in which the face of a user is morphologically at least partially different from the content available in the recognition image, for example following a different haircut, make-up, beard, etc.
An embodiment can also relate to a system for accessing multimedia content comprising at least: a user device, a photographer device, a service provider device, connectable to the user device and to the photographer device, where the system can be configured so as to execute the method according to any one of the previous embodiments.
Due to this configuration, it is advantageously possible to physically implement the method described above with a plurality of independent devices, which can be operated by different subjects so as to ensure easy access to the multimedia content by the user without the risk of invading his/her privacy.
An embodiment can also relate to a service provider device configured so as to execute the steps of: generating a first code, acquiring multimedia content, acquiring the first code, acquiring identification data of a recognition image, searching for multimedia content associated with the identification data, enabling access to the multimedia content resulting from the step of searching.
Due to this configuration, it is advantageously possible to implement a service provider device so as to ensure easy access to the multimedia content by a plurality of users without the risk of invading the privacy of each of the users and with an efficient use of the computational resources of the service provider device.
BRIEF DESCRIPTION OF THE FIGURES
Further features and advantages of the method according to the present invention will be more apparent from the following description, set down with reference to the accompanying figures, which illustrate some non-limiting embodiment examples thereof, in which identical or corresponding parts of the device are identified by the same reference numbers. In particular:
Fig. 1 schematically illustrates a method 1000 for accessing multimedia content;
Fig. 2 schematically illustrates a method 2000 for accessing multimedia content;
Fig. 3A schematically illustrates a plurality of multimedia content 3200 and a plurality of codes 3100;
Fig. 3B schematically illustrates a plurality of multimedia content 3200, some of which is associated with a code 3100;
Fig. 3C schematically illustrates a plurality of multimedia content 3200, some of which is associated with a code 3110;
- Fig. 4 schematically illustrates devices of a system 4000;
- Fig. 5 schematically illustrates a step of searching S5600 for multimedia content;
Fig. 6 schematically illustrates a method 6000 for accessing multimedia content;
Fig. 7 schematically illustrates a method 7000 for accessing multimedia content;
Fig. 8 schematically illustrates a method 8000 for accessing multimedia content; and
Fig. 9 schematically illustrates a method 9000 for accessing multimedia content. DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS
Fig. 1 schematically illustrates a method 1000 for accessing multimedia content. Fig. 4 schematically illustrates a system 4000 that can implement the method 1000 of Fig. 1 and optionally other methods described below.
In particular, the method 1000 can be implemented by a system 4000 for accessing multimedia content comprising at least:
- a user device 4300, for example operated by a user who participated in an event;
- a photographer device 4500, for example operated by a photographer who captured multimedia content at the event;
- a service provider device 4400, connectable to the user device 4300 and to the photographer device 4500, preferably so as to be able to exchange data therewith, for example via the Internet.
As will be evident from the description below, the system 4000 can be configured so as to execute the method 1000, or other methods described below, and advantageously allow the photographer to upload multimedia content relating to the event to the service provider device 4400, and the user to access multimedia content in which he/she is present, among those available on the service provider device 4400. Some embodiments of the invention may also relate to the single device 4300, 4400, 4500, configured so as to execute one or more steps of the method 1000, or of other methods described below.
Advantageously, the invention allows this result to be obtained without necessarily requiring contact between the user and the photographer, thus improving the approach to the privacy of the data of both. Moreover, as will be evident below, the invention allows the user to prevent undesirable accesses to the multimedia content that portrays him/her.
It is clear that, although a single user device 4300 and a single photographer device 4500 have been illustrated, the invention can also be implemented in the case of a plurality of user devices 4300 and of a plurality of photographer devices 4500, as will be evident from the description below. The devices 4300, 4400, 4500 can generally comprise electronic devices operated by the operators identified above, i.e., the user, the service provider, and the photographer, for the purpose of implementing one or more of the steps of the method 1000. It is therefore evident that these devices can be implemented generically by any electronic device with hardware and software suitable for implementation of one or more of the steps described below, for example a PC, a smartphone, a tablet, a camera, etc. The method 1000 comprises a step of generating SH OO a code 3100. The code 3100 can generally be a code interpretable by a computer, for example an alphanumerical code and/or a graphic code, for example a barcode, an Aruco code, a QRcode, etc.
In some embodiments, the step SH OO can be implemented, for example, by the service provider device 4400.
Moreover, the code 3100 can be distributed to one or more users, for example electronically by sending to the respective user device 4300, but also orally and/or on paper. As will be evident from the description below, it will not be necessary for the service provider device 4400 to identify and/or store data concerning the one or more users to whom the code 3100 has been provided, making the distribution procedure of the code 3100 particularly simple and effective.
The code 3100 can be unique for a given event. In this case the code 3100 can be common to a plurality of users participating in the given event. Alternatively, or additionally, the code 3100 can be unique for a given user. Hereinafter, unless otherwise specified, it will be assumed that the invention can be applied to both configurations. Any differences of implementation of the invention in the two cases shall be discussed below.
The method 1000 further comprises a step of acquiring S1200 multimedia content 3200. The multimedia content 3200 can, for example, be photographs, videos, or the like.
In some embodiments, the step S1200 can be implemented, for example, by the photographer device 4500 and/or by the service provider device 4400. In some embodiments this can take place, for example, by uploading the multimedia content 3200 to an online platform, for example a website, an app, or the like managed by the service provider, more generally transmitting the multimedia content 3200, acquired by the photographer device 4500, to the service provider device 4400.
It will be clear that although the step S1200 is represented schematically as subsequent to the step SH OO, in some embodiments it will be possible to acquire the multimedia content 3200 before or in parallel to the step of generating SH OO the code 3100.
Fig. 3A schematically illustrates three codes 3100, which are intended as being different from one another as indicated schematically by the indices #1-3, and six multimedia content 3200, also intended as being different from one another as indicated schematically by the indices #1-6. Therefore, this figure can schematically represent the content of a data memory of the service provider device S4400 after execution of the steps SH OO and S1200.
The method 1000 further comprises a step of acquiring S1300 the code 3100. In some embodiments, the step of acquiring S1300 the code 3100 can preferably be implemented by the user device 4300. It will be clear that execution of the step S1300, illustrated in Fig. 1 as before the step S1500 is not limited to this positioning in time. In general, it will be sufficient for the step S1300 to be after the step SH OO, where the code 3100 is generated, and before the step S1700, where, as will be described below, the code 3100 is utilised.
Generally, as will be described in more detail below, the code 3100 allows the service provider device 4400 to associate the user with the code 3100, or with a code deriving from the code 3100, so as to ensure that the user has control of the multimedia content that is recognised as comprising his/her face.
In some embodiments, in particular those in which the code 3100 is unique for each user, it will be possible to implement a step of checking the presence of the code 3100 acquired in a database of codes utilised, and a subsequent step of storing the code 3100 acquired in the database of codes utilised. In this way, it is advantageously possible to ensure that the same code 3100 is not utilised more than once, thereby increasing the privacy of the user associated with the code 3100. In some embodiments, the method 1000 can therefore comprise a step of checking use of the code 3100 and a step of terminating the method, not illustrated, in the case in which, following the step of checking use, it is found that the code 3100 acquired is already present in the database of codes utilised. The steps of checking use and terminating can, in some embodiments, be after the step S1300. In some embodiments, the steps of checking use and terminating can preferably be implemented by the user device 4300, or by the service provider device 4400. The method 1000 further comprises a step of acquiring S1400 a recognition image.
In some embodiments, the step of acquiring the code 3100 can preferably be implemented by the user device 4300. The recognition image can be acquired in the form of an image file, or extracted as frame of a video. The acquisition operation can be carried out in a known manner, for example by taking a photo using a camera and/or smartphone, and/or via the user device 4300, and/or acquiring a file already saved via software, an app, or the like. The recognition image preferably comprises the face of the user, so as to allow execution of facial recognition, as will be described below. The step S1400 can be implemented by the user device 4300. In some embodiments, the step S1400 can further comprise taking a selfie and/or a step of checking that a photo of a real face has been taken. This last step can be implemented in the form of a "liveness detection" step. This term identifies a class of algorithms with which it is possible to determine that a selfie was taken from a real face and not from a photo of said face. Some examples of these algorithms are known, for example, from the documents:
- "An overview of face liveness detection", Saptarshi Chakraborty, Dhrubajyoti Das, International Journal on
Information Theory (IJIT), Vol.3, No.2, April 2014;
- "Face liveness detection based on perceptual image quality assessment features with multi-scale analysis", Chun-Hsiao Yeh, Herng-Hua Chang, 2018 IEEE Winter Conference on Applications of Computer Vision (WACV);
- "Face liveness detection using variable focusing", Sooyeon Kim, Sunjin Yu, Kwangtaek Kim, Yuseok Ban, Sangyoun Lee, 2013 International Conference on Biometrics (ICB); - "Liveness Detection Using Face Recognition", Prakash
Rajagopal, Vishesh Tanksale, CS B657 Final Project Report, http://vision.soic. indiana.edu/b657/sp2016/projects/prakr aja/paper.pdf; - "Real-time face liveness detection with Python, Keras and
OpenCV", Jordan Van Eetveldt, https ://towardsdatascience.com/real-time-face-liveness- detection-with-python-keras-and-opencv-c35dc70dafd3;
- The library "libfaceid" , available for example on https ://github.com/keyurr2/libfaceid
The advantage of this approach consists in the possibility of ensuring that the recognition image derives from a real person, i.e., the user of the method, and not from another image thereof. In this way it is advantageously possible to ensure that third parties do not exploit photographs or videos of a user to access multimedia content portraying the user, which instead advantageously only the user is allowed to access, by taking the selfie ascertained with the methods described. The method 1000 further comprises a step of extracting S1500 identification data of the recognition image.
In particular, in some embodiments, it will be possible to analyse the recognition image so as to extract identification data of the recognition image, i.e., a series of data such that a unique correspondence exists between the recognition image and said identification data, preferably between the face present in the recognition image and said identification data. In some embodiments, given a plurality of recognition images all comprising the face of the same person, the extraction function of the identification data can be configured so as to result in a unique correspondence of all said recognition images of the given person and a unique set of identification data, or of all said recognition images of the given person and a plurality of sets of identification data, with a difference between them below a predetermined value.
In other words, said identification data can be interpreted as a digital code associated with the face of a given person. This extraction of identification data can be obtained, for example, utilising a suitable algorithm for the extraction of identification data such as the one provided : - by the open source library DLIB, available at the address http ://dlib.net;
- by the library OpenFacem, available at the address https ://cmusatyalab.github.io/openface/;
- by the library OpenBiometrics, available at the address http://openbiometrics.org;
- by the library FaceNet, available at the address https ://github.com/davidsandberg/facenet
In an embodiment, the step S1400 can be executed by the user device 4300 while the step S1500 can be executed by the service provider device 4400. In this case, a further step, not illustrated, of transmission of the recognition image from the user device 4300 to the service provider device 4400 will also be provided.
In a preferred embodiment, the steps S1400 and S1500 can be executed by the user device 4300, which advantageously makes it possible to avoid transmission of the recognition images to the service provider device 4400. In this case, as illustrated in Fig. 2, the method 2000 can further comprise a step of transmitting/receiving the identification data S2510 of the recognition image to the service provider device 4400, preferably comprised between the steps S1500 and S1600.
In some embodiments, the step S2510 can advantageously be performed as transmission from the user device 4300 and as reception from the service provider device 4400.
This implementation is particularly advantageous as it reduces the amount of data to be transmitted from the user device 4300 to the service provider device 4400, and the amount of data that the latter is required to save. Moreover, the computational resources required for said extraction are distributed between the various users 4300, reducing the workload of the service provider device 4400. Finally, images portraying the user are neither transmitted nor stored by the service provider, reducing the risk of accidental dissemination of said images.
The method 1000 further comprises a step of searching S1600 for multimedia content associated with the identification data.
In some embodiments, the step of searching S1600 can preferably be implemented by the service provider device 4400.
In particular, in some embodiments, it will be possible to search from a plurality of multimedia content, comprising the multimedia content acquired in step S1200 but not necessarily limited to this content. The search can generally be aimed at identifying one or more multimedia content comprising a face identified based on the identification data extracted from the recognition image. In some embodiments, the search can be limited to the multimedia content associated with a given event. In this case, it is necessary, for the service provider, to know which multimedia content is associated with a given event. In an embodiment, this will be possible by implementing in step S1200 the acquisition of a code event, provided to the photographer, or by asking the photographer to indicate an event from a list of available events. In the embodiments in which the code 3100 is unique for a given event, the code event provided to the photographer can be the same code 3100. Alternatively, or additionally, in particular in the case in which the code provided to the user in step SH OO is unique for the user, it will be possible to implement in step SH OO an association of the plurality of unique codes provided to the users with a single code event, provided to the photographers. Subsequently, during acquisition of the user code in step S1300 it will be advantageously possible to identify the code event associated therewith and consequently recognise the multimedia content deriving from this event.
In this way it will advantageously be possible to limit the search for multimedia content to that relating to an event indicated by the user through acquisition of the code at step S1300.
It will be clear that for implementation of the step S1600 different methods for recognition of multimedia content can be implemented.
Fig. 5 illustrates, by way of example, a possible implementation of the step S1600 in the form of a step of searching S5600 for multimedia content.
In particular, in some embodiments, the step of searching S5600 can comprise a step of extracting S5610 identification data of at least one of the multimedia content 3200.
In some cases, this step of extracting can be performed analogously to the step of extracting S1500 identification data of the recognition image, already described. Alternatively, or additionally, known algorithms can be used for implementation of the step of extracting S1500 and/or for the step of extracting S5610, possibly different to one another. The step of searching S5600 can further comprise a step of computing S5620 a difference between the identification data of the recognition image and the identification data of the at least one of the multimedia content 3200.
By way of example, given a vectorial formulation of the identification data P for the recognition image P = (Pi, P2 ...PN) and given a vectorial formulation of the identification data Q for the at least one of the multimedia content 3200 Q = (Qi t .2 ... QN) it will be possible to implement the step of computing as a Euclidean distance D between the two vectors Eq.l D = [ (Qi-Pi)2 + (Q2-P2)2 + ... (QN-PN)2 ]1/2 In some embodiments, for the purpose of providing a better result, training of the neural network that led to the extraction of the identification data for the recognition image can be the same used for the neural network utilised for extraction of the identification data of the multimedia content. In this embodiment it is thus advantageous to allow the two neural networks indicated above to be aligned, sharing the same training dataset. This can advantageously be implemented by sharing the training dataset between the two neural networks. Preferably, the reference training dataset is saved on the service provider device 4400 and shared with the user device 4300.
In the embodiments in which extraction of the identification data of the recognition image is implemented by the service provider device 4400, such as in the method illustrated in Fig. 2, it will be sufficient to utilise the same neural network for extraction of the identification data performed at step S2510 and at step S1600.
In some embodiments, extraction of the vectors indicated above starting from a given image or multimedia content can be obtained through known identification data extraction algorithms, such as:
"FaceNet: A Unified Embedding for Face Recognition and Clustering", Florian Schroff, Dmitry Kalenichenko, James Philbin, Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition 2015. which advantageously makes it possible to ensure compatibility of the identification data extracted from the different images, regardless of the library chosen.
In some embodiments the neural network utilised can be a version reduced to 29 convolutional layers with a lower number of filters, for example half, with respect to ResNet- 34 defined in the document:
"Deep Residual Learning for Image Recognition", Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
In general, extraction of the identification data can be obtained with any algorithm for the extraction of the facial features. Again in general, algorithms based on the model 128-D defined above are preferable, as it allows a specific algorithm, optionally optimised, to be utilised for each particular platform. For example, by way of example, on the server it may be more convenient to utilise DLIB rather than other libraries for integration with the CUDA interfaces of the Nvidia GPUs, while on mobile devices that are not equipped with Nvidia GPUs it may be more efficient to utilise other extraction algorithms for the purpose of improving performance.
The multimedia content for which the difference, or the Euclidean distance in the specific example above, is lower than a predefined threshold value can be considered as comprising a face identified based on the identification data extracted from the recognition image and hence belonging to the results of the step of searching S1600. Notwithstanding the description above has been made, for clarity of disclosure, in relation to a single multimedia content, it will be evident that the invention can compare the identification data of the recognition image with a plurality of identification data corresponding to a plurality of multimedia content.
The method 1000 further comprises a step of enabling S1700 access to the multimedia content resulting from the step of searching S1600. The step of enabling S1700 can preferably be implemented by the service provider device 4400.
In particular, the multimedia content resulting from the step of searching S1600 is multimedia content in which the face of the user, identified by the recognition image, is present.
The step of enabling can therefore be implemented so as to allow the user, preferably through the user device 4300, to access the multimedia content resulting from the step of searching S1600. The specific implementation, as will be evident to the person skilled in the art, can generally be obtained through an association of the multimedia content resulting from the step of searching S1600 with permissions associated with the user associated with the identification data of the recognition image, for example allowing the user in question to create an account with the service provider device 4400. Additionally, or alternatively, it will be possible to perform an association of the multimedia content resulting from the step of searching S1600 with permissions associated with the user device 4300.
Moreover, in some embodiments, as schematically illustrated by the method 6000 in Fig. 6, the method 1000 can further comprise a step of associating S6800 multimedia content 3200 resulting from the step of searching S1600 with a second code 3110.
In some embodiments, the step of associating S6800 can preferably be implemented by the service provider device 4400.
It will be clear that execution of the step S6800, illustrated in Fig. 6 as subsequent to the step S1600 can be implemented before, after or simultaneously to the step S1700.
In general, the step of associating 6800 allows the multimedia content 3200 resulting from the step of searching S1600, and hence deriving from the recognition image, to be associated with a second code 3110 associated with the user, i.e., with the person illustrated in the recognition image. In this way it will be possible to identify this multimedia content in subsequent searches, so as to prevent the multimedia content from being shown to other users. This function will be described in detail below.
In general, the second code 3110 can have features similar to those already described for the first code.
Association can take place in a known way, for example creating a tag associated with the multimedia content and containing the second code 3110. Alternatively, or additionally, it will be possible to register an association between the second code 3110 and the one or more multimedia content in a database.
In some embodiments, the second code 3110 can correspond to the first code 3100.
This embodiment is particularly advantageous, for example, in the case in which the first code 3100 is unique for each user. In fact, in this way, as the second code 3110 is unique for the user, its association with the multimedia content in which the user has been identified will allow identification of the multimedia content in future searches unequivocally as belonging to the user in question. This will also advantageously allow rapid identification of all the multimedia content associated with the user, with input by the latter of the first code 3100, which is known to him/her.
This embodiment is schematically illustrated in Fig. 3B. In particular, in Fig. 3B, it is assumed, for example, that the code #1 is the one acquired by the user at step S1300 and that it is a unique code associated with the user that executed the method 1000. Moreover, it is assumed that the multimedia content #1-3 comprises the face of the user, recognised by the recognition image acquired by the user at step S1400, which consequently led to identification of the multimedia content #1-3 at step S1600. The result of the step of associating S5800, schematically illustrated by the dashed lines, thus allows association of the code #1, linked to the user, with the multimedia content #1-3, comprising the face of the user.
In some embodiments, the second code 3110 can be a unique code, different from the first code 3100.
This embodiment is particularly advantageous, for example, in the case in which the first code 3100 is unique for each event but common to a plurality of users. In order to allow identification of the multimedia content associated with a single user, with the advantages indicated above, it is thus preferable to create a second code 3110, unique for the user. In this case, the step of associating 6800 can be preceded by a step of creating S7810 the second code 3110, as schematically illustrated in Fig. 7.
In some embodiments, the step of creating S7810 can preferably be implemented by the service provider device 4400.
It will be clear that execution of the step S7810, illustrated in Fig. 7 as preceding step S6800 can be implemented not necessarily immediately before the step S6800 but, more in general, between the step S1400, which gives rise to a recognition procedure associated with a user, and the step S6800, where the use of the second code 3110 is necessary.
This embodiment is schematically illustrated in Fig. 3C. In particular, in Fig. 3C, it is assumed, for example, that the code #1 is the code acquired by the user at step S1300 and that it is a unique code associated with an event, but common to a plurality of users participating in that event. For example, let us consider two participants, resulting in the generation of two second codes 3110, identified by the respective indices #1-1 and #1-2. In the case illustrated, the second code #1-1 is associated with the user who is executing the method, and, as in the case of Fig. 3B, this is associated with the multimedia content #1- 3, as indicated by the dashed lines.
In some embodiments, it will be possible to limit the number of second codes 3110 generated at step S7810 for a given first code 3100 to a predefined threshold value, preferably specified at the step of generating the first code 3100. In this way, it will advantageously be possible to introduce a further check against unidentified accesses to the multimedia content relating to the event associated with the first code 3100. For example, if the service provider is aware of the fact that X persons participated in the event associated with the first code 3100, it will be possible to set said predefined threshold value to X, so as to ensure that, once all the participants have accessed their multimedia content, that it is not possible for third parties to then access the multimedia content, as the generation of second codes 3110 is no longer possible. Additionally, in some embodiments the second code 3110 can preferably be derived from the first code 3100.
With the term derived, in some embodiments, it will be meant that at least part of the first code 3100 is utilised for the generation of a plurality of second codes 3110. For example, with reference to Fig. 3C, at least part of the code #1 can be utilised for the generation of the codes #1- 1 and #1-2. In some embodiments, the generation of the second codes will be such that, starting from the second code 3110 it is possible to trace the part of the first code 3100 from which the second code 3110 was generated.
It will be clear that this method of generation of the second code can be implemented in various ways. Merely by way of example, the first code 3100 could be an alphanumerical code of X characters and the second code 3110 could be an alphanumerical code of X+Y characters, where the X characters of the first and of the second code correspond, and where the Y characters of the second code allow the creation of a plurality of second codes, unique for each user.
The generation of a second code 3110 in a manner derived from the first code 3100 advantageously allows the service provider device 4400 to identify not only the multimedia content associated with a single user, as described previously, but also that associated with a single event, in a simple and effective way.
Moreover, in some embodiments, as illustrated schematically by the method 8000 in Fig. 8, the method 1000 can further comprise a step of removing S8900 at least part of the multimedia content 3200 resulting from the step of searching S1600.
In some embodiments, the step of removing S8900 can preferably be implemented by the service provider device 4400.
It will be clear that execution of the step S8900, illustrated in Fig. 8 as subsequent to the step S1600 can be implemented before, after or simultaneously to any one of the steps subsequent to the step S1600.
In general, the aim of the step of removing S8900 is to prevent a multimedia content 3200, resulting from the step of searching S1600, from also being the result of another search, with the exception of multimedia content with several faces, so as to prevent access to the multimedia content from being authorised also for users who are not portrayed.
The step of removing S8900 can be implemented, in some embodiments, by producing a copy of the multimedia content 3200 resulting from the step of searching S1600, and eliminating/modifying the pixels concerned by the area in which the face corresponding to the identification data of the recognition image was recognised. In subsequent executions of the method 1000, it will then be possible to execute the step of searching S1600 on the multimedia content thus modified, thereby making it impossible for the same face to be recognised more than once and hence inadvertently granting access to the content of a user to other users.
Additionally, or alternatively, the step of removing S8900 can be implemented, in some embodiments, by identifying the area in which the face corresponding to the identification data of the recognition image was recognised and saving information relating to said area in association with the multimedia content. In subsequent executions of the method 1000, it will then be possible to execute the step of searching SI600 on the multimedia content with the exception of said area, thus making it impossible for the same face to be recognised more than once with the same advantages already described. The term "removal" shall therefore in this case not imply removal of data from the multimedia content but only identification of areas to be removed in subsequent searches. This approach has the advantage of reducing the amount of data that must be saved, as no copy of the multimedia content is required.
As will be evident from the previous description, in the case of an image comprising more than one face, the step of removing S8900 in any case allows the different faces to be identified and the different users portrayed therein to access the multimedia content.
Fig. 9 represents a further embodiment of the invention. In this case, the method 9000 differs from the method 1000 due to the additional presence of the steps S9410 and S9410. In particular, in the method 1000, the search for the multimedia content could be executed several times based on the extracted identification data of the recognition image. This can, in some cases, lead to failure to recognise multimedia content in the case in which the face of the user has changed between the time of the event and the time of execution of the method, for example due to the presence/absence of glasses, different haircut, beard, etc. To solve this problem, the method 9000 further comprises a second step of extracting S9410 identification data from at least one multimedia content resulting from the step of searching S1600. Due to the differences in the face between the time of the event and the time of execution of the method, these identification data can be more representative of the face of the user at the time of the event than those extracted from the reference image. It is therefore advantageous to continue and/or repeat the search for the multimedia content utilising the identification data extracted at step S9410 instead of those extracted at step S1500.
This is schematically represented by a second step of searching S9610 for multimedia content associated with the identification data resulting from the second step of extracting S9410. The second step of searching S9610 can generally be implemented just as the step of searching S1600, S5600 already described, with the difference that the search is based on the identification data extracted at step S9410 instead of those extracted at step S1500.
In some embodiments, the steps S9410 and S9610 can be implemented by the service provider device 9400.
By way of example, it is possible to imagine a configuration in which the face of the user is present in two multimedia contents Ml and M2, such that:
- the difference between the identification data of the recognition image and the identification data extracted from Ml is above the threshold level for the search;
- the difference between the identification data of the recognition image and the identification data extracted from M2 is below the threshold level for the search; - the difference between the identification data extracted from Ml and the identification data extracted from M2 is below the threshold level for the search; this embodiment will allow the identification of M2 during the step of searching S1600. The identification of Ml, which would not be possible with the method 1000, will instead be possible during the step of searching S9610.
Due to the embodiments described above it is therefore possible to allow participants in an event to access the content that portrays them in a manner that is simple for the participants, i.e., the users of the service, and in a manner that is safe, which does not allow infringements of privacy, and which can be managed technically and commercially by the service provider for access to the multimedia content.
In the description above, different embodiments have been described, each comprising one or more technical features. It will be clear that the invention is not limited to the embodiments described in the form described. In particular, it will be possible to obtain embodiments that differ from the embodiments described due to the absence of one or more of the features described. It will also be possible to combine one or more features of one or more embodiments, without requiring to incorporate all the remaining features, resulting in further embodiments.
REFERENCE NUMBERS
1000: method for accessing multimedia content
SH OO generating a code S1200 acquiring multimedia content S1300 acquiring a code S1400 acquiring a recognition image S1500 extracting identification data SI600 searching for multimedia content S1700 enabling access to multimedia content
2000: method for accessing multimedia content S2510: transmitting identification data
3100: first code 3110: second code 3200: multimedia content
4000: system for accessing multimedia content 4300: user device 4400: service provider device 4500: photographer device
S5600 searching for multimedia content
S5610 extracting identification data
S5620 computing of distance
6000: method for accessing multimedia content S6800: associating content-code
7000: method for accessing multimedia content S7810: creating a second code
8000: method for accessing multimedia content S8900: removing part of the multimedia content
9000: method for accessing multimedia content S9410: second extraction of identification data S9610: second search for multimedia content

Claims

1. Method (1000, 2000, 6000, 7000, 8000, 9000) for accessing multimedia content comprising the steps of: generating (SH OO) a first code (3100), acquiring (S1200) multimedia content (3200), acquiring (S1300) the first code (3100), acquiring (S1400) a recognition image, extracting (S1500) identification data of the recognition image, searching (S1600, S5600) for multimedia content associated with the identification data, enabling (S1700) access to the multimedia content resulting from the step of searching (S1600).
2. Method (2000) according to claim 1, wherein the steps of acquiring (S1400) and extracting (S1500) are performed by a user device (4300), the method further comprising, after the step of extracting (S1500), a step of transmitting (S2510) the identification data from the user device (4300) to a service provider device (4400).
3. Method (1000, 2000, 6000, 7000, 8000, 9000) according to claim 1 or 2, wherein the step of searching (S5600) comprises a step of extracting (S5610) identification data of at least one of the multimedia content (3200), a step of computing (S5620) a difference between the identification data of the recognition image and the identification data of the at least one of the multimedia content (3200).
4. Method (6000, 7000) according to claim 1, further comprising a step of associating (S6800) the multimedia content (3200) resulting from the step of searching (S1600) with a second code (3110).
5. Method (6000) according to claim 4, wherein the second code (3110) corresponds to the first code (3100).
6. Method (7000) according to claim 4, wherein the second code (3110) is a unique code, different from the first code (3100), preferably derived from the first code (3100) .
7. Method (8000) according to any previous claim further comprising a step of removing (S8900) at least part of the multimedia content (3200) resulting from the step of searching (S1600).
8. Method (9000) according to any previous claim further comprising a second step of extracting (S9410) identification data from at least one multimedia content resulting from the step of searching (S1600, S5600), and a second step of searching (S9610) for multimedia content associated with the identification data resulting from the second step of extracting (S9410).
9. System (4000) for accessing multimedia content comprising at least: a user device (4300), a photographer device (4500), at least one service provider device (4400), connectable to the user device (4300) and to the photographer device (4500), wherein the system (4000) is configured so as to execute the method (1000, 2000, 6000, 7000, 8000, 9000) according to any of the previous claims.
10. Service provider device (4400) configured so as to execute the steps of: generating (SH OO) a first code (3100), acquiring (S1200) multimedia content (3200), acquiring (S1300) the first code (3100), acquiring (S1400-S1500, S2510) identification data of a recognition image, searching (S1600, S5600) for multimedia content associated with the identification data, enabling (S1700) access to the multimedia content resulting from the step of searching (S1600).
PCT/IB2021/055029 2020-06-08 2021-06-08 Method for accessing multimedia content WO2021250564A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
EP21737494.1A EP4162376A1 (en) 2020-06-08 2021-06-08 Method for accessing multimedia content

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
IT102020000013630A IT202000013630A1 (en) 2020-06-08 2020-06-08 METHOD OF ACCESSING MULTIMEDIA CONTENT
IT102020000013630 2020-06-08

Publications (1)

Publication Number Publication Date
WO2021250564A1 true WO2021250564A1 (en) 2021-12-16

Family

ID=72356263

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IB2021/055029 WO2021250564A1 (en) 2020-06-08 2021-06-08 Method for accessing multimedia content

Country Status (3)

Country Link
EP (1) EP4162376A1 (en)
IT (1) IT202000013630A1 (en)
WO (1) WO2021250564A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114722970A (en) * 2022-05-12 2022-07-08 北京瑞莱智慧科技有限公司 Multimedia detection method, device and storage medium

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115809432B (en) * 2022-11-21 2024-02-13 中南大学 Crowd social relation extraction method, equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040156535A1 (en) * 1996-09-04 2004-08-12 Goldberg David A. Obtaining person-specific images in a public venue
US20100158315A1 (en) * 2008-12-24 2010-06-24 Strands, Inc. Sporting event image capture, processing and publication
US20160191434A1 (en) * 2014-12-24 2016-06-30 Blue Yonder Labs Llc System and method for improved capture, storage, search, selection and delivery of images across a communications network
US20160261669A1 (en) * 2011-07-07 2016-09-08 Sony Interactive Entertainment America Llc Generating a Website to Share Aggregated Content
WO2018083560A1 (en) * 2016-11-04 2018-05-11 Origami Lab Srl Method to access a multimedia content

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040156535A1 (en) * 1996-09-04 2004-08-12 Goldberg David A. Obtaining person-specific images in a public venue
US20100158315A1 (en) * 2008-12-24 2010-06-24 Strands, Inc. Sporting event image capture, processing and publication
US20160261669A1 (en) * 2011-07-07 2016-09-08 Sony Interactive Entertainment America Llc Generating a Website to Share Aggregated Content
US20160191434A1 (en) * 2014-12-24 2016-06-30 Blue Yonder Labs Llc System and method for improved capture, storage, search, selection and delivery of images across a communications network
WO2018083560A1 (en) * 2016-11-04 2018-05-11 Origami Lab Srl Method to access a multimedia content

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114722970A (en) * 2022-05-12 2022-07-08 北京瑞莱智慧科技有限公司 Multimedia detection method, device and storage medium
CN114722970B (en) * 2022-05-12 2022-08-26 北京瑞莱智慧科技有限公司 Multimedia detection method, device and storage medium

Also Published As

Publication number Publication date
EP4162376A1 (en) 2023-04-12
IT202000013630A1 (en) 2021-12-08

Similar Documents

Publication Publication Date Title
US20210124908A1 (en) Private Photo Sharing System, Method and Network
US9569658B2 (en) Image sharing with facial recognition models
US10043059B2 (en) Assisted photo-tagging with facial recognition models
CN111886842B (en) Remote user authentication using threshold-based matching
EP2402867B1 (en) A computer-implemented method, a computer program product and a computer system for image processing
EP2150909A1 (en) Event-based digital content record organization
US9798742B2 (en) System and method for the identification of personal presence and for enrichment of metadata in image media
WO2021250564A1 (en) Method for accessing multimedia content
Mohanty et al. Photo sleuth: Combining human expertise and face recognition to identify historical portraits
US20160034496A1 (en) System And Method For Accessing Electronic Data Via An Image Search Engine
US9081801B2 (en) Metadata supersets for matching images
EP2656321B1 (en) Process for enabling an authentication or an identification and corresponding verification system.
US20130343618A1 (en) Searching for Events by Attendants
US8897484B1 (en) Image theft detector
KR102323650B1 (en) Image/sound acquiring or editing apparatus for creating an original image/sound file or a deepfake modified file having metadata related to file creating history, hashbank server for receiving and storing hash values related to an original image/sound file or a deepfake modified file, and server and mathod for receiving and processing an original image/recorded sound file or a deepfake modified file
Parveen et al. Classification and evaluation of digital forensic tools
CN111666552A (en) Personal information management system
US20230142898A1 (en) Device and network-based enforcement of biometric data usage
CN108304563A (en) Image processing method, device and equipment
Miller Utilizing Facial Recognition Software to Record Classroom Attendance
JP2022083906A (en) Color vision authentication method, color vision authentication system, and color vision authentication program
KR20140075903A (en) Categorization method of social network service archieve
CN111221994A (en) Photo management method and photo management device based on face recognition
Geradts Image processing and analysis
CN111340996A (en) Park visitor management and control method and device

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21737494

Country of ref document: EP

Kind code of ref document: A1

DPE1 Request for preliminary examination filed after expiration of 19th month from priority date (pct application filed from 20040101)
NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2021737494

Country of ref document: EP

Effective date: 20230109