CN112733117A

CN112733117A - Authentication system and method

Info

Publication number: CN112733117A
Application number: CN202110099979.2A
Authority: CN
Inventors: 李若愚
Original assignee: Alipay Labs Singapore Pte Ltd
Current assignee: Alipay Labs Singapore Pte Ltd
Priority date: 2020-02-03
Filing date: 2021-01-25
Publication date: 2021-04-30
Also published as: SG10202000965YA

Abstract

The application provides an authentication system and method. The authentication method comprises the following steps: extracting, using a background image extraction device, a first background image from each of a plurality of facial photographs submitted by a user for a particular digital due diligence process; extracting, using the background image extraction device, a second background image from each of a plurality of identification document photos submitted by a user for the particular digital due diligence process; generating, using a processing device, a background vector for each of the extracted first background image and the extracted second background image using a trained image similarity model, wherein the trained image similarity model is trained using historical background images; detecting, using the processing device, the presence of a similar background from the generated background vector; and triggering an authentication alarm signal using the processing device if the presence of a similar background is detected.

Description

Authentication system and method

Technical Field

The present invention relates broadly, but not exclusively, to authentication systems and methods.

Background

Electronically learning your customer (eKYC) is a digital due diligence process performed by commercial entities or service providers to verify the identity of their customers and assess the potential risk of illegal intent on business relationships (e.g., money laundering). Many eKYC processes involve potential customers submitting their own photographs (especially their faces) and photographs of their official Identification (ID) documents (e.g., ID cards, passports, etc.).

Fraudsters are a serious problem with the eKYC process. Currently, there are a number of ways to detect fraudsters. However, these known methods focus on analyzing the same features in the submitted photograph of the potential customer and/or its formal ID file, including the same facial image, the same ID file, the same name in the ID file, and the like.

However, another Fraudster Attack, known as the "Human fleshy Fraudster Attack" ("Human Flesh Fraudster Attack"), has increased recently and has not been detected by these known methods. This is because in such an attack, fraudsters attract and organize people with rewards such as commodities, most of which are educated to a lesser extent and do not know or care about personal privacy. Fraudsters have these unknowing people use their true formal identification documents and faces to complete the eKYC process.

To attract new real customers, a business entity or service provider may provide benefits for completing the eKYC process. However, any rewards are enjoyed by the fraudster and the account created after the eKYC process is completed becomes a zombie account and does not contribute to the revenue of the business entity or service provider.

Disclosure of Invention

Many eKYC processes involve potential customers submitting their own photographs (especially their faces) and photographs of their official Identification (ID) documents (e.g., ID cards, passports, etc.). One type of fraudster attack, known as a "meat fraudster attack," has victims use their true formal ID file and face to complete the eKYC process with little or no benefit to themselves.

According to one embodiment, in order to detect "meat fraudster attacks", an authentication method is proposed that involves detecting groups with similar backgrounds using similarities in the backgrounds of photos of users (especially their faces) and/or photos of their formal Identification (ID) files. In most "meat fraudster attack" cases, victims are organized and clustered in the same place to complete the fraudulent eKYC process, so the background in their photographs of the face and/or photographs of their official Identification (ID) files may be similar.

According to another embodiment, there is provided an authentication system including a background image extraction device configured to: extracting a first background image from each of a plurality of facial photographs submitted by a user for a particular digital due diligence process; and extracting a second background image from each of a plurality of identification document photos submitted by a user for the particular digital due diligence process. The system also includes a processing device configured to: generating a background vector for each of the extracted first background image and the extracted second background image using a trained image similarity model, wherein the trained image similarity model is trained using historical background images; detecting the presence of a similar background from the generated background vector; and triggering an authentication alarm signal if the presence of a similar background is detected.

According to another embodiment, there is provided a computer-implemented authentication method, the method comprising the steps of: (a) extracting, using a background image extraction device, a first background image from each of a plurality of facial photographs submitted by a user for a particular digital due diligence process; (b) extracting, using the background image extraction device, a second background image from each of a plurality of identification document photos submitted by a user for the particular digital due diligence process; (c) generating, using a processing device, a background vector for each of the extracted first background image and the extracted second background image using a trained image similarity model, wherein the trained image similarity model is trained using historical background images; (d) detecting, using the processing device, the presence of a similar background from the generated background vector; and (e) triggering an authentication alarm signal using the processing device if the presence of a similar background is detected.

Drawings

The embodiments are provided by way of example only and may be better understood and readily appreciated by those of ordinary skill in the art by reading the following written description in conjunction with the accompanying drawings, in which:

fig. 1A is a flowchart illustrating an authentication method according to an embodiment.

FIG. 1B shows an example of a "self-portrait" image previously submitted by a user.

Fig. 1C shows an example of a "self-portrait" image in which a detected face is removed so that only a background image remains, according to an embodiment.

FIG. 1D is an example photograph of an identification document.

FIG. 1E is an example photograph of an identification document with the actual identification document removed, according to an embodiment.

Fig. 2 is a schematic diagram of an authentication system according to an embodiment.

Fig. 3 is a flow diagram illustrating a computer-implemented authentication method according to an embodiment.

Fig. 4 shows a schematic diagram of a computer system suitable for performing at least some of the steps of the authentication method.

Detailed Description

Embodiments will now be described, by way of example only, with reference to the accompanying drawings. Like reference numbers and designations in the drawings indicate like elements or equivalents.

Some portions of the description that follows are presented explicitly or implicitly in terms of algorithms and functional or symbolic representations of operations on data within a computer memory. These algorithmic descriptions and functional or symbolic representations are the means used by those skilled in the data processing arts to most effectively convey the substance of their work to others skilled in the art. An algorithm is here, and generally, considered to be a self-consistent sequence of steps leading to a desired result. These steps are those requiring physical manipulations of physical quantities such as electrical, magnetic or optical signals capable of being stored, transferred, combined, compared, and otherwise manipulated.

Unless specifically stated otherwise, and as will be apparent from the following, it is appreciated that throughout the present specification, discussions utilizing terms such as "receiving," "scanning," "computing," "determining," "replacing," "generating," "initializing," "outputting," or the like, refer to the action and processes of a computer system, or similar electronic device, that manipulates and transforms data represented as physical quantities within the computer system into other data similarly represented as physical quantities within the computer system or other information storage, transmission or display devices.

The present specification also discloses an apparatus for performing the operations of the method. Such apparatus may be specially constructed for the required purposes, or may comprise a computer or other device selectively activated or reconfigured by a computer program stored in the computer. The algorithms and displays presented herein are not inherently related to any particular computer or other apparatus. Various machines may be used with programs in accordance with the teachings herein. Alternatively, the construction of more specialized apparatus to perform the required method steps may be appropriate. The structure of a computer adapted to perform the various methods/processes described herein will appear from the description below.

Further, the present specification implicitly discloses a computer program, as it is apparent to a person skilled in the art that each step of the method described herein may be implemented by computer code. The computer program is not intended to be limited to any particular programming language and implementation thereof. It will be appreciated that a variety of programming languages and code therefor may be used to implement the teachings described herein. Moreover, the computer program is not intended to be limited to any particular control flow. There are many other variations of computer programs that may use different control flows without departing from the spirit or scope of the present invention.

Furthermore, one or more steps of a computer program may be executed in parallel rather than sequentially. Such a computer program may be stored on any computer readable medium. The computer readable medium may include, for example, a magnetic or optical disk, a memory chip, or other storage device suitable for interfacing with a computer. The computer readable media may also include hardwired media such as those illustrated in the internet system, or wireless media such as those illustrated in the GSM mobile phone system. When the computer program is loaded and executed on such a computer, it effectively creates means for implementing the steps of the preferred method.

Electronically learning your customer (eKYC) is a digital due diligence process performed by commercial entities or service providers to verify the identity of their customers and assess the potential risk of illegal intent on business relationships (e.g., money laundering). Many eKYC processes involve potential customers submitting photographs of themselves (especially their faces) and photographs of their official Identification (ID) documents (ID cards, passports, etc.).

One type of fraudster attack, known as a "meat fraudster attack," has victims use their true formal ID file and face to complete the eKYC process with little or no benefit to themselves.

Authentication can be viewed as a form of fraud detection in which the user's legitimacy is verified and potential fraudsters are detected before fraud is conducted. Effective authentication can enhance the data security of the system, thereby protecting digital data from unauthorized users. Authentication may be accomplished remotely using a remote authentication server, as embodied by authentication system 200 described in detail below. Remote authentication allows fraud detection to be performed centrally (typically requiring less resources) and remotely from the user terminal over an insecure communication channel.

According to one embodiment, to detect "meat fraudster attacks," an authentication method involves detecting groups with similar backgrounds using similarities in the backgrounds of a photograph of a user (particularly their face) and/or a photograph of their formal Identification (ID) document. In most "meat fraudster attack" cases, victims are organized and clustered in the same place to complete the fraudulent eKYC process, so the background in their photographs of the face and/or photographs of their official Identification (ID) files may be similar. Similar background data can be automatically generated from historical data without manual input/manual manipulation. The face image and the ID file image are removed by applying face detection and ID alignment models, respectively, taking into account only the background.

This is in sharp contrast to some known methods of detecting "meat fraudster attacks," which either (i) focus on the similarity of the entire user's photograph (i.e., their face and background) and the entire ID file (i.e., the facial image and other text (e.g., ID number, name, date of birth, etc.) imprinted on the ID file), or (ii) use geographic information to detect duplicates, indicating the presence of fraud. Regarding (i), the similarity of the entire user's photograph and the entire ID file is easily affected by the person and image of the ID file, thereby affecting the reliability of detecting fraud. With respect to (ii), it is relatively easy to provide false geographical information, which also affects the reliability of detecting fraud.

The techniques described herein produce one or more technical effects. In particular, by focusing on the similarity of the background in (i) a photograph of a face of a user submitted for a particular digital due diligence process and/or (ii) a photograph of an identification document submitted by a user for the same particular digital due diligence process, the effects of the face and the actual identification document can be ignored. This makes fraud detection and authentication more reliable and robust. Furthermore, the similarity of (i) the photograph of the face and (ii) the photograph of the identification document to the background is more reliable than the geographic information, especially for "meat fraudster attacks". The similarity of the background may help detect victims being utilized by fraudsters, particularly when victims use their authentic identification documents and their own actual photographs.

Fig. 1A is a flowchart 100 illustrating an authentication method according to an embodiment.

The method involves a phase of constructing a similar background data set for (i) photographs of faces submitted by a user for a particular digital due diligence process and (ii) photographs of identification documents submitted by a user for the same particular digital due diligence process. The stage for constructing a similar background data set comprises

steps

102, 104, 106 and 108.

Steps

106 and 108 may be performed before

steps

102 and 104.

At step 102, a plurality of facial photographs (e.g., "self-portrait" images) previously submitted by a user for a particular digital due diligence process are collected. At step 104, a face is detected from each of a plurality of face photographs previously submitted by the user using a face detection model. Thereafter, the detected face is removed from each photograph, so that the background image remains.

FIG. 1B shows an example of a "self-portrait" image previously submitted by a user. A face detection model is used to detect a face and then remove the detected face from each photograph so that only the background image remains, as shown in fig. 1C.

At step 106, a plurality of Identification (ID) document photographs previously submitted by a user for the same particular digital due diligence process are collected. Some ID documents are printed on the front and back sides. In this case, photographs of the front and back sides of the ID document are collected.

At step 108, the ID alignment model is used to detect the actual identification document in the digital image of the identification document. The identity alignment model provides a method of detecting and aligning identification documents in digital images. In most cases, the picture of the identification document includes the actual identification document and an image of the background (e.g., a table on which the identification document is placed when the picture is taken, or an image of a person holding the identification document, as shown in FIG. 1D). Once the actual identification document is detected in the photograph of the identification document using the ID alignment model, the image of the actual identification document is removed, leaving the background of the photograph of the identification document, as shown in FIG. 1E.

After the stages of constructing a similar background data set (

steps

102, 104, 106, and 108) are completed, an image similarity model is trained separately at step 110 using background images extracted from each of the multiple facephotographs and the identification document photographs. In other words, the image similarity model is trained using background images extracted from each of the multiple face photographs. Additionally, in another training process, the image similarity model is trained using background images extracted from each of the plurality of identification document photos.

The neural network may include a "trunk" portion and a "head" portion. The "stem" part is used to generate the vector for a given image, while the "head" part is used to calculate the loss of the training process. Thus, different model architectures may be used to train the image similarity model-the stem part may be ResNet, IR _ SE, etc., and the head part may be softmax, triplet loss, arcFace, etc.

The method also involves an authentication

phase comprising steps

112, 114 and 116.

At step 112, a new series of data associated with a particular digital due diligence process is received. Similar to step 104 for preparing training data, a face is detected from each of a plurality of face photographs newly submitted by the user using a face detection model. Thereafter, the detected face is removed from each photograph so that the background image remains. Simultaneously or sequentially, the ID alignment model is used to detect the actual identification document in the digital image of the user's newly submitted identification document, similar to step 108 for preparing the training data. The ID alignment model provides a method of detecting and aligning identification documents in digital images. Once the actual identification document is detected, the image of the actual identification document is removed, leaving the background of the identification document photo.

At step 114, a background vector is generated for each background extracted from the face photograph and the identification document photograph using the trained image similarity model (i.e., trained at step 110).

At step 116, a clustering algorithm (e.g., DBSCAN, K-means, spectral clustering) is used to detect similar background groups. The presence of a similar background group indicates that a "meat fraudster attack" has occurred.

At step 118, (i) the newly extracted background data from the face photograph and the identification document photograph [ from step 112] and (ii) the similarity results [ obtained from step 116 ] are both added to the historical data to enrich the historical data set.

Fig. 2 is a schematic diagram of an authentication system 200 according to an embodiment. The system 200 includes a background image extraction device 202 and a processing device 204.

The background image extraction device 202 extracts a first background image from each of a plurality of facial photographs submitted by a user for a particular digital due diligence process. For example, fig. 1C shows a photograph of a human face with only the background image retained. The corresponding background image data is extracted by the background image extraction means 202.

Simultaneously or sequentially, the background image extraction device 202 also extracts a second background image from each of a plurality of identification document photos submitted by the user for a particular digital due diligence process. For example, FIG. 1E shows a photograph of an identification document that retains only a background image. The corresponding background image data is extracted by the background image extraction means 202.

The processing device 204 generates a background vector for each of the extracted first background image and the extracted second background image using the trained image similarity model. The trained image similarity model is trained using historical background images.

The processing device 204 detects the presence of a similar background from the generated background vector. In an embodiment, the processing device 204 detects the presence of similar backgrounds from the generated background vector using a clustering algorithm. The clustering algorithm may be DBSCAN, K-means, or spectral clustering. If the presence of a similar background is detected, the processing device 204 triggers an authentication alarm signal.

The history background image includes a first history background image and a second history background image. The background image extraction device 202 may also extract a first historical background image from each of a plurality of facial photographs previously submitted by a user for a particular digital due diligence process. The background image extraction device 202 may also extract a second historical background image from each of a plurality of identification document photographs previously submitted by the user for a particular digital due diligence process. The first historical background image, the second historical background image, the first background image, and/or the extracted second background image may be stored in the storage device 206.

The trained image similarity model may be trained using (i) the first historical background image and (ii) the second historical background image, respectively. The first history background image and the second history background image may have different distributions. Therefore, by training the image similarity models separately, it can be expected that the training models have better performance.

The background image extraction device 202 may use a face detection model to detect faces in each of a plurality of facial photographs submitted by a user for a particular digital due diligence process. The background image extraction device 202 removes the detected face from each of the plurality of face photographs so that a background image is retained in each of the plurality of face photographs. For example, as shown in FIG. 1C.

The background image extraction device 202 may use an identification document alignment model to detect an identification document in each of a plurality of identification document photos submitted by a user for a particular digital due diligence process. The background image extraction device 202 removes the detected identification document from each of the plurality of identification document photos such that a background image remains in each of the plurality of identification document photos. For example, as shown in FIG. 1E.

The neural network may include a "trunk" portion and a "head" portion. Thus, different model architectures may be used to train the image similarity model-the stems may be ResNet, IR _ SE, etc., and the heads may be softmax, triplet loss, arcFace, etc.

The processing device may add the extracted first background image and the extracted second background image to the set of first historical background images and the set of second historical background images, respectively. As a result, the set of first historical background images and the set of second historical background images are enriched to increase the accuracy of fraud detection and authentication. Additionally or alternatively, the processing device may add the generated context vector to a set of historical context vectors. As a result, the set of historical background vectors is enriched to improve the accuracy of fraud detection and authentication.

Fig. 3 is a flow diagram illustrating a computer-implemented method 300 for authentication, according to an embodiment. Method 300 includes a step 302 of extracting a first background image from each of a plurality of facial photographs submitted by a user for a particular digital due diligence process using a background image extraction device. The method 300 further includes a step 304 of extracting a second background image from each of a plurality of identification document photos submitted by a user for a particular digital due diligence process using a background image extraction device. The method 300 further includes a step 306 of generating, using the processing device, a background vector for each of the extracted first background image and the extracted second background image using the trained image similarity model. The trained image similarity model is trained using historical background images.

The method 300 further comprises a step 308 of detecting, using the processing device, the presence of a similar background from the generated background vector. The method 300 further includes the step 310 of triggering an authentication alarm signal using the processing device if the presence of a similar background is detected.

The method 300 may further include the steps of: (i) extracting, using a background image extraction device, a first historical background image from each of a plurality of facial photographs previously submitted by a user for a particular digital due diligence process; and (ii) a second historical background image extracted from each of a plurality of identification document photos previously submitted by the user for a particular digital due diligence process using the background image extraction device. The history background image includes a first history background image and a second history background image.

The method 300 may further include: a step of training an image similarity model using (i) the first historical background image and (ii) the second historical background image, respectively.

The method 300 may further include the steps of: (i) using a face detection model to detect a face in each of a plurality of face photos submitted by a user for a particular digital due diligence process; and (ii) removing the detected face from each of the plurality of face photographs such that a background image remains in each of the plurality of face photographs.

The method 300 may further include the steps of: (i) using an identification document alignment model to detect an identification document in each of a plurality of identification document photos submitted by a user for a particular digital due diligence process; and (ii) removing the detected identification document from each of the plurality of identification document photographs such that a background image remains in each of the plurality of identification document photographs.

Clustering algorithms (e.g., DBSCAN, K-means, or spectral clustering) may be used to detect the presence of similar backgrounds from the generated background vectors.

The image similarity model may be trained using one of the following Artificial Neural Network (ANN) architectures: (a) a residual neural network (ResNet) or IR _ SE for the trunk portion of the ANN; (b) softmax, triplet loss, or arcFace for the head portion of the ANN.

The method 300 may further include the steps of: (i) adding the extracted first background image and the extracted second background image to a set of first historical background images and a set of second historical background images respectively; and (ii) adding the generated background vector to the set of historical background vectors.

The following description of computer system/computing device 400 is provided by way of example only and is not intended to be limiting.

As shown in fig. 4, the exemplary computing device 400 includes a processor 404 for executing software routines. Although a single processor is shown for clarity, computing device 400 may also include a multi-processor system. The processor 404 is connected to a communication facility 406 for communicating with other components of the computing device 400. The communication facilities 406 may include, for example, a communication bus, cross-bar, or network.

Computing device 400 also includes a main memory 408, such as Random Access Memory (RAM), and a secondary memory 410. The secondary memory 410 may include, for example, a hard disk drive 412 and/or a removable storage drive 414, which may include a magnetic tape drive, an optical disk drive, etc. The removable storage drive 414 reads from and/or writes to a removable storage unit 418 in a well known manner. Removable storage unit 418 may comprise a magnetic tape, an optical disk, etc. which is read by and written to by removable storage drive 414. As will be appreciated by those skilled in the relevant art, the removable storage unit 418 includes a computer-readable storage medium having stored therein computer-executable program code instructions and/or data.

In alternative embodiments, secondary memory 410 may additionally or alternatively include other similar devices for allowing computer programs or other instructions to be loaded into computing device 400. Such devices may include, for example, a removable storage unit 422 and an interface 420. Examples of removable storage unit 422 and interface 420 include a removable memory chip (e.g., an EPROM, or PROM) and associated socket, and other removable storage units 422 and interfaces 420 that allow software and data to be transferred from removable storage unit 422 to computer system 400.

Computing device 400 also includes at least one communication interface 424. Communication interface 424 allows software and data to be transferred between computing device 400 and external devices via a communication path 426. In various embodiments, communication interface 424 allows data to be transferred between computing device 400 and a data communication network, such as a public or private data communication network. The communication interface 424 may be used to exchange data between different computing devices 400, which computing devices 400 form part of an interconnected computer network. Examples of communication interface 424 may include a modem, a network interface (e.g., an ethernet card), a communication port, an antenna with associated circuitry, and the like. The communication interface 424 may be wired or wireless. Software and data transferred via communications interface 424 are in the form of signals which may be electronic, electromagnetic, optical, or other signals capable of being received by communications interface 424. These signals are provided to the communications interface via communications path 426.

Optionally, computing device 400 also includes a display interface 402 that performs operations for presenting images to an associated display 430 and an audio interface 432 that performs operations for playing audio content via associated speakers 434.

As used herein, the term "computer program product" may refer, in part, to removable storage unit 418, removable storage unit 422, a hard disk installed in hard disk drive 412, or a carrier wave that carries software to communication interface 424 through communication path 426 (wireless link or cable). Computer-readable storage media refers to any non-transitory tangible storage medium that provides recorded instructions and/or data to computing device 400 for execution and/or processing. Examples of such storage media include floppy disks, magnetic tapes, CD-ROMs, DVDs, Blu-rays^TMDisks, hard drives, ROMs, orAn integrated circuit, USB memory, a magneto-optical disk, or a computer readable card such as a PCMCIA card, etc., whether or not such devices are internal or external to the computing device 400. Examples of transitory or non-tangible computer-readable transmission media that may also participate in providing software, applications, instructions, and/or data to the computing device 400 include radio or infrared transmission channels and network connections to another computer or networked device, as well as the internet or intranet, including e-mail transmissions and information recorded on websites and the like.

Computer programs (also called computer program code) are stored in the main memory 408 and/or the secondary memory 410. Computer programs may also be received via communications interface 424. Such computer programs, when executed, enable computing device 400 to perform one or more features of embodiments discussed herein. In various embodiments, the computer programs, when executed, enable the processor 404 to perform the features of the embodiments described above. Accordingly, such computer programs represent controllers of the computer system 400.

The software may be stored in a computer program product and loaded into computing device 400 using removable storage drive 414, hard drive 412, or interface 420. Alternatively, the computer program product may be downloaded to computer system 400 over communications path 426. The software, when executed by the processor 404, causes the computing device 400 to perform the functions of the embodiments described herein.

It should be understood that the embodiment of fig. 4 is presented by way of example only. Thus, in some embodiments, one or more features of computing device 400 may be omitted. Furthermore, in some embodiments, one or more features of computing device 400 may be combined together. Additionally, in some embodiments, one or more features of computing device 400 may be separated into one or more component parts.

The term "configured" is used herein in terms of systems, devices, and computer program components. For a system consisting of one or more computers configured to perform particular operations or actions, it is meant that the system has installed thereon software, firmware, hardware, or a combination thereof that in operation causes the system to perform the operations or actions. By one or more computer programs configured to perform certain operations or actions, it is meant that the one or more programs include instructions, which when executed by a data processing apparatus, cause the apparatus to perform the operations or actions. By dedicated logic circuitry configured to perform a particular operation or action is meant that the circuitry has electronic logic to perform the operation or action.

It will be appreciated by persons skilled in the art that numerous variations and/or modifications may be made to the invention as shown in the specific embodiments without departing from the spirit or scope of the invention as broadly described. The present embodiments are, therefore, to be considered in all respects as illustrative and not restrictive.

Claims

1. An authentication system comprising:

a background image extraction device configured to:

extracting a first background image from each of a plurality of facial photographs submitted by a user for a particular digital due diligence process; and

extracting a second background image from each of a plurality of identification document photos submitted by a user for the particular digital due diligence process; and

a processing device configured to:

generating a background vector for each of the extracted first background image and the extracted second background image using a trained image similarity model, wherein the trained image similarity model is trained using historical background images;

detecting the presence of a similar background from the generated background vector; and

if the presence of a similar background is detected, an authentication alarm signal is triggered.

2. The system of claim 1, wherein:

the background image extraction device is further configured to:

extracting a first historical background image from each of a plurality of facial photographs previously submitted by a user for the particular digital due diligence process; and

extracting a second historical background image from each of a plurality of identification document photos previously submitted by a user for the particular digital due diligence process,

wherein the history background image includes the first history background image and the second history background image.

3. The system of claim 2, wherein:

the trained image similarity model is trained separately using (i) the first historical background image and (ii) the second historical background image.

4. The system of claim 2, wherein the background image extraction device is further configured to:

using a face detection model, detecting a face in each of a plurality of face photos submitted by a user for the particular digital due diligence process; and

removing the detected face from each of the plurality of face photographs such that a background image remains in each of the plurality of face photographs.

5. The system of claim 2, wherein the background image extraction device is further configured to:

using an identification document alignment model to detect an identification document in each of a plurality of identification document photos submitted by a user for the particular digital due diligence process; and

removing the detected identification document from each of the plurality of identification document photographs such that a background image remains in each of the plurality of identification document photographs.

6. The system of claim 1, wherein the processing device is further configured to detect the presence of a similar background from the generated background vector using a clustering algorithm.

7. The system of claim 6, wherein the clustering algorithm comprises: DBSCAN, K-means, or spectral clustering.

8. The system of claim 1, wherein the trained image similarity model is trained using one of the following Artificial Neural Network (ANN) architectures:

a residual neural network ResNet or IR _ SE for a backbone portion of the artificial neural network; and

softmax, tripletloss, or arcFace for a head portion of the artificial neural network.

9. The system of claim 2, wherein the processing device is further configured to:

adding the extracted first background image and the extracted second background image to the set of first historical background images and the set of second historical background images, respectively; and

adding the generated background vector to a set of historical background vectors.

10. A computer-implemented authentication method, comprising:

extracting, using a background image extraction device, a first background image from each of a plurality of facial photographs submitted by a user for a particular digital due diligence process;

extracting, using the background image extraction device, a second background image from each of a plurality of identification document photos submitted by a user for the particular digital due diligence process;

generating, using a processing device, a background vector for each of the extracted first background image and the extracted second background image using a trained image similarity model, wherein the trained image similarity model is trained using historical background images;

detecting, using the processing device, the presence of a similar background from the generated background vector; and

triggering an authentication alarm signal using the processing device if the presence of a similar background is detected.

11. The method of claim 10, further comprising:

extracting, using the background image extraction device, a first historical background image from each of a plurality of facial photographs previously submitted by a user for the particular digital due diligence process; and

extracting, using the background image extraction device, a second historical background image from each of a plurality of identification document photos previously submitted by a user for the particular digital due diligence process,

12. The method of claim 11, further comprising:

training the image similarity model using (i) the first historical background image and (ii) the second historical background image, respectively.

13. The method of claim 11, further comprising:

14. The method of claim 11, further comprising:

15. The method of claim 10, further comprising:

detecting the presence of a similar background from the generated background vector using a clustering algorithm.

16. The method of claim 15, wherein the clustering algorithm comprises: DBSCAN, K-means, or spectral clustering.

17. The method of claim 10, further comprising:

training the image similarity model using one of the following Artificial Neural Network (ANN) architectures:

softmax, triplet loss or arcFace for the head portion of the artificial neural network.

18. The method of claim 11, further comprising: