CN112597808A

CN112597808A - Tamper detection method and system

Info

Publication number: CN112597808A
Application number: CN202011388928.3A
Authority: CN
Inventors: 徐炎
Original assignee: Alipay Labs Singapore Pte Ltd
Current assignee: Alipay Labs Singapore Pte Ltd
Priority date: 2020-06-23
Filing date: 2020-12-01
Publication date: 2021-04-02

Abstract

A tamper detection method and system, the method comprising: extracting image data of a face area of an identity card from an image of the identity card; applying a trained neural network to the extracted image data to calculate a confidence score; and comparing the confidence score to a threshold to determine whether the identification card has been tampered with. Training the neural network comprises: inputting each training sample in a plurality of training samples into a binary classifier to classify based on the face photos of the training samples, and iteratively optimizing the parameters of the neural network based on the classification results of the plurality of training samples.

Description

Tamper detection method and system

Technical Field

This document relates generally, but not exclusively, to identity fraud detection, including methods, devices and systems for detecting tampering with an identity card.

Background

Identity fraud is generally considered to be the unauthorized use of personal information of another person by one person to crime or deceive or defraud the other person or a third person. One example is identification card fraud, where forged or tampered identities are used for identity fraud.

For example, "electronically know your customer (eKYC)" usually needs to submit an image (e.g., a photograph or a scanned part) of an identification card, and it can be observed from some of these images that the identification card may have a picture of a substitute face, and the detailed information such as name, identification card number, birth date, and biometric data is the original information. In the eKYC process, a picture of a replacement face may match a scan of a live face, but the personal information clearly belongs to another person. Such fraud may pose a significant financial risk to the eKYC-dependent enterprise, as well as potential legal and security issues.

It may be desirable to provide methods, devices and systems that can detect such tampering to prevent identity card fraud.

Disclosure of Invention

An embodiment provides a tamper detection method, including: extracting image data of a face area of an identity card from an image of the identity card; applying a trained neural network to the extracted image data to calculate a confidence score; and comparing the confidence score to a threshold to determine whether the identification card has been tampered with. Training the neural network comprises: inputting each training sample in a plurality of training samples into a binary classifier to classify based on the face photos of the training samples, and iteratively optimizing the parameters of the neural network based on the classification results of the plurality of training samples.

Another embodiment provides a tamper detection system comprising: at least one processor; and a computer readable memory coupled to the processor and having instructions stored thereon. The instructions are executable by the processor to: extracting image data of a face area of an identity card from an image of the identity card; applying a trained neural network to the extracted image data to calculate a confidence score; determining whether the identity card is tampered based on a comparison of the confidence score to a threshold. The trained neural network includes: a binary classifier trained to classify each of a plurality of training samples based on a face photograph of the training sample; and neural network parameters that are iteratively optimized based on the classification results of the plurality of training samples.

Another embodiment provides an apparatus comprising: the receiving equipment is used for receiving the image of the identity card; and the processing equipment is used for extracting the image data of the face area of the identity card from the received image. The processing device provides the extracted image data to a trained neural network to calculate a confidence score, and determines whether the identification card has been tampered with based on a comparison of the confidence score to a threshold value. The trained neural network includes: a binary classifier trained to classify each of a plurality of training samples based on a face photograph of the training sample; and neural network parameters that are iteratively optimized based on the classification results of the plurality of training samples.

Drawings

Embodiments of the present disclosure will be better understood and appreciated by those of ordinary skill in the art from the following written description, by way of example only, and with reference to the accompanying drawings.

Fig. 1 shows a flow diagram illustrating a method of detecting tampering with an identity card according to an embodiment.

Fig. 2 shows a flow chart illustrating a detailed embodiment of the method of fig. 1.

FIG. 3 shows a flow diagram illustrating a detailed implementation of an automatic image generation process for preparing negative training samples.

Fig. 4 shows a schematic diagram of an apparatus suitable for implementing the method of fig. 1.

FIG. 5 shows a schematic diagram illustrating a computer system suitable for implementing the method of FIG. 1.

Skilled artisans will appreciate that elements in the figures are illustrated for simplicity and clarity and have not necessarily been drawn to scale. For example, the dimensions of some of the elements in the figures, block diagrams, or flowcharts may be exaggerated relative to other elements to help improve understanding of embodiments herein.

Detailed Description

The present disclosure provides a method for detecting tampering, in particular photo replacement, in identification card fraud. In the present disclosure, tamper detection is considered a classification task, where two classes of classification models, namely, genuine (i.e., untampered) cards and counterfeit (i.e., tampered) cards with photo replacement. Furthermore, the classification model is trained using regions of interest (e.g., face regions) of the identification card rather than the entire captured image. In other words, the tamper detection method can achieve high accuracy and high speed by focusing more attention on an area where tampering can be performed in the identification card.

Embodiments will now be described, by way of example only, with reference to the accompanying drawings. Like reference numbers and characters in the drawings indicate like elements or equivalents.

Some portions of the following description are presented in terms of algorithms and functional or symbolic representations of operations on data within a computer memory. These algorithmic descriptions and functional or symbolic representations are the means used by those skilled in the data processing arts to most effectively convey the substance of their work to others skilled in the art. An algorithm is here, and generally, considered to be a self-consistent sequence of steps leading to a desired result. These steps are those requiring physical manipulations of physical quantities such as electrical, magnetic or optical signals capable of being stored, transferred, combined, compared, and otherwise manipulated.

Unless specifically stated otherwise, and as will be apparent from the following, it is appreciated that throughout the present document, discussions utilizing terms such as "scanning," "computing," "determining," "applying," "extracting," "generating," "inputting," "outputting," "optimizing," or the like, refer to the action and processes of a computer system, or similar electronic device, that manipulates and transforms data represented as physical quantities within the computer system into other data similarly represented as physical quantities within the computer system or other information storage, transmission or display devices.

Also disclosed herein are apparatuses for performing the operations of the methods. Such apparatus may be specially constructed for the required purposes, or may comprise a computer or other device selectively activated or reconfigured by a computer program stored in the computer. The algorithms and displays presented herein are not inherently related to any particular computer or other apparatus. Various machines may be used with programs in accordance with the teachings herein. Alternatively, the construction of a more specialized apparatus for carrying out the required method steps may be appropriate. The structure of a conventional computer will appear from the description below.

Further, a computer program is implicitly disclosed herein, since it is clear to a person skilled in the art that the individual steps of the methods described herein can be implemented by computer code. The computer program is not intended to be limited to any particular programming language and implementation thereof. It will be appreciated that a variety of programming languages and code therefor may be used to implement the teachings of the disclosure as contained herein. Moreover, the computer program is not intended to be limited to any particular control flow. There are many other variations of computer programs that may use different control flows without departing from the scope of this disclosure.

Furthermore, one or more steps of a computer program may be executed in parallel rather than sequentially. Such a computer program may be stored on any computer readable medium. The computer-readable medium may include a storage device such as a magnetic or optical disk, a memory chip, or other storage device suitable for interfacing with a computer. The computer readable media may also include hardwired media such as those exemplified in the internet systems, or wireless media such as those exemplified in GSM, GPRS, 3G, 4G or 5G mobile phone systems, as well as other wireless systems such as bluetooth, ZigBee, Wi-Fi. When loaded and executed on such a computer, effectively creates means for implementing the steps of the preferred method.

The present disclosure may also be implemented as hardware elements. More specifically, in a hardware sense, an element is a functional hardware unit designed for use with other components or elements. For example, one element may be implemented using discrete electronic components, or an element may form part of an overall electronic circuit, such as an Application Specific Integrated Circuit (ASIC) or a Field Programmable Gate Array (FPGA). Many other possibilities exist. Those skilled in the art will appreciate that the system may also be implemented as a combination of hardware and software elements.

According to various embodiments, "circuitry" may be understood as any kind of logic implementing entity, which may be a dedicated circuit or a processor executing software stored in a memory, firmware, or any combination thereof. Thus, in embodiments, a "circuit" may be a hardwired logic circuit or a programmable logic circuit, e.g., a programmable processor such as a microprocessor (e.g., a Complex Instruction Set Computer (CISC) processor or a Reduced Instruction Set Computer (RISC) processor). The "circuitry" may also be a processor executing software, e.g. any kind of computer program, such as a computer program using virtual machine code such as Java. According to alternative embodiments, any other kind of implementation of the respective functions, which may be described in more detail herein, may also be understood as a "circuit".

Fig. 1 shows a flow diagram illustrating a method of detecting tampering with an identity card according to an embodiment. Examples of identification cards include, but are not limited to, driver's licenses, social security cards, passport pages, and other official release documents that can be accepted as valid identification. At step 102, image data of a face region of the identification card is extracted from an image of the identification card. As described in further detail below, this may involve a first crop of the image to contain substantially only the identification card, and a second crop to retain the portion of the image where the face region is located (e.g., the lower left corner of the identification card).

At step 104, the trained neural network is applied to the extracted image data to calculate a confidence score. The confidence score is then compared to a threshold value at step 106 to determine whether the identification card has been tampered with. For example, if the confidence score is greater than a threshold, the photograph on the identification card is determined to be a replacement photograph and the identification card is determined to be a counterfeit (i.e., tampered) identification card. On the other hand, if the confidence score is less than the threshold, the photograph on the identification card is determined to be the original photograph and the identification card is determined to be a genuine (i.e., untampered) identification card.

The neural network may be a Convolutional Neural Network (CNN) suitable for image classification tasks, such as ResNet-18 CNN, and the confidence score may be calculated from the trained CNN using a normalized exponential function. The threshold may be determined based on the marked validation dataset. More specifically, the verification dataset includes a normal identification card image and a counterfeit identification card image (i.e., replaced with a photograph), and then a Receiver Operating Characteristic (ROC) curve may be generated based on the label (i.e., 0 or 1) of the verification dataset and the classification/prediction result (i.e., 0 or 1) from the binary classifier. In one embodiment, the threshold may be set when the False Acceptance Rate (FAR) of the binary classifier is equal to 0.01, based on the ROC curve. In other embodiments, the threshold may be set when the FAR of the binary classifier is equal to 0.001. It will be appreciated that in alternative embodiments different FAR values may be used and the threshold may be varied accordingly.

In the present approach, the neural network may be trained with a plurality of training samples (e.g., tens of thousands of samples). Training the neural network may include inputting each of a plurality of training samples into a binary classifier to classify based on a face photograph of the training sample, and iteratively optimizing parameters of the neural network based on classification results of the plurality of training samples. During the training phase, the number and quality of the training samples may affect the accuracy of the classifier.

FIG. 2 shows a flow diagram 200 of a detailed implementation of the method of FIG. 1, including a data processing phase 202, a training phase 204, and a testing phase 206.

In the data processing stage 202, a portion 208 of the identification card 210 containing a picture of a human face, hereinafter also referred to as a face region, is extracted. The input image I1 of identification card 210 in data processing stage 202 may be a photograph or scanned of identification card 210. The image alignment step 212 is performed by detecting the four corners of the identification card 210 from the input image l1 to perform a first crop and aligning the cropped image I2 based on the four corners. In an exemplary embodiment, the resolution of the cropped image I2 is 400 × 300 pixels, and some background around the identity card 210 may be included in the image I2. Then, a face cropping step (i.e., second cropping) is performed to extract a face region 208 having a size of 200 × 300 pixels from the cropped image I2, and then resize the face region 208 into a rectangle having a size of 192 × 256 pixels.

The data processing stage 202 may be used to prepare training samples for the training stage 204. In addition, the data processing stage may be used to extract the face region of the identification card during the testing stage 206 or actual use of the tamper detection method.

Referring to the training phase 204, the training samples may include a plurality of positive training samples 214 and a plurality of negative training samples 216. Using the data processing stage 202 as described above, a positive training sample 214 is obtained from a real image of the identification cards, each having a respective picture of a face. In one embodiment, about 15 thousand such positive training samples are used, but it is understood that the number may vary depending on the training results (e.g., accuracy of implementation).

The negative training samples 216 may similarly be obtained from real images of identification cards, with a picture of the face of each card replaced. However, it has been recognized that the process of preparing such real images in a physical/manual manner requires repeated printing and cropping of physical face photographs, superimposing the face photographs on the face area of the identification card and taking the photographs, and therefore this approach is time consuming and expensive when a large number of negative training samples (e.g., 15 tens of thousands) are required. In a preferred embodiment, to efficiently prepare a large number of negative training samples, the negative training samples are obtained using the data processing stage 202 above from synthetic images, each generated by superimposing a photograph of a human face on an identification card image based on an automatic image generation process, as described in further detail below with reference to FIG. 3.

In one embodiment, the positive training samples 214 are labeled 0 and the negative training samples 216 are labeled 1. Each of the positive training samples 214 and the negative training samples 216 has a size of 192 × 256 pixels (corresponding to a rectangle) instead of the size of 224 × 224 pixels typically used in CNNs such as ResNet-18 CNN. The size of 192 x 256 pixels may better match the face area of the identification card because the face area is generally rectangular in shape rather than square. Each training sample is provided to a neural network based classifier, such as CNN classifier 218, for binary classification based on the photographs of the faces of the training samples. In other words, the face region images are used to train a neural network based classifier, rather than a full identification card image. Thus, in use, the trained classifier may focus more on the face region of the identification card image, ignoring unnecessary regions of the identification card image, and may significantly improve the accuracy of detecting the replacement photograph. The classification result may be a value 0 or a value 1. The training phase 202 also includes iteratively optimizing parameters of the neural network based on the classification results of the plurality of training samples.

In the testing stage 206 or actual use, the submitted identification card image first passes through the data processing stage 202 as described above to obtain a cropped image 220 of the face area of the identification card. The cropped image 220 is then provided to the trained neural network based classifier 218 to obtain a confidence score. If the confidence score is greater than the defined threshold, the submitted image is considered to be the image of the identification card that replaced the face photograph, i.e., the identification card has been tampered with. On the other hand, if the confidence score is less than the defined threshold, the submitted image is considered to be an image of the original identification card, i.e., the identification card has not been tampered with.

The confidence score may be calculated using a normalized exponential function. In one example, the threshold is determined from the tagged validation dataset such that the false acceptance rate of the classifier 218 based on the threshold is equal to 0.01. In another embodiment, the threshold is determined from the tagged validation data set such that the false acceptance rate of the classifier 218 based on the threshold is equal to 0.001. In alternative embodiments, different false acceptance rates may be used.

FIG. 3 shows a flow diagram 300 illustrating a detailed embodiment of an automatic image generation process for preparing negative training samples. First, a face image library including only face images from the identification card is created, and one face image (face _ image) is randomly selected in step 302. An identification card image library including identification card images is also created and one identification card image (card image) is randomly selected in step 304. Next, face detection is performed on the selected face image to obtain a first rectangular frame (face _ roi) in step 306. Similarly, face detection is performed on the selected card image to obtain a second rectangular frame (card _ roi) corresponding to the face photograph in step 308. Then, in step 310, the first rectangular box (face _ roi) is clipped to extract a face photo (face _ crop). In an embodiment, the cropping may be performed by one of: (a) randomly moving four end points of the face _ roi within a small range to obtain a random quadrangle and cutting the quadrangle; (b) randomly generating n end points around the face _ roi, connecting the end points pairwise to obtain a random n-polygon, and cutting the n-polygon; (c) a face segmentation algorithm is performed on the face _ roi to obtain an avatar edge (avatar edge) and to trim the avatar edge.

The cropped face picture (face _ crop) is then further processed. For example, in step 312, the average luminance of the second rectangular frame (card _ roi) is calculated, and the luminance of the face _ crop is adjusted to the average luminance of the card _ roi. In addition, shading may be added to the face _ crop at any angle. Then, at step 314, the size of the face _ crop is scaled to match the size of the card _ roi, and the face _ crop is superimposed on the card _ roi to generate a composite image. Further, at step 316, gaussian noise and gaussian blur are randomly added to the composite image to obtain a final fitted photograph (tamper image).

Steps 302 through 316 may be repeated in an automatic image generation process to quickly generate a negative training sample with realistic masking traces. The generated high quality training samples can improve the accuracy of the neural network based classifier.

The automatic image generation process also includes hard sample mining. The goal is to find those negative examples that current neural network-based classifiers cannot correctly identify. This process collects "hard" negative examples, which are then sent to further optimize the next generation/iteration of neural network-based classifiers. Specifically, at step 318, the pointer _ image is provided to the neural network-based classifier to determine whether the image is an image with a replacement photograph. If the classifier correctly determines that the pointer image is indeed an image with a replacement photograph, which is a normal result, then the sample is considered a simple/regular sample and is ignored (step 320). On the other hand, if the classifier incorrectly determines that the pointer image is a photograph with the original photograph (i.e., no replacement), which is a false negative result, then the sample is considered a "hard" sample and saved (step 322). The collected hard samples may then be used to retrain the neural network-based classifier by updating and optimizing neural network parameters to substantially reduce or minimize false negative conditions.

Fig. 4 shows a schematic diagram of an apparatus 400 suitable for implementing the method of fig. 1. The device 400 includes a receiving device 402 connected to a processing device 404. Receiving device 402 may receive image 406 of an identification card. The processing device 404 may extract image data of the face region of the identification card from the received image 406, provide the extracted image data to a trained neural network 408 to calculate a confidence score 410, and determine whether the identification card has been tampered with based on a comparison of the confidence score 410 to a threshold 412. The trained neural network 408 includes: a binary classifier 414 trained to classify each of a plurality of training samples based on a face photograph of the training sample; and iteratively optimized neural network parameters based on the classification results of the plurality of training samples 416.

As described, the present methods, systems, and apparatus can detect identification card fraud by detecting tampering of a face photograph based on an image of an identification card. Tamper detection is performed by a classifier trained to focus on regions of interest that generally correspond in shape, size, and location to the face photograph regions of the identification card, thereby improving the speed and accuracy of classification. In some embodiments, the large number of samples used to train the classifier are generated by strictly automated processing with high quality. In addition, the classifier can be trained on hard samples that typically give false negatives of the selection, thereby improving classification accuracy.

Fig. 5 depicts an exemplary computing device 500, hereinafter interchangeably referred to as computer system 500, in which one or more such computing devices 500 may be used in the apparatus 400. The following description of computing device 500 is provided by way of example only and is not intended to be limiting.

As shown in fig. 5, exemplary computing device 500 includes a processor 504 for executing software routines. Although single processors are shown for clarity, computing device 500 may also comprise a multi-processor system. The processor 504 is connected to a communication infrastructure 506 to communicate with other components of the computing device 500. The communication infrastructure 506 may include, for example, a communication bus, a crossbar, or a network.

Computing device 500 also includes a main memory 508, such as Random Access Memory (RAM), and a secondary memory 510. The secondary memory 510 may include, for example, a hard disk drive 512 and/or a removable storage drive 514, and the removable storage drive 514 may include a floppy disk drive, a magnetic tape drive, an optical disk drive, etc. Removable storage drive 514 reads from and/or writes to a removable storage unit 518 in a well known manner. Removable storage unit 518 may comprise a floppy disk, magnetic tape, optical disk, etc. which is read by and written to by removable storage drive 514. As will be appreciated by one skilled in the relevant art, removable storage unit 518 includes a computer-readable storage medium having stored therein computer-executable program code instructions and/or data.

In alternative embodiments, secondary memory 510 may additionally or alternatively include other similar means for allowing computer programs or other instructions to be loaded into computing device 500. Such means may include, for example, a removable storage unit 522 and an interface 520. Examples of removable storage unit 522 and interface 520 include a program cartridge and cartridge interface (e.g., such as that found in video game devices), a removable memory chip (e.g., an EPROM, or PROM) and associated socket, and other removable storage units 522 and interfaces 520 which allow software and data to be transferred from removable storage unit 522 to computer system 500.

Computing device 500 also includes at least one communication interface 524. Communications interface 524 allows software and data to be transferred between computing device 500 and external devices via communications path 526. In various embodiments, communication interface 524 allows data to be transferred between computing device 500 and a data communication network, such as a public or private data communication network. The communication interface 524 may be used to exchange data between different computing devices 500, which computing devices 500 form part of an interconnected computer network. Examples of communication interface 524 may include a modem, a network interface (such as an ethernet card), a communication port, an antenna with associated circuitry, and the like. Communication interface 524 may be wired or may be wireless. Software and data transferred via communications interface 524 are in the form of signals which may be electrical, electromagnetic, optical or other signals capable of being received by communications interface 524. These signals are provided to the communications interface via communications path 526.

As shown in fig. 5, computing device 500 further includes: a display interface 502 that performs operations for presenting images to an associated display 530; and an audio interface 532 that performs operations for playing audio content via an associated speaker 534.

As used herein, the term "computer program product" may refer, in part, to removable storage unit 518, removable storage unit 522, a hard disk installed in hard disk drive 512, or a carrier wave of software that passes through communications path 526 (wireless link or cable) to communications interface 524. Computer-readable storage media refers to any non-transitory tangible storage medium that provides recorded instructions and/or data to computing device 500 for execution and/or processing. Examples of such storage media include floppy disks, magnetic tapes, CD-ROMs, DVDs, Blu-ray (Blu-ray)^TM) An optical disk, a hard drive, a ROM or integrated circuit, a USB memory, a magneto-optical disk, or a computer readable card such as a PCMCIA card, whether or not such devices are internal or external to computing device 500. Can also participate in the direction meterExamples of transitory or non-tangible computer-readable transmission media in which computing device 500 provides software, applications, instructions, and/or data include radio or infrared transmission channels and network connections to another computer or networked device, as well as the internet or ethernet, etc., including e-mail transmissions and information recorded on websites and the like.

Computer programs (also called computer program code) are stored in main memory 508 and/or secondary memory 510. Computer programs may also be received via communications interface 524. Such computer programs, when executed, enable computing device 500 to perform one or more features of embodiments discussed herein. In various embodiments, the computer programs, when executed, enable processor 504 to perform the features of the embodiments described above. Accordingly, such computer programs represent controllers of the computer system 500.

The software may be stored in a computer program product and loaded into computing device 500 using removable storage drive 514, hard drive 512, or interface 520. Alternatively, the computer program product may be downloaded to computer system 500 via communications path 526. The software, when executed by the processor 504, causes the computing device 500 to perform the functions of the embodiments described herein.

It should be understood that the embodiment of fig. 5 is given by way of example only. Thus, in some embodiments, one or more features of computing device 500 may be omitted. Also, in some embodiments, one or more features of computing device 500 may be combined together. Additionally, in some embodiments, one or more features of computing device 500 may be separated into one or more components.

It should be understood that the elements shown in fig. 5 are used to provide means to perform the various functions and operations of the server as described in the above embodiments.

In an embodiment, a server may generally be described as a physical device including at least one processor and at least one memory including computer program code. The at least one memory and the computer program code are configured to, with the at least one processor, cause the physical device to perform the necessary operations.

It will be appreciated by persons skilled in the art that numerous variations and/or modifications may be made to what is shown in a particular embodiment of the disclosure without departing from the scope of the disclosure as broadly described. For example, the threshold may be adjusted based on actual demand, for example. The described embodiments are, therefore, to be considered in all respects as illustrative and not restrictive.

Claims

1. A tamper detection method, comprising:

extracting image data of a face area of an identity card from an image of the identity card;

applying a trained neural network to the extracted image data to compute a confidence score, wherein training the neural network comprises:

inputting each training sample in a plurality of training samples into a binary classifier to classify the face photos based on the training samples; and

iteratively optimizing parameters of the neural network based on classification results of the plurality of training samples; and

comparing the confidence score to a threshold to determine whether the identification card has been tampered with.

2. The method of claim 1, wherein the confidence score is calculated using a normalized exponential function.

3. The method of claim 1 or 2, wherein the threshold is determined from a labeled validation dataset such that the false acceptance rate of the binary classifier based on the threshold is equal to 0.01.

4. The method of claim 1 or 2, wherein the threshold is determined from a labeled validation dataset such that a false acceptance rate of the binary classifier based on the threshold is equal to 0.001.

5. The method of any of the preceding claims, wherein the training samples comprise:

a plurality of positive training samples obtained from real images of a plurality of identification cards, each having a respective picture of a face;

a plurality of negative training samples obtained from a plurality of composite images, wherein each composite image is generated by superimposing a photograph of a human face on an identification card photograph based on an automatic image generation process.

6. The method of claim 5, wherein superimposing the picture of the face on the identification card based on the automatic image generation process comprises:

randomly selecting an image with the face photo and an image with the identification card photo from a corresponding image library;

extracting the face photo and the identification card photo from the selected image, wherein the identification card photo comprises an interested area where the face photo is located;

adjusting the size and brightness of the extracted face picture based on the size and brightness of the region of interest of the identification card picture; and

and superposing the human face picture after size and brightness adjustment on the interested area of the identity card picture.

7. The method of claim 6, adjusting the size and brightness of the extracted face photograph further comprising:

shadows are added to the extracted face photos at random angles.

8. The method of claim 6 or 7, further comprising:

gaussian noise and Gaussian blur are added to the composite image.

9. The method of any one of the preceding claims,

each training sample comprises a rectangle, an

The face area of the identification card comprises a rectangle of the same size as the training sample.

10. The method of any of the preceding claims, wherein training the neural network further comprises:

identifying a set of negative training samples that provide false negative classification results;

adjusting a parameter of the neural network; and

retraining the neural network based on the identified set of negative training samples and the adjusted parameters.

11. A tamper detection system comprising:

at least one processor; and

a computer-readable memory coupled with the processor and having instructions stored thereon that are executable by the processor to:

applying a trained neural network to the extracted image data to compute a confidence score, wherein the trained neural network comprises:

a binary classifier trained to classify each of a plurality of training samples based on a face photograph of the training sample; and

a neural network parameter iteratively optimized based on classification results of the plurality of training samples; and

determining whether the identity card is tampered based on a comparison of the confidence score to a threshold.

12. The system of claim 11, wherein the instructions are executable by the processor to calculate the confidence score using a normalized exponential function.

13. The system of claim 11 or 12, wherein the memory further comprises instructions executable by the processor to: the threshold is determined from the labeled validation dataset such that the false acceptance rate of the binary classifier based on the threshold is equal to 0.01.

14. The system of claim 11 or 12, wherein the memory further comprises instructions executable by the processor to: the threshold is determined from the labeled validation dataset such that the false acceptance rate of the binary classifier based on the threshold is equal to 0.001.

15. The system of any of claims 11 to 14, wherein the training samples comprise:

a plurality of negative training samples obtained from a plurality of composite images, wherein each composite image comprises a photograph of a human face superimposed on a photograph of an identification card.

16. The system of claim 15, wherein the face photograph of each composite image has a size and brightness that is adjusted based on a size and brightness of a region of interest in which the face photograph of the identification card photograph is located.

17. The system of claim 16, wherein the face photograph of each composite image includes shadows added at random angles.

18. The system of claim 16 or 17, wherein each composite image comprises added gaussian noise and gaussian blur.

19. The system of any of claims 11 to 18, wherein each training sample comprises a rectangle and the face area of the identification card comprises a rectangle of the same size as the training sample.

20. An apparatus, comprising:

the receiving equipment is used for receiving the image of the identity card; and

a processing device to extract image data of a face region of the identification card from the received image, the processing device to provide the extracted image data to a trained neural network to calculate a confidence score and determine whether the identification card has been tampered with based on a comparison of the confidence score to a threshold, wherein the trained neural network comprises:

neural network parameters are iteratively optimized based on the classification results of the plurality of training samples.