CN113158893A

CN113158893A - Target identification method and system

Info

Publication number: CN113158893A
Application number: CN202110424645.8A
Authority: CN
Inventors: 程博; 张天明
Original assignee: Beijing Didi Infinity Technology and Development Co Ltd
Current assignee: Beijing Didi Infinity Technology and Development Co Ltd
Priority date: 2021-04-20
Filing date: 2021-04-20
Publication date: 2021-07-23
Also published as: WO2022222957A1

Abstract

The embodiment of the specification provides a method and a system for target identification. The method comprises the following steps: acquiring at least one shooting parameter related to a shooting frame; sending a shooting instruction to the client, wherein the shooting instruction instructs the client to display a shooting frame based on at least one shooting parameter; receiving at least one image from a client; and determining the authenticity of the at least one image based on the at least one photographing parameter.

Description

Target identification method and system

Technical Field

The present disclosure relates to the field of image processing, and more particularly, to a method and system for object recognition.

Background

The target identification is a technique of performing identification based on an image acquired by an image pickup element. For example, face recognition technology targeting a face is widely applied to application scenarios of authority verification and identity verification. In order to improve the accuracy of target recognition, it is necessary to determine the authenticity of the image.

It is therefore desirable to provide a method and system for object recognition that can determine the authenticity of an image.

Disclosure of Invention

One embodiment of the present disclosure provides a method for object recognition. The target identification method comprises the following steps: acquiring at least one shooting parameter related to a shooting frame; sending a shooting instruction to a client, wherein the shooting instruction instructs the client to display the shooting frame based on the at least one shooting parameter; receiving at least one image from the client; and determining the authenticity of the at least one image based on the at least one photographing parameter.

One embodiment of the present disclosure provides a method for object recognition. The target identification method comprises the following steps: receiving a photographing instruction from a server, the photographing instruction including at least one photographing parameter related to a photographing frame; displaying the shooting frame based on the at least one shooting parameter; acquiring at least one shot image based on the image acquisition element; and sending the at least one shot image to the server to judge the authenticity of the at least one shot image.

One embodiment of the present specification provides a system for object recognition. The system for target recognition comprises: the parameter acquisition module is used for acquiring at least one shooting parameter related to the shooting frame; the instruction sending module is used for sending a shooting instruction to a client, and the shooting instruction is used for instructing the client to display the shooting frame based on the at least one shooting parameter; an image receiving module for receiving at least one image from the client; and an authenticity determination module for determining authenticity of the at least one image based on the at least one photographing parameter.

One embodiment of the present specification provides a system for object recognition. The system for target recognition comprises: an instruction receiving module, configured to receive a shooting instruction from a server, where the shooting instruction includes at least one shooting parameter related to a shooting frame; the shooting frame display module is used for displaying the shooting frame based on the at least one shooting parameter; the image acquisition module is used for acquiring at least one shot image based on the image acquisition element; and the image sending module is used for sending the at least one shot image to the server so as to judge the authenticity of the at least one shot image.

Drawings

The present description will be further explained by way of exemplary embodiments, which will be described in detail by way of the accompanying drawings. These embodiments are not intended to be limiting, and in these embodiments like numerals are used to indicate like structures, wherein:

FIG. 1 is a schematic diagram of an application scenario of a target recognition system in accordance with some embodiments of the present description;

FIG. 2 is an exemplary flow diagram of a target recognition method applied to a server, according to some embodiments described herein;

FIG. 3 is a flow diagram illustrating sending a shoot instruction to a client according to some embodiments of the present description;

FIG. 4 is a schematic diagram of a display capture frame according to some embodiments of the present description;

FIG. 5 is an exemplary flow diagram illustrating the determination of image authenticity according to some embodiments of the present description;

FIG. 6 is a schematic diagram of an image contrast model according to some embodiments of the present description;

FIG. 7 is an exemplary flow chart illustrating a method of object recognition applied to a client according to some embodiments of the present description.

Detailed Description

In order to more clearly illustrate the technical solutions of the embodiments of the present disclosure, the drawings used in the description of the embodiments will be briefly described below. It is obvious that the drawings in the following description are only examples or embodiments of the present description, and that for a person skilled in the art, the present description can also be applied to other similar scenarios on the basis of these drawings without inventive effort. Unless otherwise apparent from the context, or otherwise indicated, like reference numbers in the figures refer to the same structure or operation.

It should be understood that "system", "apparatus", "unit" and/or "module" as used herein is a method for distinguishing different components, elements, parts, portions or assemblies at different levels. However, other words may be substituted by other expressions if they accomplish the same purpose.

As used in this specification and the appended claims, the terms "a," "an," "the," and/or "the" are not intended to be inclusive in the singular, but rather are intended to be inclusive in the plural, unless the context clearly dictates otherwise. In general, the terms "comprises" and "comprising" merely indicate that steps and elements are included which are explicitly identified, that the steps and elements do not form an exclusive list, and that a method or apparatus may include other steps or elements.

Flow charts are used in this description to illustrate operations performed by a system according to embodiments of the present description. It should be understood that the preceding or following operations are not necessarily performed in the exact order in which they are performed. Rather, the various steps may be processed in reverse order or simultaneously. Meanwhile, other operations may be added to the processes, or a certain step or several steps of operations may be removed from the processes.

The object identification is a technique for identifying an object based on an image pickup element. In some embodiments, the target object may be a human face, a fingerprint, a palm print, a pupil, a non-living object, and the like. In some embodiments, target identification may be applied to rights verification. For example, access authorization authentication, account payment authorization authentication, and the like. In some embodiments, target identification may also be used for identity verification. For example, employee attendance authentication and principal registration identity security authentication. For example only, the target recognition may be based on matching an image of the target object acquired by the image acquisition element in real time with a pre-acquired biometric feature, thereby verifying the identity of the target object.

However, the image capturing element may be attacked or hijacked, and the attacker may upload false images for authentication. For example, attacker a may directly upload the image of user B after attacking or hijacking the image capturing element. The target recognition system performs face recognition based on the image of the user B and the face biological characteristics of the user B acquired in advance, so that the identity of the user B is verified. Therefore, in order to ensure the safety of the target identification, the authenticity of the image needs to be determined, namely, the image is determined to be acquired by the image acquisition element in real time in the target identification process.

FIG. 1 is a schematic diagram of an application scenario of an object recognition system according to some embodiments of the present description.

As shown in FIG. 1, the object recognition system 100 may include a server 110, a network 120, a client 130, and a storage device 140.

The server 110 may be used to process data and/or information from at least one component of the target recognition system 100 or an external data source (e.g., a cloud data center). For example, the server 110 may acquire at least one photographing parameter related to the photographing frame, and determine authenticity or the like of at least one image transmitted by the client 130 based on the at least one photographing parameter. For another example, the server 110 may perform pre-processing (e.g., object detection, quality analysis, etc.) on the at least one image obtained from the client 130 to obtain a pre-processed at least one image. During processing, server 110 may retrieve data (e.g., instructions) from storage device 140 or save data (e.g., at least one image) to storage device 140, or may read data (e.g., shooting environment information) from other sources such as client 130 or output data (e.g., shooting instructions) to client 130 via network 120.

In some embodiments, the server 110 may be a single server or a group of servers. The set of servers can be centralized or distributed (e.g., the servers 110 can be a distributed system). In some embodiments, the server 110 may be regional or remote. In some embodiments, the server 110 may be implemented on a cloud platform, or provided in a virtual manner. By way of example only, the cloud platform may include a private cloud, a public cloud, a hybrid cloud, a community cloud, a distributed cloud, an internal cloud, a multi-tiered cloud, and the like, or any combination thereof.

The network 120 may connect the components of the object recognition system 100 and/or connect the object recognition system 100 with external components. The network 120 enables communication between components of the object recognition system 100 and/or between the object recognition system 100 and external components to facilitate the exchange of data and/or information. In some embodiments, the network 120 may be any one or more of a wired network or a wireless network. For example, network 120 may include a cable network, a fiber optic network, a telecommunications network, the internet, a Local Area Network (LAN), a Wide Area Network (WAN), a Wireless Local Area Network (WLAN), a Metropolitan Area Network (MAN), a Public Switched Telephone Network (PSTN), a bluetooth network, a ZigBee network (ZigBee), Near Field Communication (NFC), an in-device bus, an in-device line, a cable connection, and the like, or any combination thereof. In some embodiments, the network connections between the various parts of the system may be in one of the manners described above, or in multiple manners. In some embodiments, network 120 may be a point-to-point, shared, centralized, etc. variety of topologies or a combination of topologies. In some embodiments, network 120 may include one or more network access points. For example, the network 120 may include wired or wireless network access points, such as base stations and/or network switching points 120-1, 120-2, …, through which one or more components of the object identification system 100 may connect to the network 120 to exchange data and/or information.

The client 130 may enable interaction between the user and the target recognition system 100. In some embodiments, the client 130 may include an image capture element (e.g., a camera, a video camera) for capturing image data (images and/or video). In some embodiments, the image capture component may display information that guides the user through the capture while the client 130 (e.g., a screen of the client 130) is capturing. For example, the client 130 may receive or determine one or more photographing parameters related to the photographing frame and display the photographing frame on its screen based on the one or more photographing parameters to guide the user to place a target object (e.g., a human face) within the photographing frame for photographing. In some embodiments, the client 130 may communicate with the server 110 through the network 120 and transmit the at least one image taken to the server 110. In some embodiments, the client 130 may be a mobile device 130-1, a tablet computer 130-2, a laptop computer 130-3, other devices with input and/or output capabilities, the like, or any combination thereof. The above examples are intended to be illustrative of the broad scope of the client 130 device and are not intended to be limiting.

The storage device 140 may be used to store data (e.g., a standard image of the object, at least one reference image of a qualified object, etc.) and/or instructions. Storage device 140 may include one or more storage components, each of which may be a separate device or part of another device. In some embodiments, storage device 140 may include Random Access Memory (RAM), Read Only Memory (ROM), mass storage, removable storage, volatile read and write memory, and the like, or any combination thereof. Illustratively, mass storage may include magnetic disks, optical disks, solid state disks, and the like. In some embodiments, the storage device 140 may be implemented on a cloud platform. By way of example only, the cloud platform may include a private cloud, a public cloud, a hybrid cloud, a community cloud, a distributed cloud, an internal cloud, a multi-tiered cloud, and the like, or any combination thereof. In some embodiments, the storage device 140 may be integrated or included in one or more other components of the target recognition system 100 (e.g., the server 110, the client 130, or possibly other components).

In some embodiments, the server 110 may include a parameter acquisition module, an instruction transmission module, an image reception module, an authenticity determination module, and a model acquisition module.

The parameter acquisition module may be configured to acquire at least one photographing parameter related to the photographing frame. In some embodiments, the parameter acquisition module may randomly generate the at least one photographing parameter. In some embodiments, the parameter obtaining module may determine the shooting difficulty coefficient based on the reference information; and determining the at least one shooting parameter based on the shooting difficulty coefficient.

The instruction sending module may be configured to send a shooting instruction to the client, where the shooting instruction is used to instruct the client to display a shooting frame based on at least one shooting parameter. In some embodiments, the instruction sending module may obtain a template image of the target object; and adjusting the template image based on the at least one photographing parameter to generate a contrast template image, wherein the photographing instruction further instructs the client to display the contrast template image within the photographing frame.

The image receiving module may be configured to receive at least one image from a client.

The authenticity determination module may be configured to determine authenticity of the at least one image based on the at least one photographing parameter. In some embodiments, the authenticity determination module may pre-process the at least one image to generate a pre-processed at least one image; and determining the authenticity of the at least one image based on the at least one photographing parameter and the preprocessed at least one image. In some embodiments, the pre-processing of the at least one image by the authenticity determination module comprises performing at least one of the following operations on each of the at least one image: carrying out target detection on the image, and determining whether the image contains a target object; performing quality analysis on the image to determine whether the image meets quality requirements; or carrying out image segmentation on the image to generate a segmented image corresponding to the shooting frame.

In some embodiments, the authenticity determination module may determine a first authenticity of the at least one image based on the at least one photographing parameter, the first authenticity reflecting whether the at least one image is an image that the client photographed based on a photographing instruction; the authenticity determination module may determine a second authenticity of the at least one image based on the at least one photographing parameter and at least one reference image of at least one qualified object, the second authenticity reflecting whether the at least one image is an image of one of the at least one qualified object.

The model obtaining module is used for obtaining one or more machine learning models, such as an image contrast model, a difficulty coefficient determination model, a shooting parameter determination model and the like. In some embodiments, the model acquisition module may be from other elements (e.g., storage device 140) or external sources in object recognition system 100.

In some embodiments, the client 130 may include a photographing instruction receiving module, a photographing frame displaying module, an image acquiring module, and an image transmitting module. The instruction receiving module may be configured to receive a shooting instruction from the server, the shooting instruction including at least one shooting parameter related to the shooting frame. The photographing frame display module may be configured to display a photographing frame based on the at least one photographing parameter. In some embodiments, the photographing frame display module may display the contrast template image of the target object within the photographing frame. The image acquisition module may be configured to acquire at least one captured image based on the image capture element. The image sending module can be used for sending the at least one shot image to the server so as to judge the authenticity of the at least one shot image.

It should be noted that the above descriptions of the candidate item display and determination system and the modules thereof are only for convenience of description, and the description is not limited to the scope of the illustrated embodiments. It will be appreciated by those skilled in the art that, given the teachings of the present system, any combination of modules or sub-system configurations may be used to connect to other modules without departing from such teachings. In some embodiments, the modules disclosed in fig. 1 may be different modules in a system, or may be a module that implements the functions of two or more of the modules described above. For example, each module may share one memory module, and each module may have its own memory module.

FIG. 2 is an exemplary flow diagram of a method of object recognition shown in accordance with some embodiments of the present description. As shown in fig. 2, the process 200 includes the following steps.

At step 210, at least one shooting parameter associated with the shooting frame is acquired. In some embodiments, step 210 may be performed by a parameter acquisition module of a server (e.g., server 110).

The photographing box refers to a specific area displayed on a screen of a client (e.g., the client 130) in which a user of the client can be guided to place a target object when photographing. In some embodiments, the capture frame may be any shape, e.g., rectangular, circular, oval, etc. In some embodiments, to facilitate the user to identify the shot box, the client may mark the shot box on the screen. For example, the edge of the photographing frame may be marked with a specific color. For another example, the shooting frame may be filled with a color different from the screen display area. The user of the client refers to a user who uses the client for object recognition.

The target object refers to an object to be subjected to target recognition. For example, the target object may be a face, a fingerprint, a palm print, a pupil, or the like of the user. As another example, the target object may be a non-biological object (e.g., an automobile). In some embodiments, the target object refers to a face of a user that requires authentication and/or authorization. For example, in a network appointment application scenario, the platform needs to verify whether the order taker is a registered driver user that the platform has approved, and the target object is the driver's face. For another example, in a face payment application scenario, the payment system needs to verify the payment authority of the payer, and the target object is the face of the payer.

The photographing parameters may include any parameters related to the shape, size, position, display manner, and the like of the photographing frame. Exemplary photographing parameters may include a photographing angle, a photographing distance, a photographing center point, display parameters, and the like.

The photographing angle is an angle of the photographing frame with respect to a reference direction (e.g., a length direction of the client screen). A change in the shooting angle may result in a change in the relative orientation of the shooting frame and the client screen. For example, assume that the photographing frame is rectangular. When the shooting angle is 0 degrees, the length direction of the shooting frame is parallel to the length direction of the screen; when the photographing angle is 30 °, an angle between a length direction of the photographing frame and a length direction of the screen is 30 °.

The shooting distance refers to an estimated distance between a target object and an image acquisition element of the client when the user places the target object in the shooting frame for shooting. The change in the photographing distance may cause the size ratio of the photographing frame to the screen to change. For example, when the shooting distance is 0.5m, the ratio of the shooting frame to the screen is 0.8: 1; when the shooting distance is 1m, the ratio of the shooting frame to the screen is 0.6: 1.

the shooting center point is a positioning point of the shooting frame. For example, the photographing center point may be a position point located at the center of the photographing frame, a position point located on the frame of the photographing frame, or the like. The position change of the photographing center point on the screen may cause the position change of the photographing frame on the screen.

The display parameter is a mode parameter related to the display mode of the photographing frame. In some embodiments, the display parameters may include the shape of the shot box, fill color, border color, whether to flash the display, and the like.

In some embodiments, the parameter acquisition module may randomly generate the at least one photographing parameter. For example, for a certain shooting parameter, the parameter obtaining module may randomly determine the value of the shooting parameter within a preset value range of the shooting parameter of the object recognition system 100. The shooting parameters acquired by the embodiment have high randomness, and the cheating difficulty of the user can be improved, so that the accuracy of target identification is improved.

In some embodiments, the parameter acquisition module may determine the photographing parameters according to a default setting of the object recognition system 100. For example, the parameter acquisition module may acquire a shooting parameter corresponding to the object stored in advance from the storage device 140 according to the type of the object. In some embodiments, the parameter acquisition module may acquire the shooting parameters set empirically by the user from the terminal device. In some embodiments, the parameter acquisition module may determine the photographing parameters through data analysis. For example, the parameter acquisition module may determine the photographing parameters from device information received from the client.

In some embodiments, the parameter acquisition module may determine the shooting difficulty coefficient based on the reference information. The parameter obtaining module may further determine a shooting parameter based on the shooting difficulty coefficient. The reference information may reflect a likelihood and/or difficulty of a user of the client cheating in target recognition. For example, the reference information may include photographing environment information, historical behavior information of a historical user corresponding to the client, personal information of a historical user corresponding to the client, and the like, or any combination thereof.

The photographing environment information is information related to a photographing environment of an image pickup element of the client. For example, the photographing environment information may include environment lighting information, e.g., lighting intensity information, lighting type information, and the like. For another example, the photographing environment information may include environment background information, such as background still and dynamic information, background texture information, and the like. In some embodiments, the parameter acquisition module may receive the photographing environment information from the client. For example, the client may determine the photographing environment information based on image data photographed by the image pickup element. For another example, the client may include a sensor (e.g., a photosensor) for detecting the shooting environment, and detect the shooting environment information. Generally, the better the shooting environment (e.g., the better the ambient lighting), the less difficult it is for a user to cheat.

The historical users corresponding to the client can comprise users having binding relationship with the client, historical users who used the client, and the like. For example, the historical user corresponding to the client may be a driver registered with the client on the transportation service platform. The historical user corresponding to the client can be the same as or different from the user who uses the client for target identification at present.

The historical behavior information of the historical user may be related to historical behavior of the historical user, such as historical recognition behavior. For example, the historical behavior information of the historical user may include the historical identification failure times of the historical user, the reason for the historical identification failure, and the like. The reason for the failure of history identification may include user cheating, user misoperation, and the like. In some embodiments, the parameter retrieval module may retrieve a usage record of the client from the client or a storage device to determine historical behavior information of the historical user. Generally, the more the historical recognition failure times and/or the historical cheating times of the historical users are, the higher the possibility that the user of the client cheats in the current target recognition is.

The personal information of the historical user is information related to the historical user, such as historical user identification and historical user attributes. Wherein, the historical user identification is a symbol for distinguishing the historical users. For example, the identity card ID, driver license ID, etc. of the history user. The historical user attributes may include the historical user's age, school calendar, gender, credit records, and the like. Illustratively, the better the credit record of the historical user, the lower the likelihood that the user of the client will cheat in the present target recognition. In some embodiments, the parameter acquisition module may acquire personal information of the historical user from a client, a storage device, or an external source. For example, the client may collect personal information at the time of historical user registration and store the personal information to the storage device.

The shooting difficulty coefficient is used for representing the difficulty of the user of the client in placing the target object in the shooting frame for shooting. In some embodiments, the less difficult the user places the target object in the photographing frame for photographing, the greater the photographing difficulty coefficient.

In some embodiments, the parameter acquisition module may determine the photographing difficulty coefficient based on the reference information. For example, the greater the light intensity, the easier it is for the user to place the object in the shooting frame. At this time, the parameter obtaining module may determine a greater shooting difficulty coefficient to prevent the user from cheating. For another example, in the historical behavior of the user, the more the number of times of the target identification failure caused by the "user fraud" is, the higher the fraud probability of the user in the current target identification process is. At this time, the parameter obtaining module may determine a greater shooting difficulty coefficient to prevent the user from cheating. In another example, the worse the credit record of the user is, the higher the possibility of cheating of the user in the target recognition process is. At this time, the parameter obtaining module may determine a greater shooting difficulty coefficient to prevent user cheating.

In some embodiments, the parameter acquisition module may determine the shooting difficulty coefficient according to a first rule. The first rule relates to a relationship between one or more kinds of reference information and the shooting difficulty coefficient. For example, the first rule may include that when the illumination intensity is less than 30lux, the shooting difficulty coefficient is 0.1; when the illumination intensity is more than 30lux and less than 100lux, the shooting difficulty coefficient is 0.3; when the illumination intensity is greater than 100lux, the shooting difficulty coefficient is 0.6. For another example, the first rule may include that when the number of times of history recognition failure is greater than 10, the shooting difficulty coefficient is 0.6; when the historical identification failure times are less than 10 and more than 3, the shooting difficulty coefficient is 0.3; when the number of history recognition failures is less than 3, the shooting difficulty coefficient is 0.1.

In some embodiments, the parameter acquisition module may determine one shooting difficulty coefficient based on each of the plurality of kinds of reference information. The parameter obtaining module may further determine a final shooting difficulty coefficient based on the plurality of shooting difficulty coefficients. For example, the final shooting difficulty coefficient may be determined by summing, weighted summing, averaging, or the like of a plurality of shooting difficulty coefficients. For example, the parameter obtaining module determines the difficulty coefficients to be 0.3, 0.3 and obtains the final shooting difficulty coefficient to be (0.3+0.3+0.3)/3 to be 0.3 based on the illumination intensity of 40lux, the historical recognition times are 7, and the user credit rating is good.

In some embodiments, the parameter obtaining module may further determine the shooting difficulty coefficient through a difficulty coefficient determination model, specifically, an input of the difficulty coefficient determination model is the reference information, and an output of the difficulty coefficient determination model is the shooting difficulty coefficient. In some embodiments, the difficulty coefficient determination model may include, but is not limited to, a deep neural network model, a recurrent neural network model, and the like.

Further, the parameter obtaining module may determine at least one shooting parameter based on the shooting difficulty coefficient. For example, the larger the shooting difficulty coefficient is, the larger the shooting angle is, the farther the shooting distance is, and the farther the shooting center is from the screen center of the client. As described above, when the reference information shows that the client user has a high possibility of cheating and/or has a low difficulty of cheating, the shooting difficulty coefficient may be high. By setting the value of at least one shooting parameter, the difficulty of placing the target object in the shooting frame by the user can be improved, and a higher shooting difficulty coefficient is realized.

In some embodiments, the parameter acquisition module may determine the at least one photographing parameter based on a second rule. The second rule relates to a relationship between the shooting difficulty coefficient and at least one shooting parameter. For example, when the shooting difficulty coefficient is 0.1, the shooting parameters may include: the shooting angle is 0 degrees, the shooting distance is 0.1m, and the shooting center point is coincided with the center point of the screen; when the shooting difficulty coefficient is 0.6, the shooting parameters may include: the shooting angle is 40 degrees, the shooting distance is 0.3m, and the shooting central point is 0.05m above the central point of the screen.

In some embodiments, the parameter acquisition module may further determine at least one photographing parameter according to the photographing parameter determination model. Specifically, the input of the shooting parameter determination model is a shooting difficulty coefficient, and the output is at least one shooting parameter. In some embodiments, the shooting parameter determination model may include, but is not limited to, a deep neural network model, a recurrent neural network model, and the like. In some embodiments, the difficulty coefficient determination model and the shooting parameter determination model may be different layers of the same model.

In some embodiments, the at least one photographing parameter includes a plurality of photographing parameters, and the parameter acquiring module may acquire the plurality of photographing parameters in the same or different manners. For example, the parameter acquisition module may generate each of a plurality of shooting parameters at any time. For another example, some of the plurality of parameters may be randomly generated and another portion may be determined based on the reference information.

Step 220, sending a shooting instruction to the client. In some embodiments, step 220 may be performed by an instruction sending module of the server.

The shooting instruction is an instruction instructing the client to display a shooting frame according to the shooting parameters. For example, the photographing instruction may include at least one photographing parameter and be transmitted to the client via the network by the instruction transmitting module. Further, the client may display the photographing frame based on the at least one photographing parameter. For the description of displaying the shooting frame by the client based on the at least one shooting parameter, refer to step 620, which is not described herein again.

In some embodiments, the capture instructions may also include a comparison template image for further instructing the client to display the comparison template image within the capture frame. The contrast template image is a reference image that can guide the user to adjust the position of the target object and place the target object in the photographing frame. For example, in face recognition, the contrast template image may be a real or virtual face image. The shooting instruction may instruct the client to display a face image in the shooting frame to guide the user to place the face in the shooting frame for shooting. For a detailed description of the comparison template image, reference may be made to other parts of the description, such as fig. 4 and its associated description.

At step 230, at least one image is received from the client. Step 230 may be performed by an image receiving module of the server.

In some embodiments, the image receiving module may accept at least one image from a client over a network. Or, the client may send the at least one image to a storage device for storage, and the image receiving module may obtain the at least one image from the storage device. The at least one image may contain no or no object. The at least one image may be captured by an image capture component of the client or may be determined based on data (e.g., video or images) uploaded by the user.

In some embodiments, the at least one image may be a real image taken by an image capture element of the client when the client is not hijacked. For example, the client may display a photographing frame and/or a comparison template image based on the photographing instruction, and guide the user to photograph the face. The user adjusts the face position under the guidance of the client shooting frame and/or the comparison template image, so that the face is positioned in the shooting frame, and presses the shooting key to shoot the face image.

When the client is hijacked, the hijacker can upload images or videos through the client equipment. The uploaded image or video may or may not contain a target. The uploaded images or videos may be historical images or videos captured by the client or other clients, or composite images or videos. The client or other computing device (e.g., server 110) may determine the at least one image based on the uploaded image or video. For example, a hijacked client may extract at least one image from the uploaded image or video. At this time, the at least one image is a false image uploaded by the hijacked person, but not a real image photographed when the user displays the photographing frame and/or the contrast template based on the client. It will be appreciated that when the client is hijacked, the target object in the image will typically be at least partially outside the capture frame. For example, the target object is a face of a user, and when the client is hijacked or attacked, the face of the user in the image received by the image receiving module is not completely located in the shooting frame.

In some embodiments, the image receiving module may pre-process at least one image. For example, the pre-processing may include one or more of target detection, quality analysis, image segmentation, image denoising, image translation, and the like. In some embodiments, the pre-processing may include at least one of target detection, quality analysis, and image segmentation.

Object detection is used to determine whether an image contains an object. For example, the target object is a user face, the target detection can identify the image, and if the user face is identified to exist in the image, the image contains the target object; if the user's face is not present in the image, the image does not include the target object. In some embodiments, the image receiving module may reject an image not containing the object from the at least one image based on a result of the object detection. For example, the user may shoot images without human faces due to misoperation, and the images are rejected, so that the calculation amount and the calculation time of subsequent authenticity analysis can be reduced, and the analysis efficiency is improved. In some embodiments, target detection may be performed based on a target detection algorithm.

In some embodiments, the object detection may be implemented based on an object detection model (e.g., a face detection model). The object detection model may include, but is not limited to, a Visual Geometry Group Network (Visual Geometry Network) model, an inclusion NET model, a full Convolutional Neural Network (full Convolutional Neural Network) model, a segmented Network model, and a Mask-Convolutional Neural Network (Mask-Region Convolutional Neural Network) model, among others. In some embodiments, the image receiving module may use a plurality of labeled images as training data when training the target detection model based on a machine learning algorithm (e.g., a gradient descent algorithm). Alternatively, the object detection model may be trained in another device or module.

Quality analysis is used to determine whether an image meets quality requirements. For example, the quality of an image may be measured in terms of one or more image parameters of noise ratio, brightness, resolution, contrast, sharpness, etc. The quality analysis may analyze one or more image parameters of the image to determine whether the quality of the image meets a requirement. For example, if the resolution of the image is greater than 1024x 768, the image meets the quality requirement, and if the resolution of the image is less than 1024x 768, the image does not meet the quality requirement. In some embodiments, the image receiving module may remove an image that does not meet the quality requirement from the at least one image based on the result of the quality analysis, so as to reduce the calculation amount of the subsequent authenticity analysis and improve the analysis efficiency.

In some embodiments, the mass analysis may be implemented based on a mass analysis model. For example, the quality analysis model may receive an input image, output a value characterizing image quality or a determination of whether the image quality meets a quality requirement. In some embodiments, the quality analysis model may be, but is not limited to, a combination of one or more of a convolutional neural network model, a cyclic neural network model, and a long-short term memory network model.

The image segmentation may be used to segment an area within a frame from an image (referred to as a segmented image corresponding to the frame). In some embodiments, the image receiving module may segment the segmented image from the image based on the at least one photographing parameter. The image segmentation can reduce the interference of the image outside the shooting frame to the authenticity determination analysis, and the accuracy of target identification is improved. On the other hand, the segmentation image corresponding to the shooting frame can reduce the calculation amount of subsequent authenticity analysis and improve the calculation efficiency.

In some embodiments, any one or more of the above described target detection, quality analysis and image segmentation may be performed in any order or simultaneously. For example, the image receiving module may perform target detection on the image, screen out an image containing a target object, and then perform quality analysis on the image containing the target object; or the quality analysis can be carried out on the image firstly, and after the image meeting the quality requirement is screened out, the target detection can be carried out on the image meeting the quality requirement.

The authenticity of the at least one image is determined based on the at least one photographing parameter, step 240. Step 240 may be performed by the authenticity determination module.

In some embodiments, the authenticity of the at least one image comprises a first authenticity and/or a second authenticity of each image.

The first reality may reflect whether the image is an image photographed for the client based on the photographing instruction. For example, when the terminal is not hijacked or attacked, the client displays a shooting frame. The user moves the position of the target object based on the displayed shooting frame, positions the target object in the shooting frame, and shoots an image. At this time, the image has a first authenticity. For another example, when the terminal is hijacked or attacked, the image is acquired based on an image or video uploaded by the attacker. At this time, the image does not have the first authenticity. In some embodiments, the first authenticity of the image may be used to determine whether the client's camera is hijacked by an attacker. For example, if at least one of the plurality of images does not have the first authenticity, the image acquisition element of the client is hijacked. For another example, if more than a preset number of images in the plurality of images do not have the first authenticity, it indicates that the image capturing element of the client is hijacked.

The second authenticity may reflect whether the image is an image of one of the at least one qualifying objects. For example, if the image is of a qualified target, the image has a second authenticity, otherwise the image does not have the second authenticity.

In some embodiments, the image authenticity determination module may determine authenticity of the at least one image based on the pre-processed at least one image. For example, the image authenticity determination module may determine the first and/or second authenticity of the preprocessed at least one image as the first and/or second authenticity of the at least one image. In some embodiments, image segmentation processing may be performed on each of the at least one image to generate a segmented image corresponding to the frame. The image authenticity determination module may determine authenticity of at least one image based on the at least one segmented image.

For a detailed description of determining the authenticity of an image, reference may be made to fig. 5, which is not described in detail herein.

Some embodiments of the present description guide a user to locate an image of a target object within a shooting frame displayed by a client, and acquire the image from the client. Furthermore, the authenticity of the image can be judged based on the shooting parameters of the shooting frame in the image, and whether the client is hijacked or not can be effectively judged, so that the authenticity of the image is ensured. In addition, in some embodiments, the difficulty factor may be determined based on the reference information, and then the shooting parameters of the shooting frame may be determined based on the difficulty factor. For example, for a user who has a large number of cheating times, shooting parameters corresponding to a high difficulty coefficient can be set, so that the cheating difficulty of the user can be improved. By setting different shooting parameters aiming at different users, the accuracy of authenticity judgment can be improved, and the applicability and flexibility of target identification are improved.

Fig. 3 is a flow diagram illustrating sending a shooting instruction to a client according to some embodiments of the present description.

As shown in step 220, the instruction sending module may send a shooting instruction including the comparison template image to the client for instructing the client to display the comparison template image in the shooting frame. In some embodiments, the capture instruction may be generated and transmitted using the process 300 shown in fig. 3. The process 300 may include the following steps.

In step 310, a template image of the target object is obtained. In some embodiments, step 310 may be performed by an instruction sending module of the server.

The template image is an object image generated based on standard shooting parameters. For example, if the target object is a human face, the template image may be a human face template image. In some embodiments, the template image may be obtained by accessing a storage device, may be obtained by external input, invoking an associated interface, or otherwise. Alternatively, the template image may be generated by the instruction transmitting module. For example, the instruction sending module may determine the location information of at least one keypoint of the target object based on a set of standard images of the target object. The standard image set of the object is a set of a plurality of standard images including the object. Wherein the standard image of the target object is an image that meets the standard condition. Illustratively, the standard conditions may include that the object is facing the image pickup element, the size of the image of the object is 50mm × 50mm, and the distance between the object and the image pickup element is 0.4 m. For example, the standard image set may include a plurality of face images that meet the standard condition.

The keypoints of the target object may comprise representative parts of the target object. For example, the target object is a human face, and the key points may be eyes, a nose, a mouth, and the like of the human face in the standard image. In some embodiments, the keypoints may include one or more of the center of the left eye, the center of the right eye, the center of the nose, the left mouth corner, the right mouth corner, the center of the mouth, etc., of the face in the standard image. In some embodiments, the keypoints may also be any location in the standard image. For example, the key point may be the center position of the standard image. The location information of the keypoints can characterize their locations in the plurality of standard images. For example, the position information of the keypoints may be the average position coordinates of the keypoints in the standard image. Taking the left-eye center as an example, the instruction sending module may determine coordinates of the left-eye center in each standard image, and determine an average coordinate of the left-eye center in the plurality of standard images, as the position information of the left-eye center.

The standard shooting parameters are parameters for generating a template image of the target object. For example, the standard photographing parameters may include one or more of a standard photographing angle, a standard photographing distance, a standard photographing center point, and the like. The standard shooting angle refers to a standard value of the shooting angle. For example, the standard photographing angle may be a photographing angle of 0 °. The standard shooting distance refers to a standard value of the shooting distance. For example, the standard photographing distance may be a photographing distance of 0.1 m. The standard photographing center point is a standard position point of the photographing center point. For example, the standard photographing center point may be a position point of the center of the standard image or the like.

Further, the instruction transmitting module may generate a template image of the target object based on the at least one standard photographing parameter and the position information of the at least one key point. For example, the instruction sending module may generate a simulated face image meeting the standard shooting parameters based on the position information of the face key points, so as to serve as the face template image. For another example, the instruction sending module may adjust a standard face image according to the position information of the at least one key point and the standard shooting parameter, so as to generate the face template image. For example only, the key points such as the center of the left eye and the center of the right eye in the standard face image may be adjusted in position according to the corresponding position information; the face orientation, size and position in the standard face image can be adjusted based on the standard shooting angle, the standard shooting distance and the standard shooting center point respectively. In some embodiments, the instruction sending module may directly acquire a standard image set of the target object that meets the standard shooting parameters, and determine the template image based on the standard image set. For example, the instruction transmitting module may arbitrarily select one standard image from the standard image set as the template image. For another example, the instruction sending module may determine the location information of at least one keypoint based on the set of standard images and determine the template image based on the location information of the keypoint and the set of standard images (or a portion thereof).

And 320, adjusting the template image based on at least one shooting parameter to generate a contrast template image. In some embodiments, step 320 may be performed by an instruction sending module of the server.

By way of example only, as shown in fig. 4, assume that the template image conforms to a face image of standard shooting parameters. The standard shooting parameters comprise a standard shooting angle of 0 degree, a standard shooting distance of 0.1m and a standard shooting central point (namely, the central point of a screen of the client). If the shooting parameters include that the shooting angle is 30 degrees, the shooting distance is 0.3m, and the shooting center position is the upper left corner of the screen (x1, y1), the instruction sending module may first reduce the template image 410 based on the ratio of the standard shooting distance 0.1m to the shooting distance 0.3m, and obtain a first comparison template image 420; then, rotating the first contrast template image 420 by a shooting angle of 30 degrees to obtain a second contrast template image 430; then, the center position of the second comparison template image 430 is moved to the standard shooting center point, and a comparison template image 440 is obtained. It should be understood that the above-described rotation, scaling and movement of the template image may be performed in any order or simultaneously, and is not limited thereto.

In some embodiments, the instruction sending module may generate the contrast template image based on any one or more of a photographing angle, a photographing distance, a photographing center point, and/or other photographing parameters. For example, the instruction sending module adjusts the template image based on the shooting center point, and then adjusts the adjusted template image based on the shooting angle again.

Step 330, sending a shooting instruction to the client, wherein the shooting instruction instructs the client to display the comparison template image in the shooting frame. In some embodiments, step 330 may be performed by an instruction sending module of the server.

In some embodiments, the instruction sending module may send the template image and the photographing parameters to the client. The client can adjust the template image according to the shooting parameters to generate a comparison template image. The client can further display the shooting frame and the contrast template image in the shooting frame.

FIG. 5 is an exemplary flow diagram illustrating the determination of image authenticity according to some embodiments of the present description. In some embodiments, the flow 500 illustrated in fig. 5 may be performed by an authenticity determination module of a server.

Step 510, determining a first authenticity of the at least one image based on the at least one photographing parameter.

As described previously, the first reality of the image may be whether the image is an image photographed for the client based on the photographing instruction. In some embodiments, when the object is included in the shooting frame of the image, the image may be considered as an image shot by the client after the user adjusts the object into the shooting frame during the process of displaying the shooting frame, that is, the at least one image has the first authenticity.

In some embodiments, the authenticity determination module may determine a segmented image corresponding to the capture frame from the image according to the at least one capture parameter. The authenticity determination module may further detect whether at least a portion of the object (e.g., a representative portion or contour) is included in the segmented image, thereby determining whether the object is included in the frame of the image. For example, the target object is a face of a user, and if the segmented image is detected to contain the facial features or facial contours of the user, it is indicated that the shooting frame of the at least one image contains the face of the user, so that the image can be determined to have the first authenticity.

Step 520, determining a second authenticity of the at least one image based on the at least one photographing parameter and the at least one reference image of the at least one qualified target.

And when the target object contained in the image is a qualified target object, the image is proved to have the second authenticity. The qualified target is a target that has been previously verified. For example, in a network appointment application scenario, the qualified target object may be a driver's face that is audited by the network appointment platform during driver registration. For another example, in a face payment application scenario, the qualified target object may be a pupil of a payer for whom the payment platform verifies the payment authority.

The reference image is an image containing a qualified target. In some embodiments, the reference image may be pre-stored in a storage device, and the authenticity determination module may be based on retrieving the reference image from the storage device over a network. In some embodiments, the authenticity determination module may determine a second authenticity of the image based on each reference image in the storage device. For example, in a network appointment application scenario, the authenticity determination module may determine a second authenticity of the image received from the client based on reference images of all drivers in the storage device. Alternatively, the authenticity determination module may retrieve a reference image of a user corresponding to the client based on the identification information of the client and determine a second authenticity of the image based on the reference image of the user. For example, the authenticity determination module may retrieve a reference image of the driver bound to the client from the storage device based on the identification of the client for confirming a second authenticity of the image received from the client.

In some embodiments, for each of the at least one image, the authenticity determination module may generate a first image corresponding to the image and a second image corresponding to each of the at least one reference image based on the at least one photographing parameter. The shooting parameters corresponding to the first image and the second image are the same or similar. Taking a human face as an example, when a client user takes an image, the client user may rotate the head, move the head, and the like in order to place the human face in the shooting frame. The image captured at this time corresponds to the shooting parameters of the shooting frame. And the reference image is usually photographed under preset photographing parameters. For example, the reference image of the driver on the networked car reservation platform may be a human face identification photo taken under preset parameters. In order to avoid the influence of different shooting parameters on the authenticity judgment result, at least one image and at least one image need to be subjected to consistency or standardization processing so as to enable the images to correspond to the same or similar shooting parameters.

In some embodiments, for each of the at least one image, the authenticity determination module may treat the image or a segmented image corresponding to a segmented frame in the image as a first image. The authenticity determination module may adjust each of the at least one reference image based on the at least one capture parameter (or a portion thereof) to obtain at least one second image. For example, if the photographing parameter includes a photographing angle of 15 °, the reality determining module may adjust the reference image based on the photographing angle such that an angle between the reference image and a length direction of the screen is 15 °. For another example, if the photographing parameter includes a photographing center point having a position coordinate of (25, 25), the authenticity determination module may move the position point of the center of the reference image to the coordinate of (25, 25). Also for example, the authenticity determination module may be based on reducing the reference image by a factor of 5 if the photographing distance of the reference image is 0.1m and the photographing parameters include a photographing distance of 0.5 m. The authenticity determination module may take the adjusted reference image as its corresponding second image.

In some embodiments, the authenticity determination module may treat each of the at least one reference image as the second image. For each of the at least one image, the authenticity determination module may adjust the image or a corresponding segmented image of the image based on at least one capture parameter (or a portion thereof) to generate a first image. For example, if the photographing parameters include a photographing angle of 15 ° and a photographing angle of 0 ° for the reference image, the authenticity determination module may rotate the image by-15 ° so that the adjusted photographing angle of the image is the same as the photographing angle of the reference image. The authenticity determination module may take the adjusted image as its corresponding first image. In some embodiments, for each of the at least one image, the authenticity determination module determines a segmented image corresponding to the frame, and then adjusts the segmented image to generate the first image. In some embodiments, the authenticity determination module may adjust the at least one image and the at least one reference image, respectively, such that the first image and the second image both correspond to standard or other identical capture parameters.

Further, for each of the at least one image, the authenticity determination module may determine a second authenticity of the image based on a similarity of its corresponding first image and the at least one second image. For example, the authenticity determination module may determine a similarity between a first image feature of the first image and a second image feature of each second image to determine authenticity of the images. The image features of the image may include color features, texture features, shape features, depth features, and the like, or any combination thereof. The similarity between the first image feature of the first image and the second image feature of the second image can be calculated by using vector similarity, for example, determined by euclidean distance, manhattan distance, cosine similarity, and the like. If the similarity between a first image feature of a first image and a second image feature of a second image exceeds a certain threshold, the first image and the second image can be considered to be images of the same object. That is, the image corresponding to the first image is an image of a qualified object, i.e., the image corresponding to the first image has the second authenticity. If the similarity between the first image features of the first image and the second image features of all the second images does not exceed a specific threshold, the image corresponding to the first image is considered to have no second authenticity.

In some embodiments, the authenticity determination module may process each of the first image and the at least one second image based on the image contrast model to determine a second authenticity of the image. For example, the authenticity determination module may input one of the first image and the second image into the image comparison model, and the image comparison model may output a determination result of a similarity between the first image and the second image and/or whether the first image and the second image are similar. For purposes of illustration, FIG. 6 shows an exemplary block diagram of an image contrast model. As shown, the image contrast model 600 may include a feature extraction layer 605, a similarity calculation layer 608, and a discrimination layer 610.

In some embodiments, one first image and up to one second image may constitute one image pair. The image contrast model 600 may analyze the image pair to determine whether the first image and the second image in the image pair are similar. For example, as shown in FIG. 6, an image pair of a first image 603-m and a second image 604-n may be input into the image contrast model 600.

The feature extraction layer 605 may be used to process the first image and the second image to obtain a first image feature 606 of the first image 603-m and a second image feature 607 of the second image 604-n. In some embodiments, the types of feature extraction layer 605 may include convolutional neural network models such as ResNet, ResNeXt, SE-Net, DenseNet, MobileNet, ShuffleNet, RegNet, EfficientNet, or Incep, or recurrent neural network models. In some embodiments, the first image 603-m and the second image 604-n may be stitched before being input to the feature extraction layer 605. The output of the feature extraction layer 605 may be a feature vector of the stitched first image feature 606 of the first image 603-m and the second image feature 607 of the second image 604-n.

The similarity calculation layer 608 may be used to determine a similarity 609 of the first image feature 606 and the second image feature 607. The discrimination layer 610 may be configured to output a determination result of whether the first image 603-m and the second image 604-n are similar based on the similarity 609. For example, the discrimination layer 610 may compare the similarity 609 to a similarity threshold. First image 603-m and second image 604-n are similar if the similarity between first image feature 606 of first image 603-m and second image feature 607 of second image 604-n exceeds a similarity threshold. In some embodiments, a second authenticity of the image 601-m corresponding to the first image 603-m may be determined based on the determination 611 of whether the first image 603-m corresponding to the image 601-m and each second image 604 corresponding to each reference image 602 are similar. For example, the first image 603-m and each of the second images 604 are not similar, and the first image does not have the second authenticity. As another example, the first image 603-m and the second image 604-1 are similar, and the first image has a second authenticity.

In some embodiments, the authenticity determination module may input pairs of images of the first image and the second image together into the image contrast model 600. The image comparison model 600 may output the similarity determination result for each of the pairs of images simultaneously. In some embodiments, the image contrast model 600 is a machine learning model of preset parameters. The preset parameters refer to model parameters learned in the training process of the machine learning model. Taking a neural network as an example, the model parameters include Weight (Weight) and bias (bias), etc. The preset parameters of the image contrast model 600 are generated by a training process. For example, the model acquisition module may train an initial image contrast model based on a plurality of training samples with labels to obtain an image contrast model.

The training samples include one or more sample image pairs with labels. Each sample image pair includes a first sample image and a second sample image. Wherein the first sample image and the second sample image may be images of the same or different objects. The label of the training sample may indicate whether the first sample image and the second sample image are similar (or are pictures of the same object).

In some embodiments, the image contrast model may be pre-trained by the processing device or a third party and stored in the storage device, from which the processing device may directly invoke the image contrast model.

In some embodiments, the authenticity determination module may determine authenticity of the image based only on the first authenticity. For example, if at least one image has a first authenticity, the at least one image is considered authentic, by object recognition. In some embodiments, the authenticity determination module may determine authenticity of the image based on the first authenticity and the second authenticity. For example, it may be determined that at least one image has a first authenticity and then it is determined whether the object in the at least one image is a qualified object. If the image has a second authenticity, the at least one image is considered authentic, by object recognition. The first authenticity analysis is based on simple features of the image, which are simpler and require less computational resources than the second authenticity analysis. By only performing the first authenticity analysis or performing the second authenticity analysis after judging that the image has the first authenticity, the efficiency of target identification can be improved, the steps of target identification are simplified, and the waste of computing resources is reduced (for example, the second authenticity analysis is performed on false pictures uploaded by hijackers by using the computing resources is avoided).

In some embodiments, the authenticity determination module may determine authenticity of the image based directly on the second authenticity. For example, if at least one image has a first authenticity, the at least one image is considered authentic, by object recognition.

In some embodiments, the authenticity determination module may select different ways for object recognition based on the reference information. For example, if the historical cheating times of the user is greater than a specific threshold, the authenticity determination module may select a target identification mode that first determines first authenticity and then determines second authenticity. For another example, if the number of times of the user's historical cheating is 0, the authenticity determination module may select a target identification method for directly determining the second authenticity.

FIG. 7 is an exemplary flow chart illustrating a method of object recognition applied to a client according to some embodiments of the present description. As shown in fig. 7, the process 700 includes the following steps.

Step 710, receiving a shooting instruction from a server. In some embodiments, step 710 may be performed by an instruction receiving module of the client.

As described above, the shooting instruction is an instruction to instruct the client to display a shooting frame. In some embodiments, the photographing instruction may include photographing parameters related to the photographing frame, for example, a shape, a size, a position, display parameters, and the like of the photographing frame. In some embodiments, the capture instructions may further include a comparison template image of the target object for instructing the client to display the comparison template image within the capture frame. In some embodiments, the capture instruction includes a template image of the target object. The client may generate a comparison template image based on the slave template image and the photographing parameters. For a detailed description of the shooting instruction, reference may be made to the description of other parts of the present application, for example, step 220.

And 720, displaying a shooting frame based on the at least one shooting parameter. In some embodiments, step 720 may be performed by a camera box display module of the client.

As described above, the photographing frame refers to a specific area displayed on a screen of a client (e.g., the client 130) in which a user of the client can be guided to place a target object when photographing.

In some embodiments, the photographing parameters may include any parameters related to the shape, size, position, display manner, and the like of the photographing frame. The photographing frame display module may generate a photographing frame based on the at least one photographing parameter and instruct the client to display the photographing frame within the screen display area. For example, the shot frame display parameters may generate a shot frame based on the shape, size, position parameters, and instruct the client to display the shot frame in a particular manner (e.g., in a particular color, blinking frequency) based on the display parameters. In some embodiments, the client may obtain a preset shooting frame having a specific shape, size, position, and the like. The photographing frame display module may rotate, zoom, translate, etc. a preset photographing frame based on the photographing parameters to generate the photographing frame. The process of adjusting the shooting frame based on the shooting parameters is similar to the process of adjusting the template image based on the shooting parameters described in fig. 3 and 4, and is not described herein again.

In some embodiments, the capture frame display module may instruct the client to display the comparison template image within the capture frame, similar to that shown in fig. 4. When the client displays the comparison template image, the user can align the contour of the target object with the contour of the comparison template image, and the accuracy of template identification is improved. For example, the target object is a user's face, and the user may align the face contour with a face image template (i.e., a comparison template image). Additionally or alternatively, the capture frame display module may instruct the client to display at least one keypoint of the target object in the capture frame or the comparison template image. Taking the face of the user as an example, the left eye, the right eye, the nose tip, the left mouth corner and the right mouth corner of the face may be further displayed in a face image template (i.e., a comparison template image), and the user may respectively align the left eye, the right eye, the nose tip, the left mouth corner and the right mouth corner with corresponding key points in the face image template.

In some embodiments, the shooting frame display module may directly acquire the contrast template image and display the contrast template image on the screen of the client. At this time, the edge of the comparison template image may be regarded as a photographing frame.

At step 730, at least one captured image is acquired based on the image capture element. In some embodiments, step 730 may be performed by an image acquisition module.

The captured image is an image acquired by an image pickup element of the client. The captured image may or may not contain the object. In some embodiments, the image acquisition module may acquire at least one captured image based on a video captured by the image capture element. Specifically, the image acquisition module may extract at least one frame of image from the video as at least one shot image. For example, the image acquisition module randomly extracts n frames from a video taken by the image pickup element as a taken image. For another example, the image acquisition module may first recognize a video frame including the target object in the shooting frame, and extract n frames of the video including the target object as the shot image.

In some embodiments, the image acquisition module may instruct the image capture element to acquire the captured image based on the confirmation instruction. The confirmation instruction is a shooting instruction triggered by the user through a determination operation. For example, the confirmation operation may be a manual input operation, a voice input operation, or the like. Specifically, after the user adjusts the target object into the shooting frame, the confirmation instruction may be triggered by the determination operation. After the image acquisition module receives the confirmation instruction, the image acquisition element can shoot an image. In some embodiments, the image acquisition module may automatically instruct the image capture element to take an image. For example, when it is detected that an object (such as a human face) exists in the photographing frame, the image acquisition module may automatically instruct the image pickup element to photograph an image.

Step 740, sending the at least one captured image to the server to determine the authenticity of the at least one captured image. In some embodiments, step 740 may be performed by the image sending module.

In some embodiments, the image sending module may send the at least one captured image acquired by the image capturing element as an image to the server via the network to perform a determination (e.g., a second determination of authenticity) on the authenticity of the at least one captured image. In some embodiments, if the client is hijacked, after receiving the capture instruction, the hijacker may upload an image or video through the client device. At this point, step 730 may be omitted. The client side can send the image or the video uploaded by the hijacker to the server, and the server can judge the first reality and/or the second reality of the image or the video. In some embodiments, the image sending module pre-processes the captured image and sends the pre-processed captured image to the server for further analysis. The preprocessing of the shot image is similar to the preprocessing of the image by the image receiving module of the server, and the details are not repeated herein. For a detailed description of the server performing the authenticity judgment on the at least one captured image, refer to fig. 5 and its related description, which are not repeated herein.

In some embodiments, the client may also receive the authenticity discrimination result of the photographed image from the server through the network. Alternatively, the client may display the guidance information based on the authenticity discrimination result. The guide information is information for prompting the user of the next operation based on the authenticity judgment result. The guide information may include voice information, text information, image information, and the like. For example, in a network car appointment application scenario, when the authenticity judgment result reflects that the target passes the identification, that is, when the platform verifies that the order receiving driver a is the driver user registered by the platform, the guidance information may be voice information "check pass, please start driving". For another example, in a face payment application scenario, when the authenticity judgment result reflects that the target fails to be recognized, the guidance information may be "check failed" and please re-recognize the text information displayed in the screen display area of the client.

In some embodiments, the client may further determine the guidance information based on reference information, wherein the reference information may include shooting environment information, historical behavior of the user, personal information of the user, and the like. For example, in a car booking application scenario, if the authenticity judgment result indicates that the target fails to be identified, the client may determine that the guidance information is "the examination fails and please go to a brighter environment for identification again" based on the shooting environment information "light intensity < 10 lux". For another example, in an application scenario of face payment, when the authenticity judgment result reflects that the target fails to be recognized, the client may determine that the guidance information is "verification failed, please identify himself" based on the historical behavior of the user, "the number of times of failure of historical target recognition caused by user fraud > 10". Some embodiments of the present description determine the guidance information based on the reference information, and may guide, prompt or warn the user for different target recognition intentions and different operation behaviors, so as to improve the pertinence and effectiveness of the guidance information, thereby improving the accuracy of target recognition.

The beneficial effects that may be brought by the embodiments of the present description include, but are not limited to: (1) the method comprises the steps that a user is guided to adjust a target object into a shooting frame through a shooting frame displayed by a client to obtain an image, the authenticity of the image is judged based on shooting parameters of the shooting frame, and whether the client is hijacked and/or whether the image is a qualified target object can be effectively judged; (2) the difficulty coefficient is determined based on the reference information, and then the shooting parameters of the shooting frame are determined based on the difficulty coefficient, so that the accuracy of authenticity judgment can be improved by setting different shooting parameters aiming at different scenes, and the applicability and flexibility of target identification are improved; (3) the first authenticity analysis is based on simple features of the image, which is simpler and saves computational resources compared to the second authenticity analysis. By only performing the first authenticity analysis or performing the second authenticity analysis after judging that the image has the first authenticity, the efficiency of target identification can be improved, the steps of target identification are simplified, and the waste of computing resources is reduced (for example, the second authenticity analysis on false pictures uploaded by hijackers by using the computing resources is avoided); (4) based on the reference information, different modes are selected for target identification, and the adaptability and the efficiency of target identification can be improved.

Having thus described the basic concept, it will be apparent to those skilled in the art that the foregoing detailed disclosure is to be regarded as illustrative only and not as limiting the present specification. Various modifications, improvements and adaptations to the present description may occur to those skilled in the art, although not explicitly described herein. Such modifications, improvements and adaptations are proposed in the present specification and thus fall within the spirit and scope of the exemplary embodiments of the present specification.

Also, the description uses specific words to describe embodiments of the description. Reference throughout this specification to "one embodiment," "an embodiment," and/or "some embodiments" means that a particular feature, structure, or characteristic described in connection with at least one embodiment of the specification is included. Therefore, it is emphasized and should be appreciated that two or more references to "an embodiment" or "one embodiment" or "an alternative embodiment" in various places throughout this specification are not necessarily all referring to the same embodiment. Furthermore, some features, structures, or characteristics of one or more embodiments of the specification may be combined as appropriate.

Additionally, the order in which the elements and sequences of the process are recited in the specification, the use of alphanumeric characters, or other designations, is not intended to limit the order in which the processes and methods of the specification occur, unless otherwise specified in the claims. While various presently contemplated embodiments of the invention have been discussed in the foregoing disclosure by way of example, it is to be understood that such detail is solely for that purpose and that the appended claims are not limited to the disclosed embodiments, but, on the contrary, are intended to cover all modifications and equivalent arrangements that are within the spirit and scope of the embodiments herein. For example, although the system components described above may be implemented by hardware devices, they may also be implemented by software-only solutions, such as installing the described system on an existing server or mobile device.

Similarly, it should be noted that in the preceding description of embodiments of the present specification, various features are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure aiding in the understanding of one or more of the embodiments. This method of disclosure, however, is not intended to imply that more features than are expressly recited in a claim. Indeed, the embodiments may be characterized as having less than all of the features of a single embodiment disclosed above.

Numerals describing the number of components, attributes, etc. are used in some embodiments, it being understood that such numerals used in the description of the embodiments are modified in some instances by the use of the modifier "about", "approximately" or "substantially". Unless otherwise indicated, "about", "approximately" or "substantially" indicates that the number allows a variation of ± 20%. Accordingly, in some embodiments, the numerical parameters used in the specification and claims are approximations that may vary depending upon the desired properties of the individual embodiments. In some embodiments, the numerical parameter should take into account the specified significant digits and employ a general digit preserving approach. Notwithstanding that the numerical ranges and parameters setting forth the broad scope of the range are approximations, in the specific examples, such numerical values are set forth as precisely as possible within the scope of the application.

For each patent, patent application publication, and other material, such as articles, books, specifications, publications, documents, etc., cited in this specification, the entire contents of each are hereby incorporated by reference into this specification. Except where the application history document does not conform to or conflict with the contents of the present specification, it is to be understood that the application history document, as used herein in the present specification or appended claims, is intended to define the broadest scope of the present specification (whether presently or later in the specification) rather than the broadest scope of the present specification. It is to be understood that the descriptions, definitions and/or uses of terms in the accompanying materials of this specification shall control if they are inconsistent or contrary to the descriptions and/or uses of terms in this specification.

Finally, it should be understood that the embodiments described herein are merely illustrative of the principles of the embodiments of the present disclosure. Other variations are also possible within the scope of the present description. Thus, by way of example, and not limitation, alternative configurations of the embodiments of the specification can be considered consistent with the teachings of the specification. Accordingly, the embodiments of the present description are not limited to only those embodiments explicitly described and depicted herein.

Claims

1. A method of object recognition, comprising:

acquiring at least one shooting parameter related to a shooting frame;

sending a shooting instruction to a client, wherein the shooting instruction instructs the client to display the shooting frame based on the at least one shooting parameter;

receiving at least one image from the client; and

determining authenticity of the at least one image based on the at least one photographing parameter.

2. The method of claim 1, further comprising:

acquiring a template image of a target object; and

adjusting the template image based on the at least one photographing parameter to generate a contrast template image, wherein the photographing instruction further instructs the client to display the contrast template image within the photographing frame.

3. The method of claim 1, wherein the obtaining at least one photographing parameter related to the photographing frame comprises:

randomly generating the at least one shooting parameter; or

Determining the at least one photographing parameter based on the reference information, the determining the at least one photographing parameter based on the reference information including determining a photographing difficulty coefficient based on the reference information; and determining the at least one shooting parameter based on the shooting difficulty coefficient.

4. The method of claim 1, determining the authenticity of the at least one image based on the at least one capture parameter comprises:

preprocessing the at least one image to generate at least one preprocessed image; and

determining authenticity of the at least one image based on the at least one photographing parameter and the preprocessed at least one image.

5. The method of claim 4, wherein pre-processing the at least one image comprises performing at least one of the following operations on each of the at least one image:

carrying out target detection on the image, and determining whether the image contains a target object;

performing quality analysis on the image to determine whether the image meets quality requirements; or

And carrying out image segmentation on the image to generate a segmented image corresponding to the shooting frame.

6. The method of claim 1, said determining authenticity of said at least one image based on said at least one capture parameter, comprising at least one of:

determining a first authenticity of the at least one image based on the at least one shooting parameter, the first authenticity reflecting whether the at least one image is an image shot by the client based on a shooting instruction;

determining a second authenticity of the at least one image based on the at least one photographing parameter and at least one reference image of at least one qualified object, the second authenticity reflecting whether the at least one image is an image of one of the at least one qualified object.

7. A method of object recognition, comprising:

receiving a photographing instruction from a server, the photographing instruction including at least one photographing parameter related to a photographing frame;

displaying the shooting frame based on the at least one shooting parameter;

acquiring at least one shot image based on the image acquisition element; and

and sending the at least one shot image to the server to judge the authenticity of the at least one shot image.

8. The method of claim 7, wherein the capture instruction further comprises a template image of the target object, the method further comprising:

displaying a contrast template image of the target object in the shooting frame;

receiving an authenticity judgment result of the shot image from the server; and

and displaying guide information based on the authenticity judgment result.

9. A system for object recognition, comprising:

the parameter acquisition module is used for acquiring at least one shooting parameter related to the shooting frame;

the instruction sending module is used for sending a shooting instruction to a client, and the shooting instruction is used for instructing the client to display the shooting frame based on the at least one shooting parameter;

an image receiving module for receiving at least one image from the client; and

an authenticity determination module for determining authenticity of the at least one image based on the at least one photographing parameter.

10. A system for object recognition, performed by a client, comprising:

an instruction receiving module, configured to receive a shooting instruction from a server, where the shooting instruction includes at least one shooting parameter related to a shooting frame;

the shooting frame display module is used for displaying the shooting frame based on the at least one shooting parameter;

the image acquisition module is used for acquiring at least one shot image based on the image acquisition element; and

and the image sending module is used for sending the at least one shot image to the server so as to judge the authenticity of the at least one shot image.