WO2021027329A1 - 基于图像识别的信息推送方法、装置、及计算机设备 - Google Patents

基于图像识别的信息推送方法、装置、及计算机设备 Download PDF

Info

Publication number
WO2021027329A1
WO2021027329A1 PCT/CN2020/087489 CN2020087489W WO2021027329A1 WO 2021027329 A1 WO2021027329 A1 WO 2021027329A1 CN 2020087489 W CN2020087489 W CN 2020087489W WO 2021027329 A1 WO2021027329 A1 WO 2021027329A1
Authority
WO
WIPO (PCT)
Prior art keywords
picture
feature vector
tag
target
palmprint
Prior art date
Application number
PCT/CN2020/087489
Other languages
English (en)
French (fr)
Inventor
夏新
Original Assignee
深圳壹账通智能科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳壹账通智能科技有限公司 filed Critical 深圳壹账通智能科技有限公司
Publication of WO2021027329A1 publication Critical patent/WO2021027329A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/174Facial expression recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content

Definitions

  • This application relates to the field of artificial intelligence technology, and in particular to an information push method, device and computer equipment based on image recognition.
  • the target user when the target user is recommended for users of the same type, it is usually clustered based on the user's single characteristic (such as hobbies, specifically such as hiking, football, basketball, etc.), which will be in the same cluster as the target user.
  • User information is recommended as similar users of the target user.
  • the inventor realizes that after categorizing users by clustering and then recommending similar users, it may lead to a large number of clusters, resulting in low accuracy of similar user screening results, and inaccurate acquisition of the target users. Close similar users.
  • the embodiments of the present application provide an information push method, device, computer equipment, and storage medium based on image recognition, aiming to solve the problem that the prior art uses clustering to classify users and then recommends similar users, which may cause A large number of clusters leads to low accuracy of similar user screening results, and it is impossible to accurately obtain similar users who are closer to the target user.
  • an embodiment of the present application provides an information push method based on image recognition, which includes:
  • micro-expression recognition feature vector of the target picture through a convolutional neural network, and calculate the similarity between the micro-expression recognition feature vector and each micro-expression feature vector in a pre-built micro-expression feature vector library to obtain the micro-expression
  • the micro-expression feature vector with the maximum similarity value to the micro-expression recognition feature vector in the feature vector library is used as the target micro-expression feature vector;
  • an information push device based on image recognition which includes:
  • the palmprint vector obtaining unit is used to receive the palm part picture uploaded by the uploader, and obtain the palmprint recognition vector corresponding to the palm part picture through palmprint recognition;
  • the first target vector acquiring unit is configured to perform similarity calculation between the palmprint recognition vector and each palmprint feature vector in the palmprint feature vector library constructed in advance to obtain the palmprint feature vector library and the palmprint feature vector library. Identify the palmprint feature vector with the maximum similarity value of the recognition vector as the target palmprint feature vector;
  • a first label acquiring unit configured to acquire the user's first label corresponding to the target palmprint feature vector as the first label corresponding to the palm part picture;
  • the target picture obtaining unit is configured to receive the facial video uploaded by the uploader, and preprocess the facial video by optical flow method to obtain the target picture in the facial video;
  • the second target vector obtaining unit is used to obtain the micro-expression recognition feature vector of the target picture through a convolutional neural network, and perform the micro-expression recognition feature vector with each micro-expression feature vector in a pre-built micro-expression feature vector library Similarity calculation to obtain a micro-expression feature vector whose similarity value to the micro-expression recognition feature vector in the micro-expression feature vector library has the maximum value, as the target micro-expression feature vector;
  • a second tag obtaining unit configured to obtain a user second tag corresponding to the target micro-expression feature vector as the second tag corresponding to the facial video
  • a tag combination obtaining unit configured to combine the first tag and the second tag to obtain a user tag combination corresponding to the user of the uploading terminal;
  • the list sending unit is configured to obtain a tag combination whose tag similarity value exceeds a preset similarity threshold value with the user tag combination in a pre-built user tag library as a target tag combination set, and obtain the tag combination set with the target tag combination set
  • the corresponding users are sent to the uploader as a list of recommended users.
  • an embodiment of the present application provides a computer device, which includes a memory, a processor, and a computer program stored on the memory and running on the processor, and the processor executes the computer
  • the program implements the image recognition-based information push method described in the first aspect.
  • the embodiments of the present application also provide a computer-readable storage medium, wherein the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, the processor executes the above-mentioned On the one hand, the information push method based on image recognition.
  • the embodiments of the present application provide an information push method, device, computer equipment, and storage medium based on image recognition.
  • the method realizes that the corresponding first tag is obtained based on the palm part picture, and the corresponding second tag is obtained from the facial video, and similar users are obtained from the user tag library according to the user tag combination composed of the first tag and the second tag as recommended users , Can more accurately match similar users for recommendation.
  • FIG. 1 is a schematic diagram of an application scenario of an information push method based on image recognition provided by an embodiment of the application;
  • FIG. 2 is a schematic flowchart of an image recognition-based information push method provided by an embodiment of the application
  • FIG. 3 is a schematic diagram of a sub-flow of an image recognition-based information push method provided by an embodiment of the application
  • FIG. 4 is a schematic diagram of another sub-process of the image recognition-based information push method provided by an embodiment of the application.
  • FIG. 5 is a schematic diagram of another sub-flow of the image recognition-based information push method provided by an embodiment of the application.
  • FIG. 6 is a schematic diagram of another sub-flow of the image recognition-based information push method provided by an embodiment of the application.
  • FIG. 7 is a schematic block diagram of an image recognition-based information push device provided by an embodiment of the application.
  • FIG. 8 is a schematic block diagram of subunits of an image recognition-based information push device provided by an embodiment of this application.
  • FIG. 9 is a schematic block diagram of another subunit of the image recognition-based information push device provided by an embodiment of the application.
  • FIG. 10 is a schematic block diagram of another subunit of the image recognition-based information push device provided by an embodiment of the application.
  • FIG. 11 is a schematic block diagram of another subunit of the image recognition-based information push device provided by an embodiment of the application.
  • FIG. 12 is a schematic block diagram of a computer device provided by an embodiment of the application.
  • Figure 1 is a schematic diagram of an application scenario of an image recognition-based information push method provided by an embodiment of the application
  • Figure 2 is a schematic flowchart of an image recognition-based information push method provided by an embodiment of the application.
  • the information push method based on image recognition is applied to the server, and the method is executed by application software installed in the server.
  • the method includes steps S110 to S180.
  • One is the server, which has the following functions: one is to receive the palm image uploaded by the uploader to obtain the first tag; the other is to receive the facial video uploaded by the uploader to obtain the second tag; Obtain, from a pre-built user tag library, tags whose tag similarity value with the user tag combination (the user tag combination is obtained by combining the first tag and the second tag) exceeds a preset similarity threshold The combination is used as the target tag combination set, and the users corresponding to the target tag combination set are obtained and sent to the uploading terminal as a recommended user list.
  • the second is the upload terminal, which is used to upload pictures of the palms or upload facial videos to the server.
  • the user can first perform palm print recognition on the palm part through the uploading terminal, and then the palm print recognition vector corresponding to the palm part picture can be obtained.
  • step S110 includes:
  • S111 Performing palm segmentation on the palm part picture based on skin color detection to obtain a palm print interest region picture of the palm;
  • the user can use the camera of the uploader (such as a smart phone) to take a palm picture, and then use skin color detection to perform palm segmentation to obtain the palm print region of interest (ROI) of the palm.
  • the camera of the uploader such as a smart phone
  • ROI palm print region of interest
  • the principle of palm segmentation based on skin color detection to obtain the palm print interest area of the palm is as follows: use the difference between human skin color and background color to extract the palm from the background.
  • the palm image is first converted from the RGB space to the YCrCb space, and then the skin color is segmented in the YCrCb space through the palm image to obtain the palm contour image. After that, the palmprint region of interest is extracted according to the feature points on the palm contour image.
  • the skin color can be accurately separated to obtain the palm contour image.
  • the palmprint region of interest can be extracted by the extraction method based on the point of the fingertip on the palm contour image.
  • step S111 includes:
  • S1114 Rotate the converted picture counterclockwise by the current deflection angle, and obtain a square area of the current length*the current length with the target point as the midpoint, so as to obtain a palmprint region of interest picture of the palm.
  • the preprocessing of the picture of the opponent is completed, and then the palmprint recognition vector can be extracted according to the palmprint interest area.
  • a recognition algorithm based on image transformation can be used to perform Fourier transform on the palmprint region of interest to obtain the amplitude-frequency response of the palmprint region of interest.
  • the Euclidean distance or Pearson similarity between the two vectors can be calculated , Thereby judging the similarity between the two vectors.
  • the pre-built palmprint feature vector library stores a plurality of feature vectors (for example, all 8-dimensional column vectors) that have been extracted with palmprint feature vectors in advance, and each feature vector is preset
  • the corresponding user first label after having these data foundations, can determine the most similar palmprint feature vector of the target palmprint feature vector in the palmprint feature vector library as the target palmprint feature vector.
  • the user's first label corresponding to the target palmprint feature vector can be obtained as the The first label corresponding to the palm part picture, for example, the first label corresponding to the palm part picture is the A attribute.
  • S140 Receive the facial video uploaded by the uploader, and preprocess the facial video by using an optical flow method to obtain a target picture in the facial video.
  • micro-expression analysis needs to be performed on it.
  • micro-expression analysis can be performed by optical flow method to obtain the target picture in the facial video.
  • step S140 includes:
  • the scene of the object forms a series of continuously changing images on the retina of the human eye, and this series of continuously changing information constantly "flows through” the retina (that is, the image plane) , Seems to be a kind of light "flow", so it is called optical flow.
  • the optical flow expresses the change of the image, contains the information of the target movement, and can be used to determine the target's movement.
  • the three elements of optical flow one is the motion velocity field, which is a necessary condition for the formation of optical flow; the second is the part with optical characteristics such as gray-scale pixels, which can carry motion information; the third is the imaging projection from the scene to the The image plane can thus be observed.
  • optical flow is based on points. Specifically, let (u, v) be the optical flow of image point (x, y), then (x, y, u, v) is called optical flow point.
  • the collection of all optical flow points is called the optical flow field.
  • a corresponding image motion field, or image velocity field is formed on the image plane.
  • the optical flow field corresponds to the sports field.
  • the image can be dynamically analyzed. If there is no moving target in the image, the optical flow vector changes continuously throughout the image area. When there are moving objects in the image (when the user has micro expressions, the face will move, which is equivalent to moving objects), there is relative movement between the target and the background. The velocity vector formed by the moving object must be different from the background velocity vector, so that the position of the moving object can be calculated. Preprocessing is performed by the optical flow method to obtain the target picture in the facial video.
  • the micro-expression recognition feature vector of the target picture can be obtained through the convolutional neural network.
  • the specific process refer to obtaining the palmprint through the convolutional neural network.
  • the Euclidean distance or Pearson similarity between the two vectors can be calculated to determine the two The similarity between vectors.
  • step S150 includes:
  • Preprocess the target picture to obtain the preprocessed picture and the picture pixel matrix corresponding to the preprocessed picture; wherein, preprocess the target picture to sequentially perform grayscale and edge detection on the target picture And binarization processing;
  • grayscale, edge detection, and binarization are sequentially performed on the target picture to obtain the preprocessed picture and the picture pixel matrix corresponding to the preprocessed picture.
  • the gray-scale of color image is a basic method of image processing. It is widely used in the field of pattern recognition. Reasonable gray-scale will greatly help the extraction and subsequent processing of image information and save storage space. Speed up processing.
  • the method of edge detection is to examine the changes in the gray level of the pixels of the image in a certain area, and to identify the points with obvious brightness changes in the digital image.
  • Image edge detection can greatly reduce the amount of data, and eliminate irrelevant information, and preserve the important structural attributes of the image.
  • operators used for edge detection In addition to Sobel operator (ie Sobel operator), there are also Laplacian edge detection operator (ie Laplacian edge detection operator), Canny edge detection operator (ie Canney operator) and so on.
  • Binarization is a type of image thresholding. According to the selection of the threshold, the binarization method can be divided into global threshold method, dynamic threshold method and local threshold method.
  • the maximum between-class variance method also called Otsu algorithm
  • Otsu algorithm is commonly used for thresholding to eliminate some of the smaller gradient values Pixels, the pixel value of the image after binarization is 0 or 255.
  • the preprocessed picture and the picture pixel matrix corresponding to the preprocessed picture can be obtained.
  • the picture feature vector of a picture When obtaining the picture feature vector of a picture, first obtain the picture pixel matrix corresponding to the preprocessed picture, and then use the picture pixel matrix corresponding to the preprocessed picture as the input of the input layer in the convolutional neural network model to obtain the feature map, and then Input the feature map into the pooling layer to obtain the one-dimensional vector corresponding to the maximum value corresponding to the feature map, and finally input the one-dimensional vector corresponding to the maximum value corresponding to the feature map to the fully connected layer to obtain the one-dimensional vector corresponding to the preprocessed image Feature vector for micro expression recognition.
  • the user second corresponding to the target palmprint feature vector is obtained.
  • the tag can be used as the second tag corresponding to the facial video, for example, the second tag corresponding to the facial video is the B attribute.
  • the first tag is the A attribute and the second tag is the B attribute
  • the first tag and the second tag are combined to obtain the A attribute + B attribute
  • the +B attribute is used as a user tag combination corresponding to the user of the uploading terminal.
  • a tag or tag combination is set for each user in the pre-built user tag library.
  • the tag corresponding to user 1 is attribute A
  • the tag corresponding to user 2 is attribute A
  • +B attribute the label corresponding to user 3 is C attribute
  • the label corresponding to user N is C attribute+D attribute.
  • the user tag combination is A attribute + B attribute
  • the user tag library obtains the tag combination whose tag similarity value exceeds the preset similarity threshold with the user tag combination, such as A attribute (corresponding to user 1), A attribute + B attribute (Corresponding to User 2)
  • a attribute corresponding to user 1
  • a attribute + B attribute Corresponding to User 2
  • the users corresponding to each target tag combination in the target tag combination set are formed into a recommended user list (for example, including user 1 and user 2), and the recommended user list is sent to the uploader.
  • Each user data in the recommended user list includes at least a user name (ie, user name), a combination of tags, and basic user information (including gender, home address, contact number, etc.). Multi-dimensional images, micro-expressions, etc.
  • the method before step S180, the method further includes:
  • the string editing distance is the minimum number of times required to edit a single character (such as modification, insertion, and deletion) when changing from one character string to another character string.
  • a single character such as modification, insertion, and deletion
  • the string editing distance between "" and "sitting” is 3.
  • the string edit distance between each tag or tag combination in the user tag library and the user tag combination is obtained, the difference between each tag or tag combination in the user tag library and the user tag combination can be obtained.
  • the label similarity value between the two is used as a numerical reference for filtering similar users.
  • This method realizes the acquisition of similar users corresponding to the uploader based on the multi-dimensional features of palm pictures and micro-expressions, without the need to cluster a large number of users in advance, reduces the amount of data processing, and is based on the multi-dimensional features of palm pictures and micro-expressions. Dimensions can achieve a more fine-grained division of user tags.
  • FIG. 7 is a schematic block diagram of an image recognition-based information push device provided by an embodiment of the present application.
  • the image recognition-based information pushing device 100 can be configured in a server.
  • the image recognition-based information pushing device 100 includes a palm print vector acquiring unit 110, a first target vector acquiring unit 120, a first tag acquiring unit 130, a target picture acquiring unit 140, and a second target vector acquiring unit 150 , The second tag obtaining unit 160, the tag combination obtaining unit 170, and the list sending unit 180.
  • the palm print vector obtaining unit 110 is configured to receive the palm part picture uploaded by the uploader, and obtain the palm print recognition vector corresponding to the palm part picture through palm print recognition.
  • the user may first perform palmprint recognition on the palm part through the uploading terminal, and then the palmprint recognition vector corresponding to the palm part picture can be obtained.
  • the palmprint vector obtaining unit 110 includes:
  • the skin color detection unit 111 is configured to perform palm segmentation on the palm part image based on skin color detection to obtain a palm print region of interest image of the palm;
  • the region of interest extraction unit 112 is configured to obtain a feature vector of the palmprint region of interest picture through a convolutional neural network, and use it as a target palmprint feature vector.
  • the user can use the camera of the uploader (such as a smart phone) to take a palm picture, and then use skin color detection to perform palm segmentation to obtain the palm print region of interest (ROI) of the palm.
  • the camera of the uploader such as a smart phone
  • ROI palm print region of interest
  • the principle of palm segmentation based on skin color detection to obtain the palm print interest area of the palm is as follows: use the difference between human skin color and background color to extract the palm from the background.
  • the palm image is first converted from the RGB space to the YCrCb space, and then the skin color is segmented in the YCrCb space through the palm image to obtain the palm contour image. After that, the palmprint region of interest is extracted according to the feature points on the palm contour image.
  • the skin color can be accurately separated to obtain the palm contour image.
  • the palmprint region of interest can be extracted by the extraction method based on the point of the fingertip on the palm contour image.
  • the skin color detecting unit 111 includes:
  • the picture space conversion unit 1111 is used to convert the palm part picture from RGB space to YCrCb space to obtain a converted picture
  • the current connection obtaining unit 1112 is used to filter out the first valley point between the index finger and the middle finger, and the second valley point between the ring finger and the little finger in the transformed picture to obtain the first valley point and the second valley point
  • the target point acquiring unit 1113 is configured to acquire the midpoint of the current line as the current midpoint according to the current line, and obtain the current vertical line of the current line according to the current midpoint to obtain the current vertical line extending to the palm direction and distance
  • the distance of the current midpoint is one-half the current length of the target point;
  • the palmprint region of interest picture acquisition unit 1114 configured to rotate the converted picture counterclockwise to the current deflection angle, and use the target point as the midpoint to acquire a square area of current length*current length to obtain the palmprint feeling of the palm Picture of the area of interest.
  • the preprocessing of the picture of the hand is completed after the palmprint interest area is obtained, and then the palmprint recognition vector is extracted according to the palmprint interest area.
  • a recognition algorithm based on image transformation can be used to perform Fourier transform on the palmprint region of interest to obtain the amplitude-frequency response of the palmprint region of interest.
  • the first target vector acquisition unit 120 is configured to calculate the similarity between the palmprint recognition vector and each palmprint feature vector in the palmprint feature vector library constructed in advance to obtain the palmprint feature vector library and the palmprint feature vector library.
  • the palmprint feature vector with the maximum similarity value of the pattern recognition vector is used as the target palmprint feature vector.
  • the Euclidean distance or Pearson similarity between the two vectors can be calculated , Thereby judging the similarity between the two vectors.
  • the first label obtaining unit 130 is configured to obtain the user's first label corresponding to the target palmprint feature vector as the first label corresponding to the palm part picture.
  • the pre-built palmprint feature vector library stores a plurality of feature vectors (for example, all 8-dimensional column vectors) that have been extracted with palmprint feature vectors in advance, and each feature vector is preset
  • the corresponding user first label after having these data foundations, can determine the most similar palmprint feature vector of the target palmprint feature vector in the palmprint feature vector library as the target palmprint feature vector.
  • the user's first label corresponding to the target palmprint feature vector can be obtained as the The first label corresponding to the palm part picture, for example, the first label corresponding to the palm part picture is the A attribute.
  • the target picture obtaining unit 140 is configured to receive the facial video uploaded by the uploader, and preprocess the facial video by optical flow method to obtain the target picture in the facial video.
  • micro-expression analysis needs to be performed on it.
  • micro-expression analysis can be performed by optical flow method to obtain the target picture in the facial video.
  • the target picture obtaining unit 140 includes:
  • the velocity vector feature acquiring unit 141 is configured to acquire velocity vector features corresponding to each pixel of each frame of the picture in the facial video;
  • the target picture selecting unit 142 is configured to, if the velocity vector feature of at least one frame of pictures in the facial video does not keep changing continuously, use the corresponding picture as the target picture in the facial video.
  • the scene of the object forms a series of continuously changing images on the retina of the human eye, and this series of continuously changing information constantly "flows through” the retina (that is, the image plane) , Seems to be a kind of light "flow", so it is called optical flow.
  • the optical flow expresses the change of the image, contains the information of the target movement, and can be used to determine the target's movement.
  • the three elements of optical flow one is the motion velocity field, which is a necessary condition for the formation of optical flow; the second is the part with optical characteristics such as gray-scale pixels, which can carry motion information; the third is the imaging projection from the scene to the The image plane can thus be observed.
  • optical flow is based on points. Specifically, let (u, v) be the optical flow of image point (x, y), then (x, y, u, v) is called optical flow point.
  • the collection of all optical flow points is called the optical flow field.
  • a corresponding image motion field, or image velocity field is formed on the image plane.
  • the optical flow field corresponds to the sports field.
  • the image can be dynamically analyzed. If there is no moving target in the image, the optical flow vector changes continuously throughout the image area. When there are moving objects in the image (when the user has micro expressions, the face will move, which is equivalent to moving objects), there is relative movement between the target and the background. The velocity vector formed by the moving object must be different from the background velocity vector, so that the position of the moving object can be calculated. Preprocessing is performed by the optical flow method to obtain the target picture in the facial video.
  • the second target vector obtaining unit 150 is configured to obtain the micro-expression recognition feature vector of the target picture through a convolutional neural network, and combine the micro-expression recognition feature vector with each micro-expression feature vector in a pre-built micro-expression feature vector library The similarity calculation is performed to obtain the micro-expression feature vector whose similarity value to the micro-expression recognition feature vector is the maximum value in the micro-expression feature vector library, as the target micro-expression feature vector.
  • the micro-expression recognition feature vector of the target picture can be obtained through the convolutional neural network.
  • the specific process refer to obtaining the palmprint through the convolutional neural network.
  • the Euclidean distance or Pearson similarity between the two vectors can be calculated to determine the two The similarity between vectors.
  • the second target vector obtaining unit 150 includes:
  • the preprocessing unit 151 is configured to preprocess the target picture to obtain the preprocessed picture and the picture pixel matrix corresponding to the preprocessed picture; wherein, the target picture is preprocessed to sequentially gray the target pictures Scale, edge detection and binarization processing;
  • the convolution unit 152 is configured to input the picture pixel matrix corresponding to the preprocessed picture to the input layer of the convolutional neural network model to obtain a feature map;
  • the pooling unit 153 is configured to input the feature map to the pooling layer in the convolutional neural network model to obtain a one-dimensional vector corresponding to the feature map;
  • the fully connected unit 154 is used to input the one-dimensional vector corresponding to the feature map to the fully connected layer in the convolutional neural network model to obtain the micro-expression recognition feature vector corresponding to the feature map.
  • grayscale, edge detection, and binarization are sequentially performed on the target picture to obtain the preprocessed picture and the picture pixel matrix corresponding to the preprocessed picture.
  • the gray-scale of color image is a basic method of image processing. It is widely used in the field of pattern recognition. Reasonable gray-scale will greatly help the extraction and subsequent processing of image information and save storage space. Speed up processing.
  • the method of edge detection is to examine the changes in the gray level of the pixels of the image in a certain area, and to identify the points with obvious brightness changes in the digital image.
  • Image edge detection can greatly reduce the amount of data, and eliminate irrelevant information, and preserve the important structural attributes of the image.
  • operators used for edge detection In addition to Sobel operator (ie Sobel operator), there are also Laplacian edge detection operator (ie Laplacian edge detection operator), Canny edge detection operator (ie Canney operator) and so on.
  • Binarization is a type of image thresholding. According to the selection of the threshold, the binarization method can be divided into global threshold method, dynamic threshold method and local threshold method.
  • the maximum between-class variance method also called Otsu algorithm
  • Otsu algorithm is commonly used for thresholding to eliminate some of the smaller gradient values Pixels, the pixel value of the image after binarization is 0 or 255.
  • the preprocessed picture and the picture pixel matrix corresponding to the preprocessed picture can be obtained.
  • the picture feature vector of a picture When obtaining the picture feature vector of a picture, first obtain the picture pixel matrix corresponding to the preprocessed picture, and then use the picture pixel matrix corresponding to the preprocessed picture as the input of the input layer in the convolutional neural network model to obtain the feature map, and then Input the feature map into the pooling layer to obtain the one-dimensional vector corresponding to the maximum value corresponding to the feature map, and finally input the one-dimensional vector corresponding to the maximum value corresponding to the feature map to the fully connected layer to obtain the one-dimensional vector corresponding to the preprocessed image Feature vector for micro expression recognition.
  • the second tag obtaining unit 160 is configured to obtain a user second tag corresponding to the target micro-expression feature vector as the second tag corresponding to the facial video.
  • the user second corresponding to the target palmprint feature vector is obtained.
  • the tag can be used as the second tag corresponding to the facial video, for example, the second tag corresponding to the facial video is the B attribute.
  • the tag combination obtaining unit 170 is configured to combine the first tag and the second tag to obtain a user tag combination corresponding to the user of the uploader.
  • the first tag is the A attribute and the second tag is the B attribute
  • the first tag and the second tag are combined to obtain the A attribute + B attribute
  • the +B attribute is used as a user tag combination corresponding to the user of the uploading terminal.
  • the list sending unit 180 is configured to obtain a tag combination whose tag similarity value exceeds a preset similarity threshold value from a user tag library constructed in advance as a target tag combination set, and obtain the tag combination with the target tag The users corresponding to the set are sent to the uploader as a list of recommended users.
  • a tag or tag combination is set for each user in the pre-built user tag library.
  • the tag corresponding to user 1 is attribute A
  • the tag corresponding to user 2 is attribute A
  • +B attribute the label corresponding to user 3 is C attribute
  • the label corresponding to user N is C attribute+D attribute.
  • the user tag combination is A attribute + B attribute
  • the user tag library obtains the tag combination whose tag similarity value exceeds the preset similarity threshold with the user tag combination, such as A attribute (corresponding to user 1), A attribute + B attribute (Corresponding to User 2)
  • a attribute corresponding to user 1
  • a attribute + B attribute Corresponding to User 2
  • the users corresponding to each target tag combination in the target tag combination set are formed into a recommended user list (for example, including user 1 and user 2), and the recommended user list is sent to the uploader.
  • Each user data in the recommended user list includes at least a user name (ie, user name), a combination of tags, and basic user information (including gender, home address, contact number, etc.). Multi-dimensional images, micro-expressions, etc.
  • the image recognition-based information pushing device 100 further includes:
  • the tag similarity value calculation unit is used to obtain the string edit distance between each tag or tag combination in the user tag library and the user tag combination, as the difference between each tag or tag combination in the user tag library Describe the label similarity value between user label combinations.
  • the string editing distance is the minimum number of times required to edit a single character (such as modification, insertion, and deletion) when changing from one character string to another character string.
  • a single character such as modification, insertion, and deletion
  • the string editing distance between "" and "sitting” is 3.
  • the string edit distance between each tag or tag combination in the user tag library and the user tag combination is obtained, the difference between each tag or tag combination in the user tag library and the user tag combination can be obtained.
  • the label similarity value between the two is used as a numerical reference for filtering similar users.
  • the device realizes that based on the multi-dimensional features of palm pictures and micro expressions to obtain similar users corresponding to the uploader, there is no need to cluster a large number of users in advance, which reduces the amount of data processing, and is based on multiple palm pictures and micro expressions. Dimensions can achieve a more fine-grained division of user tags.
  • the above-mentioned image recognition-based information pushing device can be implemented in the form of a computer program, and the computer program can be run on a computer device as shown in FIG. 12.
  • FIG. 12 is a schematic block diagram of a computer device according to an embodiment of the present application.
  • the computer device 500 is a server, and the server may be an independent server or a server cluster composed of multiple servers.
  • the computer device 500 includes a processor 502, a memory, and a network interface 505 connected through a system bus 501, where the memory may include a non-volatile storage medium 503 and an internal memory 504.
  • the non-volatile storage medium 503 can store an operating system 5031 and a computer program 5032.
  • the processor 502 can execute an information push method based on image recognition.
  • the processor 502 is used to provide calculation and control capabilities, and support the operation of the entire computer device 500.
  • the internal memory 504 provides an environment for the operation of the computer program 5032 in the non-volatile storage medium 503.
  • the processor 502 can execute the information push method based on image recognition.
  • the network interface 505 is used for network communication, such as providing data information transmission.
  • the structure shown in FIG. 12 is only a block diagram of part of the structure related to the solution of the present application, and does not constitute a limitation on the computer device 500 to which the solution of the present application is applied.
  • the specific computer device 500 may include more or fewer components than shown in the figure, or combine certain components, or have a different component arrangement.
  • the processor 502 is configured to run a computer program 5032 stored in a memory to implement the image recognition-based information push method disclosed in the embodiment of the present application.
  • the embodiment of the computer device shown in FIG. 12 does not constitute a limitation on the specific configuration of the computer device.
  • the computer device may include more or less components than those shown in the figure. Or combine certain components, or different component arrangements.
  • the computer device may only include a memory and a processor. In such an embodiment, the structures and functions of the memory and the processor are consistent with the embodiment shown in FIG. 12, and will not be repeated here.
  • the processor 502 may be a central processing unit (Central Processing Unit, CPU), and the processor 502 may also be other general-purpose processors, digital signal processors (Digital Signal Processors, DSPs), Application Specific Integrated Circuit (ASIC), Field-Programmable Gate Array (FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc.
  • the general-purpose processor may be a microprocessor or the processor may also be any conventional processor.
  • a computer-readable storage medium may be a non-volatile computer-readable storage medium, or may be a volatile computer-readable storage medium.
  • the computer-readable storage medium stores a computer program, where the computer program is executed by a processor to implement the image recognition-based information push method disclosed in the embodiments of the present application.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Human Computer Interaction (AREA)
  • Image Analysis (AREA)

Abstract

本申请涉及人工智能领域,具体公开了基于图像识别的信息推送方法、装置、计算机设备及存储介质,通过对上传端所上传的手掌部位图片进行识别获取目标掌纹特征向量,及对应的第一标签;通过对上传端所上传的面部视频进行微表情分析,获取目标微表情特征向量,及对应的第二标签;将第一标签与第二标签进行组合,得到对应的用户标签组合;在用户标签库获取与用户标签组合的标签相似度值超过相似度阈值的标签组合以作为目标标签组合集,获取其对应的用户以作为推荐用户清单发送至上传端。该方法实现了通过基于掌部图片得到的第一标签,以及通过面部视频得到的第二标签组成的用户标签组合,基于多维度特征在用户标签库中获取相似用户,能更加精准的匹配相似用户进行推荐。

Description

基于图像识别的信息推送方法、装置、及计算机设备
本申请要求于2019年8月15日提交中国专利局、申请号为201910752720.6,发明名称为“基于图像识别的信息推送方法、装置、及计算机设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及人工智能技术领域,尤其涉及一种基于图像识别的信息推送方法、装置及计算机设备。
背景技术
目前,在对目标用户进行同类型用户的推荐时,一般是基于用户的单一特征(如兴趣爱好,具体如徒步、足球、篮球等)进行聚类后,将与目标用户位于同一聚类簇的用户信息作为目标用户的相似用户进行推荐。发明人意识到,通过聚类方式对用户进行分类后再进行相似用户的推荐,有可能导致某一聚类簇的数量众多导致相似用户筛选结果准确度低,无法准确的获取与目标用户更为接近的相似用户。
发明内容
本申请实施例提供了一种基于图像识别的信息推送方法、装置、计算机设备及存储介质,旨在解决现有技术中通过聚类方式对用户进行分类后再进行相似用户的推荐,有可能导致某一聚类簇的数量众多导致相似用户筛选结果准确度低,无法准确的获取与目标用户更为接近的相似用户的问题。
第一方面,本申请实施例提供了一种基于图像识别的信息推送方法,其包括:
接收上传端所上传的手掌部位图片,通过掌纹识别获取所述手掌部位图片对应的掌纹识别向量;
将所述掌纹识别向量与预先构建的掌纹特征向量库中各掌纹特征向量进行相似度计算,得到所述掌纹特征向量库中与所述掌纹识别向量的相似度值为最大值的掌纹特征向量,以作为目标掌纹特征向量;
获取所述目标掌纹特征向量对应的用户第一标签,以作为所述手掌部位图片对应的第一标签;
接收上传端所上传的面部视频,通过光流法对面部视频进行预处理,获取所述面部视频中的目标图片;
通过卷积神经网络获取所述目标图片的微表情识别特征向量,将所述微表情识别特征向量与预先构建的微表情特征向量库中各微表情特征向量进行相似度计算,得到所述微表情特征向量库中与所述微表情识别特征向量的相似度值为最大值的微表情特征向量,以作为目标微表情特征向量;
获取所述目标微表情特征向量对应的用户第二标签,以作为所述面部视频对应的第二标签;
将所述第一标签与所述第二标签进行组合,得到与所述上传端的用户对应的用户标签组合;以及
在预先构建的用户标签库获取与所述用户标签组合的标签相似度值超过预设的相似度阈值的标签组合以作为目标标签组合集,获取与所述目标标签组合集对应的用户以作为推荐用户清单发送至上传端。
第二方面,本申请实施例提供了一种基于图像识别的信息推送装置,其包括:
掌纹向量获取单元,用于接收上传端所上传的手掌部位图片,通过掌纹识别获取所述手 掌部位图片对应的掌纹识别向量;
第一目标向量获取单元,用于将所述掌纹识别向量与预先构建的掌纹特征向量库中各掌纹特征向量进行相似度计算,得到所述掌纹特征向量库中与所述掌纹识别向量的相似度值为最大值的掌纹特征向量,以作为目标掌纹特征向量;
第一标签获取单元,用于获取所述目标掌纹特征向量对应的用户第一标签,以作为所述手掌部位图片对应的第一标签;
目标图片获取单元,用于接收上传端所上传的面部视频,通过光流法对面部视频进行预处理,获取所述面部视频中的目标图片;
第二目标向量获取单元,用于通过卷积神经网络获取所述目标图片的微表情识别特征向量,将所述微表情识别特征向量与预先构建的微表情特征向量库中各微表情特征向量进行相似度计算,得到所述微表情特征向量库中与所述微表情识别特征向量的相似度值为最大值的微表情特征向量,以作为目标微表情特征向量;
第二标签获取单元,用于获取所述目标微表情特征向量对应的用户第二标签,以作为所述面部视频对应的第二标签;
标签组合获取单元,用于将所述第一标签与所述第二标签进行组合,得到与所述上传端的用户对应的用户标签组合;以及
清单发送单元,用于在预先构建的用户标签库获取与所述用户标签组合的标签相似度值超过预设的相似度阈值的标签组合以作为目标标签组合集,获取与所述目标标签组合集对应的用户以作为推荐用户清单发送至上传端。
第三方面,本申请实施例又提供了一种计算机设备,其包括存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机程序,所述处理器执行所述计算机程序时实现上述第一方面所述的基于图像识别的信息推送方法。
第四方面,本申请实施例还提供了一种计算机可读存储介质,其中所述计算机可读存储介质存储有计算机程序,所述计算机程序当被处理器执行时使所述处理器执行上述第一方面所述的基于图像识别的信息推送方法。
本申请实施例提供了一种基于图像识别的信息推送方法、装置、计算机设备及存储介质。该方法实现了基于手掌部位图片获取对应的第一标签,以及面部视频获取对应的第二标签,根据第一标签和第二标签组成的用户标签组合在用户标签库中获取相似用户以作为推荐用户,能更加精准的匹配相似用户进行推荐。
附图说明
图1为本申请实施例提供的基于图像识别的信息推送方法的应用场景示意图;
图2为本申请实施例提供的基于图像识别的信息推送方法的流程示意图;
图3为本申请实施例提供的基于图像识别的信息推送方法的子流程示意图;
图4为本申请实施例提供的基于图像识别的信息推送方法的另一子流程示意图;
图5为本申请实施例提供的基于图像识别的信息推送方法的另一子流程示意图;
图6为本申请实施例提供的基于图像识别的信息推送方法的另一子流程示意图;
图7为本申请实施例提供的基于图像识别的信息推送装置的示意性框图;
图8为本申请实施例提供的基于图像识别的信息推送装置的子单元示意性框图;
图9为本申请实施例提供的基于图像识别的信息推送装置的另一子单元示意性框图;
图10为本申请实施例提供的基于图像识别的信息推送装置的另一子单元示意性框图;
图11为本申请实施例提供的基于图像识别的信息推送装置的另一子单元示意性框图;
图12为本申请实施例提供的计算机设备的示意性框图。
具体实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的 实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
请参阅图1和图2,图1为本申请实施例提供的基于图像识别的信息推送方法的应用场景示意图;图2为本申请实施例提供的基于图像识别的信息推送方法的流程示意图,该基于图像识别的信息推送方法应用于服务器中,该方法通过安装于服务器中的应用软件进行执行。
如图2所示,该方法包括步骤S110~S180。
S110、接收上传端所上传的手掌部位图片,通过掌纹识别获取所述手掌部位图片对应的掌纹识别向量。
在本实施例中,为了更清楚的理解技术方案的使用场景,下面对所涉及到的终端进行介绍。其中,在本申请中,是站在服务器的角度来描述技术方案。
一是服务器,服务器有以下功能:一是用于接收上传端所上传的手掌部位图片进行第一标签的获取;二是用于接收上传端所上传的面部视频进行第二标签的获取;三是在预先构建的用户标签库获取与所述用户标签组合(所述用户标签组合由将所述第一标签与所述第二标签组合得到)的标签相似度值超过预设的相似度阈值的标签组合以作为目标标签组合集,获取与所述目标标签组合集对应的用户以作为推荐用户清单发送至上传端。
二是上传端,用于上传手掌部位图片,或是上传面部视频至服务器。
用户可以先通过上传端对手掌部位进行掌纹识别,即可得到所述手掌部位图片对应的掌纹识别向量。
在一实施例中,如图3所示,步骤S110包括:
S111、通过基于肤色检测对所述手掌部位图片进行手掌分割,得到手掌的掌纹感兴趣区域图片;
S112、通过卷积神经网络获取所述掌纹感兴趣区域图片的特征向量,以作为目标掌纹特征向量。
在本实施例中,用户可使用上传端(如智能手机)的摄像头拍摄手掌图片,然后采用基于肤色检测进行手掌分割,得到手掌的掌纹感兴趣区域(ROI)。
基于肤色检测进行手掌分割,得到手掌的掌纹感兴趣区域的原理如下:利用人体肤色与背景颜色的差异,将手掌从背景中分量出来。具体实施时,先将手掌图片从RGB空间转化为YCrCb空间,然后通过手掌图片在YCrCb空间下进行肤色分割,得到手掌轮廓图像。之后,在根据手掌轮廓图像上的特征点来提取掌纹感兴趣区域。
具体是根据肤色在空间上的分布特点,可以将肤色精准的分离出来,得到了手掌轮廓图像。之后在手掌轮廓图像上可通过基于手指尖的点的提取方法来提取掌纹感兴趣区域。
在一实施例中,如图4所示,步骤S111包括:
S1111、将所述手掌部位图片从RGB空间转化为YCrCb空间,得到转化后图片;
S1112、筛选出所述转化后图片中食指与中指之间的第一谷点,及无名指与小指之间的第二谷点,得到第一谷点与第二谷点之间当前连线的当前长度、及当前连线与X轴之间的当前偏角;
S1113、根据当前连线获取当前连线的中点以作为当前中点,根据当前中点获取当前连线的当前垂线,以获取当前垂线中延伸到手掌方向且距离当前中点的距离为二分之一当前长度的目标点;
S1114、将所述转化后图片逆时针旋转当前偏角,并以目标点为中点获取当前长度*当前长度大小的正方形区域,以得到手掌的掌纹感兴趣区域图片。
在本实施例中,由RGB空间到YCrCb空间的转化公式如下:
Figure PCTCN2020087489-appb-000001
基于手指尖的点的提取方法来提取掌纹感兴趣区域的过程如下:
11)筛选出食指与中指之间的第一谷点A,及无名指与小指之间的第二谷点B,得到AB连线的长度L和AB连线与X周的偏角ANGLE;
12)根据AB连线获取AB连线的中点,然后穿过AB连线的中点做AB连线的垂线,沿着垂线延伸到手掌方向L/2处找到点C;
13)将图像逆时针旋转-ANGLE角度,并以点C为中点提取L*L大小的正方形区域,以作为掌纹感兴趣区域。
在获取了掌纹感兴趣区域后即完成了对手张图片的预处理,接下来根据掌纹感兴趣区域提取掌纹识别向量即可。
对掌纹感兴趣区域提取掌纹识别向量可采用基于图片变换的识别算法,将掌纹感兴趣区域进行傅里叶变换,得到掌纹感兴趣区域的幅频响应。
在掌纹感兴趣区域的幅频响应上作出8个同心圆,将幅频响应分为8个区域,每一区域的所有像素点的灰度值求和,得到对应区域的特征点,将以圆心为起点从里到外的顺序将各特征值进行串接,得到一个8维的列向量,即为掌纹识别向量。
在获取手掌部位图片的图片特征向量时,先获取与手掌部位图片对应的像素矩阵,然后将手掌部位图片对应的像素矩阵作为卷积神经网络模型中输入层的输入,得到多个特征图,之后将特征图输入池化层,得到每一特征图对应的最大值所对应一维向量,最后将每一特征图对应的最大值所对应一维向量输入至全连接层,得到与手掌部位图片对应的掌纹识别向量。
S120、将所述掌纹识别向量与预先构建的掌纹特征向量库中各掌纹特征向量进行相似度计算,得到所述掌纹特征向量库中与所述掌纹识别向量的相似度值为最大值的掌纹特征向量,以作为目标掌纹特征向量。
在本实施例中,在计算所述掌纹识别向量与掌纹特征向量库中各掌纹特征向量之间的相似度值时,可以计算两个向量之间的欧式距离或是皮尔逊相似度,从而判断两个向量之间的相似度。
S130、获取所述目标掌纹特征向量对应的用户第一标签,以作为所述手掌部位图片对应的第一标签。
在本实施例中,预先构建的的掌纹特征向量库中存储了多个预先进行了掌纹特征向量提取的特征向量(例如都是8维的列向量),每一特征向量都预先设置了对应的用户第一标签,在有了这些数据基础后,即可判断所述目标掌纹特征向量在掌纹特征向量库中最相似的掌纹特征向量以作为目标掌纹特征向量。
在得到与所述掌纹识别向量的相似度值为最大值的掌纹特征向量以作为目标掌纹特征向量后,获取所述目标掌纹特征向量对应的用户第一标签,即可作为所述手掌部位图片对应的第一标签,例如所述手掌部位图片对应的第一标签为A属性。
S140、接收上传端所上传的面部视频,通过光流法对面部视频进行预处理,获取所述面部视频中的目标图片。
在本实施例中,当通过上传端的摄像头获取了用户的面部视频后,需对其进行微表情分析。具体实施时,可以通过光流法进行微表情分析以得到所述面部视频中的目标图片。
在一实施例中,如图5所示,步骤S140包括:
S141、获取所述面部视频中各帧图片的各像素点对应的速度矢量特征;
S142、若所述面部视频中存在至少一帧图片的所述速度矢量特征未保持连续变化,将对应图片作为所述面部视频中的目标图片。
在本实施例中,当人的眼睛观察运动物体时,物体的景象在人眼的视网膜上形成一系列连续变化的图像,这一系列连续变化的信息不断“流过”视网膜(即图像平面),好像是一种光的“流”,故称之为光流。光流表达图像的变化,包含目标运动的信息,可用来确定目标的运动。光流三个要素:一是运动速度场,这是形成光流的必要条件;二是带光学特征的部分例如有灰度的象素点,它可以携带运动信息;三是成像投影从场景到图像平面,因而能被观 察到。
定义光流以点为基础,具体来说,设(u,v)为图像点(x,y)的光流,则把(x,y,u,v)称为光流点。所有光流点的集合称为光流场。当带光学特性的物体在三维空间运动时,在图像平面上就形成了相应的图像运动场,或称为图像速度场。在理想情况下,光流场对应于运动场。
给图像中的每个像素点赋予一个速度矢量,这样就形成了一个运动矢量场。根据各个像素点的速度矢量特征,可以对图像进行动态分析。如果图像中没有运动目标,则光流矢量在整个图像区域是连续变化的。当图像中有运动物体时(当用户有微表情时,脸部会有运动,相当于运动物体),目标和背景存在着相对运动。运动物体所形成的速度矢量必然和背景的速度矢量有所不同,如此便可以计算出运动物体的位置。通过光流法进行预处理,获取所述面部视频中的目标图片。
S150、通过卷积神经网络获取所述目标图片的微表情识别特征向量,将所述微表情识别特征向量与预先构建的微表情特征向量库中各微表情特征向量进行相似度计算,得到所述微表情特征向量库中与所述微表情识别特征向量的相似度值为最大值的微表情特征向量,以作为目标微表情特征向量。
在本实施例中,当获取了面部视频对应的目标图片后,即可通过卷积神经网络获取所述目标图片的微表情识别特征向量,其具体过程参考通过卷积神经网络获取所述掌纹感兴趣区域图片的特征向量。
在计算所述微表情识别特征向量与微表情特征向量库中各微表情特征向量之间的相似度值时,可以计算两个向量之间的欧式距离或是皮尔逊相似度,从而判断两个向量之间的相似度。
在一实施例中,如图6所示,步骤S150包括:
S151、将目标图片进行预处理,得到预处理后图片,及与预处理后图片对应的图片像素矩阵;其中,将目标图片进行预处理为依序对所述目标图片进行灰度化、边缘检测和二值化处理;
S152、将与预处理后图片对应的图片像素矩阵输入至卷积神经网络模型中输入层,得到特征图;
S153、将特征图输入至卷积神经网络模型中池化层,得到与特征图对应的一维向量;
S154、将与特征图对应的一维向量输入至卷积神经网络模型中全连接层,得到与特征图对应的微表情识别特征向量。
在本实施例中,对目标图片依次进行灰度化、边缘检测和二值化处理,即可得到预处理后图片、及与预处理后图片对应的图片像素矩阵。
由于彩色图像包含更多的信息,但是直接对彩色图像进行处理,服务器中的执行速度将会降低,储存空间也会变大。彩色图像的灰度化是图像处理的一种基本的方法,在模式识别领域得到广泛的运用,合理的灰度化将对图像信息的提取和后续处理有很大的帮助,能够节省储存空间,加快处理速度。
边缘检测的方法是考察图像的像素在某个领域内灰度的变化情况,标识数字图像中亮度变化明显的点。图像的边缘检测能够大幅度地减少数据量,并且剔除不相关的信息,保存图像重要的结构属性。用于边缘检测的算子很多,常用的除了有Sobel算子(即索贝尔算子),还有Laplacian边缘检测算子(即拉普拉斯边缘检测算子)、Canny边缘检测算子(即坎尼算子)等。
为了减少噪声的影响,需要对进行边缘检测后的图像进行二值化处理,二值化是对图像进行阈值化的一种类型。根据阈值的选取情况,二值化的方法可分为全局阈值法、动态阈值法和局部阈值法,常用最大类间方差法(也称Otsu算法)进行阈值化,来剔除一些梯度值较小的像素,二值化处理后图像的像素值为0或者255。此时,即可得到预处理后图片,及与预处理后图片对应的图片像素矩阵。
在获取图片的图片特征向量时,先获取与预处理后图片对应的图片像素矩阵,然后将预处理后图片对应的图片像素矩阵作为卷积神经网络模型中输入层的输入,得到特征图,之后将特征图输入池化层,得到特征图对应的最大值所对应的一维向量,最后将特征图对应的最大值所对应的一维向量输入至全连接层,得到与预处理后图片对应的微表情识别特征向量。
S160、获取所述目标微表情特征向量对应的用户第二标签,以作为所述面部视频对应的第二标签。
在本实施例中,在得到与所述微表情识别特征向量的相似度值为最大值的微表情特征向量以作为目标掌纹特征向量后,获取所述目标掌纹特征向量对应的用户第二标签,即可作为所述面部视频对应的第二标签,例如所述面部视频对应的第二标签为B属性。
S170、将所述第一标签与所述第二标签进行组合,得到与所述上传端的用户对应的用户标签组合。
在本实施例中,例如所述第一标签为A属性,所述第二标签为B属性,则将所述第一标签与所述第二标签进行组合得到A属性+B属性,以A属性+B属性作为与所述上传端的用户对应的用户标签组合。
S180、在预先构建的用户标签库获取与所述用户标签组合的标签相似度值超过预设的相似度阈值的标签组合以作为目标标签组合集,获取与所述目标标签组合集对应的用户以作为推荐用户清单发送至上传端。
在本实施例中,预先构建的用户标签库中针对每一用户均设置了标签或标签组合,例如预先构建的用户标签库中用户1对应的标签为A属性、用户2对应的标签为A属性+B属性、用户3对应的标签为C属性、……、用户N对应的标签为C属性+D属性。其中用户标签组合为A属性+B属性,用户标签库获取与所述用户标签组合的标签相似度值超过预设的相似度阈值的标签组合如A属性(对应用户1)、A属性+B属性(对应用户2)以组成目标标签组合集,将目标标签组合集中每一目标标签组合对应的用户组成推荐用户清单(如包括用户1和用户2),将推荐用户清单发送至上传端。其中推荐用户清单中每一用户数据至少包括用户名称(即用户姓名)、标签组合、用户基本信息(如包括性别、家庭住址、联系号码等)。基于图片、微表情等多维度作为用户标签的获取来源,能实现对用户更加细粒度的进行划分。而且事先无需对用户进行聚类,只需预先构建用户标签库即可作为目标用户的相似用户的数据基础,降低了数据处理量。
在一实施例中,步骤S180之前还包括:
获取所述用户标签库中各标签或标签组合与所述用户标签组合之间的字符串编辑距离,以作为所述用户标签库中各标签或标签组合与所述用户标签组合之间的标签相似度值。
在本实施例中,字符串编辑距离就是从一个字符串修改到另一个字符串时,其中编辑单个字符(比如修改、插入、删除)所需要的最少次数。例如,从字符串“kitten”修改为字符串“sitting”只需3次单字符编辑操作,具体如sitten(k→s)、sittin(e→i)、sittin(_→g),因此“kitten”和“sitting”的字符串编辑距离为3。当获取了获取所述用户标签库中各标签或标签组合与所述用户标签组合之间的字符串编辑距离,即可获取所述用户标签库中各标签或标签组合与所述用户标签组合之间的标签相似度值以作为相似用户筛选的数值参考。
该方法实现了基于掌部图片和微表情的多维度特征来获取上传端对应用户的相似用户,无需事先对海量用户进行聚类,降低了数据处理量,而且基于掌部图片和微表情的多维度能实现对用户标签更加细粒度的划分。
本申请实施例还提供一种基于图像识别的信息推送装置,该基于图像识别的信息推送装置用于执行前述基于图像识别的信息推送方法的任一实施例。具体地,请参阅图7,图7是本申请实施例提供的基于图像识别的信息推送装置的示意性框图。该基于图像识别的信息推送装置100可以配置于服务器中。
如图7所示,基于图像识别的信息推送装置100包括掌纹向量获取单元110、第一目标向量获取单元120、第一标签获取单元130、目标图片获取单元140、第二目标向量获取单元 150、第二标签获取单元160、标签组合获取单元170、清单发送单元180。
掌纹向量获取单元110,用于接收上传端所上传的手掌部位图片,通过掌纹识别获取所述手掌部位图片对应的掌纹识别向量。
在本实施例中,用户可以先通过上传端对手掌部位进行掌纹识别,即可得到所述手掌部位图片对应的掌纹识别向量。
在一实施例中,如图8所示,掌纹向量获取单元110包括:
肤色检测单元111,用于通过基于肤色检测对所述手掌部位图片进行手掌分割,得到手掌的掌纹感兴趣区域图片;
感兴趣区域提取单元112,用于通过卷积神经网络获取所述掌纹感兴趣区域图片的特征向量,以作为目标掌纹特征向量。
在本实施例中,用户可使用上传端(如智能手机)的摄像头拍摄手掌图片,然后采用基于肤色检测进行手掌分割,得到手掌的掌纹感兴趣区域(ROI)。
基于肤色检测进行手掌分割,得到手掌的掌纹感兴趣区域的原理如下:利用人体肤色与背景颜色的差异,将手掌从背景中分量出来。具体实施时,先将手掌图片从RGB空间转化为YCrCb空间,然后通过手掌图片在YCrCb空间下进行肤色分割,得到手掌轮廓图像。之后,在根据手掌轮廓图像上的特征点来提取掌纹感兴趣区域。
具体是根据肤色在空间上的分布特点,可以将肤色精准的分离出来,得到了手掌轮廓图像。之后在手掌轮廓图像上可通过基于手指尖的点的提取方法来提取掌纹感兴趣区域。
在一实施例中,如图9所示,肤色检测单元111包括:
图片空间转化单元1111,用于将所述手掌部位图片从RGB空间转化为YCrCb空间,得到转化后图片;
当前连线获取单元1112,用于筛选出所述转化后图片中食指与中指之间的第一谷点,及无名指与小指之间的第二谷点,得到第一谷点与第二谷点之间当前连线的当前长度、及当前连线与X轴之间的当前偏角;
目标点获取单元1113,用于根据当前连线获取当前连线的中点以作为当前中点,根据当前中点获取当前连线的当前垂线,以获取当前垂线中延伸到手掌方向且距离当前中点的距离为二分之一当前长度的目标点;
掌纹感兴趣区域图片获取单元1114,用于将所述转化后图片逆时针旋转当前偏角,并以目标点为中点获取当前长度*当前长度大小的正方形区域,以得到手掌的掌纹感兴趣区域图片。
在本实施例中,在获取了掌纹感兴趣区域后即完成了对手张图片的预处理,接下来根据掌纹感兴趣区域提取掌纹识别向量即可。
对掌纹感兴趣区域提取掌纹识别向量可采用基于图片变换的识别算法,将掌纹感兴趣区域进行傅里叶变换,得到掌纹感兴趣区域的幅频响应。
在掌纹感兴趣区域的幅频响应上作出8个同心圆,将幅频响应分为8个区域,每一区域的所有像素点的灰度值求和,得到对应区域的特征点,将以圆心为起点从里到外的顺序将各特征值进行串接,得到一个8维的列向量,即为掌纹识别向量。
在获取手掌部位图片的图片特征向量时,先获取与手掌部位图片对应的像素矩阵,然后将手掌部位图片对应的像素矩阵作为卷积神经网络模型中输入层的输入,得到多个特征图,之后将特征图输入池化层,得到每一特征图对应的最大值所对应一维向量,最后将每一特征图对应的最大值所对应一维向量输入至全连接层,得到与手掌部位图片对应的掌纹识别向量。
第一目标向量获取单元120,用于将所述掌纹识别向量与预先构建的掌纹特征向量库中各掌纹特征向量进行相似度计算,得到所述掌纹特征向量库中与所述掌纹识别向量的相似度值为最大值的掌纹特征向量,以作为目标掌纹特征向量。
在本实施例中,在计算所述掌纹识别向量与掌纹特征向量库中各掌纹特征向量之间的相似度值时,可以计算两个向量之间的欧式距离或是皮尔逊相似度,从而判断两个向量之间的相似度。
第一标签获取单元130,用于获取所述目标掌纹特征向量对应的用户第一标签,以作为所述手掌部位图片对应的第一标签。
在本实施例中,预先构建的的掌纹特征向量库中存储了多个预先进行了掌纹特征向量提取的特征向量(例如都是8维的列向量),每一特征向量都预先设置了对应的用户第一标签,在有了这些数据基础后,即可判断所述目标掌纹特征向量在掌纹特征向量库中最相似的掌纹特征向量以作为目标掌纹特征向量。
在得到与所述掌纹识别向量的相似度值为最大值的掌纹特征向量以作为目标掌纹特征向量后,获取所述目标掌纹特征向量对应的用户第一标签,即可作为所述手掌部位图片对应的第一标签,例如所述手掌部位图片对应的第一标签为A属性。
目标图片获取单元140,用于接收上传端所上传的面部视频,通过光流法对面部视频进行预处理,获取所述面部视频中的目标图片。
在本实施例中,当通过上传端的摄像头获取了用户的面部视频后,需对其进行微表情分析。具体实施时,可以通过光流法进行微表情分析以得到所述面部视频中的目标图片。
在一实施例中,如图10所示,目标图片获取单元140包括:
速度矢量特征获取单元141,用于获取所述面部视频中各帧图片的各像素点对应的速度矢量特征;
目标图片选取单元142,用于若所述面部视频中存在至少一帧图片的所述速度矢量特征未保持连续变化,将对应图片作为所述面部视频中的目标图片。
在本实施例中,当人的眼睛观察运动物体时,物体的景象在人眼的视网膜上形成一系列连续变化的图像,这一系列连续变化的信息不断“流过”视网膜(即图像平面),好像是一种光的“流”,故称之为光流。光流表达图像的变化,包含目标运动的信息,可用来确定目标的运动。光流三个要素:一是运动速度场,这是形成光流的必要条件;二是带光学特征的部分例如有灰度的象素点,它可以携带运动信息;三是成像投影从场景到图像平面,因而能被观察到。
定义光流以点为基础,具体来说,设(u,v)为图像点(x,y)的光流,则把(x,y,u,v)称为光流点。所有光流点的集合称为光流场。当带光学特性的物体在三维空间运动时,在图像平面上就形成了相应的图像运动场,或称为图像速度场。在理想情况下,光流场对应于运动场。
给图像中的每个像素点赋予一个速度矢量,这样就形成了一个运动矢量场。根据各个像素点的速度矢量特征,可以对图像进行动态分析。如果图像中没有运动目标,则光流矢量在整个图像区域是连续变化的。当图像中有运动物体时(当用户有微表情时,脸部会有运动,相当于运动物体),目标和背景存在着相对运动。运动物体所形成的速度矢量必然和背景的速度矢量有所不同,如此便可以计算出运动物体的位置。通过光流法进行预处理,获取所述面部视频中的目标图片。
第二目标向量获取单元150,用于通过卷积神经网络获取所述目标图片的微表情识别特征向量,将所述微表情识别特征向量与预先构建的微表情特征向量库中各微表情特征向量进行相似度计算,得到所述微表情特征向量库中与所述微表情识别特征向量的相似度值为最大值的微表情特征向量,以作为目标微表情特征向量。
在本实施例中,当获取了面部视频对应的目标图片后,即可通过卷积神经网络获取所述目标图片的微表情识别特征向量,其具体过程参考通过卷积神经网络获取所述掌纹感兴趣区域图片的特征向量。
在计算所述微表情识别特征向量与微表情特征向量库中各微表情特征向量之间的相似度值时,可以计算两个向量之间的欧式距离或是皮尔逊相似度,从而判断两个向量之间的相似度。
在一实施例中,如图11所示,第二目标向量获取单元150包括:
预处理单元151,用于将目标图片进行预处理,得到预处理后图片,及与预处理后图片 对应的图片像素矩阵;其中,将目标图片进行预处理为依序对所述目标图片进行灰度化、边缘检测和二值化处理;
卷积单元152,用于将与预处理后图片对应的图片像素矩阵输入至卷积神经网络模型中输入层,得到特征图;
池化单元153,用于将特征图输入至卷积神经网络模型中池化层,得到与特征图对应的一维向量;
全连接单元154,用于将与特征图对应的一维向量输入至卷积神经网络模型中全连接层,得到与特征图对应的微表情识别特征向量。
在本实施例中,对目标图片依次进行灰度化、边缘检测和二值化处理,即可得到预处理后图片、及与预处理后图片对应的图片像素矩阵。
由于彩色图像包含更多的信息,但是直接对彩色图像进行处理,服务器中的执行速度将会降低,储存空间也会变大。彩色图像的灰度化是图像处理的一种基本的方法,在模式识别领域得到广泛的运用,合理的灰度化将对图像信息的提取和后续处理有很大的帮助,能够节省储存空间,加快处理速度。
边缘检测的方法是考察图像的像素在某个领域内灰度的变化情况,标识数字图像中亮度变化明显的点。图像的边缘检测能够大幅度地减少数据量,并且剔除不相关的信息,保存图像重要的结构属性。用于边缘检测的算子很多,常用的除了有Sobel算子(即索贝尔算子),还有Laplacian边缘检测算子(即拉普拉斯边缘检测算子)、Canny边缘检测算子(即坎尼算子)等。
为了减少噪声的影响,需要对进行边缘检测后的图像进行二值化处理,二值化是对图像进行阈值化的一种类型。根据阈值的选取情况,二值化的方法可分为全局阈值法、动态阈值法和局部阈值法,常用最大类间方差法(也称Otsu算法)进行阈值化,来剔除一些梯度值较小的像素,二值化处理后图像的像素值为0或者255。此时,即可得到预处理后图片,及与预处理后图片对应的图片像素矩阵。
在获取图片的图片特征向量时,先获取与预处理后图片对应的图片像素矩阵,然后将预处理后图片对应的图片像素矩阵作为卷积神经网络模型中输入层的输入,得到特征图,之后将特征图输入池化层,得到特征图对应的最大值所对应的一维向量,最后将特征图对应的最大值所对应的一维向量输入至全连接层,得到与预处理后图片对应的微表情识别特征向量。
第二标签获取单元160,用于获取所述目标微表情特征向量对应的用户第二标签,以作为所述面部视频对应的第二标签。
在本实施例中,在得到与所述微表情识别特征向量的相似度值为最大值的微表情特征向量以作为目标掌纹特征向量后,获取所述目标掌纹特征向量对应的用户第二标签,即可作为所述面部视频对应的第二标签,例如所述面部视频对应的第二标签为B属性。
标签组合获取单元170,用于将所述第一标签与所述第二标签进行组合,得到与所述上传端的用户对应的用户标签组合。
在本实施例中,例如所述第一标签为A属性,所述第二标签为B属性,则将所述第一标签与所述第二标签进行组合得到A属性+B属性,以A属性+B属性作为与所述上传端的用户对应的用户标签组合。
清单发送单元180,用于在预先构建的用户标签库获取与所述用户标签组合的标签相似度值超过预设的相似度阈值的标签组合以作为目标标签组合集,获取与所述目标标签组合集对应的用户以作为推荐用户清单发送至上传端。
在本实施例中,预先构建的用户标签库中针对每一用户均设置了标签或标签组合,例如预先构建的用户标签库中用户1对应的标签为A属性、用户2对应的标签为A属性+B属性、用户3对应的标签为C属性、……、用户N对应的标签为C属性+D属性。其中用户标签组合为A属性+B属性,用户标签库获取与所述用户标签组合的标签相似度值超过预设的相似度阈值的标签组合如A属性(对应用户1)、A属性+B属性(对应用户2)以组成目标标签 组合集,将目标标签组合集中每一目标标签组合对应的用户组成推荐用户清单(如包括用户1和用户2),将推荐用户清单发送至上传端。其中推荐用户清单中每一用户数据至少包括用户名称(即用户姓名)、标签组合、用户基本信息(如包括性别、家庭住址、联系号码等)。基于图片、微表情等多维度作为用户标签的获取来源,能实现对用户更加细粒度的进行划分。而且事先无需对用户进行聚类,只需预先构建用户标签库即可作为目标用户的相似用户的数据基础,降低了数据处理量。
在一实施例中,所述基于图像识别的信息推送装置100还包括:
标签相似度值计算单元,用于获取所述用户标签库中各标签或标签组合与所述用户标签组合之间的字符串编辑距离,以作为所述用户标签库中各标签或标签组合与所述用户标签组合之间的标签相似度值。
在本实施例中,字符串编辑距离就是从一个字符串修改到另一个字符串时,其中编辑单个字符(比如修改、插入、删除)所需要的最少次数。例如,从字符串“kitten”修改为字符串“sitting”只需3次单字符编辑操作,具体如sitten(k→s)、sittin(e→i)、sittin(_→g),因此“kitten”和“sitting”的字符串编辑距离为3。当获取了获取所述用户标签库中各标签或标签组合与所述用户标签组合之间的字符串编辑距离,即可获取所述用户标签库中各标签或标签组合与所述用户标签组合之间的标签相似度值以作为相似用户筛选的数值参考。
该装置实现了基于掌部图片和微表情的多维度特征来获取上传端对应用户的相似用户,无需事先对海量用户进行聚类,降低了数据处理量,而且基于掌部图片和微表情的多维度能实现对用户标签更加细粒度的划分。
上述基于图像识别的信息推送装置可以实现为计算机程序的形式,该计算机程序可以在如图12所示的计算机设备上运行。
请参阅图12,图12是本申请实施例提供的计算机设备的示意性框图。该计算机设备500是服务器,服务器可以是独立的服务器,也可以是多个服务器组成的服务器集群。
参阅图12,该计算机设备500包括通过系统总线501连接的处理器502、存储器和网络接口505,其中,存储器可以包括非易失性存储介质503和内存储器504。
该非易失性存储介质503可存储操作系统5031和计算机程序5032。该计算机程序5032被执行时,可使得处理器502执行基于图像识别的信息推送方法。
该处理器502用于提供计算和控制能力,支撑整个计算机设备500的运行。
该内存储器504为非易失性存储介质503中的计算机程序5032的运行提供环境,该计算机程序5032被处理器502执行时,可使得处理器502执行基于图像识别的信息推送方法。
该网络接口505用于进行网络通信,如提供数据信息的传输等。本领域技术人员可以理解,图12中示出的结构,仅仅是与本申请方案相关的部分结构的框图,并不构成对本申请方案所应用于其上的计算机设备500的限定,具体的计算机设备500可以包括比图中所示更多或更少的部件,或者组合某些部件,或者具有不同的部件布置。
其中,所述处理器502用于运行存储在存储器中的计算机程序5032,以实现本申请实施例公开的基于图像识别的信息推送方法。
本领域技术人员可以理解,图12中示出的计算机设备的实施例并不构成对计算机设备具体构成的限定,在其他实施例中,计算机设备可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。例如,在一些实施例中,计算机设备可以仅包括存储器及处理器,在这样的实施例中,存储器及处理器的结构及功能与图12所示实施例一致,在此不再赘述。
应当理解,在本申请实施例中,处理器502可以是中央处理单元(Central Processing Unit,CPU),该处理器502还可以是其他通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现成可编程门阵列(Field-Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。其中,通用处理器可以是微处理器或者该处理器也可以是任何常规 的处理器等。
在本申请的另一实施例中提供计算机可读存储介质。该计算机可读存储介质可以为非易失性的计算机可读存储介质,也可以为易失性的计算机可读存储介质。该计算机可读存储介质存储有计算机程序,其中计算机程序被处理器执行时实现本申请实施例公开的基于图像识别的信息推送方法。

Claims (20)

  1. 一种基于图像识别的信息推送方法,其中,包括:
    接收上传端所上传的手掌部位图片,通过掌纹识别获取所述手掌部位图片对应的掌纹识别向量;
    将所述掌纹识别向量与预先构建的掌纹特征向量库中各掌纹特征向量进行相似度计算,得到所述掌纹特征向量库中与所述掌纹识别向量的相似度值为最大值的掌纹特征向量,以作为目标掌纹特征向量;
    获取所述目标掌纹特征向量对应的用户第一标签,以作为所述手掌部位图片对应的第一标签;
    接收上传端所上传的面部视频,通过光流法对面部视频进行预处理,获取所述面部视频中的目标图片;
    通过卷积神经网络获取所述目标图片的微表情识别特征向量,将所述微表情识别特征向量与预先构建的微表情特征向量库中各微表情特征向量进行相似度计算,得到所述微表情特征向量库中与所述微表情识别特征向量的相似度值为最大值的微表情特征向量,以作为目标微表情特征向量;
    获取所述目标微表情特征向量对应的用户第二标签,以作为所述面部视频对应的第二标签;
    将所述第一标签与所述第二标签进行组合,得到与所述上传端的用户对应的用户标签组合;以及
    在预先构建的用户标签库获取与所述用户标签组合的标签相似度值超过预设的相似度阈值的标签组合以作为目标标签组合集,获取与所述目标标签组合集对应的用户以作为推荐用户清单发送至上传端。
  2. 根据权利要求1所述的基于图像识别的信息推送方法,其中,所述通过掌纹识别获取所述手掌部位图片对应的掌纹识别向量,包括:
    通过基于肤色检测对所述手掌部位图片进行手掌分割,得到手掌的掌纹感兴趣区域图片;
    通过卷积神经网络获取所述掌纹感兴趣区域图片的特征向量,以作为目标掌纹特征向量。
  3. 根据权利要求2所述的基于图像识别的信息推送方法,其中,所述通过基于肤色检测对所述手掌部位图片进行手掌分割,得到手掌的掌纹感兴趣区域图片,包括:
    将所述手掌部位图片从RGB空间转化为YCrCb空间,得到转化后图片;
    筛选出所述转化后图片中食指与中指之间的第一谷点,及无名指与小指之间的第二谷点,得到第一谷点与第二谷点之间当前连线的当前长度、及当前连线与X轴之间的当前偏角;
    根据当前连线获取当前连线的中点以作为当前中点,根据当前中点获取当前连线的当前垂线,以获取当前垂线中延伸到手掌方向且距离当前中点的距离为二分之一当前长度的目标点;
    将所述转化后图片逆时针旋转当前偏角,并以目标点为中点获取当前长度*当前长度大小的正方形区域,以得到手掌的掌纹感兴趣区域图片。
  4. 根据权利要求1所述的基于图像识别的信息推送方法,其中,所述通过光流法对面部视频进行预处理,获取所述面部视频中的目标图片,包括:
    获取所述面部视频中各帧图片的各像素点对应的速度矢量特征;
    若所述面部视频中存在至少一帧图片的所述速度矢量特征未保持连续变化,将对应图片作为所述面部视频中的目标图片。
  5. 根据权利要求1所述的基于图像识别的信息推送方法,其中,所述通过卷积神经网络获取所述目标图片的微表情识别特征向量,包括:
    将目标图片进行预处理,得到预处理后图片,及与预处理后图片对应的图片像素矩阵;其中,将目标图片进行预处理为依序对所述目标图片进行灰度化、边缘检测和二值化处理;
    将与预处理后图片对应的图片像素矩阵输入至卷积神经网络模型中输入层,得到特征图;
    将特征图输入至卷积神经网络模型中池化层,得到与特征图对应的一维向量;
    将与特征图对应的一维向量输入至卷积神经网络模型中全连接层,得到与特征图对应的微表情识别特征向量。
  6. 根据权利要求1所述的基于图像识别的信息推送方法,其中,所述在预先构建的用户标签库获取与所述用户标签组合的标签相似度值超过预设的相似度阈值的标签组合以作为目标标签组合集之前,还包括:
    获取所述用户标签库中各标签或标签组合与所述用户标签组合之间的字符串编辑距离,以作为所述用户标签库中各标签或标签组合与所述用户标签组合之间的标签相似度值。
  7. 一种基于图像识别的信息推送装置,其中,包括:
    掌纹向量获取单元,用于接收上传端所上传的手掌部位图片,通过掌纹识别获取所述手掌部位图片对应的掌纹识别向量;
    第一目标向量获取单元,用于将所述掌纹识别向量与预先构建的掌纹特征向量库中各掌纹特征向量进行相似度计算,得到所述掌纹特征向量库中与所述掌纹识别向量的相似度值为最大值的掌纹特征向量,以作为目标掌纹特征向量;
    第一标签获取单元,用于获取所述目标掌纹特征向量对应的用户第一标签,以作为所述手掌部位图片对应的第一标签;
    目标图片获取单元,用于接收上传端所上传的面部视频,通过光流法对面部视频进行预处理,获取所述面部视频中的目标图片;
    第二目标向量获取单元,用于通过卷积神经网络获取所述目标图片的微表情识别特征向量,将所述微表情识别特征向量与预先构建的微表情特征向量库中各微表情特征向量进行相似度计算,得到所述微表情特征向量库中与所述微表情识别特征向量的相似度值为最大值的微表情特征向量,以作为目标微表情特征向量;
    第二标签获取单元,用于获取所述目标微表情特征向量对应的用户第二标签,以作为所述面部视频对应的第二标签;
    标签组合获取单元,用于将所述第一标签与所述第二标签进行组合,得到与所述上传端的用户对应的用户标签组合;以及
    清单发送单元,用于在预先构建的用户标签库获取与所述用户标签组合的标签相似度值超过预设的相似度阈值的标签组合以作为目标标签组合集,获取与所述目标标签组合集对应的用户以作为推荐用户清单发送至上传端。
  8. 根据权利要求7所述的基于图像识别的信息推送装置,其中,所述目标图片获取单元,包括:
    速度矢量特征获取单元,用于获取所述面部视频中各帧图片的各像素点对应的速度矢量特征;
    目标图片选取单元,用于若所述面部视频中存在至少一帧图片的所述速度矢量特征未保持连续变化,将对应图片作为所述面部视频中的目标图片。
  9. 一种计算机设备,包括存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机程序,其中,所述处理器执行所述计算机程序时实现如下步骤:
    接收上传端所上传的手掌部位图片,通过掌纹识别获取所述手掌部位图片对应的掌纹识别向量;
    将所述掌纹识别向量与预先构建的掌纹特征向量库中各掌纹特征向量进行相似度计算,得到所述掌纹特征向量库中与所述掌纹识别向量的相似度值为最大值的掌纹特征向量,以作为目标掌纹特征向量;
    获取所述目标掌纹特征向量对应的用户第一标签,以作为所述手掌部位图片对应的第一标签;
    接收上传端所上传的面部视频,通过光流法对面部视频进行预处理,获取所述面部视频 中的目标图片;
    通过卷积神经网络获取所述目标图片的微表情识别特征向量,将所述微表情识别特征向量与预先构建的微表情特征向量库中各微表情特征向量进行相似度计算,得到所述微表情特征向量库中与所述微表情识别特征向量的相似度值为最大值的微表情特征向量,以作为目标微表情特征向量;
    获取所述目标微表情特征向量对应的用户第二标签,以作为所述面部视频对应的第二标签;
    将所述第一标签与所述第二标签进行组合,得到与所述上传端的用户对应的用户标签组合;以及
    在预先构建的用户标签库获取与所述用户标签组合的标签相似度值超过预设的相似度阈值的标签组合以作为目标标签组合集,获取与所述目标标签组合集对应的用户以作为推荐用户清单发送至上传端。
  10. 如权利要求9所述的计算机设备,其中,所述处理器执行所述计算机程序时实现所述通过掌纹识别获取所述手掌部位图片对应的掌纹识别向量的步骤,包括:
    通过基于肤色检测对所述手掌部位图片进行手掌分割,得到手掌的掌纹感兴趣区域图片;
    通过卷积神经网络获取所述掌纹感兴趣区域图片的特征向量,以作为目标掌纹特征向量。
  11. 如权利要求10所述的计算机设备,其中,所述处理器执行所述计算机程序时实现所述通过基于肤色检测对所述手掌部位图片进行手掌分割,得到手掌的掌纹感兴趣区域图片的步骤,包括:
    将所述手掌部位图片从RGB空间转化为YCrCb空间,得到转化后图片;
    筛选出所述转化后图片中食指与中指之间的第一谷点,及无名指与小指之间的第二谷点,得到第一谷点与第二谷点之间当前连线的当前长度、及当前连线与X轴之间的当前偏角;
    根据当前连线获取当前连线的中点以作为当前中点,根据当前中点获取当前连线的当前垂线,以获取当前垂线中延伸到手掌方向且距离当前中点的距离为二分之一当前长度的目标点;
    将所述转化后图片逆时针旋转当前偏角,并以目标点为中点获取当前长度*当前长度大小的正方形区域,以得到手掌的掌纹感兴趣区域图片。
  12. 如权利要求9所述的计算机设备,其中,所述处理器执行所述计算机程序时实现所述通过光流法对面部视频进行预处理,获取所述面部视频中的目标图片的步骤,包括:
    获取所述面部视频中各帧图片的各像素点对应的速度矢量特征;
    若所述面部视频中存在至少一帧图片的所述速度矢量特征未保持连续变化,将对应图片作为所述面部视频中的目标图片。
  13. 如权利要求9所述的计算机设备,其中,所述处理器执行所述计算机程序时实现所述通过卷积神经网络获取所述目标图片的微表情识别特征向量的步骤,包括:
    将目标图片进行预处理,得到预处理后图片,及与预处理后图片对应的图片像素矩阵;其中,将目标图片进行预处理为依序对所述目标图片进行灰度化、边缘检测和二值化处理;
    将与预处理后图片对应的图片像素矩阵输入至卷积神经网络模型中输入层,得到特征图;
    将特征图输入至卷积神经网络模型中池化层,得到与特征图对应的一维向量;
    将与特征图对应的一维向量输入至卷积神经网络模型中全连接层,得到与特征图对应的微表情识别特征向量。
  14. 如权利要求9所述的计算机设备,其中,所述处理器执行所述计算机程序时实现所述在预先构建的用户标签库获取与所述用户标签组合标签相似度值超过预设的相似度阈值的标签组合以作为目标标签组合集的步骤之前,还用于实现如下步骤:
    获取所述用户标签库中各标签或标签组合与所述用户标签组合之间的字符串编辑距离,以作为所述用户标签库中各标签或标签组合与所述用户标签组合之间的标签相似度值。
  15. 一种计算机可读存储介质,其中,所述计算机可读存储介质存储有计算机程序,所述 计算机程序当被处理器执行时使所述处理器执行如下步骤:
    接收上传端所上传的手掌部位图片,通过掌纹识别获取所述手掌部位图片对应的掌纹识别向量;
    将所述掌纹识别向量与预先构建的掌纹特征向量库中各掌纹特征向量进行相似度计算,得到所述掌纹特征向量库中与所述掌纹识别向量的相似度值为最大值的掌纹特征向量,以作为目标掌纹特征向量;
    获取所述目标掌纹特征向量对应的用户第一标签,以作为所述手掌部位图片对应的第一标签;
    接收上传端所上传的面部视频,通过光流法对面部视频进行预处理,获取所述面部视频中的目标图片;
    通过卷积神经网络获取所述目标图片的微表情识别特征向量,将所述微表情识别特征向量与预先构建的微表情特征向量库中各微表情特征向量进行相似度计算,得到所述微表情特征向量库中与所述微表情识别特征向量的相似度值为最大值的微表情特征向量,以作为目标微表情特征向量;
    获取所述目标微表情特征向量对应的用户第二标签,以作为所述面部视频对应的第二标签;
    将所述第一标签与所述第二标签进行组合,得到与所述上传端的用户对应的用户标签组合;以及
    在预先构建的用户标签库获取与所述用户标签组合的标签相似度值超过预设的相似度阈值的标签组合以作为目标标签组合集,获取与所述目标标签组合集对应的用户以作为推荐用户清单发送至上传端。
  16. 如权利要求15所述的存储介质,其中,所述计算机程序当被处理器执行时使所述处理器执行所述通过掌纹识别获取所述手掌部位图片对应的掌纹识别向量的步骤,包括:
    通过基于肤色检测对所述手掌部位图片进行手掌分割,得到手掌的掌纹感兴趣区域图片;
    通过卷积神经网络获取所述掌纹感兴趣区域图片的特征向量,以作为目标掌纹特征向量。
  17. 如权利要求16所述的存储介质,其中,所述计算机程序当被处理器执行时使所述处理器执行所述通过基于肤色检测对所述手掌部位图片进行手掌分割,得到手掌的掌纹感兴趣区域图片的步骤,包括:
    将所述手掌部位图片从RGB空间转化为YCrCb空间,得到转化后图片;
    筛选出所述转化后图片中食指与中指之间的第一谷点,及无名指与小指之间的第二谷点,得到第一谷点与第二谷点之间当前连线的当前长度、及当前连线与X轴之间的当前偏角;
    根据当前连线获取当前连线的中点以作为当前中点,根据当前中点获取当前连线的当前垂线,以获取当前垂线中延伸到手掌方向且距离当前中点的距离为二分之一当前长度的目标点;
    将所述转化后图片逆时针旋转当前偏角,并以目标点为中点获取当前长度*当前长度大小的正方形区域,以得到手掌的掌纹感兴趣区域图片。
  18. 如权利要求15所述的存储介质,其中,所述计算机程序当被处理器执行时使所述处理器执行所述通过光流法对面部视频进行预处理,获取所述面部视频中的目标图片的步骤,包括:
    获取所述面部视频中各帧图片的各像素点对应的速度矢量特征;
    若所述面部视频中存在至少一帧图片的所述速度矢量特征未保持连续变化,将对应图片作为所述面部视频中的目标图片。
  19. 如权利要求15所述的存储介质,其中,所述计算机程序当被处理器执行时使所述处理器执行所述通过卷积神经网络获取所述目标图片的微表情识别特征向量的步骤,包括:
    将目标图片进行预处理,得到预处理后图片,及与预处理后图片对应的图片像素矩阵;其中,将目标图片进行预处理为依序对所述目标图片进行灰度化、边缘检测和二值化处理;
    将与预处理后图片对应的图片像素矩阵输入至卷积神经网络模型中输入层,得到特征图;
    将特征图输入至卷积神经网络模型中池化层,得到与特征图对应的一维向量;
    将与特征图对应的一维向量输入至卷积神经网络模型中全连接层,得到与特征图对应的微表情识别特征向量。
  20. 如权利要求15所述的存储介质,其中,所述计算机程序当被处理器执行时使所述处理器执行所述在预先构建的用户标签库获取与所述用户标签组合标签相似度值超过预设的相似度阈值的标签组合以作为目标标签组合集的步骤之前,还用于执行如下步骤:
    获取所述用户标签库中各标签或标签组合与所述用户标签组合之间的字符串编辑距离,以作为所述用户标签库中各标签或标签组合与所述用户标签组合之间的标签相似度值。
PCT/CN2020/087489 2019-08-15 2020-04-28 基于图像识别的信息推送方法、装置、及计算机设备 WO2021027329A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910752720.6A CN110765832A (zh) 2019-08-15 2019-08-15 基于图像识别的信息推送方法、装置、及计算机设备
CN201910752720.6 2019-08-15

Publications (1)

Publication Number Publication Date
WO2021027329A1 true WO2021027329A1 (zh) 2021-02-18

Family

ID=69329330

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/087489 WO2021027329A1 (zh) 2019-08-15 2020-04-28 基于图像识别的信息推送方法、装置、及计算机设备

Country Status (2)

Country Link
CN (1) CN110765832A (zh)
WO (1) WO2021027329A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113239233A (zh) * 2021-05-31 2021-08-10 平安科技(深圳)有限公司 个性化图像推荐方法、装置、设备及存储介质
CN115129825A (zh) * 2022-08-25 2022-09-30 广东知得失网络科技有限公司 一种专利信息推送方法及系统
WO2024183805A1 (zh) * 2023-03-08 2024-09-12 北京有竹居网络技术有限公司 标签确定方法、信息推荐方法、装置、设备及存储介质

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110765832A (zh) * 2019-08-15 2020-02-07 深圳壹账通智能科技有限公司 基于图像识别的信息推送方法、装置、及计算机设备

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2010049300A (ja) * 2008-08-19 2010-03-04 Olympus Imaging Corp 画像検索システム、画像検索方法、および画像検索用プログラム
CN108305180A (zh) * 2017-01-13 2018-07-20 中国移动通信有限公司研究院 一种好友推荐方法及装置
CN109766917A (zh) * 2018-12-18 2019-05-17 深圳壹账通智能科技有限公司 面试视频数据处理方法、装置、计算机设备和存储介质
CN109785045A (zh) * 2018-12-14 2019-05-21 深圳壹账通智能科技有限公司 一种基于用户行为数据的推送方法和装置
CN110765832A (zh) * 2019-08-15 2020-02-07 深圳壹账通智能科技有限公司 基于图像识别的信息推送方法、装置、及计算机设备

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7054470B2 (en) * 1999-12-02 2006-05-30 International Business Machines Corporation System and method for distortion characterization in fingerprint and palm-print image sequences and using this distortion as a behavioral biometrics
US11645835B2 (en) * 2017-08-30 2023-05-09 Board Of Regents, The University Of Texas System Hypercomplex deep learning methods, architectures, and apparatus for multimodal small, medium, and large-scale data representation, analysis, and applications
CN109857893A (zh) * 2019-01-16 2019-06-07 平安科技(深圳)有限公司 图片检索方法、装置、计算机设备及存储介质
CN109785066A (zh) * 2019-01-17 2019-05-21 深圳壹账通智能科技有限公司 基于微表情的产品推荐方法、装置、设备及存储介质

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2010049300A (ja) * 2008-08-19 2010-03-04 Olympus Imaging Corp 画像検索システム、画像検索方法、および画像検索用プログラム
CN108305180A (zh) * 2017-01-13 2018-07-20 中国移动通信有限公司研究院 一种好友推荐方法及装置
CN109785045A (zh) * 2018-12-14 2019-05-21 深圳壹账通智能科技有限公司 一种基于用户行为数据的推送方法和装置
CN109766917A (zh) * 2018-12-18 2019-05-17 深圳壹账通智能科技有限公司 面试视频数据处理方法、装置、计算机设备和存储介质
CN110765832A (zh) * 2019-08-15 2020-02-07 深圳壹账通智能科技有限公司 基于图像识别的信息推送方法、装置、及计算机设备

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113239233A (zh) * 2021-05-31 2021-08-10 平安科技(深圳)有限公司 个性化图像推荐方法、装置、设备及存储介质
CN115129825A (zh) * 2022-08-25 2022-09-30 广东知得失网络科技有限公司 一种专利信息推送方法及系统
WO2024183805A1 (zh) * 2023-03-08 2024-09-12 北京有竹居网络技术有限公司 标签确定方法、信息推荐方法、装置、设备及存储介质

Also Published As

Publication number Publication date
CN110765832A (zh) 2020-02-07

Similar Documents

Publication Publication Date Title
AU2022252799B2 (en) System and method for appearance search
WO2021027329A1 (zh) 基于图像识别的信息推送方法、装置、及计算机设备
US11830230B2 (en) Living body detection method based on facial recognition, and electronic device and storage medium
US10846554B2 (en) Hash-based appearance search
US10949702B2 (en) System and a method for semantic level image retrieval
US20180157916A1 (en) System and method for cnn layer sharing
WO2021027325A1 (zh) 视频相似度获取方法、装置、计算机设备及存储介质
Fang et al. Deep3DSaliency: Deep stereoscopic video saliency detection model by 3D convolutional networks
Ge et al. Co-saliency detection via inter and intra saliency propagation
WO2017181892A1 (zh) 前景分割方法及装置
JP4098021B2 (ja) シーン識別方法および装置ならびにプログラム
US20230326173A1 (en) Image processing method and apparatus, and computer-readable storage medium
CN111079613B (zh) 姿势识别方法和装置、电子设备及存储介质
CN110516731B (zh) 一种基于深度学习的视觉里程计特征点检测方法及系统
Wang et al. Robust pixelwise saliency detection via progressive graph rankings
US20210034915A1 (en) Method and apparatus for object re-identification
Han et al. Saliency detection method using hypergraphs on adaptive multiscales
Kalboussi et al. A spatiotemporal model for video saliency detection
Jia et al. An adaptive framework for saliency detection
Jang et al. Exposed body component-based harmful image detection in ubiquitous sensor data
Khryashchev et al. Improving audience analysis system using face image quality assessment
Lin et al. Advanced Superpixel-Based Features and Machine Learning Based Saliency Detection
Simões et al. A fast and accurate algorithm for detecting and tracking moving hand gestures
Cheng Application of Deep Learning to Three Problems in Image Analysis and Image Processing: Automatic Image Cropping, Remote Heart Rate Estimation and Quadrilateral Detection
CN113971671A (zh) 实例分割方法、装置、电子设备及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20853441

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20853441

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 04.08.2022)

122 Ep: pct application non-entry in european phase

Ref document number: 20853441

Country of ref document: EP

Kind code of ref document: A1