CN116977404A - Target identification method, device, equipment and storage medium - Google Patents

Target identification method, device, equipment and storage medium Download PDF

Info

Publication number
CN116977404A
CN116977404A CN202211530911.6A CN202211530911A CN116977404A CN 116977404 A CN116977404 A CN 116977404A CN 202211530911 A CN202211530911 A CN 202211530911A CN 116977404 A CN116977404 A CN 116977404A
Authority
CN
China
Prior art keywords
target
candidate
character
image frame
area
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211530911.6A
Other languages
Chinese (zh)
Inventor
徐东
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN202211530911.6A priority Critical patent/CN116977404A/en
Publication of CN116977404A publication Critical patent/CN116977404A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • AHUMAN NECESSITIES
    • A63SPORTS; GAMES; AMUSEMENTS
    • A63FCARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
    • A63F13/00Video games, i.e. games using an electronically generated display having two or more dimensions
    • A63F13/80Special adaptations for executing a specific game genre or game mode
    • A63F13/822Strategy games; Role-playing games

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

The embodiment of the application provides a method, a device, equipment and a storage medium for target identification, wherein the method comprises the following steps: determining a target character area in an initial image frame, wherein the target character area is an area where a target character is located in the initial image frame; selecting a first candidate character region set from a first image frame, wherein the first image frame is the next frame of the initial image frame; determining phase differences of each candidate role area in the target role area and the first candidate role area set in the Laplace domain to obtain a plurality of phase differences; and determining first position information of the target character in the first image frame according to the minimum phase difference in the plurality of phase differences, the center point of the target character area and the center point of the candidate character area corresponding to the minimum phase difference. By adopting the embodiment of the application, the accuracy of target identification can be improved, thereby improving the effect of target identification.

Description

Target identification method, device, equipment and storage medium
Technical Field
The present application relates to computer technologies, and in particular, to a method, an apparatus, a device, and a storage medium for target identification.
Background
With the development of internet technology, electronic games are a special application program, which refers to all interactive games running on an electronic device platform. Among them, the Massively Multiplayer Online Role Playing Game (MMORPG) is a role Playing Game in which a plurality of players participate, each player can play a virtual role and control the role to perform various activities.
In order to better provide a player with a character response and a game experience, identification of the movement trajectories of the game characters is required. At present, the color can be identified by a frame difference method, namely, the target detection is carried out by utilizing the difference between the background and the target in the adjacent frame image sequence, however, in a dynamic scene, when the object motion is slower, the detection omission phenomenon is easy to occur, the overlapping part of the target can not be detected, the phenomena such as holes and the like are caused, and the effect of target identification is poor.
Therefore, how to improve the effect of target recognition is a technical problem to be solved currently.
Disclosure of Invention
The embodiment of the application provides a target identification method, a device, terminal equipment and a storage medium, which can improve the accuracy of target identification, thereby improving the target identification effect.
In a first aspect, an embodiment of the present application provides a method for identifying a target, where the method includes:
determining a target character area in an initial image frame, wherein the target character area is an area where a target character is located in the initial image frame;
selecting a first candidate character region set from a first image frame, wherein the first image frame is the next frame of the initial image frame;
determining phase differences of each candidate role area in the target role area and the first candidate role area set in the Laplace domain to obtain a plurality of phase differences;
and determining first position information of the target character in the first image frame according to the minimum phase difference in the plurality of phase differences, the center point of the target character area and the center point of the candidate character area corresponding to the minimum phase difference.
In a second aspect, an embodiment of the present application provides an object recognition apparatus, including:
a determining unit configured to determine a target character area in an initial image frame, the target character area being an area in which a target character is located in the initial image frame;
a selecting unit, configured to select a first candidate character region set from a first image frame, where the first image frame is a frame next to the initial image frame;
The determining unit is further configured to determine a phase difference of each candidate character region in the target character region and the first candidate character region set in the laplace domain, so as to obtain a plurality of phase differences;
the determining unit is further configured to determine first position information of the target character in the first image frame according to a minimum phase difference among the plurality of phase differences, a center point of the target character region, and a center point of a candidate character region corresponding to the minimum phase difference.
In a third aspect, an embodiment of the present application provides an electronic device, including a processor, a communication interface, and a memory, where the processor, the communication interface, and the memory are connected to each other, where the memory stores executable program code, and the processor is configured to invoke the executable program code to perform the method for identifying an object according to the first aspect.
In a fourth aspect, an embodiment of the present application further provides a computer readable storage medium, where instructions are stored, which when executed on a computer, cause the computer to perform the method for object recognition according to the first aspect.
In a fifth aspect, embodiments of the present application also provide a computer program product or computer program comprising computer instructions stored on a computer readable storage medium. The processor of the terminal device reads the computer instructions from the computer readable storage medium, and the processor executes the computer instructions, so that the terminal device performs the method of object recognition described in the first aspect.
In the embodiment of the application, the target role area where the target in the initial image frame is located is determined, a first candidate role area set is selected from the next frame of the initial image frame, the phase difference of each candidate role area in the target role area and the first candidate role area set in the Laplacian domain is further determined, a plurality of phase differences are obtained, and then the first position information of the target role in the first image frame is determined according to the minimum phase difference, the center point of the target role area and the center point of the candidate role area corresponding to the minimum phase difference. Therefore, the image information of the target role in the initial image frame and the image information of the candidate role areas in the next frame are converted into the Laplace domain for operation, namely, the translation conversion of the target in the image domain is converted into the phase change of the Laplace domain, so that the position information of the next frame of the target can be determined, the target identification is more accurate, and the effect of the target identification is improved.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings required for the description of the embodiments will be briefly described, and it is obvious that the drawings in the following description are some embodiments of the present application, and other drawings may be obtained according to the drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic diagram of an architecture of a target recognition system according to an embodiment of the present application;
FIG. 2 is a flow chart of a method for object recognition according to an embodiment of the present application;
FIG. 3 is an interface schematic diagram of a method for object recognition according to an embodiment of the present application;
FIG. 4 is another interface diagram of a method for object recognition according to an embodiment of the present application;
FIG. 5 is a schematic diagram of another interface of a method for object recognition according to an embodiment of the present application;
FIG. 6 is a timing diagram of a method for object recognition according to an embodiment of the present application;
fig. 7 is a schematic structural diagram of an object recognition device according to an embodiment of the present application;
fig. 8 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
The following description of the embodiments of the present application will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
Before describing embodiments of the present application in further detail, the terms and terminology involved in the embodiments of the present application will be described, and the terms and terminology involved in the embodiments of the present application will be used in the following explanation.
1. Laplace transform
The laplace transform, which may also be referred to as a laplace transform, is a mathematical integration transform, which is a linear transform that converts a function with a real number t (t is equal to or greater than 0) as a parameter into a complex number s. The time domain problem is converted into the problem of the Laplace domain through mathematical transformation, so that the high-order calculus equation of the time domain can be converted into the algebraic equation of the Laplace domain for solving.
In the present application, the time domain may correspond to an image domain (may also be referred to as an image space), the image domain may be an image space in which a target character is located, the laplace domain may be transformation information obtained by subjecting an original image of the image domain to laplace transformation, and the transformation information may include phase information. The operation of performing the laplace transform on the original image in the image domain may be referred to as a forward transform, and the operation of transforming the obtained transformation information in the laplace domain into the image in the image domain may be referred to as an inverse transform. It can be understood that the original image in the image domain is subjected to the laplace transform process, and the obtained transform information is lossless with respect to the information included in the image information of the original image, which is equivalent to the laplace transform process having the characteristic of "translational invariance". As can be seen from the positive inverse transform formula, the phase difference affects the magnitude of the final f (t). Therefore, the offset change of the target character can be converted into the Laplace domain to perform operation, namely, the offset information of the target character is determined based on the magnitude of the phase.
Specifically, a set of formulas for the laplace transform and the inverse transform may be as shown in formulas 1 and 2:
wherein, formula 1 is the Laplace forward transform, defining the definition of the Laplace transform formula F(s) of the function F (t) in the interval [0, ]. F(s) is called an image function of F (t), F (t) is called a primary function of F(s), F(s) is a function of complex variable s, e -st Is a convergence factor. In the embodiment of the application, F (t) may represent an original image of an image domain, F(s) may represent transformation information of a laplace domain, and s may represent a phase.
Wherein, formula 2 is the definition of inverse Laplace transform, i.e. the operation of the original function is calculated from the Laplace transform F(s) of the known function F (t). In the embodiment of the application, f (t) may represent an original image of an image domain, or may represent image information of the image domain obtained by inverse transformation according to transformation information of a laplace domain.
2. Phase of
Phase (phase) is the position in its cycle of a particular moment for a wave: a scale of whether at a peak, trough, or some point in between. The phase describes a measure of the change in waveform of a signal, usually in degrees (angles), also called phase angle. Coordinates in the x-axis and the y-axis can be determined in the spherical coordinate system based on the radius and the phase angle.
In the application, the phase difference can be obtained by carrying out the Laplacian transformation on the phase difference obtained by the image of the target character region of the image domain and each candidate character region in the candidate character region set, and the distance of the target character offset can be further determined according to the phase angle and the radius corresponding to the phase difference.
The embodiment of the application provides a target recognition scheme which can be applied to various target recognition applications or target recognition platforms, wherein the target recognition application or target recognition system refers to an application or system for recognizing a provided target in continuous image frames. Specifically, the target recognition application or the target recognition system may determine the target character area in the initial image frame, select the first candidate character area set from the first image frame, further determine the phase difference of each candidate character area in the target character area and the first candidate character area set in the laplace domain to obtain a plurality of phase differences, and determine the first position information of the target character in the first image frame according to the minimum phase difference in the plurality of phase differences, the center point of the target character area, and the center point of the candidate character area corresponding to the minimum phase difference. Therefore, the displacement of the target character offset in the image domain can be accurately estimated according to the phase change by converting the target character offset in the image domain into the phase change in the Laplace domain, the accuracy of target identification is improved, and the effect of target identification is improved.
The signaling processing scheme provided by the embodiment of the application relates to artificial intelligence, machine learning, computer vision and other technologies, wherein:
artificial intelligence (Artificial Intelligence, AI) is a theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and extend human intelligence, sense the environment, acquire knowledge and use the knowledge to obtain optimal results. In other words, artificial intelligence is an integrated technology of computer science that attempts to understand the essence of intelligence and to produce a new intelligent machine that can react in a similar way to human intelligence. Artificial intelligence, i.e. research on design principles and implementation methods of various intelligent machines, enables the machines to have functions of sensing, reasoning and decision. The artificial intelligence technology is a comprehensive subject, and relates to the technology with wide fields, namely the technology with a hardware level and the technology with a software level. Artificial intelligence infrastructure technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operating (interactive) systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning (deep learning) and other directions.
Machine Learning (ML) is a multi-domain interdisciplinary, involving multiple disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory, etc. It is specially studied how a computer simulates or implements learning behavior of a human to acquire new knowledge or skills, and reorganizes existing knowledge structures to continuously improve own performance. Machine learning is the core of artificial intelligence, a fundamental approach to letting computers have intelligence, which is applied throughout various areas of artificial intelligence. Machine Learning and Deep Learning (DL) generally includes techniques such as artificial neural networks, confidence networks, reinforcement Learning, transfer Learning, induction Learning, teaching Learning, and the like.
Computer Vision (CV) is a science of studying how to "look" a machine, and more specifically, to replace a human eye with a camera and a Computer to perform machine Vision such as recognition and measurement on a target, and further perform graphic processing to make the Computer process an image more suitable for human eye observation or transmission to an instrument for detection. As a scientific discipline, computer vision research-related theory and technology has attempted to build artificial intelligence systems that can acquire information from images or multidimensional data. Computer vision techniques typically include image processing, image recognition, image semantic understanding, image retrieval, text recognition (optical character recognition, OCR), video processing, video semantic understanding, video content/behavior recognition, three-dimensional object reconstruction, 3D techniques, virtual reality, augmented reality, synchronous positioning, and map construction, among others, as well as common biometric recognition techniques such as face recognition, fingerprint recognition, and others.
Referring to fig. 1, fig. 1 is a schematic architecture diagram of a target recognition system according to an embodiment of the present application, where the target recognition system may include a terminal device, for example, a terminal device 101 of a user, and the target recognition system may further include two electronic devices, that is, a target recognition device 102 and a location information verification device 103. The target identifying device 102 may be directly or indirectly connected to the plurality of terminal devices through a wired or wireless manner, and the location information verifying device 103 may directly or indirectly connect to the target identifying device 102 through a wired or wireless manner. Alternatively, the target recognition device 102 and the location information verification device 103 may be the same electronic device or two different electronic devices, which is not limited in the present application. It should be noted that the number and the form of the devices shown in fig. 1 are used as examples, and are not limited to the embodiments of the present application, and the object recognition system may include only one terminal device in practical application, such as a terminal device of one user, or may include terminal devices of more than three users in practical application. The object recognition system may in practice also comprise at least two object recognition devices. The embodiment of the present application is drawn and explained taking three user terminal devices 101 and one electronic device (i.e., the object recognition device 102 and the position information verification device 103 are the same electronic device) as an example.
As shown in fig. 1, three terminal devices in the terminal devices 101 of the user may be terminal devices of three different users, the object recognition device 102 may be an electronic device that provides a game interface of the MMORPG, and the user may play a virtual character in the game provided by the object recognition device 102 through the terminal devices 101 of the user and control the character to perform various movements, complete tasks, and so on. In order to enhance the game experience of the user, so that the game response is more accurate and smooth, the target recognition device 102 can recognize the role played by the user in the game picture, determine the position information of the role in each image frame, further determine the movement track of the role, and further interact with the user according to the position of the role. The target recognition device 102 may determine an area in which the target character is located in the initial image frame, i.e., a target character area, and further select a first candidate character area set from a first image frame, where the first image frame is a next frame of the initial image frame. And then the target recognition device 102 determines the phase difference of each candidate character region in the target character region and the first candidate character region set in the laplace domain to obtain a plurality of phase differences, and then the target recognition device 102 determines the first position information of the target character in the first image frame according to the minimum phase difference in the plurality of phase differences, the center point of the target character region and the center point of the candidate character region corresponding to the minimum phase difference.
Further, after the obtained first position information, the position information verification apparatus 103 may perform validity verification on the first position information, and in the case where the verification results in that the first position information is valid position information, the first position information may be regarded as the position of the target character in the first image frame. Otherwise, if the first position information obtained by verification is invalid position information, the second candidate character region set may be selected from the first image frame, and the target character is identified again by the target identification device 102 until the first position information of the target character in the first image frame is obtained. Further, the target recognition device 102 can recognize the position information of the target character in each image frame, and can obtain the motion trail of the target character.
The terminal device (such as the terminal device 101 of the user) may be, but not limited to, a smart phone, a tablet computer, a notebook computer, a desktop computer, a smart speaker, a smart watch, and the like. The object recognition device 102 and the location information verification device 103 may be a smart phone, a tablet computer, a notebook computer, a desktop computer, a smart speaker, a smart watch, etc.; the target identifying device 102 and the location information verifying device 103 may be servers, for example, independent physical servers, a server cluster or a distributed system formed by a plurality of physical servers, or cloud servers providing cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, content delivery networks (Content Delivery Network, CDN), and basic cloud computing services such as big data and artificial intelligence platforms.
Through the target recognition system, the target character area where the target in the initial image frame is located is determined, a first candidate character area set is selected from the next frame of the initial image frame, further, the phase difference of each candidate character area in the target character area and the first candidate character area set in the Laplacian domain is determined, a plurality of phase differences are obtained, and then the first position information of the target character in the first image frame is determined according to the minimum phase difference, the center point of the target character area and the center point of the candidate character area corresponding to the minimum phase difference. Therefore, the image information of the target role in the initial image frame and the image information of the candidate role areas in the next frame are converted into the Laplace domain for operation, namely, the translation conversion of the target in the image domain is converted into the phase change of the Laplace domain, so that the position information of the next frame of the target can be determined, the target identification is more accurate, and the effect of the target identification is improved.
In one implementation, the initial image frame, the first image frame, the target character region, the first candidate character region set, the phase difference, and the first position information may be stored in a blockchain, so that the initial image frame, the first image frame, the target character region, the first candidate character region set, the phase difference, and the first position information may be prevented from being tampered with. The block chain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, a consensus mechanism, an encryption algorithm and the like, is essentially a decentralised database, and is a series of data blocks which are generated by correlation by using a cryptography method, and each data block contains information of a batch of network transactions and is used for verifying the validity (anti-counterfeiting) of the information and generating a next block.
It may be understood that, the content pushing system described in the embodiment of the present application is for more clearly describing the technical solution of the embodiment of the present application, and does not constitute a limitation on the technical solution provided by the embodiment of the present application, and those skilled in the art can know that, with the evolution of the system architecture and the appearance of a new service scenario, the technical solution provided by the embodiment of the present application is equally applicable to similar technical problems.
Based on the above-mentioned object recognition scheme and the object recognition system, the embodiment of the present application provides a method for object recognition, and the image recognition method described in the embodiment of the present application may be performed by an electronic device, which may be the object recognition device 102 in the object recognition system shown in fig. 1. If the target identification device is a server, it may be a dedicated server or some internet application server, through which not only the relevant steps of the embodiment of the present application may be executed, but also other services may be provided. Referring to fig. 2, fig. 2 is a flowchart of an image recognition method according to an embodiment of the present application, and the content pushing method includes steps S201 to S204 as follows:
S201, a target character area in the initial image frame is determined.
In the embodiment of the present application, the initial image frame is a first frame for identifying the object, and the initial image may be, for example, an initial game picture in a game or an initial frame in which a certain game character appears in a game interface. The object may be a character, i.e. a target character, of a certain character or animal or other form in the initial image frame, and may be, for example, a certain game character in the game interface, for example, a certain character in the game. The target character area may be an area in which the target character is located in the initial image frame, i.e., the target character area is an area including the target character in the initial image frame. Alternatively, the target character area may be an area with any shape, and for convenience of description, the embodiment of the present application uses the target character area as a rectangular area for drawing and explaining. The determined size of the target character area may be smaller than the size of the initial image frame, so that the amount of calculation of the subsequent target recognition can be reduced. For example, in a game interface including a plurality of game characters, the target character region may include a target character, and may further include game scene information, i.e., background information of the target character.
It should be noted that, the embodiment of the present application is explained by taking the game field as an example, that is, the identification of the target character in the plurality of image frames included in the game interface is described by taking the example, and the embodiment of the present application may also be applied to the identification of the target in the video in practical application, that is, the identification of the target in the continuous image frames, and may be specifically applied to the scenes such as the target monitoring and the target identification.
In one possible implementation, determining the target character area in the initial image frame may be automatically recognized by the target recognition device, and in particular, the target recognition device recognizes the target character area in the initial image frame by the target recognition model by inputting the initial image frame into the target recognition model, for example, may be that the target recognition model outputs the target character area. The target recognition model may be a pre-trained model that recognizes for characters in the game. The object recognition model may be, for example, a convolutional neural network (Convolutional Neural Networks, CNN) or other network, such as a browse once (you only look once, YOLO) model, which is not limited by the present application. Alternatively, the target characters may be all characters in the initial image frame, or characters specified by the user in the initial image frame, or characters of a certain class in the initial image frame, for example, player characters, that is, non-player characters (NPCs) are not identified.
In another possible implementation, determining the target character area in the initial image frame may also be manually set by the user, and the user may manually select the target character, for example, the target recognition device may receive user input including the target character area of the target character, that is, the user may frame an area as the target character area, and the user frame an area may be an area of any shape, which is not limited in this application. The user may further input location information of a center point of the target character, such as coordinates of the center point, and the target recognition device may determine the target character area with the center point according to size information and shape information of the preset target character area. If there are a plurality of target characters in the initial image frame, the target character areas corresponding to each of the plurality of target characters may overlap. Alternatively, the identification of the marked target character areas of the different target characters may be different, for example, the different target characters may be identified by different colored identification boxes to distinguish.
Further, the target recognition device may select a first candidate character region set from the first image frame after determining the target character region in the initial image frame, and determine first position information of the target character in the first image frame according to the target character region and each candidate character region in the first candidate character region set.
S202, selecting a first candidate character area set from the first image frame.
In the embodiment of the present application, the first image frame is the next frame of the initial image frame, and it can be understood that, in order to acquire the activity performed by the user to operate the target character, the position information of the target character in each frame can be identified. Thus, the position information of the target character in the first image frame can be identified for the next frame of the initial image frame. The first set of candidate character areas is a plurality of candidate character areas including the target character. The candidate character region may be a region of a position where the target character, which is a candidate, is located, may exist in any selected candidate corner region, and the target character may not exist in any selected candidate corner region. The candidate character area may also be referred to as a "template" because of the need to process the candidate character area. It is to be understood that "candidate character area", "template", "target transfer template", and the like are only words used in the present embodiment, and the meaning of the representation thereof has been described in the present embodiment, and the names thereof are not intended to limit the present embodiment in any way.
Referring to fig. 3 together, fig. 3 is an interface schematic diagram of a target recognition method according to an embodiment of the present application, where the interface may be a game interface of a certain MMORPG game, and the game interface may include character information, scene information, target character position information, other character information, dialogue information, and operation information, as shown in a game interface on the upper side of fig. 3. By way of example, the character information may be in the upper left corner of the game interface, the scene information may be a background image of the game interface, and game operation information included in the scene may be included in the background image, such as "move flowers, seawater (lowest 24 levels) |non-throwable rod", "fish in water", and "starfish, snapper, oyster, octopus, sea cucumber, globefish; the grade is not enough: sardine, flatfish silk, mountain sea destroy, grouper, and Ming Yuzi). The dialogue information included in the game interface may be announcement information at the bottom and middle of the game interface. The game interface may also include operation information, which may be a plurality of operation icons, each icon corresponding to a character's activity, such as an operation in the upper right corner of the game interface, or an operation control for operating a game character, as shown in the lower left corner of the game interface.
The target character position information included in the game interface may be a target character in the central position of fig. 3, and identification information for identifying information such as the name and the style of the target character is further included above the target character. At least one other role information may also be included in the game interface, which may include other roles played by other users, as well as NPCs. The game interface on the upper side of fig. 3 may be an initial image frame and the game interface on the lower side of fig. 3 may be the next frame of the initial image frame, i.e., the first image frame. After the user operates the target character to move, the target character moves leftward from the dotted line position to the solid line position as shown in the lower game interface of fig. 3. The game interface includes a target character area 31. Further, if continued identification of the target is desired, a target character area 32 may also be included in the first image frame. The target character area 32 may be one of a first set of candidate character areas selected in the first image frame.
In one possible implementation, the selecting of the first candidate character region set from the first image frame may be selected by the target recognition device, and since the range of movement of the target character is not too large between two adjacent image frames, a plurality of candidate character regions may be selected from around the target character as the first candidate character region set. For example, the target recognition device may determine location information of a center point of the target character region, determine location information of a center point of the candidate character region according to the center point and the distance threshold, select N candidate character regions according to the location information of the center point of the candidate character region and size information and shape of the candidate character region, N being an integer greater than 1, and use the N candidate character regions as the first candidate character region set. Wherein the distance between the center point of the target character area and the center point of each candidate character area is less than the distance threshold. Alternatively, the distance between the center point of the target character area and the center point of each candidate character area may be a fixed distance, which is not limited by the present application.
In another possible implementation, the target recognition device may also determine candidate character areas based on the scene information, the activity of the target character, and the like. For example, the target recognition device may acquire activity information of the target character, such as a task performed by the target character, and acquire scene information, such as an active region and an active route of the target character, so as to determine candidate character regions according to a character performed by the target character, the active region of the target character, and the route of the target character. For example, if a task of a certain target character is to acquire a certain prop in a river, the target recognition device may select N candidate character areas with a distance from the center point of the target character area smaller than a distance threshold from the river area as candidate character areas of the target character. In still another example, if a certain target character needs to be transferred to another location from a certain transfer point, the target recognition apparatus may select N candidate character areas from the areas between the current location information of the target character and the location of the transfer point.
Referring to fig. 4 together, fig. 4 is another interface schematic diagram of a method for identifying a target according to an embodiment of the present application, where, as shown in the game interface in fig. 4, the position of the target character in the initial image frame is a dotted line position, and the position of the target character in the first image frame may be a solid line. Wherein a first set of candidate character areas encompassing the target character may be as shown by gray filled area 41 in fig. 4, and a plurality of candidate character areas may be included in the first set of candidate character areas as shown by a plurality of gray image areas in fig. 4. The candidate character areas in fig. 4 are only examples, and other candidate character areas may be included. Wherein in the first image frame, the target character may be included in a certain candidate angular region.
In one possible implementation, a reference character region is further included in the initial image frame, where the reference character region is a region in the initial image frame where the reference character is located, and the reference character is different from the target character. Wherein the reference character may be a character other than the target character in the game interface (e.g., the initial image frame and the first image frame). Specifically, the target recognition device may determine a distance between a center point of the target character region and a center point of a reference character region of the reference character, and select a third candidate character region set corresponding to the reference character from the first image frame if the distance is less than a distance threshold. The target recognition device may select a first candidate character region set from the image frame, and may specifically include selecting a fourth candidate character region set corresponding to the target character from the first image frame, and a sum of third candidate character region sets corresponding to the reference character, that is, the target recognition device may determine the first candidate character region set according to the third candidate character region set and the fourth candidate character region set.
Optionally, the target recognition device may further determine a distance between the position of the center point of the target character and the position of the center point of the reference character, and determine that the target character is closer to the position of the reference character if the distance is less than the distance threshold. Alternatively, in the distance calculation, the positions of other points may be selected for calculation, for example, the lowest point or the highest point of the game character, which is not limited in the present application.
In the case where the reference character is closer to the target character, the candidate character areas of other characters (i.e., the reference character) in the vicinity of the target character may be used as candidate character areas of the target character in order to better recognize the target character. That is, the third candidate character region set corresponding to the reference character includes a plurality of candidate character regions of the reference character, the fourth candidate character region set includes a plurality of candidate character regions of the target character, and the candidate character regions included in the third candidate character region set and the fourth candidate character region set are both candidate character regions in the first candidate character region set. Alternatively, the first candidate character region set may be a plurality of candidate character regions selected from the third candidate character region set and the fourth candidate character region set, for example, a candidate character region selected to be less than a set distance threshold from the target character.
Referring to fig. 5 together, fig. 5 is a schematic diagram of another interface of a method for identifying a target according to an embodiment of the present application, where the game interface on the upper side of fig. 5 further includes a reference character beside the target character, as shown by the character filled with black lines in fig. 5. As shown in the lower game interface of fig. 5, the reference character region may be a gray-filled region 51, the target character region may be a gray region, the target recognition device may determine a distance between a center point of the reference character region and a center point of the target character region, and in the case where the target recognition device determines that the distance is less than the distance threshold, a third candidate character region set corresponding to the reference character may be selected from the first image frame, and the candidate character region included in the third candidate character region set may be a diagonal-filled region 53 in the lower game interface of fig. 5. The target recognition device may use the candidate character region of the reference character as a candidate character region in the first candidate character region set together when selecting the candidate character region of the target character in the first image frame, i.e., the dark gray filled region 52 and the diagonal filled region 53 as the candidate character region in the first candidate character region set together.
Further, after the target recognition device determines the target character region and the first candidate character region set, the target recognition device may recognize the position of the target character in the first image frame, specifically may determine a phase difference of each candidate character region in the target character region and the first candidate character region set in the laplace domain, and determine the first position information of the target character in the first image frame according to the minimum phase difference, the position information of the target character region, and the position information of the candidate character region corresponding to the minimum phase difference.
S203, determining phase differences of each candidate character area in the target character area and the first candidate character area set in the Laplace domain, and obtaining a plurality of phase differences.
In the embodiment of the application, since the transformation information of the image domain, which is converted into the transformation information of the laplace domain, has the characteristic of 'translation invariance', namely, the process of converting the image of the image domain into the transformation information of the laplace domain can be called 'translation', and the information included before and after the laplace transformation processing is unchanged, the target recognition device can determine the phase information in the laplace domain based on the target character area and each candidate character area and the position information of the target character in the image domain based on the change of the phase, thereby realizing the position information for recognizing the moving object with a very high frame rate, namely, the offset information for recognizing the translation of the moving object in the image domain in a small time window.
In one possible implementation, the target recognition device may perform laplace transform processing on the pixels included in the target character region and the pixels included in each candidate character region in the first candidate character region set, to obtain a target transform matrix and a plurality of candidate transform matrices. It may be appreciated that the image corresponding to each candidate character region in the target character region and the first candidate character region set may be subjected to laplace forward transform processing, so as to obtain a transform matrix, where the target transform matrix is a transform matrix obtained by subjecting the target character region to laplace forward transform processing, and the plurality of candidate transform matrices may be transform matrices obtained by subjecting each candidate character region in the first candidate character region set to laplace forward transform processing, where one candidate character region corresponds to one candidate transform matrix.
Further, the object recognition device may determine the plurality of phase differences from respective elements in the object transformation matrix and respective elements in each of the plurality of candidate transformation matrices. The laplace transform processing may be performed on each pixel included in the image to obtain transform information, where the transform information includes phase information. In the embodiment of the application, the process of performing laplace transformation on the image to obtain transformation information and extracting phase information from the transformation information can be called as intelligent expression process, wherein the transformation information can also comprise amplitude information. Alternatively, the phase information extracted from the transformation information may be stored in a data table to obtain a lookup table, and then the stored phase information may be obtained from the lookup table and operated when the phase difference is determined later. The lookup table can be established to store transformation information of an existing image area, specifically including phase information, and the transformation information can be directly obtained when operation is needed, and Laplace transformation processing is not needed to be carried out on a certain role each time, so that the operation is more effective and efficient. Optionally, the phase information extracted from the transformation information of the reference character of the target character and the candidate character area of the reference character may also be stored in the lookup table, so as to facilitate subsequent operations.
The phase difference between the target character area and each of the plurality of candidate character areas determined by the target recognition device in the laplace domain may be as shown in equation 3:
in the case of the formula 3 of the present application,indicating phase difference>Representing candidate anglesThe phase corresponding to the color region,the phase corresponding to the target character region is indicated, and the obtained phase difference is determined by two coordinate relations.
Where (u, v) may represent a two-dimensional coordinate point after the laplace transform, and may also be understood as coordinate information of a frequency spectrum, m and n represent offset information, that is, offset distances of the target character in the x-axis and the y-axis in the image domain, and values of m and n may be determined by distances between a center point of the target character region and a center point of each candidate character region in the first candidate character region set. For example, knowing the position information (e.g., two-dimensional coordinates) of the center point of the target character region and the position information (e.g., two-dimensional coordinates) of the center point of each candidate character region in the first candidate character region set, the offset distances of the target character in the x-axis and the y-axis can be determined according to the two coordinates, i.e., the values of m and n are obtained.
Note that M and N represent calculation periods of laplace, and the values of M and N are constant, because the image function F(s) is an infinite non-periodic function, and if the calculation period is not set, the calculation amount for one target character is very large, and the relationship between the phase and the offset information cannot be determined based on this. Where M and N may refer to the size m×n of a window where the matrix performs convolution operation when performing laplace transform processing, which may be understood as the size of the transform matrix, that is, the transform matrix includes m×n elements. Alternatively, M and N may be scaled down in size of the initial image frame (i.e., the first image frame), for example, M and N may be one sixth of the width and height of the initial image frame. Alternatively, in practical applications, M and N may be scaled down according to an area of the initial image frame, for example, the area of m×n is one sixth of the area of the initial image frame, where M and N are equal to the length and width of the initial image frame. M and N may also be other values, as the application is not limited in this regard.
Specifically, the target recognition device determines a plurality of phase differences according to each element in the target transformation matrix and each element in each candidate transformation matrix in the plurality of candidate transformation matrices, specifically, the sum of absolute values of differences between a first element in the target transformation matrix and a second element corresponding to the first element in the target candidate transformation matrix can be determined, so as to obtain a first phase difference, the first element is any element in the target transformation matrix, the target candidate transformation matrix is any candidate transformation matrix in the plurality of candidate transformation matrices, and the target recognition device can determine the first phase difference between the target transformation matrix and each target candidate transformation matrix in the plurality of candidate transformation matrices as a plurality of phase differences. Wherein each element in the transformation matrix may correspond to a pixel value of the image in the image domain, each element may include a phase, and the phase difference may be obtained from a sum of differences between the target transformation matrix and elements of each of the plurality of candidate transformation matrices.
The phase difference between the target character area and each of the plurality of candidate character areas determined by the target recognition apparatus in the laplace domain may be as shown in equation 4 and equation 5:
t (m,n) [u.v]=t[x-m,y-n]Equation 4
In equation 4, for convenience of subsequent formulation, the offset information of the target character (initial coordinates of the target character are (x, y)) including m and n, i.e., the position information of the target character after the offset distance of m and n occurs is denoted as t (m,n)
In the formula 5 of the present invention,representing the minimum phase difference, minimum represents +.>Phase in candidate transformation matrix>Phase +.>Is a difference between (a) and (b). Where (u, v) may represent the positions of the elements in the candidate transformation, it will be appreciated that M and N are the dimensions of the transformation matrix, and may represent that m×n elements are included in the transformation matrix (since u and v are starting from 0, and are M-1 and N-1), then equation 5 may represent the sum of the absolute values of the phase differences of the respective elements in a certain candidate transformation matrix (e.g., the target candidate transformation matrix) and the elements in the corresponding positions in the target transformation matrix, resulting in the first phase difference. For example, if the first element in the target transformation matrix is an element corresponding to the first position (0, 0), the second element is an element corresponding to the first position (0, 0) in the target candidate transformation matrix. Wherein, the above formula 5 may be referred to as an L1 paradigm.
It will be appreciated how many candidate character areas exist in the first candidate character area set, how many phase differences can be calculated according to the equation shown in equation 5. Further, the image recognition apparatus may determine the minimum phase difference from the plurality of phase differences.
S204, determining first position information of the target character in the first image frame according to the minimum phase difference in the plurality of phase differences, the center point of the target character area and the center point of the candidate character area corresponding to the minimum phase difference.
After the target recognition device calculates a plurality of phase differences, the target recognition device can determine to minimize the phase differences, because the image energy is processed in the edge and intermediate frequency areas by the Laplace transformation, and the information of irrelevant areas such as the background can be filtered out by the phase differences after the Laplace transformation, namely, the areas which are counteracted and are more easily focused on the target role are equivalent. And the smaller the phase difference, the more similar the candidate character region and the target character region are, the more likely the candidate character region is to be a region including the target character, and thus the minimum phase difference is determined from the plurality of phase differences. The target recognition device may then determine the position information of the target character in the target image frame based on the minimum phase difference. It should be noted that, the values of m and n corresponding to the minimum phase difference are m and n, where m and n are different from m and n, because the relative offset distances of the center points of the candidate character areas represented by m and n are not necessarily the center positions of the candidate character areas, and the m and n determined by the minimum phase difference may represent the relative offset distances of the center points of the target characters, thereby the position information of the identified target characters is more accurate.
In one possible implementation, the target recognition device may determine the location information of the target character in the target image frame according to the minimum phase difference, the center point of the target character region, and the center point of the candidate character region corresponding to the minimum phase difference. Specifically, the target recognition device may determine the phase angle according to the minimum phase difference, obtain radius information according to the distance between the position information of the center point of the target character area and the position information of the center point of the candidate character area corresponding to the minimum phase difference, further determine offset information of the target character according to the phase angle and the radius information, where the offset information includes offset information in the direction of the image coordinate axis, and finally the target recognition model may determine the position information of the target character in the target image frame according to the offset information and the center point of the target character area.
The minimum phase difference is a phase, which is the phase angle (i.e., the angular representation). In the spherical coordinate system, offset information in the image coordinate axis directions, i.e., the x-axis direction and the y-axis direction in the two-dimensional image, can be obtained from the radius and the phase angle, respectively. The radius information may be a distance between the position information of the center point of the target character area and the position information of the center point of the candidate character area corresponding to the minimum phase difference, that is, the coordinates of the center point of the target character area and the coordinates of the center point of the candidate character area corresponding to the minimum phase difference are known, and the radius information may be determined according to a formula of the euclidean distance, where the radius included in the radius information is the distance between the two center points. Further, the target recognition apparatus may determine the offset information of the target character according to the phase angle and the radius information, and in particular, the offset information of the target character may be as shown in formula 6 and formula 7:
Wherein R in the formulas 6 and 7 represents a radius,the phase angle determined by the minimum phase difference determined by the above equation 5 is shown. m represents the offset distance of the target character in the x-axis direction in the two-dimensional image, and n represents the offset distance of the target character in the y-axis direction in the two-dimensional image. Further, the first location information of the target character may be as shown in equations 8 and 9:
x=x+m formula 8
y=y+n formula 9
Wherein (x, y) represents first position information of the target character, x and y represent position information of the target character in the initial image frame, m represents an offset distance of the target character in an x-axis direction in the two-dimensional image, and n represents an offset distance of the target character in a y-axis direction in the two-dimensional image. If the first image frame is taken as a t frame and the initial image frame is taken as a t-1 frame, the first position information may be the deviation distance of the target character at the time of the t frame added with the coordinate information x and y of the t-1 frame respectively.
In one possible implementation, the target recognition device may continuously recognize the position of the target character in each frame of image, and store the position information of the target character in each image frame, that is, recognize the target character in a memory form. The target recognition apparatus may determine a moving track of the target character based on position information (e.g., coordinates) of the target character recognized in the respective image frames. Optionally, the position of the target character in the next frame can be predicted according to the moving track of the target character, that is, the position information of the target character in the next frame is determined according to the moving distance and direction in the track of the target character. Further, when the target character disappears, the target recognition device may record the disappearance of the target character. Optionally, if the minimum phase difference is greater than a preset phase difference threshold, determining that the target character has disappeared.
Optionally, the target recognition device may also determine, for the coordinate information of the target character, whether the conventional winning rate, death coordinate, or the like of the player controlling the target character is crossed. In this case, the target character can be recognized simultaneously by the target recognition device, and when the target character disappears, the mark frame for marking the target character area of the target character also disappears.
In one possible implementation, after determining the first location information of the target character, the target recognition device may also verify whether the first location information is valid. Specifically, the target recognition device may determine image representation information of the target character region, perform inverse laplace transform processing on a sum of a minimum phase difference and a target phase among the plurality of phase differences to obtain an inverse transform image, and the target phase is a phase obtained by performing the inverse laplace transform processing on the target character region, and further determine image representation information of the inverse transform image, determine similarity according to the image representation information of the target character region and the image representation information of the inverse transform image, and determine that the first position information in the target image frame is effective position information when the similarity reaches a similarity threshold.
Wherein the object recognition apparatus can compare the image obtained by inverse transforming the candidate character region with the image of the object character region. This is because the image of the target character itself is less changed before and after the movement of the target character in the adjacent frame, and if the similarity reaches the similarity threshold value, the recognized first position information is determined to be valid information. In the compared images, one may be an inverse transformation based on a minimum phase difference, and one may be an original target character region. For example, the position information of the target character in the initial image frame is (x, y), the position information of the target character identified by the target identification device is (x, y), and the image of the image domain obtained by inverse transformation processing of the transformation information of the candidate character region corresponding to the target character region and the minimum phase difference from the laplace domain may be as shown in formula 10 and formula 11:
in equation 10, M and N are laplace transform calculation periods,the phase corresponding to the target character region is expressed, and the image obtained by inverse-transforming the target character region in the laplace domain is expressed in formula 10. In formula 11, +_>Is->The formula for the minimum phase difference may be shown as formula 5. Equation 11 may represent an image obtained by inverse-transforming the candidate character region corresponding to the minimum phase difference by the laplace domain. As can be seen from the formulas 10 and 11, the corresponding phase difference is +. >From this, it can be seen that the minimum phase difference calculated by equation 5 can determine the offset distance of the target character.
In the specific verification process, the target recognition device may perform the laplace inverse transformation process according to the sum of the minimum phase difference and the target phase, that is, the sum of the phases obtained by performing the laplace transformation process on the minimum phase and the target character region, to obtain an inverse transformation image of the image domain, where the sum of the two phases may be referred to as information generated by phase minimization, and further the target recognition device may determine image representation information of the inverse transformation image, where the image representation information refers to image vectorization of the inverse transformation image, to obtain an image vector of the inverse transformation image. Further, the target recognition device may determine image representation information of a target character region, which may also be referred to as convolution information, that is, image vectorization of the target character region to obtain an image vector of the target character region, where the image vectorization expresses an image in a vector form, and further calculates a similarity of the two image vectors. The similarity of the image vectors is the image quality index. By this verification method, verification can be performed based on the image quality index at the pixel level.
Alternatively, the calculating the similarity of the vectors may be a cosine similarity of the two vectors, and the first location information is determined to be valid location information when the similarity reaches a similarity threshold. Alternatively, the similarity threshold may be 0.9, 1, which is not limited by the present application.
Further, the target recognition device selects a plurality of candidate character areas from a second image frame, where the second image frame is a next frame of the target image frame, when the first position information in the target image frame is valid position information, and further determines second position information of the target character in the second image frame according to the target character area and each candidate character area in the plurality of candidate character areas selected from the second image frame. That is, it can be understood that if the identified first position information is valid position information, the target character can be continuously identified, i.e., the mode is operated in the next image frame and the next image frame.
In one possible implementation manner, the target recognition device determines that the first position information is invalid position information under the condition that the similarity is lower than a similarity threshold value, and selects a second candidate character region set from the first image frame; each candidate character region in the second candidate character region set is different from each candidate character region in the first candidate character region set, and the target recognition device may perform an operation of determining a phase difference of each candidate character region in the target character region and the second candidate character region set in the laplace domain until first position information of the target character in the first image frame is obtained. If the target recognition device determines that the similarity is lower than the similarity threshold, the target recognition device indicates that the recognition result error of the target role is larger, and then a plurality of candidate role regions can be selected again, namely a second angle selection region set is obtained, and the plurality of newly selected candidate role regions are recognized, namely the minimum phase difference is determined through the steps, and validity verification is performed again until effective first position information is obtained.
Referring to fig. 6, fig. 6 is a timing diagram of a target recognition method according to an embodiment of the present application, wherein, as shown in fig. 6, pixel values of pixel points in an image domain are denoted by c (x, y) (e.g. an image of a target character region), and the image including the pixel points is subjected to laplace transformThe minimum phase difference is determined and it is understood that the relationship between the phase and the pixels in the image domain, which is the inverse laplace transform, can be derived from each other by the inverse laplace transform process. And determining the position information (such as the first position information) of the target character according to the minimum phase difference, and verifying the identification validity of the position information, if the position information is valid, continuously identifying and recognizing the target character in a cyclic reciprocation mode in this way. If the position information is invalid position information, the target character is recognized again (t [ u, v)]) Equivalent to executing the reacquiring mode, i.e., re-selecting the candidate character areas (e.g., the candidate character areas included in the second set of candidate character areas), and calculating the laplace transform information, t [,] [u,v]And extracts phase information +.>Further determining a minimum phase difference and determining according to the minimum phase difference And determining the position information of the target role, and performing identification validity verification, thereby going back and forth. The method has better effect on identifying the game roles in the MOO RPG game, can achieve more than 90% of accuracy rate on identifying the target roles, and has better practical effect in game automation.
It should be noted that the expressions of the formulas in different formulas or drawings may be the same, and the meanings of the expressions are not the same. For example, in equation 3Representing the phase corresponding to the candidate character area +.>Representing the phase in the target transformation matrix, and +.>The transformation information representing the candidate character area includes phase information. For another example, in equation 5, +.>Representing the phase in the candidate transformation matrix, shown in FIG. 6 +.>The phase of the target character region obtained by the laplace transform process is represented.
In the embodiment of the application, the target role area where the target in the initial image frame is located is determined, a first candidate role area set is selected from the next frame of the initial image frame, the phase difference of each candidate role area in the target role area and the first candidate role area set in the Laplacian domain is further determined, a plurality of phase differences are obtained, and then the first position information of the target role in the first image frame is determined according to the minimum phase difference, the center point of the target role area and the center point of the candidate role area corresponding to the minimum phase difference. Therefore, the image information of the target role in the initial image frame and the image information of the candidate role areas in the next frame are converted into the Laplace domain for operation, namely, the translation conversion of the target in the image domain is converted into the phase change of the Laplace domain, so that the position information of the next frame of the target can be determined, the target identification is more accurate, and the effect of the target identification is improved.
Referring to fig. 7, fig. 7 is a schematic structural diagram of an object recognition device according to an embodiment of the present application, where the object recognition device 70 according to an embodiment of the present application may be disposed on an electronic apparatus, and the electronic apparatus may be the object recognition apparatus in fig. 1. The object recognition device 70 includes the following elements:
a determining unit 701 configured to determine a target character area in an initial image frame, where the target character area is an area where a target character is located in the initial image frame;
a selecting unit 702, configured to select a first candidate character region set from a first image frame, where the first image frame is a frame next to the initial image frame;
the determining unit 701 is further configured to determine a phase difference between the target character area and each candidate character area in the first candidate character area set in a laplace domain, so as to obtain a plurality of phase differences;
the determining unit 701 is further configured to determine first position information of the target character in the first image frame based on a minimum phase difference among the plurality of phase differences, a center point of the target character area, and a center point of a candidate character area corresponding to the minimum phase difference.
In one implementation, the determining unit 701 is further configured to determine image representation information of the target character area;
a processing unit 703 configured to perform a laplace inverse transform on a sum of a minimum phase difference among the plurality of phase differences and a target phase, the target phase being a phase obtained by performing the laplace transform on the target character region, to obtain an inverse transform image;
the determining unit 701 is further configured to determine image representation information of the inverse transformed image;
the determining unit 701 is further configured to determine a similarity based on the image representation information of the target character region and the image representation information of the inverse transformed image;
the determining unit 701 is further configured to determine that the first location information in the target image frame is valid location information when the similarity reaches a similarity threshold.
In one implementation, the selecting unit 702 is further configured to select a plurality of candidate character areas from a second image frame, where the second image frame is a frame next to the target image frame, when the first position information in the target image frame is valid position information;
The determining unit 701 is further configured to determine second position information of the target character in the second image frame according to the target character region and each candidate character region of a plurality of candidate character regions selected from the second image frame.
In one implementation manner, the determining unit 701 is further configured to determine that the first location information is invalid location information when the similarity is lower than the similarity threshold;
the selecting unit 702 is further configured to select a second candidate character region set from the first image frame; each candidate character region in the second set of candidate character regions is different from each candidate character region in the first set of candidate character regions;
the determining unit 701 is further configured to perform an operation of determining a phase difference in the laplace domain for each candidate character region in the target character region and the second candidate character region set until first position information of the target character in the first image frame is obtained.
In one implementation manner, the determining unit 701 is configured to determine a phase difference between the target character area and each of the plurality of candidate character areas in the laplace domain, to obtain a plurality of phase differences, and specifically is configured to:
Carrying out Laplacian transformation on pixels included in the target character region and pixels included in each candidate character region in the first candidate character region set to obtain a target transformation matrix and a plurality of candidate transformation matrices;
the plurality of phase differences are determined based on the respective elements in the target transformation matrix and the respective elements in each of the plurality of candidate transformation matrices.
In one implementation manner, the determining unit 701 is configured to determine the plurality of phase differences according to each element in the target transformation matrix and each element in each candidate transformation matrix in the plurality of candidate transformation matrices, and is specifically configured to:
determining the sum of absolute values of differences between a first element in the target transformation matrix and a second element corresponding to the target candidate transformation matrix and the first element to obtain a first phase difference, wherein the first element is any element in the target transformation matrix, and the target candidate transformation matrix is any candidate transformation matrix in the plurality of candidate transformation matrices;
and determining a first phase difference between the target transformation matrix and each of the plurality of candidate transformation matrices as the plurality of phase differences.
In one implementation manner, the determining unit 701 is configured to determine, based on the minimum phase difference, a center point of the target character area, and a center point of a candidate character area corresponding to the minimum phase difference, position information of the target character in the target image frame, specifically configured to:
determining a phase angle according to the minimum phase difference;
obtaining radius information according to the distance between the position information of the center point of the target character area and the position information of the center point of the candidate character area corresponding to the minimum phase difference;
determining offset information of the target character according to the phase angle and the radius information, wherein the offset information comprises offset information in the direction of an image coordinate axis;
and determining position information of the target character in the target image frame according to the offset information and the center point of the target character area.
In one implementation manner, the initial image frame further includes a reference character area, where the reference character area is an area where a reference character in the initial image frame is located, and the reference character is different from the target character;
the determining unit 701 is further configured to determine a distance between a center point of the target character area and a center point of a reference character area of the reference character;
The selecting unit 702 is further configured to select a third candidate character region set corresponding to the reference character from the first image frame if the distance is less than a distance threshold;
the selecting unit 702 is configured to select the first candidate character area set from the first image frame, specifically:
selecting a fourth candidate character region set corresponding to the target character from the first image frame;
and determining a first candidate character area set according to the third candidate character area set and the fourth candidate character area set.
According to one embodiment of the application, the steps involved in the method shown in fig. 2 may be performed by the units in the image recognition device shown in fig. 7. For example, step S201 shown in fig. 2 is performed by the determination unit 701 shown in fig. 7, step S202 is performed by the selection unit 702 shown in fig. 7, and, as another example, step S203 and step S204 shown in fig. 2 are performed by the determination unit 701 shown in fig. 7.
According to another embodiment of the present application, each unit in the content pushing apparatus shown in fig. 7 may be separately or completely combined into one or several other units, or some unit(s) thereof may be further split into a plurality of units with smaller functions, which may achieve the same operation without affecting the implementation of the technical effects of the embodiments of the present application. The above units are divided based on logic functions, and in practical applications, the functions of one unit may be implemented by a plurality of units, or the functions of a plurality of units may be implemented by one unit. In other embodiments of the present application, the content-based pushing device may also include other units, and in practical applications, these functions may also be implemented with assistance from other units, and may be implemented by cooperation of multiple units.
In the embodiment of the application, the target role area where the target in the initial image frame is located is determined, a first candidate role area set is selected from the next frame of the initial image frame, the phase difference of each candidate role area in the target role area and the first candidate role area set in the Laplacian domain is further determined, a plurality of phase differences are obtained, and then the first position information of the target role in the first image frame is determined according to the minimum phase difference, the center point of the target role area and the center point of the candidate role area corresponding to the minimum phase difference. Therefore, the image information of the target role in the initial image frame and the image information of the candidate role areas in the next frame are converted into the Laplace domain for operation, namely, the translation conversion of the target in the image domain is converted into the phase change of the Laplace domain, so that the position information of the next frame of the target can be determined, the target identification is more accurate, and the effect of the target identification is improved.
Based on the above description of the embodiments of the push method, the embodiment of the present application also discloses an electronic device, referring to fig. 8, where the electronic device may at least include a processor 801, a communication interface 802, and a computer storage medium 803. Wherein the processor 801, communication interface 802, and computer storage medium 803 within an electronic device may be connected by bus or other means.
The computer storage medium 803 is a memory device in an electronic device for storing programs and data. It is understood that the computer storage media 803 herein may include a built-in storage medium of the electronic device, and of course, may include an extended storage medium supported by the electronic device. The computer storage media 803 provides storage space that stores an operating system of the electronic device. Also stored in this memory space are one or more instructions, which may be one or more computer programs (including program code), adapted to be loaded and executed by the processor 801. Note that the computer storage medium herein may be a high-speed RAM memory; optionally, the computer storage medium may be at least one computer storage medium remote from the foregoing processor, where the foregoing processor may be referred to as a central processing unit (Central Processing Unit, CPU), and is a core of the electronic device and a control center, and adapted to be implemented with one or more instructions, and specifically load and execute the one or more instructions to implement the corresponding method flow or function.
In one implementation, the processor 801 may load and execute one or more first instructions stored in a computer storage medium to implement the corresponding steps of the method described above in connection with the content pushing method embodiment; in particular implementations, one or more first instructions in a computer storage medium are loaded by processor 801 and perform the following:
Determining a target character area in an initial image frame, wherein the target character area is an area where a target character is located in the initial image frame;
selecting a first candidate character region set from a first image frame, wherein the first image frame is the next frame of the initial image frame;
determining phase differences of each candidate character region in the target character region and the first candidate character region set in the Laplace domain to obtain a plurality of phase differences;
and determining first position information of the target character in the first image frame according to a minimum phase difference among the plurality of phase differences, a center point of the target character region, and a center point of a candidate character region corresponding to the minimum phase difference.
In one implementation, the processor 801 loads and executes one or more first instructions stored in the computer storage medium, and is further configured to perform the following steps:
determining image representation information of the target character area;
performing a laplace inverse transform on a sum of a minimum phase difference and a target phase among the plurality of phase differences to obtain an inverse transform image, wherein the target phase is a phase obtained by performing the laplace transform on the target character region;
Determining image representation information of the inverse transformed image;
determining a similarity based on the image representation information of the target character region and the image representation information of the inverse transformed image;
and determining the first position information in the target image frame as effective position information when the similarity reaches a similarity threshold.
In one implementation, the processor 801 loads and executes one or more first instructions stored in the computer storage medium, and is further configured to perform the following steps:
selecting a plurality of candidate character areas from a second image frame, which is a next frame of the target image frame, in a case where the first position information in the target image frame is valid position information;
and determining second position information of the target character in the second image frame according to the target character region and each candidate character region in a plurality of candidate character regions selected from the second image frame.
In one implementation, the processor 801 loads and executes one or more first instructions stored in the computer storage medium, and is further configured to perform the following steps:
determining that the first position information is invalid position information when the similarity is lower than the similarity threshold;
Selecting a second candidate character region set from the first image frame; each candidate character region in the second set of candidate character regions is different from each candidate character region in the first set of candidate character regions;
and executing the operation of determining the phase difference of each candidate character area in the Laplacian domain in the target character area and the second candidate character area set until the first position information of the target character in the first image frame is obtained.
In one implementation, the processor 801 loads and executes one or more first instructions stored in a computer storage medium to determine a phase difference between the target character region and each of the plurality of candidate character regions in the laplace domain, to obtain a plurality of phase differences, specifically for:
carrying out Laplacian transformation on pixels included in the target character region and pixels included in each candidate character region in the first candidate character region set to obtain a target transformation matrix and a plurality of candidate transformation matrices;
the plurality of phase differences are determined based on the respective elements in the target transformation matrix and the respective elements in each of the plurality of candidate transformation matrices.
In one implementation, the processor 801 loads and executes one or more first instructions stored in a computer storage medium to determine the plurality of phase differences according to the respective elements of the target transformation matrix and the respective elements of each of the plurality of candidate transformation matrices, and is specifically configured to:
determining the sum of absolute values of differences between a first element in the target transformation matrix and a second element corresponding to the target candidate transformation matrix and the first element to obtain a first phase difference, wherein the first element is any element in the target transformation matrix, and the target candidate transformation matrix is any candidate transformation matrix in the plurality of candidate transformation matrices;
and determining a first phase difference between the target transformation matrix and each of the plurality of candidate transformation matrices as the plurality of phase differences.
In one implementation, the processor 801 loads and executes one or more first instructions stored in a computer storage medium to determine the location information of the target character in the target image frame according to the minimum phase difference, the center point of the target character area, and the center point of the candidate character area corresponding to the minimum phase difference, specifically:
Determining a phase angle according to the minimum phase difference;
obtaining radius information according to the distance between the position information of the center point of the target character area and the position information of the center point of the candidate character area corresponding to the minimum phase difference;
determining offset information of the target character according to the phase angle and the radius information, wherein the offset information comprises offset information in the direction of an image coordinate axis;
and determining position information of the target character in the target image frame according to the offset information and the center point of the target character area.
In one implementation manner, the initial image frame further includes a reference character area, where the reference character area is an area where a reference character in the initial image frame is located, and the reference character is different from the target character; the processor 801 loads and executes one or more first instructions stored in the computer storage medium to further perform the steps of:
determining a distance between a center point of the target character region and a center point of a reference character region of the reference character;
selecting a third candidate character region set corresponding to the reference character from the first image frame when the distance is smaller than a distance threshold;
The selecting the first candidate character region set from the first image frame includes:
selecting a fourth candidate character region set corresponding to the target character from the first image frame;
and determining a first candidate character area set according to the third candidate character area set and the fourth candidate character area set.
The specific implementation of each step executed by the processor 801 in the embodiment of the present application may refer to the description of the related content in the foregoing embodiment, and the same technical effects may be achieved, which is not repeated herein.
The embodiment of the application also provides a computer readable storage medium, wherein the computer readable storage medium stores a computer program, and a processor runs the computer program to enable the electronic device to execute the method provided by the previous embodiment.
Embodiments of the present application also provide a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the electronic device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions, so that the electronic device performs the method provided by the foregoing embodiment.
The steps in the method of the embodiment of the application can be sequentially adjusted, combined and deleted according to actual needs.
Those skilled in the art will appreciate that the processes implementing all or part of the methods of the above embodiments may be implemented by a computer program for instructing relevant hardware, and the program may be stored in a computer readable storage medium, and the program may include the processes of the embodiments of the methods as above when executed. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a random-access Memory (Random Access Memory, RAM), or the like.
The above disclosure is only a preferred embodiment of the present application, and it should be understood that the scope of the application is not limited thereto, and those skilled in the art will appreciate that all or part of the procedures described above can be performed according to the equivalent changes of the claims, and still fall within the scope of the present application.

Claims (11)

1. A method of object recognition, comprising:
determining a target character area in an initial image frame, wherein the target character area is an area where a target character is located in the initial image frame;
Selecting a first candidate character region set from a first image frame, wherein the first image frame is the next frame of the initial image frame;
determining phase differences of each candidate role area in the target role area and the first candidate role area set in the Laplace domain to obtain a plurality of phase differences;
and determining first position information of the target character in the first image frame according to the minimum phase difference in the plurality of phase differences, the center point of the target character area and the center point of the candidate character area corresponding to the minimum phase difference.
2. The method according to claim 1, wherein the method further comprises:
determining image representation information of the target character area;
performing Laplace inverse transformation on the sum of the minimum phase difference and the target phase in the plurality of phase differences to obtain an inverse transformation image, wherein the target phase is a phase obtained by performing Laplace transformation on the target character region;
determining image representation information of the inverse transformed image;
determining a similarity according to the image representation information of the target character region and the image representation information of the inverse transformation image;
And determining the first position information in the target image frame as effective position information under the condition that the similarity reaches a similarity threshold value.
3. The method according to claim 2, wherein the method further comprises:
selecting a plurality of candidate character areas from a second image frame, which is a next frame of the target image frame, in the case that the first position information in the target image frame is valid position information;
and determining second position information of the target character in the second image frame according to the target character area and each candidate character area in a plurality of candidate character areas selected from the second image frame.
4. The method according to claim 2, wherein the method further comprises:
determining that the first position information is invalid position information under the condition that the similarity is lower than the similarity threshold value;
selecting a second candidate character region set from the first image frame; each candidate character region in the second set of candidate character regions is different from each candidate character region in the first set of candidate character regions;
And executing the operation of determining the phase difference of each candidate character area in the target character area and the second candidate character area set in the Laplace domain until the first position information of the target character in the first image frame is obtained.
5. The method of claim 1, wherein the determining the phase difference in the laplace domain for each of the target character region and the plurality of candidate character regions results in a plurality of phase differences, comprising:
carrying out Laplacian transformation on pixels included in the target role region and pixels included in each candidate role region in the first candidate role region set to obtain a target transformation matrix and a plurality of candidate transformation matrices;
the plurality of phase differences is determined based on the respective elements in the target transformation matrix and the respective elements in each of the plurality of candidate transformation matrices.
6. The method of claim 5, wherein determining the plurality of phase differences from the respective elements in the target transformation matrix and the respective elements in each of the plurality of candidate transformation matrices comprises:
Determining the sum of absolute values of differences between a first element in the target transformation matrix and a second element corresponding to the target candidate transformation matrix and the first element to obtain a first phase difference, wherein the first element is any element in the target transformation matrix, and the target candidate transformation matrix is any candidate transformation matrix in the plurality of candidate transformation matrices;
and determining a first phase difference between the target transformation matrix and each target candidate transformation matrix of the plurality of candidate transformation matrices as the plurality of phase differences.
7. The method according to any one of claims 1 to 6, wherein the determining the position information of the target character in the target image frame according to the minimum phase difference, the center point of the target character area, and the center point of the candidate character area corresponding to the minimum phase difference includes:
determining a phase angle from the minimum phase difference;
obtaining radius information according to the distance between the position information of the central point of the target character area and the position information of the central point of the candidate character area corresponding to the minimum phase difference;
determining offset information of the target character according to the phase angle and the radius information, wherein the offset information comprises offset information in the direction of an image coordinate axis;
And determining the position information of the target character in the target image frame according to the offset information and the center point of the target character area.
8. The method according to any one of claims 1-6, wherein the initial image frame further includes a reference character area, the reference character area being an area in which a reference character is located in the initial image frame, the reference character being different from the target character; the method further comprises the steps of:
determining a distance between a center point of the target character region and a center point of a reference character region of the reference character;
selecting a third candidate character region set corresponding to the reference character from the first image frame under the condition that the distance is smaller than a distance threshold;
the selecting a first candidate character region set from the first image frame includes:
selecting a fourth candidate character region set corresponding to the target character from the first image frame;
and determining a first candidate character area set according to the third candidate character area set and the fourth candidate character area set.
9. An object recognition apparatus, comprising:
a determining unit configured to determine a target character area in an initial image frame, the target character area being an area in which a target character is located in the initial image frame;
A selecting unit, configured to select a first candidate character region set from a first image frame, where the first image frame is a frame next to the initial image frame;
the determining unit is further configured to determine a phase difference of each candidate character region in the target character region and the first candidate character region set in the laplace domain, so as to obtain a plurality of phase differences;
the determining unit is further configured to determine first position information of the target character in the first image frame according to a minimum phase difference among the plurality of phase differences, a center point of the target character region, and a center point of a candidate character region corresponding to the minimum phase difference.
10. An electronic device comprising a processor, a communication interface and a memory, the processor, the communication interface and the memory being interconnected, wherein the memory stores executable program code, the processor being configured to invoke the executable program code to perform the method of any of claims 1-8.
11. A computer readable storage medium, characterized in that the computer readable storage medium stores a computer program which, when executed by a processor, performs the method of any of claims 1-8.
CN202211530911.6A 2022-12-01 2022-12-01 Target identification method, device, equipment and storage medium Pending CN116977404A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211530911.6A CN116977404A (en) 2022-12-01 2022-12-01 Target identification method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211530911.6A CN116977404A (en) 2022-12-01 2022-12-01 Target identification method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN116977404A true CN116977404A (en) 2023-10-31

Family

ID=88473706

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211530911.6A Pending CN116977404A (en) 2022-12-01 2022-12-01 Target identification method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN116977404A (en)

Similar Documents

Publication Publication Date Title
CN111126272B (en) Posture acquisition method, and training method and device of key point coordinate positioning model
CN111709409B (en) Face living body detection method, device, equipment and medium
KR102645536B1 (en) Animation processing methods and devices, computer storage media, and electronic devices
CN111666919B (en) Object identification method and device, computer equipment and storage medium
CN111862274A (en) Training method for generating confrontation network, and image style migration method and device
CN111339343A (en) Image retrieval method, device, storage medium and equipment
CN112085835B (en) Three-dimensional cartoon face generation method and device, electronic equipment and storage medium
CN112232258A (en) Information processing method and device and computer readable storage medium
CN112101344B (en) Video text tracking method and device
US20230153965A1 (en) Image processing method and related device
CN116977674A (en) Image matching method, related device, storage medium and program product
CN115482556A (en) Method for key point detection model training and virtual character driving and corresponding device
CN111126515A (en) Model training method based on artificial intelligence and related device
Davtyan et al. Controllable video generation through global and local motion dynamics
CN110807380B (en) Human body key point detection method and device
CN117011856A (en) Handwriting skeleton refining method, system, equipment and medium based on deep reinforcement learning
CN116701706A (en) Data processing method, device, equipment and medium based on artificial intelligence
CN114333069B (en) Object posture processing method, device, equipment and storage medium
CN116977404A (en) Target identification method, device, equipment and storage medium
Mehmood et al. Automatically human action recognition (HAR) with view variation from skeleton means of adaptive transformer network
CN113569809A (en) Image processing method, device and computer readable storage medium
CN113592986A (en) Action generation method and device based on neural network and computing equipment
CN111275183A (en) Visual task processing method and device and electronic system
CN116091675B (en) Image processing method, device, equipment and storage medium
CN117557699B (en) Animation data generation method, device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication