CN111522988A

CN111522988A - Image positioning model obtaining method and related device

Info

Publication number: CN111522988A
Application number: CN202010478436.7A
Authority: CN
Inventors: 葛艺潇; 朱烽; 王海波; 赵瑞; 李鸿升
Original assignee: Shenzhen Sensetime Technology Co Ltd
Current assignee: Shenzhen Sensetime Technology Co Ltd
Priority date: 2020-05-29
Filing date: 2020-05-29
Publication date: 2020-08-11
Anticipated expiration: 2040-05-29
Also published as: TWI780563B; WO2021237973A1; TW202145075A; CN111522988B

Abstract

The embodiment of the application provides an image positioning model obtaining method and a related device, wherein the method comprises the following steps: determining the similarity between a target image and K first sample positioning images according to the first image positioning model to obtain a first similarity vector, wherein K is an integer greater than 1; determining a first target loss function according to the first similarity vector; and adjusting an initial model according to the first target loss function to obtain a second image positioning model, wherein the initial model is obtained after the first image positioning model is initialized. The accuracy of the positioning model in positioning the image can be improved.

Description

Image positioning model obtaining method and related device

Technical Field

The application relates to the technical field of data processing, in particular to an image positioning model obtaining method and a related device.

Background

The image positioning technology aims to match a reference image which is most similar (near) to a target image in a large-scale database, and takes a GPS marked by the reference image as the geographic position of the target image. Image positioning technology is mainly realized by three methods at present, including image retrieval, 3D structure matching and classification according to geographical positions.

At present, in order to avoid misleading by wrong positive samples (refer to image sample pairs with similar GPS but non-overlapped pictures), only the first best matching sample is selected as the positive sample for training, however, a network with good robustness to conditions such as different visual angles and light rays cannot be obtained only by learning the best matching sample, so that the accuracy of the trained network model is low when image positioning is carried out.

Disclosure of Invention

The embodiment of the application provides an image positioning model obtaining method and a related device, which can improve the accuracy of a positioning model in positioning an image.

A first aspect of an embodiment of the present application provides an image localization model obtaining method, including:

determining the similarity between a target image and K first sample positioning images according to the first image positioning model to obtain a first similarity vector, wherein K is an integer greater than 1;

determining a first target loss function according to the first similarity vector;

and adjusting the initial model according to the first target loss function to obtain a second image positioning model, wherein the initial model is obtained after the first image positioning model is initialized.

In this example, the similarity between the target image and the K first sample positioning images is determined by the first image positioning model to obtain a first similarity vector, the first target loss function is determined according to the similarity vector, the initial model is adjusted according to the first target loss function to obtain the second image positioning model, so that the first target loss function determined according to the first image positioning model, the target image and the K first sample positioning images can be determined, the similarity supervision learning is performed on the initial model to obtain the second image positioning model, and therefore the accuracy of the second image positioning model in image positioning can be improved.

With reference to the first aspect, in a possible implementation manner, determining similarities between a target image and K first sample positioning images according to a first image positioning model to obtain a first similarity vector includes:

splitting each first sample positioning image in the K first sample positioning images to obtain N sub-first sample positioning images corresponding to each first sample positioning image;

determining feature values corresponding to N sub-first sample positioning images corresponding to each first sample positioning image according to the first image positioning model to obtain a feature vector corresponding to each first sample positioning image;

determining a characteristic value of a target image according to the first image positioning model;

and determining a first similarity vector according to the feature vector corresponding to each first sample positioning image and the feature value of the target image.

In this example, each of K first sample positioning images is split to obtain N sub-first sample positioning images, and a first similarity vector is determined according to a feature value of the K × N sub-first sample positioning images and a feature value of a target image, so that the first similarity vector can be determined at a fine granularity, accuracy of the first similarity vector in reflecting the samples is improved, and accuracy of determining a second image positioning model is improved.

With reference to the first aspect, in one possible implementation manner, the determining a first objective loss function according to the first similarity vector includes:

determining a first sub-loss function according to the first similarity vector;

determining a second sub-loss function according to the difficult negative sample image corresponding to the target image;

a first target loss function is determined based on the first sub-loss function and the second sub-loss function.

In this example, the first target loss function may be determined according to the first sub-loss function determined by the first similarity vector and the second sub-loss function determined by the difficult negative sample image corresponding to the target image, so that the first target loss function may be determined according to the accurate first similarity vector and the second sub-loss function determined by the difficult negative sample image, and accuracy in determining the first target loss function is improved.

With reference to the first aspect, in a possible implementation manner, determining a first sub-loss function according to the first similarity vector includes:

obtaining the similarity between the target image and the K first sample positioning images according to the initial model to obtain a second similarity vector;

a first sub-loss function is determined based on the first similarity vector and the second similarity vector.

In this example, the first sub-loss function may be determined by the second similarity vector and the first similarity vector determined by the initial model, so that the second similarity vector may be supervised by the similarity vector determined by the first image positioning model, accuracy in determining the first sub-loss function is improved, and accuracy in positioning an image of the second image positioning model may also be improved because the first similarity vector supervises the second similarity vector.

With reference to the first aspect, in one possible implementation manner, determining a first target loss function according to the first sub-loss function and the second sub-loss function includes:

and calculating the first sub-loss function and the second sub-loss function according to the loss weighting factors corresponding to the first sub-loss function and the second sub-loss function to obtain a first target loss function.

With reference to the first aspect, in one possible implementation manner, the method further includes:

receiving an image to be marked;

acquiring K second sample positioning images corresponding to the images to be marked;

splitting each second sample positioning image in the K second sample positioning images to obtain N sub-second sample positioning images corresponding to each second sample positioning image;

and determining similarity labels corresponding to the image to be marked and the N sub second sample positioning images corresponding to each second sample positioning image through the second image positioning model.

In this example, the similarity labels corresponding to the N sub-second sample positioning images corresponding to the substitute label image and each second sample positioning image are determined by the second image positioning model, and when the similarity labels are determined by the image positioning model obtained by training a single sample pair (optimal sample pair) in the existing scheme, the accuracy of the obtained similarity labels can be improved.

With reference to the first aspect, in a possible implementation manner, the first image positioning model includes a basic image positioning model, and the basic image positioning model includes a model obtained by training a sample pair including an image with the highest similarity in the target image and the K first sample positioning images.

determining a second target loss function according to the second image positioning model, the target image and the K first sample positioning images;

adjusting the initial model according to the second target loss function to obtain a third image positioning model;

the first image localization model is replaced with a third image localization model.

A second aspect of the embodiments of the present application provides an image positioning method, including:

receiving an image to be detected;

and positioning the image to be detected according to the second image positioning model in any one of the first aspect to obtain positioning information corresponding to the image to be detected.

A third aspect of an applied embodiment provides an image localization model acquisition apparatus, including:

the first determining unit is used for determining the similarity between the target image and K first sample positioning images according to the first image positioning model so as to obtain a first similarity vector, and K is an integer greater than 1;

a second determining unit, configured to determine a first target loss function according to the first similarity vector;

and the adjusting unit is used for adjusting the initial model according to the first target loss function to obtain a second image positioning model, and the initial model is obtained after the first image positioning model is initialized.

With reference to the second aspect, in a possible implementation manner, the first determining unit is configured to:

With reference to the second aspect, in a possible implementation manner, the second determining unit is configured to:

determining a first sub-loss function according to the first similarity vector;

With reference to the second aspect, in a possible implementation manner, in determining the first sub-loss function according to the first similarity vector, the second determining unit is configured to:

With reference to the second aspect, in one possible implementation manner, in determining the first target loss function according to the first sub-loss function and the second sub-loss function, the second determining unit is configured to:

With reference to the second aspect, in one possible implementation manner, the apparatus is further configured to:

receiving an image to be marked;

With reference to the second aspect, in a possible implementation manner, the first image positioning model includes a basic image positioning model, and the basic image positioning model is a model obtained by training a sample pair including an image with the highest similarity in the target image and the K first sample positioning images.

A fourth aspect of the embodiments of the present application provides an image positioning apparatus, including:

the receiving unit is used for receiving an image to be detected;

and the positioning unit is used for positioning the image to be detected according to the second image positioning model in any one of the second aspects to obtain the positioning information corresponding to the image to be detected.

A fifth aspect of the embodiments of the present application provides a terminal, including a processor, an input device, an output device, and a memory, where the processor, the input device, the output device, and the memory are connected to each other, where the memory is used to store a computer program, and the computer program includes program instructions, and the processor is configured to call the program instructions and execute the step instructions in the first aspect or the second aspect of the embodiments of the present application.

A sixth aspect of embodiments of the present application provides a computer-readable storage medium, wherein the computer-readable storage medium stores a computer program for electronic data exchange, wherein the computer program causes a computer to perform some or all of the steps as described in the first or second aspect of embodiments of the present application.

A seventh aspect of embodiments of the present application provides a computer program product, wherein the computer program product comprises a non-transitory computer readable storage medium storing a computer program operable to cause a computer to perform some or all of the steps as described in the first or second aspect of embodiments of the present application. The computer program product may be a software installation package.

These and other aspects of the invention are apparent from and will be elucidated with reference to the embodiments described hereinafter.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

FIG. 1a is a schematic diagram of an application scenario of an image localization model according to an embodiment of the present application;

FIG. 1b is a schematic flowchart of an image localization model obtaining method according to an embodiment of the present application;

FIG. 2a is a schematic diagram of a sample image according to an embodiment of the present application;

FIG. 2b is a schematic diagram illustrating a first sample positioning image according to an embodiment of the present application;

FIG. 2c is a schematic diagram illustrating another example of a first sample positioning image according to an embodiment of the present application;

FIG. 2d is a schematic diagram of a first sample positioning image according to an embodiment of the present application;

FIG. 3 is a schematic flow chart diagram illustrating another method for acquiring an image localization model according to an embodiment of the present application;

FIG. 4 is a schematic flow chart diagram illustrating another method for acquiring an image localization model according to an embodiment of the present application;

fig. 5 is a schematic structural diagram of a terminal according to an embodiment of the present application;

FIG. 6 is a schematic structural diagram of an image localization model obtaining apparatus according to an embodiment of the present application;

fig. 7 is a schematic structural diagram of an image positioning apparatus according to an embodiment of the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

The terms "first," "second," and the like in the description and claims of the present application and in the above-described drawings are used for distinguishing between different objects and not for describing a particular order. Furthermore, the terms "include" and "have," as well as any variations thereof, are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those steps or elements listed, but may alternatively include other steps or elements not listed, or inherent to such process, method, article, or apparatus.

Reference in the specification to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the specification. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is explicitly and implicitly understood by one skilled in the art that the embodiments described herein can be combined with other embodiments.

The electronic device described in the embodiment of the present invention may include a smart Phone (e.g., an Android Phone, an iOS Phone, a Windows Phone, etc.), a tablet computer, a palm computer, a vehicle data recorder, a traffic guidance platform, a server, a notebook computer, a Mobile Internet device (MID, Mobile Internet Devices), or a wearable device (e.g., a smart watch, a bluetooth headset), which are merely examples, but are not exhaustive, and the electronic device may also be a server or a video matrix, which is not limited herein, and the electronic device may also be an Internet of things device. The terminal and the electronic device in the embodiment of the application can be the same device.

In order to better understand an image positioning model obtaining method provided by the embodiment of the present application, first, an application scenario of an image positioning model determined by the image positioning model obtaining method is briefly described below. As shown in fig. 1a, the image positioning model may be applied to an electronic device, when a user needs to perform position positioning, for example, the user needs to inform other people of the current position, the user may acquire an image near the current position through the electronic device, for example, the user is beside xx building, and the image near the current position may be an image of an area near the xx building, so as to obtain an image to be detected. The electronic device performs positioning analysis and calculation on the image to be detected through the image positioning model to obtain positioning information corresponding to the image to be detected, where the positioning information is position information (xx building) of an area reflected by the image to be detected, for example, the position information may be position information of a landmark building in the image to be detected, and the landmark building may be a building selected by a user, a building determined through the image positioning model, or other position information of a landmark, and this is only an example. Therefore, the current position of the user can be determined through the image positioning model, and higher convenience is brought to the user. Because the positioning accuracy of the existing image positioning model is not high when the image positioning model positions the image, when the model training is carried out, a single sample pair is usually adopted to train the initial model, and therefore the image positioning model needs to be optimally trained, so that the accuracy of the image positioning model when the image positioning model positions the image is improved. The following embodiments mainly illustrate the accuracy of adjusting the initial model to improve the image positioning of the adjusted image positioning model.

As shown in fig. 1b, the image localization model obtaining method is applied to an electronic device, and the method includes steps 101 and 103, which are as follows:

101. the electronic equipment determines the similarity between the target image and K first sample positioning images according to the first image positioning model to obtain a first similarity vector, wherein K is an integer larger than 1.

The K first sample positioning images may be sample images determined from GPS (global positioning system) positioning information of the target image, for example, images within a preset range at a position indicated by the GPS positioning information of the target image, specifically, map images within a range of 10 meters at the indicated position, and the like. The target image can be acquired through mobile terminals such as a mobile phone and a computer, and the target image can be used for determining a sample pair so as to adjust the initial model through the sample pair, namely the target image and the K first sample positioning images are the sample pair for adjusting the initial model. The preset range may be set by an empirical value or historical data.

The similarity labels between the K first sample positioning images and the target image may be values between 0 and 1, and may also include 0 or 1. As shown in fig. 2a, one possible target image and a first sample positioning image are shown in fig. 2a, wherein the similarity labels of the first sample positioning image comprise 0.45, 0.35, etc.

The elements in the first similarity vector may include a similarity between the target image and the first sample positioning image and a similarity between sub-images of the target image after splitting from the first sample positioning image. The first sample positioning image is split, a plurality of sub first sample positioning images can be obtained, the image can be split into the plurality of sub first sample positioning images with the same area when the image is split, and the image can also be split into the plurality of sub first sample positioning images with different areas and the like.

The electronic device can be used for adjusting the initial model, and can also be used for adjusting the initial model and adopting the image positioning model for image positioning.

102. The electronic device determines a first target loss function according to the first similarity vector.

A corresponding loss function may be determined from the first similarity vector, and a first target loss function may be determined at least from the corresponding loss function.

103. And the electronic equipment adjusts the initial model according to the first target loss function to obtain a second image positioning model, wherein the initial model is obtained after the first image positioning model is initialized.

And training the initial model through a sample set comprising the target image and K first sample positioning images and a first target loss function to obtain a second image positioning model. The model obtained after the initial model is the first image positioning model is initialized can be understood as initializing the model parameters in the first image positioning model to obtain the initial model. The first image positioning model is obtained by training an initial model through a sample set comprising a target image and K first sample positioning images.

In one possible embodiment, a possible method for determining similarity between a target image and K first sample positioning images according to a first image positioning model to obtain a first similarity vector includes steps a1-a4 as follows:

a1, splitting each first sample positioning image in the K first sample positioning images to obtain N sub first sample positioning images corresponding to each first sample positioning image;

a2, determining feature values corresponding to N sub-first sample positioning images corresponding to each first sample positioning image according to the first image positioning model to obtain a feature vector corresponding to each first sample positioning image;

a3, determining a characteristic value of the target image according to the first image positioning model;

a4, determining a first similarity vector according to the feature vector corresponding to each first sample positioning image and the feature value of the target image.

When the first sample positioning image is split, the image can be split into a plurality of sub first sample positioning images with the same area, or can be split into a plurality of sub first sample positioning images with different areas, and the like. One possible splitting approach may be: splitting the first sample positioning image into two sub first positioning images with equal areas, and splitting the first sample positioning image into 4 sub first positioning images with equal areas. Specifically, as shown in fig. 2b, the first sample positioning image may be split into an upper sub-positioning image and a lower sub-positioning image, or the first sample positioning image may be split into a left sub-positioning image and a right sub-positioning image; as shown in fig. 2c, the first sample positioning image may be split into 4 sub-first sample positioning images of equal area.

The N sub-first sample images may include sub-first sample images obtained by a plurality of different splitting manners, for example, if all sub-first sample positioning images obtained by the splitting manners shown in fig. 2b and fig. 2c are available, then N is equal to 8, and N may also be any other number, which is merely an example and is not limited specifically herein.

When the feature vector and the feature value of the target image are determined, the feature vector and the feature value of the target image can be obtained through calculation by the first image positioning model. The feature vector corresponding to each first sample positioning image can be expressed as:

wherein the content of the first and second substances,

feature values of an image are located for a first sub-sample of an ith first sample location image.

The first similarity vector may be obtained in a cross entropy calculation manner, and specifically, the first similarity vector may be determined by a method shown in the following formula:

wherein the content of the first and second substances,

the first similarity vector, softmax, is a normalization operation, τ_ωIs a super-parameter (temperature coefficient),

is the characteristic value of the target image,

locating an image p for a first sample₁Is determined by the characteristic value of (a),

is a first sample image p₁The first sub-first sample of (a) locates a feature value of the image,

locating an image p for a first sample_kIs determined by the characteristic value of (a),

is a first sample image p_kThe first sub-first sample of (2) locates a feature value of the image.

In one possible embodiment, a possible method for determining a first objective loss function based on a first similarity vector comprises steps B1-B3 as follows:

b1, determining a first sub-loss function according to the first similarity vector;

b2, determining a second sub-loss function according to the difficult negative sample image corresponding to the target image;

and B3, determining a first target loss function according to the first sub-loss function and the second sub-loss function.

The first similarity vector may determine a first sub-loss function with a similarity vector between the target image and the first sample positioning image determined by the initial model.

The difficult negative sample image corresponding to the target image may be understood as a negative sample having a similarity lower than a preset threshold, which may be set by an empirical value or historical data, among the negative samples corresponding to the target image.

When determining the second sub-loss function, the second sub-loss function may be determined by a method shown by the following formula:

wherein the content of the first and second substances,

in order to be a second sub-loss function,

the feature value of the positive sample image with the highest similarity label,

and K is the number of the first sample positioning images, and is the characteristic value of the negative sample image with the lowest similarity label.

The first sub-loss function and the second sub-loss function may be weighted to obtain a first target loss function.

In a possible embodiment, a possible method for determining the first sub-loss function according to the first similarity vector comprises steps C1-C2, as follows:

c1, obtaining the similarity between the target image and the K first sample positioning images according to the initial model to obtain a second similarity vector;

and C2, determining a first sub-loss function according to the first similarity vector and the second similarity vector.

The method for obtaining the second similarity vector may refer to the method for obtaining the first similarity vector in the foregoing embodiment, and in specific implementation, the initial model is used for calculation to obtain the second similarity vector.

The first similarity vector and the second similarity vector can be obtained by adopting cross entropy operation to obtain a first sub-loss function. For example, the first sub-loss function may be obtained by the following formula:

wherein L is_soft(θ_ω) In order to be a first sub-loss function,

is the second similarity vector, and is the second similarity vector,

τ_ω-1is a first similarity vector, l_ec() For cross entropy operations, ω is a positive integer greater than or equal to 2. When the above formula is used to represent a plurality of adjustments, ω can then be understood as the number of adjustments.

l_ec() Can be expressed as:

wherein, the ratio of y,

are elements that need to be cross entropy operated.

In one possible embodiment, one possible method for determining the first target loss function according to the first sub-loss function and the second sub-loss function may be:

The loss weighting factor corresponds to a first sub-loss function and a second sub-loss function, and one possible way to weight the loss factors may be: the loss weighting factor of the first sub-loss function is

The second sub-loss function has a loss weighting factor of 1.

The method of obtaining the first target loss function is also a method shown by the following formula:

wherein, L (theta)_ω) In order to be a first target loss function,

in order to be a second sub-loss function,

is the first sub-lossThe loss function, λ, is a weighting factor.

In a possible embodiment, the labeling of the labeled image may be further performed to obtain a similarity label between the labeled image and the corresponding sample positioning image, which may specifically include steps D1-D4:

d1, receiving an image to be marked;

d2, acquiring K second sample positioning images corresponding to the images to be marked;

d3, splitting each second sample positioning image in the K second sample positioning images to obtain N sub second sample positioning images corresponding to each second sample positioning image;

d4, determining the similarity labels corresponding to the image to be marked and the N sub second sample positioning images corresponding to each second sample positioning image through the second image positioning model.

The method for obtaining the second sample positioning image may refer to the obtaining method for obtaining the first sample positioning image in the foregoing embodiment, and details are not repeated here. Step D3 can refer to the method shown in step a1, and will not be described herein.

When the similarity labels are obtained, calculation can be carried out through the second image positioning model, so that the similarity labels corresponding to the image to be marked and the N sub second sample positioning images corresponding to each second sample positioning image are obtained. In the specific calculation, the similarity may be determined by substituting the distance between the feature vector of the labeled image and the feature vectors of the N sub-sample positioning images, and the similarity may be determined as the corresponding similarity label.

In one possible embodiment, the first image localization model includes a basic image localization model, and the basic image localization model is a model obtained by training a pair of samples including an image with the highest similarity between the target image and the K first sample localization images.

In a possible embodiment, the method for obtaining the first image localization model further includes steps E1-E3, as follows:

e1, determining a second target loss function according to the second image positioning model, the target image and the K first sample positioning images;

e2, adjusting the initial model according to the second target loss function to obtain a third image positioning model;

e3, replacing the first image localization model with the third image localization model.

The implementation of step E1 can refer to the method for determining the first objective loss function in the foregoing embodiment, and the implementation of step E2 can refer to the method for determining the second image localization model in the foregoing embodiment.

In a possible embodiment, the second image localization model may be used to localize the image to be detected, so as to obtain localization information corresponding to the image to be detected, which may specifically include steps F1-F2, as follows:

f1, receiving an image to be detected;

and F2, positioning the image to be detected according to the second image positioning model in any embodiment to obtain positioning information corresponding to the image to be detected.

In this example, the image to be detected is positioned by the second image positioning model, so that the accuracy in acquiring the positioning information can be improved.

In a possible implementation manner, the method includes adjusting the image positioning model according to the loss function for multiple times, and then obtaining a final image positioning model, and the detailed method includes:

training the initial model by using an image with the highest similarity in the target image and the K first sample positioning images as a sample pair to obtain a basic image positioning model; determining the similarity between a target image and K first sample positioning images by adopting a basic positioning model to obtain a first similarity vector, and determining a first sub-loss function according to the first similarity vector; determining a second sub-loss function according to the initial model, the target image and the difficult negative sample corresponding to the target image; performing weighting operation on the first sub-loss function and the second sub-loss function to obtain a first target loss function, and adjusting the initial model through the first target loss function to obtain a second image positioning model; and determining a second target loss function according to the second image positioning model, the target image and the K first sample positioning images, adjusting and training the initial model according to the second target loss function to obtain a third image positioning model, and repeating the steps to obtain a final image positioning model. As shown in fig. 2d, in order to adjust the initial model three times, when the initial model is adjusted for the first time, the K first sample images have been split (not shown in the figure), and the similarity bar shown in the figure may be understood as similarity or a similarity label, where a value of the similarity label is larger when the similarity is high, and a value of the similarity label is smaller when the similarity is low. In fig. 2d, the similarity label of the sub-first sample positioning image obtained by the model calculation after the three times of adjustment is more accurate than the similarity label of the sub-first sample positioning image obtained by the model calculation after the first time of adjustment.

Referring to fig. 3, fig. 3 is a schematic flowchart illustrating another method for obtaining an image localization model according to an embodiment of the present disclosure. As shown in fig. 3, the image positioning model obtaining method includes

steps

301 and 306, which are as follows:

301. splitting each first sample positioning image in the K first sample positioning images to obtain N sub-first sample positioning images corresponding to each first sample positioning image, wherein K is an integer greater than 1;

the K first sample positioning images may be sample images determined according to GPS positioning information of the target image, for example, images within a preset range at a position indicated by the GPS positioning information of the target image, and specifically, map images within a range of 10 meters at the indicated position, and the like. The preset range may be set by an empirical value or historical data.

302. Determining feature values corresponding to N sub-first sample positioning images corresponding to each first sample positioning image according to the first image positioning model to obtain a feature vector corresponding to each first sample positioning image;

the feature vector includes a plurality of elements.

303. Determining a characteristic value of a target image according to the first image positioning model;

304. determining a first similarity vector according to the characteristic vector corresponding to each first sample positioning image and the characteristic value of the target image;

305. determining a first target loss function according to the first similarity vector;

306. and adjusting the initial model according to the first target loss function to obtain a second image positioning model, wherein the initial model is obtained after the first image positioning model is initialized.

Referring to fig. 4, fig. 4 is a schematic flowchart of another image localization model obtaining method according to an embodiment of the present disclosure. As shown in fig. 4, the image positioning model obtaining method includes

steps

401 and 405 as follows:

401. determining the similarity between a target image and K first sample positioning images according to the first image positioning model to obtain a first similarity vector, wherein K is an integer greater than 1;

402. determining a first sub-loss function according to the first similarity vector;

403. determining a second sub-loss function according to the difficult negative sample image corresponding to the target image;

404. determining a first target loss function according to the first sub-loss function and the second sub-loss function;

405. and adjusting the initial model according to the first target loss function to obtain a second image positioning model, wherein the initial model is obtained after the first image positioning model is initialized.

In accordance with the foregoing embodiments, please refer to fig. 5, fig. 5 is a schematic structural diagram of a terminal according to an embodiment of the present application, and as shown in the drawing, the terminal includes a processor, an input device, an output device, and a memory, where the processor, the input device, the output device, and the memory are connected to each other, where the memory is used to store a computer program, the computer program includes program instructions, and the processor is configured to call the program instructions, and the program includes instructions for performing the following steps;

an image localization model acquisition method, the method comprising:

In one possible implementation manner, determining similarity between the target image and K first sample positioning images according to the first image positioning model to obtain a first similarity vector includes:

In one possible implementation, determining a first objective loss function according to the first similarity vector includes:

determining a first sub-loss function according to the first similarity vector;

In one possible implementation manner, determining a first sub-loss function according to the first similarity vector includes:

In one possible implementation, determining a first target loss function according to the first sub-loss function and the second sub-loss function includes:

In one possible implementation, the method further includes:

receiving an image to be marked;

In one possible implementation manner, the first image positioning model includes a basic image positioning model, and the basic image positioning model is a model obtained by training a pair of samples including an image with the highest similarity among the target image and the K first sample positioning images.

In one possible implementation, the method further includes:

A method of image localization, the method comprising:

receiving an image to be detected;

and positioning the image to be detected according to the second image positioning model in any instruction to obtain positioning information corresponding to the image to be detected.

The above description has introduced the solution of the embodiment of the present application mainly from the perspective of the method-side implementation process. It is understood that the terminal includes corresponding hardware structures and/or software modules for performing the respective functions in order to implement the above-described functions. Those of skill in the art will readily appreciate that the present application is capable of hardware or a combination of hardware and computer software implementing the various illustrative elements and algorithm steps described in connection with the embodiments provided herein. Whether a function is performed as hardware or computer software drives hardware depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.

In the embodiment of the present application, the terminal may be divided into the functional units according to the above method example, for example, each functional unit may be divided corresponding to each function, or two or more functions may be integrated into one processing unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit. It should be noted that the division of the unit in the embodiment of the present application is schematic, and is only a logic function division, and there may be another division manner in actual implementation.

In accordance with the above, please refer to fig. 6, fig. 6 is a schematic structural diagram of an image positioning model obtaining apparatus according to an embodiment of the present application. As shown in fig. 6, the apparatus includes:

a first determining unit 601, configured to determine, according to the first image positioning model, similarities between the target image and K first sample positioning images to obtain a first similarity vector, where K is an integer greater than 1;

a second determining unit 602, configured to determine a first target loss function according to the first similarity vector;

an adjusting unit 603, configured to adjust the initial model according to the first target loss function to obtain a second image positioning model, where the initial model is obtained after the first image positioning model is initialized.

In one possible implementation manner, the first determining unit 601 is configured to:

In one possible implementation manner, the second determining unit 602 is configured to:

determining a first sub-loss function according to the first similarity vector;

In one possible implementation manner, in determining the first sub-loss function according to the first similarity vector, the second determining unit 602 is configured to:

In one possible implementation manner, in determining the first target loss function according to the first sub-loss function and the second sub-loss function, the second determining unit 602 is configured to:

In one possible implementation, the apparatus is further configured to:

receiving an image to be marked;

In one possible implementation, the apparatus is further configured to:

Referring to fig. 7, fig. 7 is a schematic structural diagram of an image positioning apparatus according to an embodiment of the present disclosure.

As shown in fig. 7, the apparatus includes:

a receiving unit 701 configured to receive an image to be detected;

a positioning unit 702, configured to position the image to be detected according to the second image positioning model in any of the above embodiments, to obtain positioning information corresponding to the image to be detected.

Embodiments of the present application also provide a computer storage medium, wherein the computer storage medium stores a computer program for electronic data exchange, and the computer program enables a computer to execute a part or all of the steps of any one of the image localization model obtaining methods or the image localization methods described in the above method embodiments.

Embodiments of the present application further provide a computer program product, which includes a non-transitory computer-readable storage medium storing a computer program, and the computer program causes a computer to execute some or all of the steps of any one of the image localization model obtaining methods or the image localization methods described in the above method embodiments.

It should be noted that, for simplicity of description, the above-mentioned method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present application is not limited by the order of acts described, as some steps may occur in other orders or concurrently depending on the application. Further, those skilled in the art should also appreciate that the embodiments described in the specification are preferred embodiments and that the acts and modules referred to are not necessarily required in this application.

In the foregoing embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.

In the embodiments provided in the present application, it should be understood that the disclosed apparatus may be implemented in other manners. For example, the above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one type of division of logical functions, and there may be other divisions when actually implementing, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted, or not implemented. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of some interfaces, devices or units, and may be an electric or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit may be implemented in the form of hardware, or may be implemented in the form of a software program module.

The integrated units, if implemented in the form of software program modules and sold or used as stand-alone products, may be stored in a computer readable memory. Based on such understanding, the technical solution of the present application may be substantially implemented or a part of or all or part of the technical solution contributing to the prior art may be embodied in the form of a software product stored in a memory, and including several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method described in the embodiments of the present application. And the aforementioned memory comprises: various media capable of storing program codes, such as a usb disk, a read-only memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic or optical disk, and the like.

Those skilled in the art will appreciate that all or part of the steps in the methods of the above embodiments may be implemented by associated hardware instructed by a program, which may be stored in a computer-readable memory, which may include: flash memory disks, read-only memory, random access memory, magnetic or optical disks, and the like.

The foregoing detailed description of the embodiments of the present application has been presented to illustrate the principles and implementations of the present application, and the above description of the embodiments is only provided to help understand the method and the core concept of the present application; meanwhile, for a person skilled in the art, according to the idea of the present application, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present application.

Claims

1. An image localization model acquisition method, characterized in that the method comprises:

and adjusting an initial model according to the first target loss function to obtain a second image positioning model, wherein the initial model is obtained after the first image positioning model is initialized.

2. The method of claim 1, wherein determining the similarity between the target image and the K first sample location images according to the first image location model to obtain a first similarity vector comprises:

determining feature values corresponding to the N sub-first sample positioning images corresponding to each first sample positioning image according to the first image positioning model to obtain a feature vector corresponding to each first sample positioning image;

determining a characteristic value of the target image according to the first image positioning model;

and determining the first similarity vector according to the feature vector corresponding to each first sample positioning image and the feature value of the target image.

3. The method according to claim 1 or 2, wherein said determining a first target loss function from said first similarity vector comprises:

determining a first sub-loss function according to the first similarity vector;

and determining the first target loss function according to the first sub-loss function and the second sub-loss function.

4. The method of claim 3, wherein determining a first sub-loss function based on the first similarity vector comprises:

and determining the first sub-loss function according to the first similarity vector and the second similarity vector.

5. The method of claim 3 or 4, wherein determining the first target loss function from the first sub-loss function and the second sub-loss function comprises:

and calculating the first sub-loss function and the second sub-loss function according to the loss weighting factors corresponding to the first sub-loss function and the second sub-loss function to obtain the first target loss function.

6. The method according to any one of claims 1-5, further comprising:

receiving an image to be marked;

acquiring K second sample positioning images corresponding to the image to be marked;

7. The method according to any one of claims 1-6, wherein the first image localization model comprises a basic image localization model, and the basic image localization model is a model obtained by training using an image with the highest similarity between the target image and the K first sample localization images as a sample pair.

8. The method according to any one of claims 1-7, further comprising:

replacing the first image localization model with the third image localization model.

9. An image localization method, characterized in that the method comprises:

receiving an image to be detected;

the second image positioning model according to any of claims 1-8, positioning the image to be detected to obtain the positioning information corresponding to the image to be detected.

10. An image localization model acquisition apparatus, characterized in that the apparatus comprises:

and the adjusting unit is used for adjusting an initial model according to the first target loss function to obtain a second image positioning model, wherein the initial model is obtained after the first image positioning model is initialized.

11. The apparatus of claim 10, wherein the first determining unit is configured to:

12. The apparatus according to claim 10 or 11, wherein the second determining unit is configured to:

determining a first sub-loss function according to the first similarity vector;

13. A terminal, comprising a processor, an input device, an output device, and a memory, the processor, the input device, the output device, and the memory being interconnected, wherein the memory is configured to store a computer program comprising program instructions, the processor being configured to invoke the program instructions to perform the method of any of claims 1-9.

14. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program comprising program instructions that, when executed by a processor, cause the processor to carry out the method according to any one of claims 1-9.