CN113360791A - Interest point query method and device of electronic map, road side equipment and vehicle - Google Patents
Interest point query method and device of electronic map, road side equipment and vehicle Download PDFInfo
- Publication number
- CN113360791A CN113360791A CN202110730642.7A CN202110730642A CN113360791A CN 113360791 A CN113360791 A CN 113360791A CN 202110730642 A CN202110730642 A CN 202110730642A CN 113360791 A CN113360791 A CN 113360791A
- Authority
- CN
- China
- Prior art keywords
- text
- text content
- image
- interest point
- coding
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 78
- 238000001914 filtration Methods 0.000 claims abstract description 124
- 238000012545 processing Methods 0.000 claims description 58
- 238000012549 training Methods 0.000 claims description 52
- 238000004590 computer program Methods 0.000 claims description 18
- 238000000605 extraction Methods 0.000 claims description 16
- 238000001514 detection method Methods 0.000 claims description 11
- 238000009432 framing Methods 0.000 claims description 7
- 238000007499 fusion processing Methods 0.000 claims description 7
- 238000013473 artificial intelligence Methods 0.000 abstract description 5
- 238000013135 deep learning Methods 0.000 abstract description 3
- 238000010586 diagram Methods 0.000 description 31
- 238000012015 optical character recognition Methods 0.000 description 13
- 238000004891 communication Methods 0.000 description 8
- 239000003814 drug Substances 0.000 description 8
- 230000006870 function Effects 0.000 description 8
- 230000000694 effects Effects 0.000 description 7
- 230000008569 process Effects 0.000 description 4
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 230000003287 optical effect Effects 0.000 description 3
- 238000004364 calculation method Methods 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 238000007726 management method Methods 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 230000001953 sensory effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9537—Spatial or temporal dependent retrieval, e.g. spatiotemporal queries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/29—Geographical information databases
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/58—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/583—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
- G06F16/5846—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using extracted text
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Bioinformatics & Computational Biology (AREA)
- Library & Information Science (AREA)
- Evolutionary Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Health & Medical Sciences (AREA)
- Remote Sensing (AREA)
- Image Analysis (AREA)
- Processing Or Creating Images (AREA)
Abstract
The invention provides a method and a device for inquiring interest points of an electronic map, roadside equipment and a vehicle, and relates to the technical field of deep learning and intelligent transportation in the technical field of artificial intelligence. The method comprises the following steps: the method comprises the steps of identifying a signboard image to be inquired to obtain initial text content of the signboard image to be inquired and attribute information of the initial text content, filtering the initial text content according to the attribute information of the initial text content to filter invalid text content in the initial text content to obtain valid text content, determining a preset interest point as the interest point of the signboard image to be inquired if the valid text content is matched with text content of the preset interest point in an electronic map, reducing the text content matched with the text content of the interest point, improving matching efficiency, further improving inquiring efficiency, avoiding noise interference of the invalid text content on matching, and further improving inquiring accuracy and reliability.
Description
Technical Field
The present disclosure relates to the technical field of deep learning and intelligent transportation in the technical field of artificial intelligence, and in particular, to a method and an apparatus for inquiring a point of interest of an electronic map, a roadside device, and a vehicle.
Background
In the electronic map, a Point of Interest (POI) may be a house, a shop, a mailbox, a bus station, etc.
In the prior art, a commonly used method for inquiring a point of interest includes: the Longest Common Subsequence (LCS) query method, for example, determines a character string (referred to as a character string a) corresponding to a signboard image to be queried and a character string (referred to as a character string B) corresponding to each interest point, determines a Common character string between the character string a and each character string B, and determines a character string B having the Longest Common character string with the character string a as the interest point corresponding to the signboard image to be queried.
However, the adoption of all the character strings for determining the signboard image and the comparison among all the character strings easily causes the long comparison time of the query, thereby causing the technical problem of low query efficiency.
Disclosure of Invention
The disclosure provides an interest point query method and device for an electronic map, road side equipment and a vehicle, wherein the interest point query method and device are used for improving query efficiency.
According to a first aspect of the present disclosure, there is provided a method for inquiring a point of interest of an electronic map, including:
identifying a signboard image to be inquired to obtain initial text content of the signboard image to be inquired and attribute information of the initial text content;
filtering the initial text content according to the attribute information of the initial text content to filter invalid text content in the initial text content to obtain valid text content;
and if the effective text content is matched with the text content of a preset interest point in the electronic map, determining that the preset interest point is the interest point of the signboard image to be inquired, wherein the electronic map is provided with a plurality of interest points, and each interest point in the interest points is provided with text content.
According to a second aspect of the present disclosure, there is provided a method for training a signboard text filtering model, comprising:
obtaining a first sample set comprising a plurality of sample sign images;
determining rectangular boxes for framing the interest point names of the sample signboard images, and determining image information and text position information of each rectangular box;
inputting the image information, the text position information and the interest point name in each rectangular box into a Board-transformer model framework, training the Board-transformer model framework, and generating a signboard text filtering model, wherein the signboard text filtering model is used for filtering invalid text contents in the signboard image to be inquired.
According to a third aspect of the present disclosure, there is provided an interest point query apparatus for an electronic map, including:
the identification unit is used for identifying the signboard image to be inquired to obtain the initial text content of the signboard image to be inquired and the attribute information of the initial text content;
the filtering unit is used for filtering the initial text content according to the attribute information of the initial text content so as to filter invalid text content in the initial text content to obtain valid text content;
a first determining unit, configured to determine that a preset interest point is an interest point of the signboard image to be queried if the valid text content matches text content of the preset interest point in the electronic map, where the electronic map has multiple interest points, and each of the multiple interest points has text content.
According to a fourth aspect of the present disclosure, there is provided a training apparatus for a signboard text filtering model, comprising:
an acquisition unit configured to acquire a first sample set including a plurality of sample signboard images therein;
a second determining unit, configured to determine a rectangular frame for framing the interest point name of each sample signboard image, and determine image information and text position information of each rectangular frame;
and the training unit is used for inputting the image information, the text position information and the interest point name in each rectangular box into a Board-transformer model frame, training the Board-transformer model frame and generating a signboard text filtering model, wherein the signboard text filtering model is used for filtering invalid text contents in the signboard image to be inquired.
According to a fifth aspect of the present disclosure, there is provided an electronic device comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of the first aspect; or to enable the at least one processor to perform the method of the second aspect.
According to a sixth aspect of the present disclosure, there is provided a non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method of the first aspect; alternatively, the computer instructions are for causing the computer to perform the method of the second aspect.
According to a seventh aspect of the present disclosure, there is provided a computer program product comprising: a computer program stored in a readable storage medium, from which the computer program can be read by at least one processor of an electronic device, execution of the computer program by the at least one processor causing the electronic device to perform the method of the first aspect; alternatively, execution of the computer program by the at least one processor causes the electronic device to perform the method of the second aspect.
According to an eighth aspect of the present disclosure, there is provided a vehicle comprising: the apparatus of the second aspect.
According to a ninth aspect of the present disclosure, there is provided a roadside apparatus including: the apparatus of the second aspect.
It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present disclosure, nor do they limit the scope of the present disclosure. Other features of the present disclosure will become apparent from the following description.
Drawings
The drawings are included to provide a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:
fig. 1 is a scene diagram of a point of interest query method of an electronic map, in which an embodiment of the present disclosure may be implemented;
FIG. 2 is a scene diagram of a point of interest query method of an electronic map in which an embodiment of the present disclosure may be implemented;
fig. 3 is a schematic view of a point-of-interest query method of an electronic map in the related art;
FIG. 4 is a schematic diagram according to a first embodiment of the present disclosure;
FIG. 5 is a schematic diagram according to a second embodiment of the present disclosure;
FIG. 6 is a schematic diagram according to a third embodiment of the present disclosure;
FIG. 7 is a schematic diagram of the principles of training a text matching model according to the present disclosure;
FIG. 8 is a schematic diagram according to a fourth embodiment of the present disclosure;
FIG. 9 is a schematic diagram according to a fifth embodiment of the present disclosure;
FIG. 10 is a schematic diagram of a point of interest query method of an electronic map according to the present disclosure;
FIG. 11 is a schematic diagram according to a sixth embodiment of the present disclosure;
FIG. 12 is a schematic diagram of the training of a signboard text filtering model according to the present disclosure;
FIG. 13 is a schematic diagram according to a seventh embodiment of the present disclosure;
FIG. 14 is a schematic diagram according to an eighth embodiment of the present disclosure;
FIG. 15 is a schematic diagram according to a ninth embodiment of the present disclosure;
FIG. 16 is a schematic diagram according to a tenth embodiment of the present disclosure;
fig. 17 is a block diagram of an electronic device for implementing a point of interest query method of an electronic map according to an embodiment of the present disclosure.
Detailed Description
Exemplary embodiments of the present disclosure are described below with reference to the accompanying drawings, in which various details of the embodiments of the disclosure are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
The interest point is a term in a geographic information system, and generally refers to all geographic objects which can be abstracted as points, especially some geographic entities closely related to the life of people, such as schools, banks, restaurants, gas stations, hospitals, supermarkets, and the like. The main purpose of the interest points is to describe the addresses of the things or events, so that the description capability and the query capability of the positions of the things or events can be greatly enhanced, and the accuracy and the speed of geographic positioning are improved.
With the wide application of artificial intelligence in intelligent transportation and smart cities, in order to facilitate the traveling of users, interest points are marked in the electronic map, for example, interest points representing schools, banks and the like can be marked in the electronic map.
A point of interest query (also referred to as a point of interest search, or a map search) is one of the basic technologies of location information services, and directly affects the service experience of a user. Point of interest query is technically a pulse of Web search that supports users searching for points of interest that are related to geographic location.
In one example, the interest point query may be applied to update the interest points marked on the electronic map, such as adding missing interest points on the electronic map, adding newly added interest points, deleting outdated interest points, and the like.
For example, with respect to a newly constructed office building, points of interest of the newly constructed office building may be added to the electronic map based on the location information of the newly constructed office building and the like.
For example, in conjunction with the application scenario shown in FIG. 1, user 101 may transmit a signboard image of a newly constructed office building to computer 102.
The computer 102 may determine whether to mark an interest point corresponding to the signboard image in the electronic map by using an interest point query method, and if it is determined that the interest point corresponding to the signboard image is not marked in the electronic map, the computer 102 may mark an interest point of a newly-built office building in the electronic map according to the signboard image.
In another example, the point of interest query may be applied to path planning, thereby implementing services such as automatic driving.
For example, in connection with the application scenario as shown in fig. 2, a vehicle 201 is traveling on a road 202.
A user (not shown in the figure) in the vehicle 201 can input a signboard image of a destination to the vehicle 201.
The vehicle 201 determines the interest point of the signboard image in the electronic map by using the interest point query method, and determines the driving path according to the current position and the interest point, thereby realizing automatic driving.
As can be seen from fig. 3, in the related art, a commonly used method of querying a point of interest includes:
the first step is as follows: and performing Optical Character Recognition (OCR) on the signboard image to be queried to obtain the text content of the signboard image to be queried. Specifically, the text content is "XX medicine, NO233XX garden shop" as shown in fig. 3.
The second step is as follows: the textual content is text matched (LCS) to the point of interest names in a point of interest name repository, such as the POI name repository shown in fig. 3.
The interest point name library is as follows: the name library is formed by the names of the interest points marked in the electronic map. For example, the point of interest name repository may include "XX Garden, XX Bank" as shown in FIG. 3.
The specific text matching method may include: determining a character string of the text content (i.e. a character string of "XX medicine, NO233XX garden shop"); determining a string for each point of interest name in the point of interest name repository (i.e., a string for "XX garden," a string for "XX bank," etc.); determining a common character string of the text content and the character string of each interest point name, determining the longest common character string from the common character strings, determining the interest point name corresponding to the longest common character string as the interest point name of the signboard image to be inquired, and determining the interest point marked on the electronic map by the interest point name as the interest point of the signboard image to be inquired (namely, the inquiry result shown in fig. 3).
On the one hand, however, when the point of interest of the signboard image to be queried is determined by using the correlation method shown in fig. 3, by matching all the text contents of the signboard image to be queried with the name of the point of interest, there may be a technical problem that the accuracy of the query result is low due to invalid text contents existing in the text contents.
For example, "NO 233XX garden shop" shown in fig. 3 is invalid text content, and although the interest point name corresponding to "NO 233XX garden shop" can be matched, the matched interest point on the electronic map is "XX bank". That is, the interference of a part of invalid text contents in the text contents causes the disadvantage of obtaining wrong interest points.
On the other hand, when the interest point of the signboard image to be queried is determined by using the correlation method shown in fig. 3, all text contents of the signboard image to be queried are matched with the name of the interest point, so that more text contents are used for text matching, and text matching needs to be performed in a relatively long time, thereby causing the technical problem of low query efficiency.
In order to avoid at least one of the above technical problems, the inventors of the present disclosure have made creative efforts to obtain the inventive concept of the present disclosure: after the text content of the signboard image to be inquired is obtained, filtering the invalid text content in the text content so as to match the text content of the interest point based on the valid text content, thereby obtaining the interest point of the signboard image to be inquired.
Based on the invention concept, the invention discloses a method and a device for inquiring interest points of an electronic map, road side equipment and a vehicle, which are applied to the technical fields of deep learning and intelligent transportation in the technical field of artificial intelligence, and particularly can be applied to an automatic driving scene to improve the inquiring efficiency and accuracy.
Fig. 4 is a schematic diagram according to a first embodiment of the present disclosure, and as shown in fig. 4, the method for querying a point of interest of an electronic map provided by the present embodiment includes:
s401: and identifying the signboard image to be inquired to obtain the initial text content of the signboard image to be inquired and the attribute information of the initial text content.
For example, the execution subject of this embodiment may be a point of interest query device (hereinafter, referred to as a query device for short) of an electronic map, and the query device may be a server (including a local server and a cloud server, where the server may be a cloud control platform, a vehicle-road cooperative management platform, a central subsystem, an edge computing platform, a cloud computing platform, and the like), may also be a roadside device, may also be a vehicle (such as a vehicle-mounted terminal in a vehicle, and the like), may also be a terminal device, may also be a processor, may also be a chip, and the like, which is not limited in this embodiment.
In a system architecture of intelligent transportation vehicle-road cooperation, the road side equipment comprises road side sensing equipment with a computing function and road side computing equipment connected with the road side sensing equipment, the road side sensing equipment (such as a road side camera) is connected to the road side computing equipment (such as a Road Side Computing Unit (RSCU)), the road side computing equipment is connected to a server, and the server can communicate with an automatic driving vehicle or an auxiliary driving vehicle in various modes; or the roadside sensing device comprises a calculation function, and the roadside sensing device is directly connected to the server. The above connections may be wired or wireless.
The attribute information of the initial text content refers to attributes of the initial text, such as semantic attributes, position attributes, image attributes, and the like. That is, the attribute information of the initial text content may be used to describe the initial text content from different dimensions.
In some embodiments, the signboard image to be queried may be recognized by an optical character recognition method, so as to obtain the initial text content and the attribute information of the initial text content.
S402: and filtering the initial text content according to the attribute information of the initial text content to filter invalid text content in the initial text content to obtain valid text content.
Illustratively, in conjunction with the related art as shown in fig. 3, the initial text content includes "XX medicine, NO233XX garden shop", and in this embodiment, the query device performs filtering processing on the invalid text "NO 233XX garden shop" in the initial text content "XX medicine, NO233XX garden shop" based on the attribute information of the initial text content, resulting in the valid text "XX medicine".
S403: and if the effective text content is matched with the text content of the preset interest point in the electronic map, determining the preset interest point as the interest point of the signboard image to be inquired.
The electronic map is provided with a plurality of interest points, and each interest point in the interest points is provided with text content.
In combination with the above example, by performing filtering processing on the invalid text "NO 233XX garden shop", and matching the text content of the preset point of interest with the valid text "XX medicine", as shown in fig. 3, matching "XX medicine" with "XX garden, XX bank", and the like respectively, the text content of the preset point of interest matching with "XX medicine" is obtained, and thus the point of interest of the signboard image to be queried is obtained.
Based on the above analysis, an embodiment of the present disclosure provides a method for querying a point of interest of an electronic map, including: identifying a signboard image to be inquired to obtain initial text content of the signboard image to be inquired and attribute information of the initial text content, filtering the initial text content according to the attribute information of the initial text content to filter invalid text content in the initial text content to obtain valid text content, and determining a preset interest point as an interest point of the signboard image to be inquired if the valid text content is matched with text content of a preset interest point in an electronic map, wherein the electronic map has a plurality of interest points, and each interest point in the interest points has text content, in the embodiment, the method comprises the following steps of: after the text content of the signboard image to be inquired is obtained, filtering processing is carried out on invalid text content in the text content, so that matching can be carried out based on the valid text content and the text content of the interest point, the characteristics of the interest point of the signboard image to be inquired can be obtained, the text content used for matching with the text content of the interest point can be reduced, matching efficiency is improved, inquiring efficiency is improved, noise interference of the invalid text content on matching is avoided by filtering the invalid text content, and the technical effects of accuracy and reliability of inquiring are improved.
Fig. 5 is a schematic diagram according to a second embodiment of the present disclosure, and as shown in fig. 5, the method for querying a point of interest of an electronic map provided by the present embodiment includes:
s501: and identifying the signboard image to be inquired to obtain the initial text content of the signboard image to be inquired and the attribute information of the initial text content.
For an exemplary implementation principle of S501, reference may be made to the first embodiment, which is not described herein again.
Wherein the attribute information of the initial text content includes: semantic attributes, location attributes, and image attributes.
The semantic attribute of the initial text content refers to information related to the semantics of the initial text content (such as the name of an interest point in the initial text content); the position attribute of the initial text content refers to information related to the position of the initial text content (e.g., the pixel position of the initial text content); the image attribute of the initial text content refers to information related to an image (e.g., an image identifier) of the initial text content.
S502: and respectively coding the semantic attribute, the position attribute and the image attribute of the initial text content to obtain a coding feature set comprising respective corresponding coding features.
Illustratively, S502 may include the steps of:
the first step is as follows: and coding the semantic attributes of the initial text content according to a first coder of a pre-trained signboard text filtering model to obtain a first coding characteristic.
The second step is as follows: and coding the position attribute of the initial text content according to a first coder of the signboard text filtering model to obtain a second coding characteristic.
The third step: and coding the image attribute of the initial text content according to the first coder of the signboard text filtering model to obtain a third coding characteristic.
Wherein the signboard text filtering model is: generated by training the Board-transformer model framework based on each sample signboard image in the first sample set.
It should be noted that, in this embodiment, the signboard text filtering model generated by training based on the multi-modal Board-transformer model framework is used to perform coding processing on the semantic attribute, the position attribute, and the image attribute of the initial text content, so that parallel coding processing can be implemented without affecting the coding processing, and the accuracy and reliability of each coding feature in the obtained coding feature set are improved.
S503: and determining a first-order coding characteristic from the coding characteristic set, and filtering the coding characteristics except the first-order coding characteristic in the coding characteristic set by taking the first-order coding characteristic as a filtering reference coding characteristic so as to filter invalid text contents in the initial text contents and obtain valid text contents.
It should be understood that "first" in the first leading encoding characteristic of the present embodiment is used to distinguish the second leading encoding characteristic in the following text, and is not to be construed as a limitation on the first leading encoding characteristic.
For example, if the coding feature set includes m coding features and the first leading coding feature is coding feature a, the coding feature a is used as a filtering reference coding feature, and (m-1) coding features (that is, other coding features of the m coding features that do not include coding feature a) are filtered to filter invalid coding features, so as to filter invalid text content in the initial text content.
It should be noted that, in this embodiment, by performing filtering processing on the invalid text content in the initial text content based on the respective corresponding encoding features of the semantic attribute, the position attribute, and the image attribute of the initial text content, the invalid text content can be filtered from multiple dimensions, so that the valid text content has the technical effects of higher accuracy and reliability.
In some embodiments, the principles of the filtering process may include: and determining the association degree between each coding feature except the first leading coding feature in the coding feature set and the first leading coding feature, and filtering the coding features of which the association degree is smaller than a preset association degree threshold.
The association degree can be realized by calculating the similarity, and the greater the similarity is, the higher the association degree is, and conversely, the smaller the similarity is, the lower the association degree is.
Specifically, when filtering invalid coding features by calculating the similarity, the following example may be referred to:
and (3) calculating the similarity between any coding feature and the coding feature a aiming at any coding feature in the (m-1) coding features, filtering the any coding feature if the similarity is smaller than a preset similarity threshold, otherwise, reserving the any coding feature if the similarity is larger than or equal to the similarity threshold, performing fusion processing on the reserved coding feature and the coding feature a after all reserved coding features are obtained, obtaining the coding feature after the fusion processing, and determining the text content corresponding to the coding feature after the fusion processing as effective text content.
The similarity threshold may be set by the querying device based on the needs, history, and tests, which is not limited in this embodiment.
It should be noted that, in this embodiment, by performing filtering processing on the coding features based on the association degree, filtering of the interference coding features can be achieved, so that the interference text content is filtered as the invalid text content, and the technical effects of improving the accuracy and reliability of filtering the invalid text content are achieved.
S504: and if the effective text content is matched with the text content of the preset interest point in the electronic map, determining the preset interest point as the interest point of the signboard image to be inquired.
The electronic map is provided with a plurality of interest points, and each interest point in the interest points is provided with text content.
For example, regarding the implementation principle of S504, reference may be made to the first embodiment, and details are not described here.
Fig. 6 is a schematic diagram according to a third embodiment of the present disclosure, and as shown in fig. 6, the method for querying a point of interest of an electronic map provided by the present embodiment includes:
s601: and identifying the signboard image to be inquired to obtain the initial text content of the signboard image to be inquired and the attribute information of the initial text content.
For an exemplary implementation principle of S601, reference may be made to the first embodiment, which is not described herein again.
S602: and filtering the initial text content according to the attribute information of the initial text content to filter invalid text content in the initial text content to obtain valid text content.
For an exemplary implementation principle of S602, reference may be made to the first embodiment or the second embodiment, which is not described herein again.
S603: inputting the effective text content and the text content of each interest point of the electronic map into a pre-trained text matching model, and performing matching processing on the effective text content and the text content of each interest point of the electronic map according to the text matching model to obtain the interest point of the signboard image to be inquired.
And the text matching model is generated by performing triple loss training on the self-supervision model frame based on the text content of each interest point in the second sample set.
It should be noted that, in this embodiment, the effective text content and the text content of each interest point of the electronic map are matched according to a text matching model generated by performing triple loss training on an auto-supervision model framework (triplet loss), that is, a matching result (that is, the text content of a preset interest point in the electronic map matched with the effective text content) is determined from triple dimensions (that is, the original point text content, the active text content, and the passive text content), so that matching can be comprehensive, and semantic features are fully considered, so that accuracy and reliability of matching can be improved, and technical effects of accuracy and reliability of the determined interest point of the signboard image to be queried can be achieved.
In some embodiments, the text matching model is generated by adjusting parameters of the self-supervision model framework based on first difference information between the origin text content and the positive text content, and second difference information between the origin text content and the negative text content, the origin text content is randomly selected from the interest point text contents, the positive text content is the interest point text content belonging to the same type as the origin text content in the interest point text contents, and the negative text content is determined as the negative text content for the interest point text content belonging to the different type from the origin text content.
For example, in some embodiments, referring to fig. 7, the principle of training the text matching model may be that, as shown in fig. 7, the original text content, the positive text content, and the negative text content are respectively input into a language representation model (e.g., BERT shown in fig. 7), and the output text feature u of the original text content, the text feature v of the positive text content, and the text feature w of the negative text content are trained by a triplet loss function (e.g., triplet loss shown in fig. 7), specifically, parameters of an auto-supervised model framework are adjusted, so as to obtain the text matching model.
It should be understood that the above examples are merely exemplary illustrations that may use BERT as a language representation model and are not to be construed as limiting the model. Similarly, the loss function is also only exemplarily illustrated by the triplet loss, but is not to be construed as a limitation of the loss function.
Fig. 8 is a schematic diagram of a fourth embodiment of the present disclosure, and as shown in fig. 8, the method for querying a point of interest of an electronic map provided by the present embodiment includes:
s801: and identifying the signboard image to be inquired to obtain the initial text content of the signboard image to be inquired and the attribute information of the initial text content.
For an exemplary implementation principle of S801, reference may be made to the first embodiment, which is not described herein again.
S802: and removing the initial text content according to the attribute information of the initial text content to remove the invalid text content in the initial text content to obtain the valid text content.
For an exemplary implementation principle of S802, reference may be made to the first embodiment or the second embodiment, which is not described herein again.
S803: and inputting the effective text content and the text of each interest point of the electronic map into a pre-trained text matching model, and aligning and matching the effective text content and the text content of each interest point of the electronic map according to the text matching model to obtain the interest point of the signboard image to be inquired.
And the text matching model is generated by training a face recognition model framework ArcFace based on the text content of each interest point in the third sample set.
It should be noted that, in this embodiment, a text matching model generated based on a face recognition model frame is introduced, and the effective text content and the text content of each interest point of the electronic map are aligned and matched, so as to obtain the interest point of the signboard image to be queried, and the image features of the effective text content are fully considered, so that the technical effects of accuracy and reliability of the determined interest point of the signboard image to be queried can be improved.
In some embodiments, S803 may include the steps of:
the first step is as follows: and performing text detection on the effective text content to obtain a text image of the effective text content, and aligning the text image with the text image of the text content of each interest point of the electronic map.
For the alignment processing principle, reference may be made to the principle of performing alignment processing on a face image in the related art, and details are not described here.
The second step is as follows: and performing feature extraction on the text image of the aligned effective text content to obtain a first image feature, performing feature extraction on the text image of the text content of each interest point of the electronic map to obtain a second image feature, and performing matching processing on the first image feature and the second image feature to obtain the interest point of the signboard image to be inquired.
It should be noted that, in this embodiment, by aligning the text images to be matched and matching the image features obtained after the alignment, the matching is targeted, so as to improve the matching accuracy, and further achieve the technical effects of accuracy and reliability of the points of interest of the signboard images to be queried.
In some embodiments, the text matching model is generated by training the ArcFace according to a third image feature and a fourth image feature, the third image feature is obtained by performing feature extraction on a text image of text content of each interest point in a third sample set after alignment processing, the fourth image feature is obtained by performing feature extraction on the preset standard text image, and the alignment processing refers to: and performing text detection on the text content of each interest point in the third sample set to obtain a text image of the text content of each interest point in the third sample set, and aligning the text image with a preset standard text image.
The preset standard text image may be obtained by performing mean processing on the text image of each interest point in the third sample set.
That is, the method of training the generation of the text matching model may include: performing text detection on the text content of each interest point in the third sample set to obtain a text image of the text content of each interest point in the third sample set, aligning the text image with a preset standard text image, performing feature extraction on the text image of the text content of each interest point in the third sample set after alignment to obtain a third image feature, performing feature extraction on the preset standard text image to obtain a fourth image feature, and training ArcFace according to the third image feature and the fourth image feature to generate a text matching model.
Based on the third embodiment and the fourth embodiment, in the present embodiment, different methods may be adopted to implement matching between the effective text content and the text content of each interest point in the electronic map, so that the technical effects of flexibility and diversity of matching may be achieved.
Fig. 9 is a schematic diagram according to a fifth embodiment of the present disclosure, and as shown in fig. 9, the method for querying a point of interest of an electronic map provided by the present embodiment includes:
s901: and identifying the signboard image to be inquired to obtain the semantic attribute, the position attribute and the image attribute of the signboard image to be inquired.
In conjunction with the schematic diagram shown in fig. 10, the signboard image to be queried may be input to an optical character recognition model (i.e., an OCR module as shown in fig. 10), recognized by the optical character recognition model, and output semantic attributes (i.e., OCR text shown in fig. 10), position attributes (i.e., OCR position as shown in fig. 10), and image attributes (i.e., OCR image as shown in fig. 10).
S902: and inputting the semantic attribute, the position attribute and the image attribute into a signboard text filtering model, and filtering according to the signboard text filtering model to obtain the effective text content of the signboard image to be inquired.
In conjunction with the schematic diagram shown in fig. 10, the sign text filtering model (i.e., the OCR text Board-transformer shown in fig. 10) filters the invalid text content of the sign image to be queried according to the OCR text, the OCR image, and the OCR position, so as to obtain valid text content (i.e., the valid POI text shown in fig. 10).
S903: and matching the effective text content with each interest point text content of the electronic map to obtain the interest point of the signboard image to be inquired.
In conjunction with the schematic diagram shown in fig. 10, the text content of each point of interest of the electronic map may be stored in the POI name library shown in fig. 10, the valid POI text and the text content of each point of interest in the POI name library are input into a text matching model (i.e., POI-match shown in fig. 10), and the point of interest (not shown in fig. 10) associated with the signboard image to be queried is output.
In some embodiments, S903 may include: according to the position attribute of the signboard image to be inquired, selecting the text content of the interest point with the position attribute within a preset range from the text content of each interest point of the electronic map, and matching the effective text content with the text content of the interest point within the preset range to obtain the interest point with the signboard image to be inquired.
The preset range may be set based on a demand, a history, a test, and the like, and this embodiment is not limited.
It should be noted that, in this embodiment, matching is performed within a preset range, so that the matching times can be reduced, the matching efficiency can be improved, noise interference can be avoided, and the matching accuracy and reliability can be improved.
The matching process can be understood as a process of calculating similarity, and an interest point corresponding to the text content of the interest point with the best matching result can be determined as the interest point of the signboard image to be queried in a mode of setting a similarity threshold.
In some embodiments, for a scenario in which the method of the present embodiment is applied to a point of interest marked in an electronic map, whether any point of interest has been marked in the electronic map can be verified by the method of the present embodiment.
For example, if there is no text content of the interest point matching with the valid text content when the valid text content is matched with the text content of each interest point in the electronic map, it indicates that the signboard image to be queried has not marked a corresponding interest point in the electronic map, the interest point corresponding to the signboard image to be queried may be marked in the electronic map according to the location attribute of the signboard image to be queried, and the POI name corresponding to the signboard image to be queried may be added in the POI name library (which may be determined based on the valid text content).
Fig. 11 is a schematic diagram of a sixth embodiment of the present disclosure, and as shown in fig. 11, the method for training a signboard text filtering model provided in this embodiment includes:
s1101: a first sample set is obtained, the first sample set including a plurality of sample sign images.
S1102: rectangular boxes for framing the interest point names of each sample signboard image are determined, and image information and text position information of each rectangular box are determined.
S1103: inputting the image information, the text position information and the interest point name in each rectangular frame into a Board-transformer model frame, and training the Board-transformer model frame to generate a signboard text filtering model.
The signboard text filtering model is used for filtering invalid text contents in the signboard image to be inquired.
In some embodiments, S1103 may include the steps of:
the first step is as follows: and respectively coding the input image information and text position information of each rectangular frame and the interest point name in each rectangular frame according to a Board-transformer model frame to obtain the coding characteristics corresponding to the image information and text position information of each rectangular frame and the interest point name in each rectangular frame.
The second step is as follows: and training the Board-transformer model frame according to the image information and the text position information of each rectangular frame and the coding characteristics corresponding to the interest point names in each rectangular frame to generate a signboard text filtering model.
In some embodiments, the second step may comprise the sub-steps of:
the first substep: and determining the image information, the text position information and the second first encoding characteristic in the encoding characteristics corresponding to the interest point name in each rectangular frame.
The second substep: and according to the second first-order coding features, filtering and fusing the image information and the text position information of each rectangular frame and the coding features corresponding to the interest point names in each rectangular frame to obtain the coding features after filtering and fusing, and adjusting the parameters of the Board-transformer model frame according to the coding features after filtering and fusing to obtain the signboard text filtering model.
In some embodiments, the second substep may comprise: and filtering the image information and the text position information of each rectangular frame and invalid coding features in the coding features corresponding to the interest point names in each rectangular frame based on the second first coding features, and fusing the image information and the text position information of each rectangular frame and the valid coding features in the coding features corresponding to the interest point names in each rectangular frame by taking the second first coding features as basic features to obtain the coding features after filtering and fusing.
In some embodiments, the Board-transformer model framework includes a plurality of encoders, and the encoders for encoding the input image information, text position information, and interest point names in each rectangular box are different from each other.
In some embodiments, the image information of each rectangular box comprises an image identification of each rectangular box, the text position information of each rectangular box comprises a position, a length, and a width of a center point of each rectangular box, and the interest point name in each rectangular box comprises pixels of the interest point name in each rectangular box.
Exemplarily, in combination with the training principle shown in fig. 12, a text encoder (i.e., the text Embedder shown in fig. 12) encodes the name of the interest point, a position encoder (i.e., the position Embedder shown in fig. 12) encodes the text position information, an image encoder (i.e., the image Embedder shown in fig. 12) encodes the image information, and each encoder inputs the encoded encoding features thereof to a Transformer, the Transformer determines the second encoding feature, and filters and fuses other encoding features based on the second encoding feature, so as to obtain a signboard text filtering model when the number of iterations is preset or the preset training requirement is met.
Fig. 13 is a schematic diagram according to a seventh embodiment of the disclosure, and as shown in fig. 13, an apparatus 1300 for querying a point of interest of an electronic map provided in this embodiment includes:
the identifying unit 1301 is configured to identify the signboard image to be queried, so as to obtain an initial text content of the signboard image to be queried and attribute information of the initial text content.
The filtering unit 1302 is configured to filter the initial text content according to the attribute information of the initial text content, so as to filter the invalid text content in the initial text content, and obtain the valid text content.
The first determining unit 1303 is configured to determine a preset interest point as the interest point of the signboard image to be queried if the valid text content matches the text content of the preset interest point in the electronic map, where the electronic map has a plurality of interest points, and each of the plurality of interest points has text content.
Fig. 14 is a schematic diagram according to an eighth embodiment of the present disclosure, and as shown in fig. 14, the apparatus 1400 for querying a point of interest of an electronic map provided in this embodiment includes:
and an identifying unit 1401, configured to identify the signboard image to be queried, so as to obtain an initial text content of the signboard image to be queried and attribute information of the initial text content.
The filtering unit 1402 is configured to filter the initial text content according to the attribute information of the initial text content, so as to filter the invalid text content in the initial text content, and obtain the valid text content.
As can be appreciated in conjunction with fig. 14, in some embodiments, filter unit 1402 includes:
a first encoding subunit 14021, configured to perform encoding processing on the semantic attribute, the position attribute, and the image attribute of the initial text content, respectively, to obtain an encoding feature set including corresponding encoding features.
The filtering subunit 14022 is configured to determine a first leading encoding characteristic from the encoding characteristic set, and filter, with the first leading encoding characteristic as a filtering reference encoding characteristic, the encoding characteristics in the encoding characteristic set except the first leading encoding characteristic, so as to filter the invalid text content in the initial text content, and obtain the valid text content.
In some embodiments, the filtering subunit 14022 is configured to determine a degree of association between each coding feature in the coding feature set, except for the first leading coding feature, and perform filtering processing on the coding features whose degree of association is smaller than a preset association degree threshold.
In some embodiments, the first encoding subunit 14021 is configured to perform encoding processing on semantic attributes of the initial text content according to a first encoder of a pre-trained signboard text filtering model to obtain a first encoding characteristic, perform encoding processing on position attributes of the initial text content according to the first encoder of the signboard text filtering model to obtain a second encoding characteristic, and perform encoding processing on image attributes of the initial text content according to the first encoder of the signboard text filtering model to obtain a third encoding characteristic; wherein the signboard text filtering model is: generated by training the Board-transformer model framework based on each sample signboard image in the first sample set.
A first determining unit 1403, configured to determine a preset interest point as the interest point of the signboard image to be queried if the valid text content matches with a text content of a preset interest point in an electronic map, where the electronic map has a plurality of interest points, and each interest point in the plurality of interest points has a text content.
As can be seen in fig. 14, in some embodiments, the first determining unit 1403 includes:
an input subunit 14031, configured to input the valid text content and the text content of each point of interest that the electronic map has to the pre-trained text matching model.
And the matching subunit 14032 is configured to perform matching processing on the effective text content and the text content of each interest point of the electronic map according to a text matching model to obtain an interest point of the signboard image to be queried, where the text matching model is generated by performing triple loss training on the self-monitoring model frame based on the text content of each interest point in the second sample set.
In some embodiments, a text matching model is generated by adjusting parameters of the self-supervised model framework based on first difference information between origin text content and positive text content, and second difference information between the origin text content and negative text content, the origin text content being randomly selected from the interest point text contents, the positive text content being interest point text content of the interest point text contents belonging to the same type as the origin text content, and the negative text content being interest point text content of a different type from the origin text content and determined as negative text content.
In other embodiments, the inputting subunit 14031 is configured to input the valid text content and the text of each point of interest possessed by the electronic map into a pre-trained text matching model.
And the matching subunit 14032 is configured to perform alignment processing and matching processing on the effective text content and the text content of each interest point of the electronic map according to the text matching model to obtain an interest point of the signboard image to be queried, where the text matching model is generated by training the face recognition model framework ArcFace based on the text content of each interest point in the third sample set.
In some embodiments, matching subunit 14032 includes:
and the detection module is used for performing text detection on the effective text content to obtain a text image of the effective text content, and aligning the text image with the text image of the text content of each interest point of the electronic map.
And the extraction module is used for extracting the characteristics of the text image of the effective text content after the alignment processing to obtain first image characteristics, and extracting the characteristics of the text image of the text content of each interest point of the electronic map to obtain second image characteristics.
And the matching module is used for matching the first image characteristic and the second image characteristic to obtain an interest point of the signboard image to be inquired.
In some embodiments, the text matching model is generated by training the ArcFace according to a third image feature and a fourth image feature, the third image feature is obtained by performing feature extraction on a text image of text content of each interest point in a third sample set after alignment processing, the fourth image feature is obtained by performing feature extraction on a preset standard text image, and the alignment processing refers to: and performing text detection on the text content of each interest point in the third sample set to obtain a text image of the text content of each interest point in the third sample set, and aligning the text image with a preset standard text image.
Fig. 15 is a schematic diagram of a ninth embodiment of the present disclosure, and as shown in fig. 15, the present embodiment provides a training apparatus 1500 for a signboard text filtering model, including:
an acquiring unit 1501 acquires a first sample set including a plurality of sample signboard images.
A second determination unit 1502 determines rectangular boxes for framing the interest point names of each sample signboard image, and determines image information and text position information of each rectangular box.
The training unit 1503 is configured to input the image information, the text position information, and the interest point name in each rectangular frame into a Board-transformer model frame, train the Board-transformer model frame, and generate a signboard text filtering model, where the signboard text filtering model is used to filter invalid text content in the signboard image to be queried.
Fig. 16 is a schematic diagram of a tenth embodiment of the present disclosure, and as shown in fig. 16, the embodiment provides a training apparatus 1600 for a signboard text filtering model, which includes:
an acquiring unit 1601 is configured to acquire a first sample set, where the first sample set includes a plurality of sample signboard images.
A second determining unit 1602, configured to determine rectangular boxes for framing the interest point names of each sample signboard image, and determine image information and text position information of each rectangular box.
The training unit 1603 is configured to input the image information, the text position information, and the interest point name in each rectangular box into a Board-transformer model frame, train the Board-transformer model frame, and generate a signboard text filtering model, where the signboard text filtering model is used to filter invalid text content in a signboard image to be queried.
As may be appreciated in conjunction with FIG. 16, in some embodiments, training unit 1603 includes:
a second encoding subunit 16031, configured to perform encoding processing on the input image information, text position information, and interest point name in each rectangular frame according to a Board-transformer model framework, to obtain encoding features corresponding to the image information, the text position information, and the interest point name in each rectangular frame.
A training subunit 16032, configured to train the Board-transformer model framework according to the image information and the text position information of each rectangular frame and the coding features corresponding to the interest point names in each rectangular frame, so as to generate a signboard text filtering model.
In some embodiments, training subunit 16032 includes:
and the determining module is used for determining the image information and the text position information of each rectangular frame and the second first encoding characteristic in the encoding characteristics corresponding to the interest point name in each rectangular frame.
And the processing module is used for filtering and fusing the image information and the text position information of each rectangular frame and the coding features corresponding to the interest point names in each rectangular frame according to the second first-order coding features to obtain the coding features after filtering and fusing.
And the adjusting module is used for adjusting the parameters of the Board-transformer model frame according to the coding characteristics after the filtering and fusing processing to obtain a signboard text filtering model.
In some embodiments, the processing module comprises:
and the filtering submodule is used for filtering the image information and the text position information of each rectangular frame and invalid coding features in the coding features corresponding to the interest point names in each rectangular frame based on the second first-order coding features.
And the processing submodule is used for carrying out fusion processing on the image information and the text position information of each rectangular frame and the effective coding features in the coding features corresponding to the interest point names in each rectangular frame by taking the second first-order coding feature as a basic feature to obtain the coding features after filtering and fusion processing.
In some embodiments, the Board-transformer model framework includes a plurality of encoders, and the encoders for encoding the input image information, text position information, and interest point names in each rectangular box are different from each other.
In some embodiments, the image information of each rectangular box comprises an image identification of each rectangular box, the text position information of each rectangular box comprises a position, a length, and a width of a center point of each rectangular box, and the interest point name in each rectangular box comprises pixels of the interest point name in each rectangular box.
The present disclosure also provides an electronic device and a readable storage medium according to an embodiment of the present disclosure.
According to an embodiment of the present disclosure, the present disclosure also provides a computer program product comprising: a computer program, stored in a readable storage medium, from which at least one processor of the electronic device can read the computer program, the at least one processor executing the computer program causing the electronic device to perform the solution provided by any of the embodiments described above.
Fig. 17 illustrates a schematic block diagram of an example electronic device 1700 that can be used to implement embodiments of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the disclosure described and/or claimed herein.
As shown in fig. 17, the electronic apparatus 1700 includes a computing unit 1701 that can perform various appropriate actions and processes according to a computer program stored in a Read Only Memory (ROM)1702 or a computer program loaded from a storage unit 1708 into a Random Access Memory (RAM) 1703. In the RAM 1703, various programs and data required for the operation of the device 1700 can also be stored. The computing unit 1701, the ROM 1702, and the RAM 1703 are connected to each other through a bus 1704. An input/output (I/O) interface 1705 is also connected to bus 1704.
Various components in the device 1700 are connected to the I/O interface 1705, including: an input unit 1706 such as a keyboard, a mouse, and the like; an output unit 1707 such as various types of displays, speakers, and the like; a storage unit 1708 such as a magnetic disk, optical disk, or the like; and a communication unit 1709 such as a network card, modem, wireless communication transceiver, etc. The communication unit 1709 allows the device 1700 to exchange information/data with other devices via a computer network such as the internet and/or various telecommunication networks.
The computing unit 1701 may be a variety of general purpose and/or special purpose processing components with processing and computing capabilities. Some examples of the computing unit 1701 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various dedicated Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, and the like. The calculation unit 1701 executes various methods and processes described above, such as a point-of-interest inquiry method of an electronic map. For example, in some embodiments, the point-of-interest query method of the electronic map may be implemented as a computer software program tangibly embodied in a machine-readable medium, such as the storage unit 1708. In some embodiments, part or all of a computer program may be loaded and/or installed onto device 1700 via ROM 1702 and/or communications unit 1709. When the computer program is loaded into the RAM 1703 and executed by the computing unit 1701, one or more steps of the above-described point of interest query method of the electronic map may be performed. Alternatively, in other embodiments, the computing unit 1701 may be configured in any other suitable manner (e.g., by means of firmware) to perform a point of interest query method of an electronic map.
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuitry, Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), system on a chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.
Program code for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program codes, when executed by the processor or controller, cause the functions/operations specified in the flowchart and/or block diagram to be performed. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), and the Internet.
The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The Server can be a cloud Server, also called a cloud computing Server or a cloud host, and is a host product in a cloud computing service system, so as to solve the defects of high management difficulty and weak service expansibility in the traditional physical host and VPS service ("Virtual Private Server", or simply "VPS"). The server may also be a server of a distributed system, or a server incorporating a blockchain.
According to another aspect of the disclosed embodiments, there is provided a vehicle including: the apparatus for inquiring the interest point of the electronic map according to any one of the above embodiments.
According to another aspect of an embodiment of the present disclosure, there is provided a roadside apparatus including: the apparatus for inquiring the interest point of the electronic map according to any one of the above embodiments.
It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in this application may be performed in parallel, sequentially, or in a different order, and are not limited herein as long as the desired results of the technical solutions provided by the present disclosure can be achieved.
The above detailed description should not be construed as limiting the scope of the disclosure. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present disclosure should be included in the scope of protection of the present disclosure.
Claims (35)
1. An interest point query method of an electronic map comprises the following steps:
identifying a signboard image to be inquired to obtain initial text content of the signboard image to be inquired and attribute information of the initial text content;
filtering the initial text content according to the attribute information of the initial text content to filter invalid text content in the initial text content to obtain valid text content;
and if the effective text content is matched with the text content of a preset interest point in the electronic map, determining that the preset interest point is the interest point of the signboard image to be inquired, wherein the electronic map is provided with a plurality of interest points, and each interest point in the interest points is provided with text content.
2. The method of claim 1, wherein the attribute information of the initial text content comprises: semantic attributes, location attributes, and image attributes; filtering the initial text content according to the attribute information of the initial text content to filter invalid text content in the initial text content to obtain valid text content, including:
respectively coding the semantic attribute, the position attribute and the image attribute of the initial text content to obtain a coding feature set comprising respective corresponding coding features;
and determining a first-order coding characteristic from the coding characteristic set, and performing filtering processing on the coding characteristics except the first-order coding characteristic in the coding characteristic set by taking the first-order coding characteristic as a filtering reference coding characteristic to filter invalid text contents in the initial text contents to obtain the valid text contents.
3. The method according to claim 2, wherein the filtering the coding features in the coding feature set except for the first leading coding feature with the first leading coding feature as a filtering reference coding feature comprises:
determining the association degree between each coding feature in the coding feature set except the first leading coding feature and the first leading coding feature, and filtering the coding features of which the association degree is smaller than a preset association degree threshold.
4. The method according to claim 2 or 3, wherein the encoding processing is performed on the semantic attribute, the position attribute, and the image attribute of the initial text content, respectively, to obtain an encoding feature set including respective corresponding encoding features, and the encoding feature set includes:
according to a first encoder of a pre-trained signboard text filtering model, encoding the semantic attributes of the initial text content to obtain a first encoding characteristic;
coding the position attribute of the initial text content according to a first coder of the signboard text filtering model to obtain a second coding characteristic;
coding the image attribute of the initial text content according to a first coder of the signboard text filtering model to obtain a third coding characteristic;
wherein the sign text filtering model is: generated by training the Board-transformer model framework based on each sample signboard image in the first sample set.
5. The method according to any one of claims 1 to 4, wherein if there is a match between the valid text content and text content of a preset point of interest in an electronic map, determining that the preset point of interest is a point of interest of the signboard image to be queried comprises:
inputting the effective text content and the text content of each interest point of the electronic map into a pre-trained text matching model, and performing matching processing on the effective text content and the text content of each interest point of the electronic map according to the text matching model to obtain the interest point of the signboard image to be inquired, wherein the text matching model is generated by performing triple loss training on a self-supervision model frame based on the text content of each interest point in a second sample set.
6. The method of claim 5, wherein the text matching model is generated by adjusting parameters of the self-supervised model framework based on first difference information between origin text content and positive text content, and second difference information between the origin text content and negative text content, the origin text content being randomly selected from the point of interest text contents, the positive text content being the same type of point of interest text content as the origin text content, the negative text content being determined as negative text content for point of interest text contents belonging to a different type from the origin text content.
7. The method according to any one of claims 1 to 4, wherein if there is a match between the valid text content and text content of a preset point of interest in an electronic map, determining that the preset point of interest is a point of interest of the signboard image to be queried comprises:
inputting the effective text content and the text of each interest point of the electronic map into a pre-trained text matching model, aligning and matching the effective text content and the text content of each interest point of the electronic map according to the text matching model to obtain the interest point of the signboard image to be inquired, wherein the text matching model is generated by training a face recognition model frame ArcFace based on the text content of each interest point in a third sample set.
8. The method of claim 7, wherein the aligning and matching the effective text content with the text content of each interest point of the electronic map according to the text matching model to obtain the interest point corresponding to the signboard image to be queried comprises:
performing text detection on the effective text content to obtain a text image of the effective text content, and aligning the text image with a text image of the text content of each interest point of the electronic map;
and performing feature extraction on the text image of the aligned effective text content to obtain a first image feature, performing feature extraction on the text image of the text content of each interest point of the electronic map to obtain a second image feature, and performing matching processing on the first image feature and the second image feature to obtain the interest point of the signboard image to be inquired.
9. The method according to claim 8, wherein the text matching model is generated by training the ArcFace according to a third image feature and a fourth image feature, the third image feature is obtained by performing feature extraction on a text image of text content of each interest point in the third sample set after alignment processing, the fourth image feature is obtained by performing feature extraction on the preset standard text image, and the alignment processing is: and performing text detection on the text content of each interest point in the third sample set to obtain a text image of the text content of each interest point in the third sample set, and aligning the text image with a preset standard text image.
10. A method of training a sign text filtering model, comprising:
obtaining a first sample set comprising a plurality of sample sign images;
determining rectangular boxes for framing the interest point names of the sample signboard images, and determining image information and text position information of each rectangular box;
inputting the image information, the text position information and the interest point name in each rectangular box into a Board-transformer model framework, training the Board-transformer model framework, and generating a signboard text filtering model, wherein the signboard text filtering model is used for filtering invalid text contents in the signboard image to be inquired.
11. The method of claim 10, wherein inputting the image information, the text position information, and the interest point name in each of the rectangular boxes into the Board-transformer model framework, training the Board-transformer model framework, and generating a signboard text filtering model comprises:
respectively coding the input image information and text position information of each rectangular frame and the interest point name in each rectangular frame according to the Board-transformer model framework to obtain coding characteristics corresponding to the image information and text position information of each rectangular frame and the interest point name in each rectangular frame;
and training the Board-transformer model frame according to the image information and the text position information of each rectangular box and the coding characteristics corresponding to the interest point names in each rectangular box to generate the signboard text filtering model.
12. The method of claim 11, wherein training the Board-transformer model framework according to the image information, the text position information and the coding features corresponding to the interest point names in each of the rectangular boxes to generate the signboard text filtering model comprises:
determining image information and text position information of each rectangular box and second first coding features in coding features corresponding to the interest point names in each rectangular box;
and according to the second first-order coding features, filtering and fusing the image information and the text position information of each rectangular frame and the coding features corresponding to the interest point names in each rectangular frame to obtain coding features after filtering and fusing, and adjusting the parameters of the Board-transformer model frame according to the coding features after filtering and fusing to obtain the signboard text filtering model.
13. The method according to claim 12, wherein performing filtering and merging processing on the image information and the text position information of each rectangular frame and the coding features corresponding to the interest point names in each rectangular frame according to the second leading coding feature to obtain the coding features after filtering and merging processing includes:
based on the second first-order coding features, filtering image information and text position information of each rectangular box and invalid coding features in coding features corresponding to the interest point names in each rectangular box;
and fusing the image information and the text position information of each rectangular frame and the effective coding features in the coding features corresponding to the interest point names in each rectangular frame by taking the second first-order coding feature as a basic feature to obtain the coding features after filtering and fusing.
14. The method according to any one of claims 11 to 13, wherein the Board-transformer model framework comprises a plurality of encoders, and the encoders for encoding the inputted image information, text position information, and interest point name in each rectangular box are different from each other.
15. The method according to any one of claims 10 to 14, wherein the image information of each of the rectangular boxes comprises an image identification of each of the rectangular boxes, the text position information of each of the rectangular boxes comprises a position, a length, and a width of a center point of each of the rectangular boxes, and the interest point name in each of the rectangular boxes comprises pixels of the interest point name in each of the rectangular boxes.
16. An interest point inquiring apparatus of an electronic map, comprising:
the identification unit is used for identifying the signboard image to be inquired to obtain the initial text content of the signboard image to be inquired and the attribute information of the initial text content;
the filtering unit is used for filtering the initial text content according to the attribute information of the initial text content so as to filter invalid text content in the initial text content to obtain valid text content;
a first determining unit, configured to determine that a preset interest point is an interest point of the signboard image to be queried if the valid text content matches text content of the preset interest point in the electronic map, where the electronic map has multiple interest points, and each of the multiple interest points has text content.
17. The apparatus of claim 16, wherein the attribute information of the initial text content comprises: semantic attributes, location attributes, and image attributes; the filter unit includes:
the first coding subunit is used for respectively coding the semantic attribute, the position attribute and the image attribute of the initial text content to obtain a coding feature set comprising coding features corresponding to the semantic attribute, the position attribute and the image attribute;
and the filtering subunit is configured to determine a first leading encoding characteristic from the encoding characteristic set, and filter, by using the first leading encoding characteristic as a filtering reference encoding characteristic, the encoding characteristics in the encoding characteristic set except for the first leading encoding characteristic to filter invalid text content in the initial text content, so as to obtain the valid text content.
18. The apparatus of claim 17, wherein the filtering subunit is configured to determine a degree of association between each coding feature in the set of coding features except the first leading coding feature and the first leading coding feature, and filter the coding features whose degree of association is smaller than a preset association degree threshold.
19. The apparatus according to claim 17 or 18, wherein the first encoding subunit is configured to perform encoding processing on semantic attributes of the initial text content according to a first encoder of a pre-trained signboard text filtering model to obtain a first encoding feature, perform encoding processing on position attributes of the initial text content according to the first encoder of the signboard text filtering model to obtain a second encoding feature, and perform encoding processing on image attributes of the initial text content according to the first encoder of the signboard text filtering model to obtain a third encoding feature;
wherein the sign text filtering model is: generated by training the Board-transformer model framework based on each sample signboard image in the first sample set.
20. The apparatus according to any one of claims 16 to 19, wherein the first determining unit comprises:
the input subunit is used for inputting the effective text content and the text content of each interest point of the electronic map into a pre-trained text matching model;
and the matching subunit is used for performing matching processing on the effective text content and the text content of each interest point of the electronic map according to the text matching model to obtain the interest points of the signboard image to be inquired, wherein the text matching model is generated by performing triple loss training on an automatic supervision model frame based on the text content of each interest point in a second sample set.
21. The apparatus of claim 20, wherein the text matching model is generated by adjusting parameters of the self-supervised model framework based on first difference information between origin text content and positive text content, and second difference information between the origin text content and negative text content, the origin text content being randomly selected from the point of interest text contents, the positive text content being the same type of point of interest text content as the origin text content, the negative text content being determined as negative text content for point of interest text contents belonging to a different type from the origin text content.
22. The apparatus according to any one of claims 16 to 19, wherein the first determining unit comprises:
the input subunit is used for inputting the effective text content and the text of each interest point of the electronic map into a pre-trained text matching model;
and the matching subunit is used for performing alignment processing and matching processing on the effective text content and the text content of each interest point of the electronic map according to the text matching model to obtain the interest point of the signboard image to be inquired, wherein the text matching model is generated by training a face recognition model framework ArcFace based on the text content of each interest point in a third sample set.
23. The apparatus of claim 22, wherein the matching subunit comprises:
the detection module is used for carrying out text detection on the effective text content to obtain a text image of the effective text content, and aligning the text image with a text image of the text content of each interest point of the electronic map;
the extraction module is used for extracting the characteristics of the text image of the effective text content after the alignment processing to obtain first image characteristics, and extracting the characteristics of the text image of the text content of each interest point of the electronic map to obtain second image characteristics;
and the matching module is used for matching the first image characteristic and the second image characteristic to obtain an interest point of the signboard image to be inquired.
24. The apparatus of claim 23, wherein the text matching model is generated by training the ArcFace according to a third image feature and a fourth image feature, the third image feature is obtained by feature extraction of a text image of text content of each interest point in the third sample set after alignment processing, the fourth image feature is obtained by feature extraction of the preset standard text image, and alignment processing is performed by: and performing text detection on the text content of each interest point in the third sample set to obtain a text image of the text content of each interest point in the third sample set, and aligning the text image with a preset standard text image.
25. An apparatus for training a signboard text filtering model, comprising:
an acquisition unit configured to acquire a first sample set including a plurality of sample signboard images therein;
a second determining unit, configured to determine a rectangular frame for framing the interest point name of each sample signboard image, and determine image information and text position information of each rectangular frame;
and the training unit is used for inputting the image information, the text position information and the interest point name in each rectangular box into a Board-transformer model frame, training the Board-transformer model frame and generating a signboard text filtering model, wherein the signboard text filtering model is used for filtering invalid text contents in the signboard image to be inquired.
26. The apparatus of claim 25, wherein the training unit comprises:
the second coding subunit is configured to perform coding processing on the input image information and text position information of each rectangular frame and the name of the interest point in each rectangular frame according to the Board-transformer model framework, so as to obtain coding features corresponding to the image information and text position information of each rectangular frame and the name of the interest point in each rectangular frame;
and the training subunit is used for training the Board-transformer model framework according to the image information and the text position information of each rectangular frame and the coding features corresponding to the interest point names in each rectangular frame to generate the signboard text filtering model.
27. The apparatus of claim 26, wherein the training subunit comprises:
the determining module is used for determining the image information and the text position information of each rectangular frame and the second first coding feature in the coding features corresponding to the interest point names in each rectangular frame;
the processing module is used for filtering and fusing the image information and the text position information of each rectangular frame and the coding features corresponding to the interest point names in each rectangular frame according to the second first-order coding features to obtain the coding features after filtering and fusing;
and the adjusting module is used for adjusting the parameters of the Board-transformer model frame according to the coding characteristics after filtering and fusing to obtain the signboard text filtering model.
28. The apparatus of claim 27, wherein the processing module comprises:
the filtering submodule is used for filtering the image information and the text position information of each rectangular frame and invalid coding features in the coding features corresponding to the interest point names in each rectangular frame based on the second first-order coding features;
and the processing submodule is used for carrying out fusion processing on the image information and the text position information of each rectangular frame and the effective coding features in the coding features corresponding to the interest point names in each rectangular frame by taking the second first-order coding feature as a basic feature to obtain the coding features after filtering and fusion processing.
29. The apparatus according to any one of claims 26 to 28, wherein the Board-transformer model framework comprises a plurality of encoders, and the encoders for encoding the inputted image information, text position information, and interest point name in each of the rectangular boxes are different from each other.
30. The apparatus according to any one of claims 25 to 29, wherein the image information of each of the rectangular boxes comprises an image identification of each of the rectangular boxes, the text position information of each of the rectangular boxes comprises a position, a length, and a width of a center point of each of the rectangular boxes, and the interest point name in each of the rectangular boxes comprises pixels of the interest point name in each of the rectangular boxes.
31. An electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1 to 9; or to enable the at least one processor to perform the method of any of claims 10 to 15.
32. A non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method of any one of claims 1 to 9; alternatively, the computer instructions are for causing the computer to perform the method of any one of claims 10 to 15.
33. A computer program product comprising a computer program which, when executed by a processor, implements the method according to any one of claims 1 to 9; alternatively, the computer program, when executed by a processor, implements the method of any of claims 10 to 15.
34. A vehicle, comprising: the apparatus of any one of claims 16 to 24.
35. A roadside apparatus comprising: the apparatus of any one of claims 16 to 24.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110730642.7A CN113360791B (en) | 2021-06-29 | 2021-06-29 | Interest point query method and device of electronic map, road side equipment and vehicle |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110730642.7A CN113360791B (en) | 2021-06-29 | 2021-06-29 | Interest point query method and device of electronic map, road side equipment and vehicle |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113360791A true CN113360791A (en) | 2021-09-07 |
CN113360791B CN113360791B (en) | 2023-07-18 |
Family
ID=77537227
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110730642.7A Active CN113360791B (en) | 2021-06-29 | 2021-06-29 | Interest point query method and device of electronic map, road side equipment and vehicle |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113360791B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113901257A (en) * | 2021-10-28 | 2022-01-07 | 北京百度网讯科技有限公司 | Map information processing method, map information processing device, map information processing equipment and storage medium |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2002102055A1 (en) * | 2001-06-12 | 2002-12-19 | International Business Machines Corporation | Method of invisibly embedding and hiding data into soft-copy text documents |
CN111191028A (en) * | 2019-12-16 | 2020-05-22 | 浙江大搜车软件技术有限公司 | Sample labeling method and device, computer equipment and storage medium |
CN111984876A (en) * | 2020-06-29 | 2020-11-24 | 北京百度网讯科技有限公司 | Interest point processing method, device, equipment and computer readable storage medium |
CN112633380A (en) * | 2020-12-24 | 2021-04-09 | 北京百度网讯科技有限公司 | Interest point feature extraction method and device, electronic equipment and storage medium |
KR20210042275A (en) * | 2020-05-27 | 2021-04-19 | 베이징 바이두 넷컴 사이언스 앤 테크놀로지 코., 엘티디. | A method and a device for detecting small target |
-
2021
- 2021-06-29 CN CN202110730642.7A patent/CN113360791B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2002102055A1 (en) * | 2001-06-12 | 2002-12-19 | International Business Machines Corporation | Method of invisibly embedding and hiding data into soft-copy text documents |
CN111191028A (en) * | 2019-12-16 | 2020-05-22 | 浙江大搜车软件技术有限公司 | Sample labeling method and device, computer equipment and storage medium |
KR20210042275A (en) * | 2020-05-27 | 2021-04-19 | 베이징 바이두 넷컴 사이언스 앤 테크놀로지 코., 엘티디. | A method and a device for detecting small target |
CN111984876A (en) * | 2020-06-29 | 2020-11-24 | 北京百度网讯科技有限公司 | Interest point processing method, device, equipment and computer readable storage medium |
CN112633380A (en) * | 2020-12-24 | 2021-04-09 | 北京百度网讯科技有限公司 | Interest point feature extraction method and device, electronic equipment and storage medium |
Non-Patent Citations (3)
Title |
---|
YURUI LI 等: "POI Representation Learning by a Hybrid Model", 《2019 20TH IEEE INTERNATIONAL CONFERENCE ON MOBILE DATA MANAGEMENT (MDM)》 * |
周长利;陈永红;田晖;蔡绍滨;: "保护位置隐私和查询内容隐私的路网K近邻查询方法", 软件学报, no. 02 * |
柴变芳;傅;安素芳;胡吉朝;: "支持路径查询和信息检索的XML索引", 软件导刊, no. 03 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113901257A (en) * | 2021-10-28 | 2022-01-07 | 北京百度网讯科技有限公司 | Map information processing method, map information processing device, map information processing equipment and storage medium |
CN113901257B (en) * | 2021-10-28 | 2023-10-27 | 北京百度网讯科技有限公司 | Map information processing method, device, equipment and storage medium |
US11934449B2 (en) | 2021-10-28 | 2024-03-19 | Beijing Baidu Netcom Science Technology Co., Ltd. | Method and apparatus for processing map information, and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN113360791B (en) | 2023-07-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
KR20220029403A (en) | Method and device for identifying updated road, electronic equipment and computer storage medium | |
CN113033534A (en) | Method and device for establishing bill type identification model and identifying bill type | |
US20230134569A1 (en) | Positioning method based on lane line and feature point, electronic device, and storage medium | |
KR20230005408A (en) | Method and apparatus for extracting multi-modal POI features | |
CN113947147A (en) | Training method and positioning method of target map model and related devices | |
CN113361710A (en) | Student model training method, picture processing device and electronic equipment | |
US20230213353A1 (en) | Method of updating road information, electronic device, and storage medium | |
CN112860993A (en) | Method, device, equipment, storage medium and program product for classifying points of interest | |
CN113705716A (en) | Image recognition model training method and device, cloud control platform and automatic driving vehicle | |
CN113657395A (en) | Text recognition method, and training method and device of visual feature extraction model | |
CN113705515A (en) | Training of semantic segmentation model and generation method and equipment of high-precision map lane line | |
CN114443794A (en) | Data processing and map updating method, device, equipment and storage medium | |
CN114111813B (en) | High-precision map element updating method and device, electronic equipment and storage medium | |
WO2022227759A1 (en) | Image category recognition method and apparatus and electronic device | |
CN113360791B (en) | Interest point query method and device of electronic map, road side equipment and vehicle | |
CN113449687B (en) | Method and device for identifying point of interest outlet and point of interest inlet and electronic equipment | |
CN112987707A (en) | Automatic driving control method and device for vehicle | |
CN112861023B (en) | Map information processing method, apparatus, device, storage medium, and program product | |
CN113654548A (en) | Positioning method, positioning device, electronic equipment and storage medium | |
CN114170282A (en) | Point cloud fusion method and device, electronic equipment and medium | |
CN114064745A (en) | Method and device for determining traffic prompt distance and electronic equipment | |
CN114036414A (en) | Method and device for processing interest points, electronic equipment, medium and program product | |
CN113033431A (en) | Optical character recognition model training and recognition method, device, equipment and medium | |
CN114383600B (en) | Processing method and device for map, electronic equipment and storage medium | |
US20220383613A1 (en) | Object association method and apparatus and electronic device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |