CN112818865A - Vehicle-mounted field image identification method, identification model establishing method, device, electronic equipment and readable storage medium - Google Patents

Vehicle-mounted field image identification method, identification model establishing method, device, electronic equipment and readable storage medium Download PDF

Info

Publication number
CN112818865A
CN112818865A CN202110145812.5A CN202110145812A CN112818865A CN 112818865 A CN112818865 A CN 112818865A CN 202110145812 A CN202110145812 A CN 202110145812A CN 112818865 A CN112818865 A CN 112818865A
Authority
CN
China
Prior art keywords
target area
training
recognition
information
image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110145812.5A
Other languages
Chinese (zh)
Inventor
冯辉
苟巍
沈海峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Didi Infinity Technology and Development Co Ltd
Original Assignee
Beijing Didi Infinity Technology and Development Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Didi Infinity Technology and Development Co Ltd filed Critical Beijing Didi Infinity Technology and Development Co Ltd
Priority to CN202110145812.5A priority Critical patent/CN112818865A/en
Publication of CN112818865A publication Critical patent/CN112818865A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/59Context or environment of the image inside of a vehicle, e.g. relating to seat occupancy, driver state or inner lighting conditions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/62Text, e.g. of license plates, overlay texts or captions on TV images

Abstract

After an image to be recognized with a plurality of areas containing content information is obtained, the image to be recognized is processed by using a pre-established recognition model, a target area is determined from the plurality of areas according to position information of each area, and information recognition is carried out on the target area to obtain a recognition result. The recognition model is obtained by training a plurality of training positive samples marked with position information of the target area in advance, and each training positive sample is provided with a plurality of areas containing content information. In the scheme, the problem that in the prior art, areas containing content information are subjected to undifferentiated identification, so that key identification of the required areas is difficult to emphasize is solved, and the problem that other areas interfere with identification results is solved.

Description

Vehicle-mounted field image identification method, identification model establishing method, device, electronic equipment and readable storage medium
Technical Field
The application relates to the technical field of computer networks, in particular to a vehicle-mounted field image identification method, a vehicle-mounted field image identification model establishing device, electronic equipment and a readable storage medium.
Background
At present, most of existing information identification methods aim at information identification in natural scenes, and no information identification method aiming at special scenes exists at present. The information identification technology under the natural scene can identify all information with the same attribute in the image, and if the information identification mode of the open-source natural scene is used, the requirement that only part of information is identified from a plurality of information with the same attribute is met under some special scenes.
Disclosure of Invention
In view of the above, an object of the present invention is to provide an in-vehicle field image recognition method, a recognition model establishing method, an apparatus, an electronic device, and a readable storage medium, which can improve recognition efficiency by identifying a target region from a plurality of regions having the same attribute and then recognizing the target region.
In a first aspect, an embodiment of the present application provides a vehicle-mounted domain image recognition method, where the method includes:
acquiring an image to be identified, wherein the image to be identified is provided with a plurality of areas containing content information;
processing the image to be recognized by using a pre-established recognition model, determining a target area from the plurality of areas according to the position information of each area, and performing information recognition on the target area to obtain a recognition result;
the identification model is obtained by training a plurality of training positive samples marked with position information of the target area in advance, and each training positive sample is provided with a plurality of areas containing content information.
In an alternative embodiment, the recognition model comprises a detection submodel and a recognition submodel;
the step of processing the image to be recognized by using the pre-established recognition model, determining a target area from the plurality of areas according to the position information of each area, and performing information recognition on the target area to obtain a recognition result comprises the following steps:
detecting based on the position information by utilizing a pre-established detection submodel to determine a target area from the plurality of areas;
and carrying out information identification on the target area of the image to be identified by utilizing a pre-established identification submodel to obtain an identification result, wherein the identification submodel is obtained by training on the basis of a plurality of training positive samples and a plurality of extension samples obtained by extension according to the content information of the plurality of training positive samples.
In an optional implementation manner, the step of performing information recognition on the target area of the image to be recognized by using a pre-established recognition sub-model to obtain a recognition result includes:
and acquiring character information and digital information contained in the target area of the image to be recognized by utilizing a pre-established recognition sub-model, and acquiring corresponding date information according to the digital information.
In an alternative embodiment, the step of determining the target area from the plurality of areas based on the position information by using the pre-established detection submodel includes:
and determining a target area from the plurality of areas based on the detection of the coordinates of the rectangular frame and the corner points by utilizing a pre-established detection sub-model.
In an optional embodiment, the step of determining a target region from the plurality of regions based on the detection of the coordinates of the rectangular frame and the corner points by using the pre-established detection submodel includes:
determining a preliminary target area from the plurality of areas by using a pre-established detection sub-model based on the detection of the coordinates of the rectangular frame and the angular point;
constructing a minimum external frame of the preliminary target area according to the coordinates of the corner points and the side lines of the preliminary target area;
and detecting whether the size of the minimum external frame is within a preset range, and if so, determining the preliminary target area as a target area.
In an optional implementation manner, before the step of performing information recognition on the target area of the image to be recognized to obtain a recognition result, the method further includes:
and detecting whether the azimuth information of the minimum external frame meets a preset requirement, and if not, performing rotation correction on the target area.
In an optional embodiment, the recognition model further includes a classification submodel, where the classification submodel is obtained by training a classifier in advance by using multiple training positive samples and multiple training negative samples;
the step of processing the image to be recognized by using the pre-established recognition model, determining a target area from the plurality of areas according to the position information of each area, and performing information recognition on the target area to obtain a recognition result further includes:
and classifying and identifying the determined target area by using the classification submodel to obtain a classification result, wherein the classification result represents whether the target area contains information corresponding to the content information in the target area of the training positive sample.
In a second aspect, an embodiment of the present application provides a method for establishing a vehicle-mounted domain image recognition model, where the method includes:
aiming at each training positive sample in a plurality of acquired training positive samples, acquiring a plurality of areas containing content information in the training positive sample;
marking a target area from the plurality of areas and obtaining position information of the target area;
and training the constructed neural network model by using a plurality of training positive samples marked with the position information of the target area to obtain the recognition model.
In an optional embodiment, the identification model includes a detection submodel and an identification submodel, and the step of training the constructed neural network model by using a plurality of training positive samples marked with the position information of the target region to obtain the identification model includes:
training the constructed first neural network model by using a plurality of training positive samples marked with the position information of the target area to obtain a detection sub-model;
performing sample expansion according to the content information in the target areas of the multiple training positive samples to obtain multiple expansion samples;
and training the constructed second neural network model by utilizing a plurality of training positive samples and a plurality of extension samples to obtain the recognition submodel.
In an optional embodiment, the identification model further includes a classification sub-model, and the step of training the constructed neural network model by using a plurality of training positive samples marked with the position information of the target region to obtain the identification model further includes:
and training the constructed classifier by using the obtained multiple training negative samples and the multiple training positive samples marked with the position information of the target area to obtain a classification submodel.
In an optional implementation manner, the content information in the target area includes characters and numbers, and the step of performing sample expansion according to the content information in the target area of the plurality of training positive samples to obtain a plurality of expansion samples includes:
extracting numbers contained in the target area of each training positive sample to form a number set;
combining the numbers contained in the number set to obtain a plurality of number combinations;
and obtaining a plurality of expansion samples based on the plurality of number combinations and the characters.
In an alternative embodiment, the number represents a date, and the step of combining the numbers included in the number set to obtain a plurality of number combinations includes:
aiming at each preset expansion date, obtaining the expansion number required by the expansion date;
randomly extracting the number corresponding to the expansion number from the number set to form a number combination characterizing the expansion date.
In an optional embodiment, the step of extracting numbers included in the target area of each training positive sample to form a number set includes:
extracting numbers contained in a target area of each training positive sample;
acquiring an expansion set containing a plurality of numbers carrying different writing information;
and forming a digit set by the digits in the extended set and the digits in the training positive samples.
In an alternative embodiment, the step of obtaining a plurality of extended samples based on the plurality of number combinations and the text includes:
for each training positive sample, deducting the numbers in the target area of the training positive sample to obtain a corresponding expansion template;
and filling each digital combination into any one of the expansion templates to obtain corresponding expansion samples.
In an optional embodiment, the position information includes corner coordinates of the target region and a rectangular frame framing the target region;
the step of training the constructed first neural network model by using a plurality of training positive samples marked with the position information of the target area to obtain the detection submodel comprises the following steps:
and training the constructed first neural network model by utilizing the corner point coordinates carrying the target area and a plurality of training positive samples framing the rectangular frame of the target area to obtain a detection sub-model.
In a third aspect, an embodiment of the present application provides an on-vehicle domain image recognition apparatus, where the apparatus includes:
the device comprises an acquisition module, a recognition module and a processing module, wherein the acquisition module is used for acquiring an image to be recognized, and the image to be recognized is provided with a plurality of areas containing content information;
the identification module is used for processing the image to be identified by using a pre-established identification model, determining a target area from the plurality of areas according to the position information of each area, and identifying information of the target area to obtain an identification result;
the identification model is obtained by training a plurality of training positive samples marked with position information of the target area in advance, and each training positive sample is provided with a plurality of areas containing content information.
In an alternative embodiment, the recognition model comprises a detection submodel and a recognition submodel;
the identification module is used for obtaining an identification result in the following mode:
detecting based on the position information by utilizing a pre-established detection submodel to determine a target area from the plurality of areas;
and carrying out information identification on the target area of the image to be identified by utilizing a pre-established identification submodel to obtain an identification result, wherein the identification submodel is obtained by training on the basis of a plurality of training positive samples and a plurality of extension samples obtained by extension according to the content information of the plurality of training positive samples.
In an alternative embodiment, the identification module is configured to obtain the identification result by:
and acquiring character information and digital information contained in the target area of the image to be recognized by utilizing a pre-established recognition sub-model, and acquiring corresponding date information according to the digital information.
In an alternative embodiment, the identification module is configured to determine the target area by:
and determining a target area from the plurality of areas based on the detection of the coordinates of the rectangular frame and the corner points by utilizing a pre-established detection sub-model.
In an alternative embodiment, the identification module is configured to determine the target area by:
determining a preliminary target area from the plurality of areas according to the corner point coordinates and the rectangular frame of each area;
constructing a minimum external frame of the preliminary target area according to the coordinates of the corner points and the side lines of the preliminary target area;
and detecting whether the size of the minimum external frame is within a preset range, and if so, determining the preliminary target area as a target area.
In an alternative embodiment, the in-vehicle field image recognition device further comprises a rectification module;
the correcting module is used for detecting whether the azimuth information of the minimum external frame meets a preset requirement or not, and if not, performing rotation correction on the target area.
In an optional embodiment, the recognition model further includes a classification submodel, where the classification submodel is obtained by training a classifier in advance by using multiple training positive samples and multiple training negative samples;
the identification module is also used for classifying and judging the image to be identified in the following way:
and classifying and identifying the determined target area by using the classification submodel to obtain a classification result, wherein the classification result represents whether the target area contains information corresponding to the content information in the target area of the training positive sample.
In a fourth aspect, an embodiment of the present application provides an apparatus for establishing an on-vehicle domain image recognition model, where the apparatus includes:
the device comprises an obtaining module, a judging module and a judging module, wherein the obtaining module is used for obtaining a plurality of areas containing content information in a training positive sample aiming at each training positive sample in a plurality of obtained training positive samples;
the calibration module is used for calibrating a target area from the plurality of areas and acquiring the position information of the target area;
and the training module is used for training the constructed neural network model by utilizing a plurality of training positive samples marked with the position information of the target area to obtain the recognition model.
In an optional embodiment, the recognition model includes a detection submodel and a recognition submodel, and the training module is configured to obtain the detection submodel and the recognition submodel by:
training the constructed first neural network model by using a plurality of training positive samples marked with the position information of the target area to obtain a detection sub-model;
performing sample expansion according to the content information in the target areas of the multiple training positive samples to obtain multiple expansion samples;
and training the constructed second neural network model by utilizing a plurality of training positive samples and a plurality of extension samples to obtain the recognition submodel.
In an optional embodiment, the recognition model further includes a classification submodel, and the training module is further configured to obtain the classification submodel by:
and training the constructed classifier by using the obtained multiple training negative samples and the multiple training positive samples marked with the position information of the target area to obtain a classification submodel.
In an alternative embodiment, the content information in the target area includes words and numbers, and the training module is configured to obtain the extended sample by:
extracting numbers contained in the target area of each training positive sample to form a number set;
combining the numbers contained in the number set to obtain a plurality of number combinations;
and obtaining a plurality of expansion samples based on the plurality of number combinations and the characters.
In an alternative embodiment, the number represents a date, and the training module is configured to obtain the combination of numbers by:
aiming at each preset expansion date, obtaining the expansion number required by the expansion date;
randomly extracting the number corresponding to the expansion number from the number set to form a number combination characterizing the expansion date.
In an alternative embodiment, the training module is configured to form the set of numbers by:
extracting numbers contained in a target area of each training positive sample;
acquiring an expansion set containing a plurality of numbers carrying different writing information;
and forming a digit set by the digits in the extended set and the digits in the training positive samples.
In an alternative embodiment, the training module is configured to obtain the extended samples by:
for each training positive sample, deducting the numbers in the target area of the training positive sample to obtain a corresponding expansion template;
and filling each digital combination into any one of the expansion templates to obtain corresponding expansion samples.
In an optional embodiment, the position information includes corner coordinates of the target region and a rectangular frame framing the target region;
the training module is used for obtaining a detection submodel in the following modes:
and training the constructed first neural network model by utilizing the corner point coordinates carrying the target area and a plurality of training positive samples framing the rectangular frame of the target area to obtain a detection sub-model.
In a fifth aspect, an embodiment of the present application provides an electronic device, including: the system comprises a processor, a storage medium and a bus, wherein the storage medium stores machine-readable instructions executable by the processor, when an electronic device runs, the processor is communicated with the storage medium through the bus, and the processor executes the machine-readable instructions to execute the steps of the method according to any one of the preceding implementation modes.
In a sixth aspect, the present application provides a computer-readable storage medium, on which a computer program is stored, where the computer program is executed by a processor to perform the steps of the method according to any one of the foregoing embodiments.
In a seventh aspect, the present application provides a computer program product, which when run on a computer, causes the computer to execute the method according to any one of the preceding embodiment modes.
Based on any one of the above aspects, after the image to be recognized having the plurality of areas including the content information is obtained, the image to be recognized is processed by using the pre-established recognition model, the target area is determined from the plurality of areas according to the position information of each area, and the target area is subjected to information recognition to obtain the recognition result. The recognition model is obtained by training a plurality of training positive samples marked with position information of the target area in advance, and each training positive sample is provided with a plurality of areas containing content information. In the scheme, the problem that in the prior art, areas containing content information are subjected to undifferentiated identification, so that key identification of the required areas is difficult to emphasize is solved, and the problem that other areas interfere with identification results is solved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained from the drawings without inventive effort.
Fig. 1 is a schematic view illustrating an application scenario of a vehicle-mounted domain image identification method provided by an embodiment of the application;
FIG. 2 is a flowchart illustrating a vehicle-mounted domain image recognition method according to an embodiment of the present disclosure;
FIG. 3 is a schematic diagram of a training positive sample provided by an embodiment of the present application;
FIG. 4 is a flowchart illustrating a vehicle-mounted domain image recognition model building method according to an embodiment of the present application;
fig. 5 is a flowchart illustrating a specific method for establishing a detection sub-model and an identification sub-model in the method for establishing an image identification model in a vehicle-mounted field according to the embodiment of the present application;
FIG. 6 is a schematic diagram illustrating a rectangular frame of a calibration target area according to an embodiment of the present disclosure;
fig. 7 is a flowchart illustrating a specific method for obtaining an extended sample in the method for establishing an identification submodel according to the embodiment of the present application;
FIG. 8 is a flowchart illustrating a specific method for constructing a number set in the method for obtaining an extended sample according to an embodiment of the present application;
FIG. 9 is a flowchart illustrating a specific method for constructing a number combination in the method for obtaining an extended sample according to an embodiment of the present application;
FIG. 10 is another flow chart of a specific method for obtaining an extended sample according to an embodiment of the present application;
FIG. 11(a) is a schematic diagram of an expanded template provided by an embodiment of the present application;
FIG. 11(b) shows that the embodiment of the present application provides an extended sample based on an extended template;
fig. 12 is a flowchart illustrating a specific method for obtaining a target area and a recognition result in the vehicle-mounted domain image recognition method according to the embodiment of the present application;
fig. 13 is a flowchart illustrating a specific method for determining a target area in the vehicle-mounted area image recognition method according to the embodiment of the present application;
FIG. 14 is a diagram illustrating a minimum bounding box of a constructed target region provided by an embodiment of the present application;
FIG. 15 is a schematic diagram illustrating aspect ratio detection and classification discrimination provided by an embodiment of the present application;
FIG. 16 is a second schematic diagram of aspect ratio detection and classification discrimination provided by the embodiments of the present application;
fig. 17 is a functional block diagram showing an in-vehicle field image recognition apparatus according to an embodiment of the present application;
fig. 18 is a functional block diagram showing an in-vehicle field image recognition model building apparatus according to an embodiment of the present application;
fig. 19 shows a block diagram of an electronic device according to an embodiment of the present application.
Detailed Description
In order to make the purpose, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it should be understood that the drawings in the present application are for illustrative and descriptive purposes only and are not used to limit the scope of protection of the present application. Additionally, it should be understood that the schematic drawings are not necessarily drawn to scale. The flowcharts used in this application illustrate operations implemented according to some embodiments of the present application. It should be understood that the operations of the flow diagrams may be performed out of order, and steps without logical context may be performed in reverse order or simultaneously. One skilled in the art, under the guidance of this application, may add one or more other operations to, or remove one or more operations from, the flowchart.
In addition, the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. The components of the embodiments of the present application, generally described and illustrated in the figures herein, can be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present application, presented in the accompanying drawings, is not intended to limit the scope of the claimed application, but is merely representative of selected embodiments of the application. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present application without making any creative effort, shall fall within the protection scope of the present application.
To enable those skilled in the art to utilize the present disclosure, the following embodiments are presented in conjunction with a specific application scenario, "network appointment". It will be apparent to those skilled in the art that the general principles defined herein may be applied to other embodiments and applications without departing from the spirit and scope of the application. Although the present application is described primarily in the context of "net appointment", it should be understood that this is only one exemplary embodiment.
It should be noted that in the embodiments of the present application, the term "comprising" is used to indicate the presence of the features stated hereinafter, but does not exclude the addition of further features.
The terms "passenger," "requestor," "service demander," "customer," "service requestor" are used interchangeably in this application to refer to an individual, entity or tool that can request or subscribe to a service. The terms "driver," "provider," "service provider," "vendor," and "service provider" are used interchangeably herein to refer to an individual, entity, or tool that can provide a service. The term "user" in this application refers primarily to an individual, entity or tool that requests a service, subscribes to a service. For example, the user may be a passenger.
The terms "service request" and "service" are used interchangeably herein to refer to a request initiated by a passenger, a service requester, a driver, a service provider, or a supplier, the like, or any combination thereof. Accepting the "service request" or "service" may be a passenger, a service requester, a driver, a service provider, a supplier, or the like, or any combination thereof. The service request may be charged or free.
The Positioning technology used in the present application may be based on a Global Positioning System (GPS), a Global Navigation Satellite System (GLONASS), a COMPASS Navigation System (COMPASS), a galileo Positioning System, a Quasi-Zenith Satellite System (QZSS), a Wireless Fidelity (WiFi) Positioning technology, or the like, or any combination thereof. One or more of the above-described positioning systems may be used interchangeably in this application.
One aspect of the present application relates to an identification system, which may process an image to be identified by using a pre-established identification model after obtaining the image to be identified having a plurality of areas containing content information, determine a target area from the plurality of areas according to position information of each area, and perform information identification on the target area to obtain an identification result. The recognition model is obtained by training a plurality of training positive samples marked with position information of the target area in advance, and each training positive sample is provided with a plurality of areas containing content information. Therefore, the problem that in the prior art, areas containing content information are subjected to undifferentiated identification, emphasis identification of required areas is difficult to perform, and the problem that other areas interfere with identification results is avoided.
It should be noted that, before the present application is filed, when the image includes a plurality of regions having the same attribute, the information of the plurality of regions is generally identified without distinction. This approach is difficult to apply to some scenes where only partial region information in a plurality of regions having the same attribute needs to be identified. For example, in the case of an image including a plurality of areas having content information (e.g., text information, numerical information, graphic information, etc.), information in some areas of the image is always fixed, and information in some areas is changed, and it is required to be able to recognize the changed information.
Therefore, although some of the areas also contain content information, since the content information of the areas is fixed and the content does not matter, the content information in the areas does not need to be identified. However, in the current information recognition method in a natural scene, each area including content information is often recognized indiscriminately. This results in inefficient recognition and the recognition of unwanted information can also have an impact on the results.
However, in the recognition method provided by the present application, the target area may be determined from the plurality of areas by using the obtained recognition model, and the target area may be subjected to information recognition to obtain a recognition result. The target area can be emphatically identified so as to avoid the problem that the needed area is difficult to identify in the existing mode without difference.
First embodiment
Fig. 1 is a schematic architecture diagram of an identification system 100 according to an embodiment of the present disclosure. For example, the identification system 100 may be an online transportation service platform for transportation services such as taxis, designated driving services, express, carpooling, bus services, driver rentals, or regular bus services, or any combination thereof. The identification system 100 may include one or more of a server 110, a network 120, a service requester 130, a service provider 140, and a database 150.
In some embodiments, the server 110 may include a processor. The processor may analyze the information sent by the service provider 140 to perform one or more of the functions described herein. For example, the processor may analyze a plurality of images sent by the service provider 140, so as to establish the recognition model. In some embodiments, a processor may include one or more processing cores (e.g., a single-core processor (S) or a multi-core processor (M)). Merely by way of example, a Processor may include a Central Processing Unit (CPU), an Application Specific Integrated Circuit (ASIC), an Application Specific Instruction Set Processor (ASIP), a Graphics Processing Unit (GPU), a Physical Processing Unit (PPU), a Digital Signal Processor (DSP), a Field Programmable Gate Array (FPGA), a Programmable Logic Device (PLD), a controller, a microcontroller Unit, a Reduced Instruction Set computer (Reduced Instruction Set computer), a microprocessor, or the like, or any combination thereof.
In some embodiments, the device types corresponding to the service request end 130 and the service providing end 140 may be mobile devices, such as smart home devices, wearable devices, smart mobile devices, virtual reality devices, or augmented reality devices, and the like, and may also be tablet computers, laptop computers, or built-in devices in motor vehicles, and the like.
In some embodiments, a database 150 may be connected to the network 120 to communicate with one or more components (e.g., the server 110, the service requester 130, the service provider 140, etc.) in the identification system 100. One or more components in the identification system 100 may access data or instructions stored in the database 150 via the network 120. In some embodiments, the database 150 may be directly connected to one or more components in the identification system 100, or the database 150 may be part of the server 110.
The following describes the vehicle-mounted area image recognition method provided by the embodiment of the present application in detail with reference to the content described in the recognition system 100 shown in fig. 1.
Second embodiment
Referring to fig. 2, a flowchart of a vehicle-mounted area image recognition method according to an embodiment of the present disclosure is shown, and the method may be executed by the server 110 in the recognition system 100. It should be understood that, in other embodiments, the order of some steps in the vehicle-mounted field image recognition method according to this embodiment may be interchanged according to actual needs, or some steps may be omitted or deleted. The detailed steps of the identification method are described below.
Step S40, an image to be recognized is acquired, the image to be recognized having a plurality of areas containing content information.
Step S50, processing the image to be recognized by using a pre-established recognition model, determining a target region from the plurality of regions according to the position information of each region, and performing information recognition on the target region to obtain a recognition result.
As a possible application scenario, in order to ensure the safety of passengers, drivers are generally required to regularly disinfect vehicles providing travel services as required, especially in situations such as severe epidemic and infectious diseases. In addition, the driver needs to place or paste a picture containing disinfection information, such as a car sticker, in the vehicle, and also needs to take a picture of the car sticker to be uploaded to the server. In order to ensure that the driver completes disinfection of the interior of the vehicle and fills in the disinfected vehicle sticker date, the administrator needs to manually review the images containing the vehicle stickers uploaded by the driver, but the manual review of the images is difficult and labor cost is high due to the huge daily order amount. In order to reduce the workload of manual examination and save the labor cost, a computer vision technology can be adopted to automatically identify the information of the car sticker image. Therefore, the supervision driver can regularly disinfect the vehicle according to the requirement.
It should be noted that the identification scheme provided in this embodiment may also be applied in other application scenarios, and this application is only described by taking the identification process of the image containing the disinfection information in the vehicle-mounted field as an example.
In this embodiment, the image to be recognized may be an image sent by a service provider and received by the server, and the server may perform analysis processing on the image to be recognized to detect whether information in the image to be recognized conforms to a relevant specification.
The recognition model is obtained by training a plurality of training positive samples marked with position information of the target area in advance, and each training positive sample is provided with a plurality of areas containing content information. For example, as shown in fig. 3, the image to be recognized may include some publicity terms for disinfection processing in the car sticker, such as "driver and passenger epidemic prevention, security sticker" and the like, and may also include some explanatory terms, such as a language for explaining the purpose of the car sticker to the passenger, and in addition, information about the specific disinfection date is included. In addition, since the vehicle sticker is located at different positions in the vehicle, for example, the vehicle sticker may be located on a back of a rear seat of the vehicle, may be located on a back of a front seat of the vehicle, may be located inside a door, and the like. Therefore, the captured image may include, in addition to the vehicle sticker inside the vehicle, other partial area images inside or outside the vehicle, and these area images may interfere with the recognition.
In this embodiment, the final required information of the identification of the emphasis is the above-mentioned information including the specific disinfection date, because the information in the area is changed, and the identification model required subsequently can accurately identify the changed information, such as the date, in the area, so as to determine whether the driver disinfects the vehicle daily according to the regulations. The information in the areas, such as the areas containing publicity terms and explanatory terms and possibly other partial areas inside or outside the vehicle, is not really helpful for the subsequent supervision driver to perform the disinfection treatment, so that the information in the areas can not be identified with emphasis.
Therefore, in this embodiment, the target area, that is, the area (of the specific disinfection date) including the variation information, can be determined from the plurality of areas included in the image to be recognized.
In consideration of the fact that the general style of the car sticker in each vehicle is consistent, the position of the target region in the image does not greatly vary, and therefore, the position information of each region in the image to be recognized can be obtained.
The trained recognition model is obtained by learning and training based on the position information of the target region in the training positive sample, so that the target region can be determined based on the position information of each region based on the recognition model aiming at the image to be recognized, and information recognition is carried out based on the target region containing the required information. And the interference of other areas to the identification result is avoided.
The vehicle-mounted field image identification method provided by the embodiment avoids the problems that the areas containing content information are indiscriminately identified, the areas containing the required information are difficult to be emphasized and identified, and the information of other areas interferes the identification result in the existing information identification technology under the natural scene.
The embodiment of the present application further provides a method for establishing an image recognition model in a vehicle-mounted domain, please refer to fig. 4, and first, a detailed description is given below of a process of obtaining a recognition model by pre-training in the method for establishing a recognition model:
step S10, for each of the obtained plurality of training positive samples, obtaining a plurality of areas containing content information in the training positive sample.
Step S20, a target area is identified from the plurality of areas, and position information of the target area is obtained.
And step S30, training the constructed neural network model by using a plurality of training positive samples marked with the position information of the target area to obtain the recognition model.
In the above application scenario, the training positive sample in this embodiment is an image that is uploaded by the driver and contains relevant information about disinfection in history. The number of service providers on the travel platform is large, and angles, sizes and the like of images shot by the service providers can be different. Thus, a pre-acquired training positive sample may include multiple images uploaded by different service providers, taken at different image sizes, different angles, and different times. Thereby enriching the diversity of training samples.
In this embodiment, the obtained positive samples may be as shown in fig. 3, and the constructed neural network model is trained by using a plurality of training positive samples marked with the position information of the target region, so that feature learning may be performed on the training positive samples based on the position information of the target region, so that the trained recognition model may subsequently accurately recognize the target region from a plurality of regions included in the image to be recognized, and further perform recognition detection based on information in the target region.
In this embodiment, based on some application scenarios, when only a part of the plurality of regions having content information in the image needs to be subjected to information identification, all the regions are learned to obtain the identification model in the existing natural scene identification manner, and on one hand, learning some unnecessary feature information affects the identification result of the subsequent identification model, and on the other hand, there is a problem that the identification efficiency of the identification model is reduced. Therefore, the method for establishing the vehicle-mounted field image recognition model provided by this embodiment specifically adopts a mode of firstly calibrating the target region in the plurality of regions in the image and training the recognition model by using the sample marked with the position information of the target region, so that the recognition model can learn the features of the target region in emphasis, and the recognition efficiency of the subsequent recognition model can be provided.
In this embodiment, the recognition model includes a detection submodel and a recognition submodel, where the learning training of the detection submodel may be mainly trained based on the position information of the target region marked in the training positive sample, so as to be used for subsequently detecting the target region from the image to be recognized to exclude other regions that affect the recognition result. The learning training of the recognizer model can be trained based on specific information contained in the target region, so as to be used for analyzing and recognizing information in the target region in the image to be recognized subsequently.
Further, in the present embodiment, it is considered that the number of training positive samples collected in advance is limited, and the date is limited by time due to the requirement that the date in the image needs to be recognized. Therefore, the date in the collected training positive sample may be only a partial date of the year and does not include the date to be recognized when recognition is subsequently performed. For example, if the current time is 7/15 days with the project starting from 2/1 day, the training positive samples collected are samples from 2/1 to 7/15 days, and the samples from 7/15 to the next 2/1 day are missing.
Therefore, if the training of the model is performed only with the training positive samples in the obtained history, there is a problem that the samples are unbalanced, so that the recognition accuracy of the images that have appeared on the date is high, but the recognition accuracy of the images that do not appear on the date in the samples is low. Therefore, the problem of model overfitting is caused, the recognition rate of the appeared samples is high, the generalization capability is poor, and the recognition accuracy of the recognition model obtained by low recognition rate of the samples which do not appear is low.
In view of the above, the present embodiment adopts a method of expanding the training positive sample to solve the above problem, please refer to fig. 5, in the present embodiment, the recognition model can be obtained by training in the following manner.
And step S31, training the constructed first neural network model by using a plurality of training positive samples marked with the position information of the target area to obtain a detection sub-model.
And step S32, performing sample expansion according to the content information in the target areas of the training positive samples to obtain a plurality of expansion samples.
And step S33, training the constructed second neural network model by using a plurality of training positive samples and a plurality of extension samples to obtain a recognition submodel.
In this embodiment, the first Neural network model and the second Neural network model may be pre-constructed, and the first Neural network model and the second Neural network model may adopt, but are not limited to, a Convolutional Neural Network (CNN). The first neural network model can adopt a lightweight MobileNet model, and learning training can be realized based on a RetinaFace detection method. The detection method has high enrollment performance and high processing efficiency.
When the position information of the target area of the training positive sample is calibrated in advance, the position information of the target area is calibrated by the corner coordinates of the target area and a rectangular frame framing the target area.
Based on the method, when the detection submodel is trained, the constructed first neural network model can be trained by utilizing a plurality of training positive samples carrying the corner coordinates of the target area and the rectangular frame framing the target area, so that the detection submodel is obtained.
It is considered that the shape of the target area containing the specific date information in the sticker is generally rectangular. In this embodiment, the position information of the target region is calibrated by using the coordinates of the corner points of the target region, and when the target region is a rectangle, the corner points may be four corner points of the target region. In addition, the shape of the target region may be other irregular shapes, and when the target region is other irregular shapes, the corner point may be a plurality of edge points of an outer edge of the irregular shape.
The relative position information of the target area in the image can be accurately calibrated by utilizing the corner coordinates, and the detection sub-model can learn the accurate position information of the target area by training based on the corner coordinates of the target area.
In addition, in consideration of differences in the size, shooting angle and the like of the acquired training positive samples, in different training positive samples, there may be differences in relative corner coordinates of the target region in the whole image, and therefore if only the corner coordinates of the target region are learned, the learning effect is poor due to the diversity of the corner coordinates, and it is difficult to accurately determine the target region in the image to be recognized subsequently.
Therefore, based on the above consideration, in the present embodiment, when the target area is calibrated, a rectangular frame for framing the target area is further added. The orientation of the rectangular box may be correct, i.e. the edges of the rectangular box are horizontal or vertical lines, respectively. The edge points of the target area may be located on the edge line of the rectangular frame, or the entire target area may be completely located inside the rectangular frame, for example, as shown in fig. 6, four corner points of the target area are marked, and an end rectangular frame of the target area is defined.
The approximate position of the target area in the whole image can be calibrated by using the rectangular frame for framing the target area, so that accurate position information of the target area and the approximate position information of the target area can be learned respectively by combining the corner coordinates of the target area and the rectangular frame. On the basis of improving the detection accuracy of the obtained detection submodel on the position of the target area, the interference caused by the position diversity on the learning is avoided.
Through the method, the detection submodel for detecting the target area can be trained. In this embodiment, it is considered that although the target area may be determined from a plurality of areas included in the image based on the position information of the target area, due to problems such as the background and the shooting angle during shooting, other areas in the image may be caused, for example, a partial area image in the background may be close to the position of the target area, and thus the partial area image may be erroneously determined as the target area.
Therefore, in order to avoid the above problem, in this embodiment, the recognition model further includes a classification submodel, and the classification submodel may be obtained by training in advance in the following manner:
and training the constructed classifier by using the obtained multiple training negative samples and the multiple training positive samples marked with the position information of the target area to obtain a classification submodel.
The training positive sample is a sample containing a target area with required content information, wherein the required content information is the content information containing the disinfection date. The training negative sample is a sample of a partial region containing the same position information as the target region in the training positive sample, but the partial region does not contain the required content information.
And training the classifier by using the training positive sample and the training negative sample, and judging whether the image contains the required content information by using the trained classification submodel. In this way, after the trained detection submodel is used for detecting the target area in the image, whether the target area contains the required content information or not, namely whether the image is a real disinfection vehicle sticker image or not, can be judged based on the classification submodel.
In this embodiment, the classification submodel only needs to determine whether the target region includes the required content information, and the identification of the specific information included in the target region needs to be performed by using the trained identifier submodel. As described above, in order to avoid the problem of the imbalance of the training positive samples, the samples can be expanded based on the training positive samples.
As can be seen from the above, the content information in the target area of each training positive sample mainly includes a change in number, i.e., a change in disinfection date, and the text is generally only an explanatory description and is fixed. The dates in the car sticker are generally written by the driver, and not only can not include all dates in a year, but also there is a difference in the form of numbers written by hand, so, referring to fig. 7, in this embodiment, the sample expansion can be performed based on the training positive sample in the following manner:
in step S321, the numbers included in the target area of each training positive sample are extracted to form a number set.
Step S322, combining the numbers included in the number set to obtain a plurality of number combinations.
Step S323, based on the plurality of number combinations and the characters, obtaining a plurality of extended samples.
As can be seen from the above, the target area of the training positive sample includes characters and numbers, wherein the numbers are date information, such as 2 months and 26 days, and may include years. The content information in the target area may be, for example, "sterilized for 2 months and 26 days today". The numbers in the target area have a relatively fixed position in the target area, and when extended, the dates therein need to be extended mainly. Therefore, the numbers contained in the target region, such as months at the month corresponding positions, such as 1 to 7, and numbers at the day corresponding positions, such as 01 to 30, can be extracted. All the extracted numbers are formed into a number set.
In this embodiment, the number set may include a plurality of subsets, and each subset may include numbers with the same number but different writing forms, for example, all numbers 1 may form a subset, and all numbers 2 may form a subset.
When the digital date is extended, a number may be extracted from each data subset and combined to obtain a plurality of number combinations, and a plurality of extended samples may be extended by combining the characters included in the target area and the obtained plurality of number combinations.
In this embodiment, considering that there may be a plurality of different writing methods for the same number, and the writing forms that have appeared in the collected training positive sample may not be comprehensive, in order to further enrich the features of the numbers used for learning training and further improve the recognition accuracy of the obtained recognizer model, please refer to fig. 8, in this embodiment, in the step of forming the number set, the following method may be implemented:
in step S3211, numbers included in the target area of each of the training positive samples are extracted.
Step S3212, an extended set including a plurality of digits carrying different writing information is obtained.
Step S3213, the numbers in the extended set and the numbers in the plurality of training positive samples form a number set.
In this embodiment, an extended set may be obtained in advance, where the extended set may be a handwriting data set, and the data set includes a plurality of numbers between 0 and 9, which are different in writing information. The individual numbers contained in the extended set may be numbers acquired in other scenarios. For handwritten numbers, the same number is written by different people or the same person in different time and different scenes, and the written information is generally different. Therefore, the written information of the same number is also diversified.
After the numbers included in the target region of the training positive sample are extracted, the numbers in the extended set and the numbers in the plurality of training positive samples may be made into a number set. The digital characteristic information is enriched by utilizing a plurality of numbers with different writing information in the expansion set, so that the recognizer model can learn more digital characteristics, and the recognition accuracy of the date to be recognized subsequently is facilitated.
Referring to fig. 9, in the present embodiment, when a plurality of combinations of numbers are obtained based on the numbers in the number set, the following steps are performed:
in step S3221, extension numbers necessary for the extension dates are obtained for the respective extension dates set in advance.
Step S3222 randomly extracts a number corresponding to the extension number from the number set to constitute a number combination representing the extension date.
In this embodiment, a plurality of extended dates may be set in advance, and in the above example, the extended dates may be each of the dates from 7/month 16 to 2/month 1 of the next year, and thus, the extended dates may be included in the training sample collected at present in the range from 2/month 1 to 7/month 15, so as to constitute all the dates included in the whole year.
For each expansion date, for example, 12 months and 5 days, the expansion numbers required for the expansion date are 1, 2 and 5. Thus, a number may be extracted from the subset of numbers containing the number 1, a number from the subset of numbers containing the number 2 as a month, and a number from the subset of numbers containing the number 5 as a day, resulting in a combination of numbers corresponding to the extended date of 12 months and 5 days.
In the above manner, a combination of numerals corresponding to each extension date can be obtained, and each extension date can correspond to a plurality of combinations of numerals having the same numeral but different writing forms.
After obtaining the number combination corresponding to the required extension date, an extension sample needs to be obtained based on the number combination, please refer to fig. 10, in this embodiment, the extension sample may be generated by the following method:
step S3231, for each training positive sample, deducing the number in the target area of the training positive sample to obtain a corresponding extended template.
Step S3232, filling each of the number combinations into any one of the expansion templates to obtain corresponding expansion samples.
In this embodiment, for each training positive sample, the extended template may be obtained by filling numbers in the target area of the training positive sample with white image lattices or black image lattices. Therefore, a plurality of expansion templates with different shooting scenes, such as different sizes, different background images, different angles and the like, can be obtained.
And filling each obtained digit combination into any expansion template, specifically, filling the digit combination into a position corresponding to the date information in the expansion template, so as to form an expansion sample with other information existing in the expansion template. The resulting extended template can be represented as shown in fig. 11(a), and the extended sample (target region included in the extended sample) obtained by filling the resulting combination of digits into the extended template can be represented in fig. 11 (b).
In addition, in addition to deducting the numbers in the training positive sample to make an extended template, blank templates obtained in different scenes can be collected as the extended template, where the blank templates are templates lacking numbers in the training positive sample, that is, templates with blanks at positions corresponding to dates.
Through the steps, the position information of the target area in the training positive sample is calibrated in advance, and the training positive sample marked with the position information of the target area is used for training the first neural network model to obtain the detection submodel. The trained detection submodel can learn and train to obtain the position information of the target area, and then the target area can be located for the image to be identified subsequently, and the influence of other areas containing content information on identification is eliminated.
Further, in order to avoid interference caused by some areas with positions close to the target area, the classifier is trained by using the training positive samples and the training negative samples to obtain the classification submodel. The classification submodel can be used to determine the content information in the target region detected by the detection submodel to determine whether the target region contains the required content information, such as date information.
In addition, considering that the application mainly identifies date information in a target area, the adopted training positive samples have limited date formation, so that the samples are unbalanced. Therefore, the extended sample is obtained by extending based on the training positive sample, and the recognizer model is obtained by training the second neural network model by using the training positive sample and the extended sample. The subsequent recognition accuracy of the recognition sub-model for the images of all dates can be improved.
After the recognition model is obtained in the above manner, it can be seen from the above that the recognition model includes a detection submodel and a recognition submodel, where the recognition submodel is obtained by training in advance based on a plurality of training positive samples and a plurality of extended samples obtained by extending according to content information of the plurality of training positive samples. In the step S50 of processing the image to be recognized to obtain the recognition result, the following steps are performed, please refer to fig. 12:
and step S51, detecting based on the position information by using a pre-established detection submodel to determine a target area from the plurality of areas.
And step S52, performing information identification on the target area of the image to be identified by using a pre-established identification submodel to obtain an identification result.
In this embodiment, the detailed process of obtaining the extended sample based on the training positive sample extension can be referred to the above embodiments. The method and the device have the advantages that the content information in the target area of the image to be recognized is recognized by the recognizer model obtained based on training of the training positive sample and the extended sample, and the recognition accuracy can be improved. The problem of low recognition accuracy of the recognition submodel when the image to be recognized contains date information which does not appear in the training positive sample is solved.
In this embodiment, in order to accurately locate the position information of the target area in the image, when performing the training of the detection sub-model, the corner coordinates of the target area and the rectangular frame framing the target area may be calibrated, which may be referred to in the above embodiments. Therefore, when the detection submodel is used for determining the target area in the image to be identified, the pre-established detection submodel can be used for determining the target area from the plurality of areas based on the detection of the rectangular frame and the corner coordinates.
In this way, the target region in the image to be recognized can be determined based on the combination of the learned rough position information (represented by the rectangular frame) of the target region and the accurate position information (represented by the corner coordinates) of the target region. In addition, the problem that due to the fact that angles of the target area in the image are various, the occupation ratio of character information, digital information and the like in the target area in the rectangular frame is small possibly caused by the mode of independently detecting the rectangular frame, and the subsequent recognition effect is influenced is solved.
In this embodiment, all portions that may be target regions in the image to be recognized may be detected by the detection submodel, but the detection accuracy of the detection submodel may not reach 100% accuracy, and therefore, some false detections may occur. If these false detected regions are not processed and sent to the subsequent identification submodel for identification, a false identification result will be generated. Therefore, referring to fig. 13, in the present embodiment, the step of determining the target area from the plurality of areas may be performed by:
and step S511, determining a preliminary target area from the plurality of areas based on the detection of the rectangular frame and the corner coordinates by utilizing a pre-established detection sub-model.
And S512, constructing a minimum external frame of the preliminary target area according to the corner point coordinates and the side lines of the preliminary target area.
Step S513, detecting whether the size of the minimum outline frame falls within a preset range, and if so, determining that the preliminary target area is a target area.
In this embodiment, considering that the target area is generally rectangular, after the preliminary target area is determined based on the rectangular frame and the corner coordinates by using the detection sub-model, the minimum bounding box of the preliminary target area may be constructed. I.e., the minimum bounding box divided along the edges and corners of the preliminary target area, as shown in fig. 14, wherein the outer box is the rectangular box described above and the box framed along the inner edges is the minimum bounding box. According to the priori knowledge, when the actual size of the target area is within a certain range, whether the actual size of the target area is the real target area can be judged by detecting whether the size of the minimum external frame is within a preset range. Wherein the dimension may be a height, a width, or an aspect ratio of the smallest bounding box, among others.
For example, the aspect ratio of the target region in the image is generally between 6:1 and 9:1 according to a priori knowledge, and therefore, some of the determined aspect ratios of the preliminary target region may be excluded if they are not in the aspect ratio range.
On this basis, in the present embodiment, a classification sub-model is further added to the recognition model, considering that some regions may be determined as target regions close to the actual target region position information, but actually these regions do not contain necessary information, for example, may be a certain region in the shooting background. The classification submodel is obtained by training a classifier in advance based on the collected training positive samples and training negative samples, and the specific training process can be referred to the above embodiment.
And classifying and identifying the determined target area by using the classification submodel to obtain a classification result, wherein the classification result can represent whether the target area contains information corresponding to the content information in the target area of the training positive sample. That is, the classification submodel may identify whether the determined target area includes date information related to sterilization.
For example, as shown in fig. 15, the aspect ratio detection may be performed on each of the determined preliminary target areas, if the output is false, that is, the aspect ratio does not meet the requirement, the subsequent processing is not required, if the output is true, that is, the aspect ratio meets the requirement, the subsequent classification determination of whether the target area includes the required content information is performed, if the determination result is true, the subsequent processing of identifying the content information in the target area is continued, and if the determination result is false, the subsequent identification processing is not required. The area defined by the upper frame in fig. 15 is an area in the shooting background, and can be excluded by the detection of the aspect ratio, and after the area defined by the lower frame in fig. 15 is subjected to the aspect ratio detection and classification judgment, the target area can be judged to be an area containing the required content information, and can be used for the subsequent identification processing.
Further, it can be seen from fig. 16 that the area outlined above in fig. 16 is an area in the imaging background, but since the size of the area is close to the size of the real target area, the output is true when the aspect ratio detection is used, but when the classification submodel is used to perform the true-false classification determination, it is determined that the area does not include the necessary content information, that is, the classification determination result is false, and therefore, the subsequent recognition processing is not required. The aspect ratio detection result of the area framed in the lower part of fig. 16 is true, and the classification determination result is true, and the subsequent identification processing flow can be sent.
In this embodiment, the classification submodel is added to determine whether the information in the determined target region is true or false, so as to further screen out some regions that do not contain the required information and are erroneously determined as the target region.
As can be seen from the above, the target region of the image contains the text information and the digital information, and when performing recognition, the text information and the digital information contained in the target region of the image to be recognized can be obtained by using the pre-established recognition submodel, and the corresponding date information can be obtained according to the digital information.
Considering that the angles of the character information and the digital information in the image to be recognized may be inclined to affect the recognition effect, whether the direction information of the defined minimum external frame meets the preset requirement or not can be detected, and if the direction information does not meet the preset requirement, the target area is corrected in a rotating mode. And performing information identification based on the target area after the rotation correction.
It should be noted that the vehicle-mounted field image identification method in this embodiment is implemented based on the identification model constructed in the embodiment, and the description related to the identification model in this embodiment may refer to the description related to the embodiment without elaboration, and this embodiment is not described herein again.
Third embodiment
Based on the same application concept, a vehicle-mounted domain image recognition device 220 corresponding to the vehicle-mounted domain image recognition method is further provided in the embodiment of the present application, please refer to fig. 17, since the principle of the device in the embodiment of the present application for solving the problem is similar to the vehicle-mounted domain image recognition method in the embodiment of the present application, the implementation of the device may refer to the implementation of the method, and repeated details are not repeated.
Referring to fig. 17, a schematic diagram of a vehicle-mounted area image recognition apparatus 220 according to an embodiment of the present application is shown, where the apparatus includes: an acquisition module 221 and an identification module 222.
The acquiring module 221 is configured to acquire an image to be recognized, where the image to be recognized has a plurality of areas containing content information.
It is understood that the obtaining module 221 may be configured to perform the step S40, and for a detailed implementation of the obtaining module 221, reference may be made to the content related to the step S40.
The identification module 222 is configured to process the image to be identified by using a pre-established identification model, determine a target region from the multiple regions according to the position information of each region, and perform information identification on the target region to obtain an identification result;
the identification model is obtained by training a plurality of training positive samples marked with position information of the target area in advance, and each training positive sample is provided with a plurality of areas containing content information.
It is understood that the identification module 222 can be used to execute the step S50, and the detailed implementation of the identification module 222 can refer to the content related to the step S50.
In a possible embodiment, the recognition model comprises a detection submodel and a recognition submodel;
the identification module 222 is configured to obtain an identification result by:
detecting based on the position information by utilizing a pre-established detection submodel to determine a target area from the plurality of areas;
and carrying out information identification on the target area of the image to be identified by utilizing a pre-established identification submodel to obtain an identification result, wherein the identification submodel is obtained by training on the basis of a plurality of training positive samples and a plurality of extension samples obtained by extension according to the content information of the plurality of training positive samples.
In a possible embodiment, the identification module 222 is configured to obtain the identification result including the date information by:
and acquiring character information and digital information contained in the target area of the image to be recognized by utilizing a pre-established recognition sub-model, and acquiring corresponding date information according to the digital information.
In one possible embodiment, the identification module 222 is configured to determine the target area by:
and determining a target area from the plurality of areas based on the detection of the coordinates of the rectangular frame and the corner points by utilizing a pre-established detection sub-model.
In one possible embodiment, the identification module 222 is configured to determine the target area by:
determining a preliminary target area from the plurality of areas according to the corner point coordinates and the rectangular frame of each area;
constructing a minimum external frame of the preliminary target area according to the coordinates of the corner points and the side lines of the preliminary target area;
and detecting whether the size of the minimum external frame is within a preset range, and if so, determining the preliminary target area as a target area.
In a possible embodiment, the in-vehicle field image recognition device 220 further includes a correction module;
the correcting module is used for detecting whether the azimuth information of the minimum external frame meets a preset requirement or not, and if not, performing rotation correction on the target area.
In a possible implementation manner, the recognition model further includes a classification submodel, and the classification submodel is obtained by training a classifier by using a plurality of training positive samples and a plurality of training negative samples in advance;
the identification module 222 is further configured to perform classification judgment on the image to be identified by:
and classifying and identifying the determined target area by using the classification submodel to obtain a classification result, wherein the classification result represents whether the target area contains information corresponding to the content information in the target area of the training positive sample.
In addition, based on the same application concept, a vehicle-mounted domain image recognition model establishing device 210 corresponding to the vehicle-mounted domain image recognition model establishing method is further provided in the embodiment of the present application, please refer to fig. 18, since the principle of solving the problem of the device in the embodiment of the present application is similar to the vehicle-mounted domain image recognition model establishing method in the embodiment of the present application, the implementation of the device may refer to the implementation of the method, and repeated parts are not described again.
Referring to fig. 18, a schematic diagram of an on-board domain image recognition model building apparatus 210 according to the present application is shown, where the apparatus includes: an obtaining module 211, a calibration module 212, and a training module 213.
An obtaining module 211, configured to obtain, for each training positive sample of the obtained multiple training positive samples, multiple areas containing content information in the training positive sample.
It is understood that the obtaining module 211 can be configured to execute the step S10, and as to the detailed implementation of the obtaining module 211, reference can be made to the content related to the step S10.
A calibration module 212, configured to calibrate a target area from the multiple areas and obtain position information of the target area.
It is understood that the calibration module 212 may be used to execute the step S20, and the detailed implementation of the calibration module 212 may refer to the above description regarding the step S20.
The training module 213 is configured to train the constructed neural network model by using multiple training positive samples labeled with the position information of the target area, so as to obtain the recognition model.
It is understood that the training module 213 can be used to execute the step S30, and the detailed implementation of the training module 213 can refer to the content related to the step S30.
In a possible embodiment, the recognition model includes a detection submodel and a recognition submodel, and the training module 213 is configured to obtain the detection submodel and the recognition submodel by:
training the constructed first neural network model by using a plurality of training positive samples marked with the position information of the target area to obtain a detection sub-model;
performing sample expansion according to the content information in the target areas of the multiple training positive samples to obtain multiple expansion samples;
and training the constructed second neural network model by utilizing a plurality of training positive samples and a plurality of extension samples to obtain the recognition submodel.
In a possible embodiment, the recognition model further includes a classification submodel, and the training module 213 is further configured to obtain the classification submodel by:
and training the constructed classifier by using the obtained multiple training negative samples and the multiple training positive samples marked with the position information of the target area to obtain a classification submodel.
In a possible embodiment, the content information in the target area includes words and numbers, and the training module 213 is configured to obtain the extended sample by:
extracting numbers contained in the target area of each training positive sample to form a number set;
combining the numbers contained in the number set to obtain a plurality of number combinations;
and obtaining a plurality of expansion samples based on the plurality of number combinations and the characters.
In a possible embodiment, the number represents a date, and the training module 213 is configured to obtain the number combination by:
aiming at each preset expansion date, obtaining the expansion number required by the expansion date;
randomly extracting the number corresponding to the expansion number from the number set to form a number combination characterizing the expansion date.
In one possible embodiment, the training module 213 is configured to form the set of numbers by:
extracting numbers contained in a target area of each training positive sample;
acquiring an expansion set containing a plurality of numbers carrying different writing information;
and forming a digit set by the digits in the extended set and the digits in the training positive samples.
In a possible embodiment, the training module 213 is configured to obtain the extended samples by:
for each training positive sample, deducting the numbers in the target area of the training positive sample to obtain a corresponding expansion template;
and filling each digital combination into any one of the expansion templates to obtain corresponding expansion samples.
In a possible implementation, the position information includes corner coordinates of the target area and a rectangular frame framing the target area;
the training module 213 is configured to obtain the detection submodel by:
and training the constructed first neural network model by utilizing the corner point coordinates carrying the target area and a plurality of training positive samples framing the rectangular frame of the target area to obtain a detection sub-model.
Fourth embodiment
Referring to fig. 19, an electronic device 300 is further provided in the present embodiment, and the electronic device 300 may be the server 110. The electronic device 300 includes: a processor 310, a memory 320, and a bus 330. The memory 320 stores machine-readable instructions executable by the processor 310, the processor 310 and the memory 320 communicating via the bus 330 when the electronic device 300 is operating, the machine-readable instructions when executed by the processor 310 performing the following:
in one possible implementation, the instructions executed by the processor 310 include the following processes:
aiming at each training positive sample in a plurality of acquired training positive samples, acquiring a plurality of areas containing content information in the training positive sample;
marking a target area from the plurality of areas and obtaining position information of the target area;
and training the constructed neural network model by using a plurality of training positive samples marked with the position information of the target area to obtain the recognition model.
In another possible implementation, the instructions executed by the processor 310 include the following processes:
acquiring an image to be identified, wherein the image to be identified is provided with a plurality of areas containing content information;
processing the image to be recognized by using a pre-established recognition model, determining a target area from the plurality of areas according to the position information of each area, and performing information recognition on the target area to obtain a recognition result;
the identification model is obtained by training a plurality of training positive samples marked with position information of the target area in advance, and each training positive sample is provided with a plurality of areas containing content information.
With respect to the processes involved in the instructions executed by the processor 310 when the electronic device 300 operates, reference may be made to the related description of the above method embodiments, and details thereof are not described here.
Fifth embodiment
An embodiment of the present application further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the computer program executes the steps of the vehicle-mounted domain image recognition model building method or the vehicle-mounted domain image recognition method.
An embodiment of the present application provides a computer program product, which, when running on a computer, causes the computer to execute the steps of the above vehicle-mounted domain image recognition model building method or vehicle-mounted domain image recognition method.
Specifically, the storage medium can be a general-purpose storage medium, such as a removable disk, a hard disk, or the like, and when executed, the computer program on the storage medium can execute the above-described in-vehicle-field image recognition model building method or in-vehicle-field image recognition method.
It can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working processes of the system and the apparatus described above may refer to corresponding processes in the method embodiments, and are not described in detail in this application. In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. The above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is merely a logical division, and there may be other divisions in actual implementation, and for example, a plurality of modules or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of devices or modules through some communication interfaces, and may be in an electrical, mechanical or other form.
The modules described as separate parts may or may not be physically separate, and parts displayed as modules may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a non-volatile computer-readable storage medium executable by a processor. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a U disk, a removable hard disk, a ROM, a RAM, a magnetic disk, or an optical disk.
The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present application, and shall be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (10)

1. A vehicle-mounted field image recognition method is characterized by comprising the following steps:
acquiring an image to be identified, wherein the image to be identified is provided with a plurality of areas containing content information;
processing the image to be recognized by using a pre-established recognition model, determining a target area from a plurality of areas according to the position information of each area, and performing information recognition on the target area to obtain a recognition result;
the identification model is obtained by training a plurality of training positive samples marked with position information of the target area in advance, and each training positive sample is provided with a plurality of areas containing content information.
2. The on-vehicle domain image recognition method according to claim 1, wherein the recognition model includes a detection submodel and a recognition submodel;
the step of processing the image to be recognized by using the pre-established recognition model, determining a target area from the plurality of areas according to the position information of each area, and performing information recognition on the target area to obtain a recognition result comprises the following steps:
detecting based on the position information by utilizing a pre-established detection submodel to determine a target area from the plurality of areas;
and carrying out information identification on the target area of the image to be identified by utilizing a pre-established identification submodel to obtain an identification result, wherein the identification submodel is obtained by training on the basis of a plurality of training positive samples and a plurality of extension samples obtained by extension according to the content information of the plurality of training positive samples.
3. The vehicle-mounted field image recognition method according to claim 2, wherein the step of performing information recognition on the target area of the image to be recognized by using a pre-established recognition submodel to obtain a recognition result comprises:
and acquiring character information and digital information contained in the target area of the image to be recognized by utilizing a pre-established recognition sub-model, and acquiring corresponding date information according to the digital information.
4. The in-vehicle field image recognition method according to claim 2, wherein the step of determining a target area from the plurality of areas based on the position information detection using a pre-established detection submodel includes:
and determining a target area from the plurality of areas based on the detection of the coordinates of the rectangular frame and the corner points by utilizing a pre-established detection sub-model.
5. A vehicle-mounted field image recognition model building method is characterized by comprising the following steps:
aiming at each training positive sample in a plurality of acquired training positive samples, acquiring a plurality of areas containing content information in the training positive sample;
marking a target area from the plurality of areas and obtaining position information of the target area;
and training the constructed neural network model by using a plurality of training positive samples marked with the position information of the target area to obtain the recognition model.
6. The method for building the vehicle-mounted field image recognition model according to claim 5, wherein the recognition model comprises a detection submodel and a recognition submodel, and the step of training the built neural network model by using the plurality of training positive samples marked with the position information of the target area to obtain the recognition model comprises the steps of:
training the constructed first neural network model by using a plurality of training positive samples marked with the position information of the target area to obtain a detection sub-model;
performing sample expansion according to the content information in the target areas of the multiple training positive samples to obtain multiple expansion samples;
and training the constructed second neural network model by utilizing a plurality of training positive samples and a plurality of extension samples to obtain the recognition submodel.
7. The vehicle-mounted field image recognition model building method according to claim 6, wherein the content information in the target area includes characters and numbers, and the step of performing sample expansion according to the content information in the target areas of the plurality of training positive samples to obtain a plurality of expansion samples includes:
extracting numbers contained in the target area of each training positive sample to form a number set;
combining the numbers contained in the number set to obtain a plurality of number combinations;
and obtaining a plurality of expansion samples based on the plurality of number combinations and the characters.
8. An on-vehicle domain image recognition apparatus, characterized in that the apparatus comprises:
the device comprises an acquisition module, a recognition module and a processing module, wherein the acquisition module is used for acquiring an image to be recognized, and the image to be recognized is provided with a plurality of areas containing content information;
the identification module is used for processing the image to be identified by using a pre-established identification model, determining a target area from a plurality of areas according to the position information of each area, and identifying information of the target area to obtain an identification result;
the identification model is obtained by training a plurality of training positive samples marked with position information of the target area in advance, and each training positive sample is provided with a plurality of areas containing content information.
9. An electronic device, comprising: a processor, a storage medium and a bus, the storage medium storing machine-readable instructions executable by the processor, the processor and the storage medium communicating via the bus when the electronic device is operating, the processor executing the machine-readable instructions to perform the steps of the method according to any one of claims 1 to 7.
10. A computer-readable storage medium, having stored thereon a computer program which, when being executed by a processor, is adapted to carry out the steps of the method according to any one of claims 1 to 7.
CN202110145812.5A 2021-02-02 2021-02-02 Vehicle-mounted field image identification method, identification model establishing method, device, electronic equipment and readable storage medium Pending CN112818865A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110145812.5A CN112818865A (en) 2021-02-02 2021-02-02 Vehicle-mounted field image identification method, identification model establishing method, device, electronic equipment and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110145812.5A CN112818865A (en) 2021-02-02 2021-02-02 Vehicle-mounted field image identification method, identification model establishing method, device, electronic equipment and readable storage medium

Publications (1)

Publication Number Publication Date
CN112818865A true CN112818865A (en) 2021-05-18

Family

ID=75861697

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110145812.5A Pending CN112818865A (en) 2021-02-02 2021-02-02 Vehicle-mounted field image identification method, identification model establishing method, device, electronic equipment and readable storage medium

Country Status (1)

Country Link
CN (1) CN112818865A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113313124A (en) * 2021-07-29 2021-08-27 佛山市墨纳森智能科技有限公司 Method and device for identifying license plate number based on image segmentation algorithm and terminal equipment

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109871730A (en) * 2017-12-05 2019-06-11 杭州海康威视数字技术股份有限公司 A kind of target identification method, device and monitoring device
CN110738602A (en) * 2019-09-12 2020-01-31 北京三快在线科技有限公司 Image processing method and device, electronic equipment and readable storage medium
WO2020103676A1 (en) * 2018-11-23 2020-05-28 腾讯科技(深圳)有限公司 Image identification method and apparatus, system, and storage medium
CN111275038A (en) * 2020-01-17 2020-06-12 平安医疗健康管理股份有限公司 Image text recognition method and device, computer equipment and computer storage medium
WO2020155939A1 (en) * 2019-01-31 2020-08-06 广州视源电子科技股份有限公司 Image recognition method and device, storage medium and processor

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109871730A (en) * 2017-12-05 2019-06-11 杭州海康威视数字技术股份有限公司 A kind of target identification method, device and monitoring device
WO2020103676A1 (en) * 2018-11-23 2020-05-28 腾讯科技(深圳)有限公司 Image identification method and apparatus, system, and storage medium
WO2020155939A1 (en) * 2019-01-31 2020-08-06 广州视源电子科技股份有限公司 Image recognition method and device, storage medium and processor
CN110738602A (en) * 2019-09-12 2020-01-31 北京三快在线科技有限公司 Image processing method and device, electronic equipment and readable storage medium
CN111275038A (en) * 2020-01-17 2020-06-12 平安医疗健康管理股份有限公司 Image text recognition method and device, computer equipment and computer storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
刘伍丰;何前磊;郑维;: "基于HAAR特征与BP神经网络的车牌识别技术研究", 电子测量技术, no. 08 *
姚红革;王诚;喻钧;白小军;李蔚;: "复杂卫星图像中的小目标船舶识别", 遥感学报, no. 02 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113313124A (en) * 2021-07-29 2021-08-27 佛山市墨纳森智能科技有限公司 Method and device for identifying license plate number based on image segmentation algorithm and terminal equipment

Similar Documents

Publication Publication Date Title
CN109343061B (en) Sensor calibration method and device, computer equipment, medium and vehicle
CN108038474B (en) Face detection method, convolutional neural network parameter training method, device and medium
CN107067003B (en) Region-of-interest boundary extraction method, device, equipment and computer storage medium
CN112257605B (en) Three-dimensional target detection method, system and device based on self-labeling training sample
CN110969592B (en) Image fusion method, automatic driving control method, device and equipment
CN112580707A (en) Image recognition method, device, equipment and storage medium
CN111079763A (en) Training sample generation, model training, character recognition method and device thereof
CN114038004A (en) Certificate information extraction method, device, equipment and storage medium
CN111753592A (en) Traffic sign recognition method, traffic sign recognition device, computer equipment and storage medium
CN111256693A (en) Pose change calculation method and vehicle-mounted terminal
CN111695609A (en) Target damage degree determination method, target damage degree determination device, electronic device, and storage medium
CN113255444A (en) Training method of image recognition model, image recognition method and device
CN113158895A (en) Bill identification method and device, electronic equipment and storage medium
CN112541394A (en) Black eye and rhinitis identification method, system and computer medium
CN111191482B (en) Brake lamp identification method and device and electronic equipment
CN114120071A (en) Detection method of image with object labeling frame
CN112818865A (en) Vehicle-mounted field image identification method, identification model establishing method, device, electronic equipment and readable storage medium
CN117115823A (en) Tamper identification method and device, computer equipment and storage medium
CN115713750A (en) Lane line detection method and device, electronic equipment and storage medium
CN113902927B (en) Comprehensive information processing method fusing image and point cloud information
CN116543143A (en) Training method of target detection model, target detection method and device
CN116246161A (en) Method and device for identifying target fine type of remote sensing image under guidance of domain knowledge
CN108805121B (en) License plate detection and positioning method, device, equipment and computer readable medium
CN114332809A (en) Image identification method and device, electronic equipment and storage medium
CN111860084B (en) Image feature matching and positioning method and device and positioning system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination