CN112818865B - Vehicle-mounted field image recognition method, recognition model establishment method, device, electronic equipment and readable storage medium - Google Patents

Vehicle-mounted field image recognition method, recognition model establishment method, device, electronic equipment and readable storage medium Download PDF

Info

Publication number
CN112818865B
CN112818865B CN202110145812.5A CN202110145812A CN112818865B CN 112818865 B CN112818865 B CN 112818865B CN 202110145812 A CN202110145812 A CN 202110145812A CN 112818865 B CN112818865 B CN 112818865B
Authority
CN
China
Prior art keywords
target area
model
recognition
information
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110145812.5A
Other languages
Chinese (zh)
Other versions
CN112818865A (en
Inventor
冯辉
苟巍
沈海峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Didi Infinity Technology and Development Co Ltd
Original Assignee
Beijing Didi Infinity Technology and Development Co Ltd
Filing date
Publication date
Application filed by Beijing Didi Infinity Technology and Development Co Ltd filed Critical Beijing Didi Infinity Technology and Development Co Ltd
Priority to CN202110145812.5A priority Critical patent/CN112818865B/en
Publication of CN112818865A publication Critical patent/CN112818865A/en
Application granted granted Critical
Publication of CN112818865B publication Critical patent/CN112818865B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The application provides a vehicle-mounted field image recognition method, a recognition model establishment method, a device, electronic equipment and a readable storage medium. The recognition model is obtained by training a plurality of training positive samples marked with the position information of the target area in advance, and each training positive sample is provided with a plurality of areas containing content information. In the scheme, the problem that the required area is difficult to be identified in an emphasized way due to indiscriminate identification of the area containing the content information in the prior art is avoided, and the problem that other areas interfere with the identification result is avoided.

Description

Vehicle-mounted field image recognition method, recognition model establishment method, device, electronic equipment and readable storage medium
Technical Field
The present application relates to the field of computer networks, and in particular, to a vehicle-mounted field image recognition method, a recognition model building method, a device, an electronic apparatus, and a readable storage medium.
Background
At present, most of the existing information identification methods are specific to information identification in natural scenes, and no information identification method in special scenes exists at present. The information identification technology under the natural scene can identify all the information with the same attribute in the image, and if the information identification mode of the open source natural scene is used, the information identification technology does not accord with the requirements of identifying part of information from a plurality of pieces of information with the same attribute under some special scenes.
Disclosure of Invention
In view of the above, an object of the present application is to provide a vehicle-mounted domain image recognition method, recognition model creation device, electronic device, and readable storage medium, which can improve recognition efficiency by identifying a target region from a plurality of regions having the same attribute and then recognizing the target region.
In a first aspect, an embodiment of the present application provides a vehicle-mounted domain image recognition method, where the method includes:
acquiring an image to be identified, wherein the image to be identified is provided with a plurality of areas containing content information;
processing the image to be identified by using a pre-established identification model, determining a target area from the plurality of areas according to the position information of each area, and carrying out information identification on the target area to obtain an identification result;
The recognition model is obtained by training a plurality of training positive samples marked with the position information of the target area in advance, and each training positive sample is provided with a plurality of areas containing content information.
In an alternative embodiment, the recognition model includes a detection sub-model and a recognition sub-model;
The step of processing the image to be identified by using a pre-established identification model, determining a target area from the plurality of areas according to the position information of each area, and identifying the target area by information to obtain an identification result comprises the following steps:
detecting based on the position information using a pre-established detection sub-model to determine a target region from the plurality of regions;
And carrying out information recognition on the target area of the image to be recognized by utilizing a pre-established recognition sub-model to obtain a recognition result, wherein the recognition sub-model is obtained by training a plurality of training positive samples and a plurality of expansion samples obtained by expanding according to the content information of the training positive samples.
In an optional embodiment, the step of identifying the target area of the image to be identified by using a pre-established identification sub-model to obtain an identification result includes:
And obtaining character information and digital information contained in the target area of the image to be identified by utilizing a pre-established identification sub-model, and obtaining corresponding date information according to the digital information.
In an alternative embodiment, the step of determining the target area from the plurality of areas based on the detection of the location information using a pre-established detection sub-model includes:
And determining the target area from the areas based on detection of rectangular frames and corner coordinates by using a pre-established detection sub-model.
In an alternative embodiment, the step of determining the target area from the plurality of areas based on detection of rectangular frames and corner coordinates by using a pre-established detection submodel includes:
Determining a preliminary target area from the plurality of areas by utilizing a detection sub-model established in advance based on detection of rectangular frames and corner coordinates;
Constructing a minimum circumscribed frame of the preliminary target area according to the corner coordinates and the side lines of the preliminary target area;
and detecting whether the size of the minimum circumscribed frame is within a preset range, and if so, determining the preliminary target area as a target area.
In an optional embodiment, before the step of identifying the target area of the image to be identified by information to obtain an identification result, the method further includes:
and detecting whether the azimuth information of the minimum external frame meets a preset requirement, and if the azimuth information does not meet the preset requirement, carrying out rotation correction on the target area.
In an optional embodiment, the recognition model further comprises a classification sub-model, wherein the classification sub-model is obtained by training the classifier by utilizing a plurality of training positive samples and a plurality of training negative samples in advance;
The step of processing the image to be identified by using a pre-established identification model, determining a target area from the plurality of areas according to the position information of each area, and identifying the target area by information to obtain an identification result, and further comprises the steps of:
And classifying and identifying the determined target area by using the classifying sub-model to obtain a classifying result, wherein the classifying result represents whether the target area contains information corresponding to the content information in the target area of the training positive sample.
In a second aspect, an embodiment of the present application provides a vehicle-mounted domain image recognition model building method, where the method includes:
aiming at each training positive sample in the acquired multiple training positive samples, acquiring multiple areas containing content information in the training positive samples;
identifying a target area from the plurality of areas, and obtaining position information of the target area;
and training the constructed neural network model by utilizing a plurality of training positive samples marked with the position information of the target area to obtain an identification model.
In an alternative embodiment, the recognition model includes a detection sub-model and a recognition sub-model, and the step of training the constructed neural network model by using a plurality of training positive samples marked with the position information of the target area to obtain the recognition model includes:
training the constructed first neural network model by utilizing a plurality of training positive samples marked with the position information of the target area to obtain a detection sub-model;
Sample expansion is carried out according to the content information in the target area of the training positive samples, so that a plurality of expansion samples are obtained;
And training the constructed second neural network model by using a plurality of training positive samples and a plurality of expansion samples to obtain the recognition sub-model.
In an optional embodiment, the recognition model further includes a classification sub-model, and the step of training the constructed neural network model by using a plurality of training positive samples marked with the position information of the target area to obtain the recognition model further includes:
And training the constructed classifier by using the acquired training negative samples and the training positive samples marked with the position information of the target area to obtain a classification sub-model.
In an optional embodiment, the content information in the target area includes an alphanumeric character, and the step of performing sample expansion according to the content information in the target area of the plurality of training positive samples to obtain a plurality of expanded samples includes:
extracting numbers contained in a target area of each training positive sample to form a number set;
combining the numbers contained in the number set to obtain a plurality of number combinations;
and obtaining a plurality of expansion samples based on the plurality of digital combinations and the text.
In an alternative embodiment, the step of combining the numbers included in the number set to obtain a plurality of number combinations includes:
aiming at each preset expansion date, acquiring an expansion number required by the expansion date;
randomly extracting numbers corresponding to the extended numbers from the number set to form a number combination representing the extended date.
In an alternative embodiment, the step of extracting numbers contained in the target area of each training positive sample to form a number set includes:
Extracting numbers contained in a target area of each training positive sample;
acquiring an expansion set containing a plurality of numbers carrying different writing information;
the numbers in the extended set and the numbers in the plurality of training positive samples are formed into a set of numbers.
In an alternative embodiment, the step of obtaining a plurality of extension samples based on the plurality of digital combinations and the text includes:
For each training positive sample, buckling and taking out numbers in a target area of the training positive sample to obtain a corresponding expansion template;
and filling each digital combination into any expansion template to obtain a corresponding expansion sample.
In an alternative embodiment, the location information includes corner coordinates of the target area and a rectangular frame framing the target area;
The step of training the constructed first neural network model by using a plurality of training positive samples marked with the position information of the target area to obtain a detection sub-model comprises the following steps:
and training the constructed first neural network model by utilizing a plurality of training positive samples carrying angular point coordinates of the target area and a rectangular frame framing the target area to obtain a detection sub-model.
In a third aspect, an embodiment of the present application provides an in-vehicle field image recognition apparatus, including:
The acquisition module is used for acquiring an image to be identified, wherein the image to be identified is provided with a plurality of areas containing content information;
The identification module is used for processing the image to be identified by utilizing a pre-established identification model, determining a target area from the plurality of areas according to the position information of each area, and carrying out information identification on the target area to obtain an identification result;
The recognition model is obtained by training a plurality of training positive samples marked with the position information of the target area in advance, and each training positive sample is provided with a plurality of areas containing content information.
In an alternative embodiment, the recognition model includes a detection sub-model and a recognition sub-model;
the identification module is used for obtaining an identification result through the following modes:
detecting based on the position information using a pre-established detection sub-model to determine a target region from the plurality of regions;
And carrying out information recognition on the target area of the image to be recognized by utilizing a pre-established recognition sub-model to obtain a recognition result, wherein the recognition sub-model is obtained by training a plurality of training positive samples and a plurality of expansion samples obtained by expanding according to the content information of the training positive samples.
In an alternative embodiment, the identification module is configured to obtain the identification result by:
And obtaining character information and digital information contained in the target area of the image to be identified by utilizing a pre-established identification sub-model, and obtaining corresponding date information according to the digital information.
In an alternative embodiment, the identification module is configured to determine the target area by:
And determining the target area from the areas based on detection of rectangular frames and corner coordinates by using a pre-established detection sub-model.
In an alternative embodiment, the identification module is configured to determine the target area by:
determining a preliminary target area from the plurality of areas according to the corner coordinates and the rectangular frames of each area;
Constructing a minimum circumscribed frame of the preliminary target area according to the corner coordinates and the side lines of the preliminary target area;
and detecting whether the size of the minimum circumscribed frame is within a preset range, and if so, determining the preliminary target area as a target area.
In an optional embodiment, the vehicle-mounted domain image recognition device further comprises a correction module;
And the correction module is used for detecting whether the azimuth information of the minimum external frame meets the preset requirement, and if not, carrying out rotation correction on the target area.
In an optional embodiment, the recognition model further comprises a classification sub-model, wherein the classification sub-model is obtained by training the classifier by utilizing a plurality of training positive samples and a plurality of training negative samples in advance;
the identification module is also used for classifying and judging the image to be identified by the following modes:
And classifying and identifying the determined target area by using the classifying sub-model to obtain a classifying result, wherein the classifying result represents whether the target area contains information corresponding to the content information in the target area of the training positive sample.
In a fourth aspect, an embodiment of the present application provides a vehicle-mounted domain image recognition model building apparatus, including:
The acquisition module is used for acquiring a plurality of areas containing content information in each of the acquired training positive samples;
The calibration module is used for calibrating a target area from the plurality of areas and obtaining the position information of the target area;
and the training module is used for training the constructed neural network model by utilizing a plurality of training positive samples marked with the position information of the target area to obtain an identification model.
In an alternative embodiment, the recognition model includes a detection sub-model and a recognition sub-model, and the training module is configured to obtain the detection sub-model and the recognition sub-model by:
training the constructed first neural network model by utilizing a plurality of training positive samples marked with the position information of the target area to obtain a detection sub-model;
Sample expansion is carried out according to the content information in the target area of the training positive samples, so that a plurality of expansion samples are obtained;
And training the constructed second neural network model by using a plurality of training positive samples and a plurality of expansion samples to obtain the recognition sub-model.
In an alternative embodiment, the recognition model further comprises a classification sub-model, and the training module is further configured to obtain the classification sub-model by:
And training the constructed classifier by using the acquired training negative samples and the training positive samples marked with the position information of the target area to obtain a classification sub-model.
In an alternative embodiment, the content information in the target area contains text and numbers, and the training module is configured to obtain the extension samples by:
extracting numbers contained in a target area of each training positive sample to form a number set;
combining the numbers contained in the number set to obtain a plurality of number combinations;
and obtaining a plurality of expansion samples based on the plurality of digital combinations and the text.
In an alternative embodiment, the number characterizes a date and the training module is configured to obtain a number combination by:
aiming at each preset expansion date, acquiring an expansion number required by the expansion date;
randomly extracting numbers corresponding to the extended numbers from the number set to form a number combination representing the extended date.
In an alternative embodiment, the training module is configured to construct the digital set by:
Extracting numbers contained in a target area of each training positive sample;
acquiring an expansion set containing a plurality of numbers carrying different writing information;
the numbers in the extended set and the numbers in the plurality of training positive samples are formed into a set of numbers.
In an alternative embodiment, the training module is configured to obtain the expanded sample by:
For each training positive sample, buckling and taking out numbers in a target area of the training positive sample to obtain a corresponding expansion template;
and filling each digital combination into any expansion template to obtain a corresponding expansion sample.
In an alternative embodiment, the location information includes corner coordinates of the target area and a rectangular frame framing the target area;
the training module is used for obtaining a detection submodel by the following modes:
and training the constructed first neural network model by utilizing a plurality of training positive samples carrying angular point coordinates of the target area and a rectangular frame framing the target area to obtain a detection sub-model.
In a fifth aspect, an embodiment of the present application provides an electronic device, including: a processor, a storage medium storing machine-readable instructions executable by the processor, the processor and the storage medium communicating over a bus when the electronic device is operating, the processor executing the machine-readable instructions to perform the steps of a method as in any of the preceding embodiments.
In a sixth aspect, embodiments of the present application provide a computer readable storage medium having a computer program stored thereon, which when executed by a processor performs the steps of a method according to any of the previous embodiments.
In a seventh aspect, embodiments of the present application provide a computer program product which, when run on a computer, causes the computer to perform the method according to any of the preceding embodiment modes.
Based on any one of the above aspects, after the image to be identified having a plurality of areas containing content information is obtained, the image to be identified is processed by using a pre-established identification model, a target area is determined from the plurality of areas according to the position information of each area, and information identification is performed on the target area to obtain an identification result. The recognition model is obtained by training a plurality of training positive samples marked with the position information of the target area in advance, and each training positive sample is provided with a plurality of areas containing content information. In the scheme, the problem that the required area is difficult to be identified in an emphasized way due to indiscriminate identification of the area containing the content information in the prior art is avoided, and the problem that other areas interfere with the identification result is avoided.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the embodiments will be briefly described below, it being understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and other related drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 shows an application scenario schematic diagram of a vehicle-mounted domain image recognition method provided by an embodiment of the present application;
fig. 2 shows a flowchart of a vehicle-mounted domain image recognition method provided by an embodiment of the present application;
FIG. 3 is a schematic diagram of a training positive sample according to an embodiment of the present application;
Fig. 4 shows a flowchart of a method for establishing an image recognition model in a vehicle-mounted field according to an embodiment of the present application;
fig. 5 is a flowchart of a specific method for building a detection sub-model and a recognition sub-model in the vehicle-mounted field image recognition model building method provided by the embodiment of the application;
FIG. 6 is a schematic diagram of a rectangular frame for calibrating a target area according to an embodiment of the present application;
FIG. 7 is a flowchart of a specific method for obtaining an extension sample in the method for creating an identification submodel according to the embodiment of the present application;
FIG. 8 is a flowchart of a specific method for constructing a digital set in the method for obtaining an extended sample according to an embodiment of the present application;
FIG. 9 is a flowchart of a specific method for constructing a digital combination in the method for obtaining an extended sample according to the embodiment of the present application;
FIG. 10 is another flow chart of a specific method for obtaining an expanded sample according to an embodiment of the present application;
FIG. 11 (a) is a schematic diagram of an expansion template according to an embodiment of the present application;
FIG. 11 (b) shows an embodiment of the present application providing an expanded sample based on an expanded template;
Fig. 12 is a flowchart of a specific method for obtaining a target area and a recognition result in the vehicle-mounted domain image recognition method according to the embodiment of the present application;
Fig. 13 is a flowchart of a specific method for determining a target area in the vehicle-mounted domain image recognition method according to the embodiment of the present application;
FIG. 14 illustrates a schematic diagram of a minimum bounding box of a constructed target area provided by an embodiment of the present application;
FIG. 15 shows one of the schematic diagrams of aspect ratio detection and classification discrimination provided by an embodiment of the present application;
FIG. 16 is a diagram showing a second aspect ratio detection and classification decision scheme provided by an embodiment of the present application;
Fig. 17 is a functional block diagram showing an in-vehicle field image recognition apparatus provided by an embodiment of the present application;
fig. 18 is a functional block diagram showing an in-vehicle field image recognition model building apparatus according to an embodiment of the present application;
fig. 19 shows a block diagram of an electronic device according to an embodiment of the present application.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present application more apparent, the technical solutions of the embodiments of the present application will be clearly and completely described with reference to the accompanying drawings in the embodiments of the present application, and it should be understood that the drawings in the present application are for the purpose of illustration and description only and are not intended to limit the scope of the present application. In addition, it should be understood that the schematic drawings are not drawn to scale. A flowchart, as used in this disclosure, illustrates operations implemented according to some embodiments of the present application. It should be understood that the operations of the flow diagrams may be implemented out of order and that steps without logical context may be performed in reverse order or concurrently. Moreover, one or more other operations may be added to or removed from the flow diagrams by those skilled in the art under the direction of the present disclosure.
In addition, the described embodiments are only some, but not all, embodiments of the application. The components of the embodiments of the present application generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the application, as presented in the figures, is not intended to limit the scope of the application, as claimed, but is merely representative of selected embodiments of the application. All other embodiments, which can be made by a person skilled in the art without making any inventive effort, are intended to be within the scope of the present application.
In order to enable those skilled in the art to use the present disclosure, the following embodiments are presented in connection with a specific application scenario "net car". It will be apparent to those having ordinary skill in the art that the general principles defined herein may be applied to other embodiments and applications without departing from the spirit and scope of the application. Although the application is described primarily in the context of a "net cart," it should be understood that this is but one exemplary embodiment.
It should be noted that the term "comprising" will be used in embodiments of the application to indicate the presence of the features stated hereafter, but not to exclude the addition of other features.
The terms "passenger," "requestor," "service requestor," "customer," "service requestor," are used interchangeably herein to refer to a person, entity, or tool that can request or subscribe to a service. The terms "driver," "provider," "service provider," "provider," and "service provider" are used interchangeably herein to refer to a person, entity, or tool that can provide a service. The term "user" in the present application mainly refers to an individual, entity or tool requesting a service, subscribing to a service. For example, the user may be a passenger.
The terms "service request" and "service" are used interchangeably herein to refer to a request initiated by a passenger, a service requester, a driver, a service provider, or a vendor, etc., or any combination thereof. Accepting the "service request" or "service" may be a passenger, a service requester, a driver, a service provider, a vendor, or the like, or any combination thereof. The service request may be either fee-based or free.
The positioning techniques used in the present application may be based on global positioning system (Global Positioning System, GPS), global navigation satellite system (Global Navigation SATELLITE SYSTEM, GLONASS), COMPASS navigation system (COMPASS), galileo positioning system, quasi Zenith satellite system (Quasi-Zenith SATELLITE SYSTEM, QZSS), wireless fidelity (WIRELESS FIDELITY, WIFI) positioning techniques, or the like, or any combination thereof. One or more of the above-described positioning systems may be used interchangeably in the present application.
One aspect of the present application relates to an identification system that, after obtaining an image to be identified having a plurality of areas containing content information, processes the image to be identified using a previously established identification model, determines a target area from the plurality of areas based on position information of each area, and performs information identification on the target area to obtain an identification result. The recognition model is obtained by training a plurality of training positive samples marked with the position information of the target area in advance, and each training positive sample is provided with a plurality of areas containing content information. Therefore, the problem that in the prior art, indiscriminate identification is carried out on the area containing the content information, so that important identification is difficult to be carried out on the required area is avoided, and the problem that other areas interfere with the identification result is avoided.
It should be noted that, before the application of the present application, when the image includes a plurality of regions having the same attribute, the information of the plurality of regions is generally identified without distinction when the information is identified. This approach is difficult to apply to some scenes where only partial region information in a plurality of regions having the same attribute needs to be identified. For example, for an image including a plurality of areas having content information (e.g., text information, digital information, graphic information, etc.), information of some areas in the image is always fixed, and information of some areas is varied, it is required to be able to recognize the information varied therein.
Therefore, although some of the areas also contain content information, since the content information of these areas is fixed and the content does not matter, it is not necessary to identify the content information in these areas at the time of identification. However, in the current information identification method in natural scenes, the areas containing the content information are often identified without distinction. This approach results in inefficient identification and the identified unnecessary information can also have an impact on the results.
However, in the identification mode provided by the application, the target area can be determined from a plurality of areas by using the obtained identification model, and the information identification is carried out on the target area to obtain the identification result. The target area can be identified with emphasis, so that the problem that the required area is difficult to identify with emphasis in the conventional mode of indiscriminate identification is avoided.
First embodiment
Fig. 1 is a schematic architecture diagram of an identification system 100 according to an embodiment of the present application. For example, the identification system 100 may be an online transportation service platform for a transportation service such as a taxi, a ride service, a express, a carpool, a bus service, a driver rental, or a class service, or any combination thereof. The identification system 100 may include one or more of a server 110, a network 120, a service request terminal 130, a service provider terminal 140, and a database 150.
In some embodiments, server 110 may include a processor. The processor may analyze the information sent by the service provider 140 to perform one or more of the functions described herein. For example, the processor may analyze and process the multiple images sent by the service provider 140, so as to implement the establishment of the recognition model. In some embodiments, a processor may include one or more processing cores (e.g., a single core processor (S) or a multi-core processor (M)). By way of example only, the Processor may include a central processing unit (Central Processing Unit, CPU), application Specific Integrated Circuit (ASIC), special instruction set Processor (Application Specific Instruction-set Processor, ASIP), graphics processing unit (Graphics Processing Unit, GPU), physical processing unit (Physics Processing Unit, PPU), digital signal Processor (DIGITAL SIGNAL Processor, DSP), field programmable gate array (Field Programmable GATE ARRAY, FPGA), programmable logic device (Programmable Logic Device, PLD), controller, microcontroller unit, reduced instruction set computer (Reduced Instruction Set Computing, RISC), microprocessor, or the like, or any combination thereof.
In some embodiments, the device type corresponding to the service request end 130 and the service providing end 140 may be a mobile device, for example, may include a smart home device, a wearable device, a smart mobile device, a virtual reality device, or an augmented reality device, and may also be a tablet computer, a laptop computer, or a built-in device in a motor vehicle, and so on.
In some embodiments, database 150 may be connected to network 120 to communicate with one or more components in identification system 100 (e.g., server 110, service requester 130, service provider 140, etc.). One or more components in the identification system 100 may access data or instructions stored in the database 150 via the network 120. In some embodiments, database 150 may be directly connected to one or more components in identification system 100, or database 150 may be part of server 110.
The following describes in detail the method for identifying an image in a vehicle field according to the embodiment of the present application, with reference to the description of the identification system 100 shown in fig. 1.
Second embodiment
Referring to fig. 2, a flowchart of a vehicle-mounted domain image recognition method according to an embodiment of the present application is shown, and the method may be executed by the server 110 in the recognition system 100. It should be understood that, in other embodiments, the order of part of the steps in the vehicle-mounted field image recognition method according to the present embodiment may be interchanged according to actual needs, or part of the steps may be omitted or deleted. The detailed steps of the identification method are described below.
Step S40, an image to be identified is acquired, wherein the image to be identified is provided with a plurality of areas containing content information.
And S50, processing the image to be identified by using a pre-established identification model, determining a target area from the plurality of areas according to the position information of each area, and carrying out information identification on the target area to obtain an identification result.
As a possible application scenario, in order to ensure the travel safety of passengers, drivers are generally required to perform disinfection treatment on vehicles providing travel services regularly as required, especially in situations such as epidemic and infectious diseases being serious. In addition, the driver is required to place or paste a picture containing disinfection information, such as a vehicle sticker, in the vehicle, and further required to shoot and obtain an image containing the vehicle sticker to upload to a server. In order to ensure that a driver completes disinfection in a vehicle and fills out the disinfection vehicle sticker date, a manager needs to manually review images containing vehicle stickers uploaded by the driver, but the manually reviewed people are difficult and high in labor cost due to the huge daily order quantity. In order to reduce the workload of manual auditing and save the labor cost, a computer vision technology can be adopted to automatically perform information identification on the vehicle-mounted image. Thus, the supervision driver can perform disinfection treatment on the vehicle periodically as required.
It should be noted that the identification scheme provided in this embodiment may be applied to other application scenarios, and the present application is only described by taking the identification process of the image including the disinfection information in the vehicle-mounted field as an example.
In this embodiment, the image to be identified may be an image sent by a service provider and received by a server, and the server may analyze and process the image to be identified to detect whether information in the image to be identified meets relevant specifications.
The recognition model is obtained by training a plurality of training positive samples marked with the position information of the target area in advance, and each training positive sample is provided with a plurality of areas containing content information. For example, as shown in fig. 3, the image to be identified may include propaganda terms of some disinfection processes in the vehicle, such as words of "ride epidemic prevention, safety guard" and the like, and may include explanatory terms, such as a language for explaining the purpose of the vehicle to passengers, and further include information about specific disinfection dates. Further, since the vehicle sticker is located at different positions in the vehicle, for example, it may be located on a back seat of the vehicle, it may be located on a front seat back of the vehicle, it may also be located on an inner side of a door, or the like. Therefore, the captured image may include, in addition to the vehicle interior, other partial area images of the interior or exterior of the vehicle, which may interfere with recognition.
In this embodiment, the final required information for identifying the key point is the information including the specific disinfection date, because the information in the area is changed, and the identification model is required to accurately identify the changed information, such as the date, so as to determine whether the driver disinfects the vehicle daily. The above-mentioned areas such as the areas containing propaganda terms, explanatory terms and possibly other partial areas inside or outside the vehicle, the information in these areas is not actually helpful for the subsequent supervising driver to perform the disinfection treatment, and therefore, the information in these areas may not be recognized with emphasis.
Therefore, in this embodiment, the target area may be determined from a plurality of areas included in the image to be identified, where the target area is the area (of a specific disinfection date) including the change information.
In consideration of the fact that the general patterns of the vehicle patches in the respective vehicles are consistent, the position variation of the target area in the image is not large, and therefore, the position information of each area in the image to be recognized can be obtained.
Because the recognition model obtained through training is obtained through training based on the position information of the target area in the training positive sample, the target area can be determined based on the position information of each area based on the recognition model aiming at the image to be recognized, and therefore information recognition is carried out on the target area based on the information. Interference of other areas on the identification result is avoided.
The vehicle-mounted field image recognition method provided by the embodiment avoids the problems that the existing information recognition technology aiming at the natural scene is used for indiscriminately recognizing the areas which all contain content information, so that the important recognition of the areas which contain the required information is difficult, and the information of other areas interferes with the recognition result.
The embodiment of the application also provides a vehicle-mounted field image recognition model building method, referring to fig. 4, the following first describes in detail a process of obtaining a recognition model by pre-training in the recognition model building method:
Step S10, a plurality of areas containing content information in the training positive samples are obtained for each of the obtained training positive samples.
And step S20, a target area is identified from the plurality of areas, and the position information of the target area is obtained.
And step S30, training the constructed neural network model by using a plurality of training positive samples marked with the position information of the target area to obtain an identification model.
Under the application scenario, the training positive sample in the embodiment is an image containing relevant disinfection information uploaded by the driver acquired in the history. Wherein, because of the numerous service providers on the travel platform, the angles, sizes, etc. of the images photographed by the respective service providers may be different. Thus, pre-acquired training positive samples may include multiple images uploaded by different service providers, taken at different image sizes, different angles, and different times. Thereby enriching the diversity of training samples.
In this embodiment, as shown in fig. 3, the obtained positive samples may be used to train the constructed neural network model by using a plurality of training positive samples marked with the position information of the target area, so that feature learning may be performed on the training positive samples based on the position information of the target area, so that the recognition model obtained by training may accurately identify the target area from a plurality of areas included in the image to be recognized, and further perform recognition detection based on the information in the target area.
In this embodiment, when only a partial area of a plurality of areas with content information in an image is required to be identified based on some application scenarios, all areas are learned to obtain an identification model in the existing natural scenario identification mode, and on one hand, learning some unnecessary feature information affects the identification result of a subsequent identification model, and on the other hand, there is a problem of reducing the identification efficiency of the identification model. Therefore, the vehicle-mounted field image recognition model building method provided by the embodiment adopts a mode of firstly marking target areas in a plurality of areas in an image and training the recognition model by using samples marked with the position information of the target areas, so that the recognition model can be used for focusing on the characteristics of the target areas in the recognition model, and the recognition efficiency of the follow-up recognition model can be provided.
In this embodiment, the recognition model includes a detection sub-model and a recognition sub-model, where learning training of the detection sub-model may be mainly based on training position information of a target area marked in a positive sample, so as to detect the target area from an image to be recognized later to exclude some other areas that affect the recognition result. The learning training of the recognition sub-model may be trained based on specific information contained in the target region for subsequent analysis and recognition of information in the target region in the image to be recognized.
Further, in the present embodiment, it is considered that the number of training positive samples acquired in advance is limited, and the date is limited by the time due to the requirement that the date in the image needs to be recognized. Thus, the date in the training positive sample collected may be only a partial date in the year and not include the date that is required for identification at a later time. For example, starting with the project from 2 months 1 day, the current time is 7 months 15 days, then the training positive samples collected are samples of the period from 2 months 1 day to 7 months 15 days, and lack samples of the period from 7 months 15 days to 2 months 1 year.
Therefore, if the model is trained only with the obtained training positive sample in the history, there is a problem of sample imbalance, so that the recognition accuracy of the image whose date has appeared is high, but the recognition accuracy of the image whose date has not appeared in the sample is low. Therefore, the problem of model overfitting is caused, the recognition rate of the sample which is already appeared is high, the generalization capability is poor, and the recognition accuracy of the recognition model which is obtained by the fact that the recognition rate of the sample which is not already appeared is low.
Based on the above consideration, this embodiment adopts a manner of expanding the training positive samples to solve the above problem, and referring to fig. 5, in this embodiment, the recognition model may be obtained through training in the following manner.
And S31, training the constructed first neural network model by using a plurality of training positive samples marked with the position information of the target area to obtain a detection sub-model.
And step S32, performing sample expansion according to the content information in the target area of the training positive samples to obtain a plurality of expanded samples.
And step S33, training the constructed second neural network model by using a plurality of training positive samples and a plurality of expansion samples to obtain a recognition sub-model.
In this embodiment, the first neural network model and the second neural network model may be pre-constructed, and the first neural network model and the second neural network model may employ, but are not limited to, convolutional neural network models (Convolutional Neural Networks, CNN). The first neural network model can adopt a lightweight MobileNet model, and learning training can be realized based on a RETINAFACE detection method. The detection method has higher quasi-recruitment performance and higher processing efficiency.
And (3) calibrating the position information of the target area by using the corner coordinates of the target area and the rectangular frame of the framed target area when calibrating the position information of the target area of the training positive sample in advance.
Based on the above, when the detection sub-model is trained, the constructed first neural network model can be trained by utilizing a plurality of training positive samples carrying the corner coordinates of the target area and the rectangular frame framing the target area, so as to obtain the detection sub-model.
The shape of the target area containing specific date information in the vehicle sticker is considered to be generally rectangular. In this embodiment, the position information of the target area is calibrated by using the corner coordinates of the target area, and when the target area is rectangular, the corner may be four corner points of the target area. In addition, the shape of the target area may be other irregular shapes, and when the target area is other irregular shapes, the corner points may be a plurality of edge points of the outer edge of the irregular shapes.
The relative position information of the target area in the image can be accurately calibrated by utilizing the angular point coordinates, and the detection sub-model can learn the accurate position information of the target area by training based on the angular point coordinates of the target area.
In addition, the difference of the size, the shooting angle and the like of the acquired training positive samples is considered, so that in different training positive samples, the relative angular point coordinates of the target area in the whole image may have difference, and therefore, if only the angular point coordinates of the target area are learned, the learning effect is poor due to the diversity of the angular point coordinates, and the target area in the image to be identified is difficult to accurately determine in the follow-up.
Therefore, based on the above consideration, in the present embodiment, a rectangular frame for framing the target area is also added when the target area is calibrated. The orientation of the rectangular frame may be positive, i.e. the sides of the rectangular frame are horizontal or vertical, respectively. The edge points of the target area may be on the edges of the rectangular frame or the entire target area may be entirely inside the rectangular frame, such as shown in fig. 6, with the four corner points of the target area marked, and the right rectangular frame framing the target area.
The rectangular frame used for framing the target area can be used for marking the approximate position of the target area in the whole image, so that the angular point coordinates of the target area and the rectangular frame are combined, and the accurate position information of the target area and the approximate position information of the target area can be respectively used for learning. On the basis of improving the detection accuracy of the obtained detection sub-model on the position of the target area, the interference on learning caused by position diversity is avoided.
In this way, a detection sub-model for target region detection can be trained. In the present embodiment, it is considered that although the target area can be determined from a plurality of areas included in the image with the positional information of the target area, other areas in the image may be caused due to problems such as the background at the time of photographing and the photographing angle, for example, a partial area image in the background may be close to the position of the target area, and thus be misjudged as the target area.
Therefore, in order to avoid the above-described problem, in the present embodiment, the recognition model further includes a classification sub-model, which can be trained in advance by:
And training the constructed classifier by using the acquired training negative samples and the training positive samples marked with the position information of the target area to obtain a classification sub-model.
The training positive sample is a sample containing a target area with required content information, wherein the required content information is the content information containing disinfection date. The training negative sample is a sample of a partial region containing the same position information as the target region in the training positive sample, but the partial region does not contain the required content information.
And training the classifier by using the training positive sample and the training negative sample, and judging whether the image contains the required content information or not by using the classifying sub-model obtained by training. In this way, after the target area in the image is detected by using the detection sub-model obtained through training, whether the target area contains the required content information or not, namely whether the target area is a real disinfection vehicle-mounted image or not can be judged based on the classification sub-model.
In this embodiment, the classification sub-model only needs to determine whether the target area contains the required content information, and the specific information contained in the target area needs to be identified by using the identification sub-model obtained by training. From the above, in order to avoid the problem of unbalance of the training positive samples, the expansion of the samples may be performed based on the training positive samples.
The content information in the target area of each training positive sample contains words and numbers, and it is known from the above that the change of the content information in the target area is mainly the change of the numbers, that is, the change of the disinfection date, and the words are generally only explanatory and are fixed. The dates in the disinfection vehicle patch are generally written by the driver, and not only cannot contain all dates in one year, but also have differences in the digital forms on the handwriting, so referring to fig. 7, in this embodiment, the sample expansion can be performed based on the training positive sample by the following ways:
Step S321, extracting numbers contained in the target area of each training positive sample to form a number set.
Step S322, combining the digits included in the digit set to obtain a plurality of digit combinations.
Step S323, obtaining a plurality of expansion samples based on the plurality of digital combinations and the text.
As described above, the target area of the training positive sample contains the letters and numerals, and the numerals are date information, for example, 26 days of 2 months, and may contain the year. The content information in the target area may be, for example, "sterilized 2 months by day 26". The numbers in the target area have a relatively fixed position in the target area, and when expanded, the date therein is mainly required to be expanded. Thus, numbers included in the target area, such as months at the month corresponding positions, such as 1 to 7, and numbers at the day corresponding positions, such as 01 to 30, can be extracted. All the digits extracted form a set of digits.
In this embodiment, the number set may include a plurality of subsets, and each subset may include numbers with the same number but different writing forms, for example, all numbers 1 may form a subset, and all numbers 2 may form a subset.
When the number date is expanded, numbers can be extracted from the data subsets respectively to be combined, a plurality of number combinations can be obtained, and characters contained in the target area and the obtained number combinations are combined, so that a plurality of expanded samples are obtained through expansion.
In this embodiment, considering that there may be many different writing methods for the same number, the writing form that has appeared in the collected training positive sample may not be comprehensive, so, in order to further enrich the features of the number used for learning training, to further improve the recognition accuracy of the obtained recognition submodel, please refer to fig. 8, in this embodiment, in the step of forming the number set, the following manner may be implemented:
step S3211, extracting numbers included in the target area of each training positive sample.
Step S3212, obtaining an extended set containing a plurality of digits carrying different writing information.
Step S3213, composing the numbers in the extended set and the numbers in the plurality of training positive samples into a number set.
In this embodiment, an extended set may be obtained in advance, and the extended set may be a handwriting data set including a plurality of numbers between 0 and 9 different in written information. The individual numbers contained in the expanded set may be numbers collected in other scenarios. For the numbers of handwriting, the same number is written by different people or the same person at different time and under different scenes, and the written information is generally different. Thus, the same number is also various in writing information.
After extracting the numbers contained in the target region of the training positive samples, the numbers in the expanded set and the numbers in the plurality of training positive samples may be formed into a number set. The method has the advantages that the number characteristic information is enriched by utilizing a plurality of numbers with different writing information in the expansion set, so that the recognition sub-model can learn more various number characteristics, and the accuracy of the follow-up date to be recognized is facilitated.
Referring to fig. 9, in the present embodiment, when a plurality of digital combinations are obtained based on the numbers in the digital set, the following manner may be implemented:
Step S3221, for each expansion date set in advance, acquiring expansion numbers required for the expansion dates.
Step S3222, randomly extracts numbers corresponding to the extended numbers from the number set to form a number combination characterizing the extended date.
In this embodiment, a plurality of extended dates may be preset, and the extended dates may be each date ranging from 16 days of 7 months to 1 day of 2 months of the next year, and thus, each date ranging from 1 day of 2 months to 15 days of 7 months contained in the currently collected training positive sample may be used to constitute all the dates contained in the whole year.
For each extended date, for example, 12 months and 5 days, the extended numbers required for the extended date are 1,2 and 5. Thus, a number may be extracted from the subset of numbers containing number 1, a number may be extracted from the subset of numbers containing number 2 as month, and a number may be extracted from the subset of numbers containing number 5 as day, resulting in a combination of numbers corresponding to the extended date of 12 months and 5 days.
In the above manner, a combination of numerals corresponding to the respective extended dates can be obtained, and the respective extended dates can correspond to a plurality of combinations of numerals having the same numerals but different writing forms.
After the above-mentioned digital combinations corresponding to the required expansion dates are obtained, an expansion sample is further needed to be obtained based on the digital combinations, referring to fig. 10, in this embodiment, the expansion sample may be generated by the following manner:
Step S3231, for each training positive sample, buckling the number in the target area of the training positive sample to obtain a corresponding expansion template.
And step S3232, filling each digital combination into any expansion template to obtain a corresponding expansion sample.
In this embodiment, for each training positive sample, the extended template may be obtained by filling the numbers in the target area of the training positive sample with a white image grid or a black image grid. Thus, a plurality of expansion templates with different shooting scenes, such as different sizes, different background images, different angles and the like, can be obtained.
And filling each obtained digital combination into any expansion template, specifically filling the digital combination into a position corresponding to the date information in the expansion template, so as to form an expansion sample with other information existing in the expansion template. The obtained expanded template may be represented as shown in fig. 11 (a), and fig. 11 (b) may represent an expanded sample (target area included in the expanded sample) obtained by filling the obtained digital combination into the expanded template.
In addition, besides taking out the numbers in the training positive sample by using buckles to manufacture an expansion template, a blank template obtained in different scenes can be collected as the expansion template, wherein the blank template is a template lacking the numbers in the training positive sample compared with the training positive sample, namely, the blank template is a blank template at the corresponding position of the date.
Through the steps, the first neural network model can be trained to obtain the detection sub-model by calibrating the position information of the target area in the training positive sample in advance and using the training positive sample marked with the position information of the target area. The detection sub-model obtained through training can learn and train to obtain the position information of the target area, then the target area can be positioned for the image to be identified, and the influence of other areas also containing content information on identification is eliminated.
Further, in order to avoid interference caused by areas with positions similar to the positions of the target areas, the classifier is trained by using training positive samples and training negative samples to obtain a classification sub-model. The classification sub-model can be used for judging the content information in the target area detected by the detection sub-model, so as to judge whether the target area contains the required content information, such as date information.
Further, it is considered that the present application mainly recognizes date information in a target area, and the training positive sample employed is unbalanced in sample due to limited date formation. Therefore, the expansion is carried out based on the training positive sample to obtain an expansion sample, and the training positive sample and the expansion sample are utilized to train the second neural network model to obtain the recognition sub-model. The accuracy of the subsequent recognition of the images on each date by the recognition sub-model can be improved.
After the recognition model is obtained in the above manner, it is known from the above that the recognition model includes a detection sub-model and a recognition sub-model, wherein the recognition sub-model is obtained by training in advance based on a plurality of training positive samples and a plurality of expanded samples expanded according to content information of the plurality of training positive samples. In the step S50 of processing the image to be identified to obtain the identification result, the processing may be implemented in the following manner, please refer to fig. 12:
Step S51, detecting based on the position information by using a detection sub-model established in advance to determine a target area from the plurality of areas.
And step S52, carrying out information identification on the target area of the image to be identified by utilizing a pre-established identification sub-model to obtain an identification result.
In this embodiment, the detailed process of obtaining the extended sample based on the training positive sample extension can be referred to the above embodiment. And the identification sub-model obtained based on training the training positive sample and the expansion sample is used for identifying the content information in the target area of the image to be identified, so that the accuracy of identification can be improved. The problem of low recognition accuracy of the recognition sub-model when the image to be recognized contains date information which does not appear in the training positive sample is avoided.
In this embodiment, in order to accurately locate the position information of the target area in the image, the coordinates of the corner points of the target area and the rectangular frame for framing the target area may be calibrated when the detection submodel is trained, which may be specifically referred to the above embodiment. Therefore, when determining the target region in the image to be recognized using the detection sub-model, the target region can be determined from the plurality of regions based on the detection of the rectangular frame and the corner coordinates using the detection sub-model established in advance.
In this way, the target region in the image to be identified can be determined based on a combination of the learned approximate position information (represented by the rectangular frame) of the target region and the accurate position information (represented by the corner coordinates) of the target region. And the problem that the subsequent recognition effect is affected due to the fact that the occupancy rate of text information, digital information and the like in the target area in the rectangular frame is small due to the fact that angles of the target area in the image are various and the rectangular frame is detected independently is avoided.
In this embodiment, all the portions of the image to be identified, which may be the target regions, may be detected by the detection sub-model, but the detection accuracy of the detection sub-model may not reach 100% accuracy, and thus, there may be some cases of false detection. If these erroneous detection areas are not processed and are sent to the subsequent recognition sub-model for recognition, erroneous recognition results will be generated. Therefore, referring to fig. 13, in the step of determining the target area from the plurality of areas, the above-described steps may be performed by:
in step S511, a preliminary target area is determined from the plurality of areas based on detection of rectangular frames and corner coordinates using a pre-established detection sub-model.
And S512, constructing a minimum circumscribed frame of the preliminary target area according to the corner coordinates and the side lines of the preliminary target area.
Step S513, detecting whether the size of the minimum circumscribed frame is within a preset range, and if so, determining the preliminary target area as the target area.
In this embodiment, considering that the shape of the target area is generally rectangular, after determining the preliminary target area based on the rectangular frame and the coordinates of the corner points by using the detection submodel, the minimum bounding frame of the preliminary target area may be constructed. I.e., the smallest circumscribed frame divided along the boundary and corner points of the preliminary target area, as shown in fig. 14, in which the outer frame is the above-described rectangular frame and the frame framed along the inner edge is the smallest circumscribed frame. According to the prior knowledge, the size of the actual target area should be within a certain range, so that whether the size of the minimum circumscribed frame is the actual target area can be judged by detecting whether the size of the minimum circumscribed frame is within a preset range. Wherein the dimension may be the height, width, or aspect ratio of the smallest bounding box, etc.
For example, according to a priori knowledge, the aspect ratio of the target region in the image is typically between 6:1 and 9:1, so some aspect ratios determined to be preliminary target regions may be excluded if they are not in this aspect ratio range.
On this basis, it is considered that some areas may be determined as target areas in close proximity to the actual target area position information, but in reality these areas do not contain necessary information, for example, may be a certain area in the shooting background, and therefore, in the present embodiment, a classification sub-model is also added to the recognition model. The classifying sub-model is obtained by training the classifier in advance based on the collected training positive sample and the training negative sample, and the training process can be seen in the embodiment.
And classifying and identifying the determined target area by utilizing the classifying sub-model to obtain a classifying result, wherein the classifying result can represent whether the target area contains information corresponding to the content information in the target area of the training positive sample. That is, the classification sub-model may identify whether the determined target area contains date information associated with sterilization.
For example, as shown in fig. 15, the aspect ratio of each determined preliminary target area may be detected first, if the output is false, that is, the aspect ratio is not required, then no subsequent processing is required, if the output is true, that is, the aspect ratio is required, then a classification determination is performed as to whether the required content information is included in the subsequent target area, if the determination result is true, then the subsequent processing for identifying the content information in the target area is continued, and if the determination result is false, then no subsequent identification processing is required. The area defined by the upper box in fig. 15 is an area in the shooting background, and can be excluded by detecting the aspect ratio, and after the area defined by the lower box in fig. 15 is detected by detecting the aspect ratio and classified and judged, the target area can be judged to be an area containing the required content information, and can be used for the subsequent recognition processing.
Further, as shown in fig. 16, the upper-boxed area in fig. 16 may be referred to as an area in the shooting background, but since the area is similar in size to the real target area, the output is true when detecting by means of aspect ratio detection, but when performing true-false classification discrimination by using the classification submodel again, it may be determined that the area does not contain necessary content information, that is, the classification discrimination result is false, and thus, the subsequent recognition processing is not required. The aspect ratio detection result of the area defined by the lower box in fig. 16 is true, and the classification discrimination result is true, and the subsequent recognition processing flow can be sent.
In this embodiment, the classification sub-model is added to judge whether the information in the determined target area is true or false, so that some areas which do not contain the required information and are misjudged as the target area can be further screened out.
As can be seen from the above, the target area of the image contains text information and digital information, and when the image is identified, the text information and the digital information contained in the target area of the image to be identified can be obtained by using the pre-established identification sub-model, and corresponding date information can be obtained according to the digital information.
Considering that the angle of the text information and the digital information in the image to be identified may be inclined to affect the identification effect, it is detected whether the azimuth information of the delimited minimum external frame meets the preset requirement, and if the azimuth information does not meet the preset requirement, the target area is rotationally corrected. And carrying out information identification based on the target area after rotation correction.
It should be noted that, the vehicle-mounted domain image recognition method in this embodiment is implemented based on the recognition model constructed in the foregoing embodiment, and details related to the recognition model in this embodiment may not be described in detail herein, but the description related to the foregoing embodiment is omitted herein.
Third embodiment
Based on the same application conception, the embodiment of the present application further provides a vehicle-mounted domain image recognition device 220 corresponding to the vehicle-mounted domain image recognition method, referring to fig. 17, and since the principle of solving the problem of the device in the embodiment of the present application is similar to that of the vehicle-mounted domain image recognition method in the embodiment of the present application, the implementation of the device may refer to the implementation of the method, and the repetition is omitted.
Referring to fig. 17, a schematic diagram of an in-vehicle field image recognition device 220 according to an embodiment of the present application is provided, where the device includes: an acquisition module 221 and an identification module 222.
The acquiring module 221 is configured to acquire an image to be identified, where the image to be identified has a plurality of areas containing content information.
It will be appreciated that the acquisition module 221 may be configured to perform step S40 described above, and reference may be made to the details of implementation of the acquisition module 221 regarding step S40 described above.
The recognition module 222 is configured to process the image to be recognized by using a pre-established recognition model, determine a target area from the plurality of areas according to the position information of each area, and perform information recognition on the target area to obtain a recognition result;
The recognition model is obtained by training a plurality of training positive samples marked with the position information of the target area in advance, and each training positive sample is provided with a plurality of areas containing content information.
It will be appreciated that the identification module 222 may be used to perform step S50 described above, and reference may be made to the details of the implementation of the identification module 222 as described above with respect to step S50.
In a possible implementation manner, the recognition model comprises a detection sub-model and a recognition sub-model;
the identification module 222 is configured to obtain an identification result by:
detecting based on the position information using a pre-established detection sub-model to determine a target region from the plurality of regions;
And carrying out information recognition on the target area of the image to be recognized by utilizing a pre-established recognition sub-model to obtain a recognition result, wherein the recognition sub-model is obtained by training a plurality of training positive samples and a plurality of expansion samples obtained by expanding according to the content information of the training positive samples.
In a possible implementation manner, the identification module 222 is configured to obtain an identification result including date information by:
And obtaining character information and digital information contained in the target area of the image to be identified by utilizing a pre-established identification sub-model, and obtaining corresponding date information according to the digital information.
In a possible implementation manner, the identification module 222 is configured to determine the target area by:
And determining the target area from the areas based on detection of rectangular frames and corner coordinates by using a pre-established detection sub-model.
In a possible implementation manner, the identification module 222 is configured to determine the target area by:
determining a preliminary target area from the plurality of areas according to the corner coordinates and the rectangular frames of each area;
Constructing a minimum circumscribed frame of the preliminary target area according to the corner coordinates and the side lines of the preliminary target area;
and detecting whether the size of the minimum circumscribed frame is within a preset range, and if so, determining the preliminary target area as a target area.
In a possible implementation manner, the vehicle-mounted domain image recognition device 220 further includes a correction module;
And the correction module is used for detecting whether the azimuth information of the minimum external frame meets the preset requirement, and if not, carrying out rotation correction on the target area.
In a possible implementation manner, the recognition model further comprises a classification sub-model, wherein the classification sub-model is obtained by training a classifier by utilizing a plurality of training positive samples and a plurality of training negative samples in advance;
the recognition module 222 is further configured to perform classification judgment on the image to be recognized by:
And classifying and identifying the determined target area by using the classifying sub-model to obtain a classifying result, wherein the classifying result represents whether the target area contains information corresponding to the content information in the target area of the training positive sample.
In addition, based on the same application conception, the embodiment of the present application further provides a vehicle-mounted domain image recognition model building device 210 corresponding to the vehicle-mounted domain image recognition model building method, referring to fig. 18, and since the principle of solving the problem of the device in the embodiment of the present application is similar to that of the vehicle-mounted domain image recognition model building method in the embodiment of the present application, the implementation of the device may refer to the implementation of the method, and the repetition is omitted.
Referring to fig. 18, a schematic diagram of a vehicle-mounted domain image recognition model building apparatus 210 provided by the present application includes: an obtaining module 211, a calibrating module 212 and a training module 213.
The obtaining module 211 is configured to obtain, for each of the obtained plurality of training positive samples, a plurality of regions containing content information in the training positive sample.
It is understood that the obtaining module 211 may be used to perform the step S10 described above, and reference may be made to the details of the implementation of the obtaining module 211 related to the step S10 described above.
The calibration module 212 is configured to identify a target area from the plurality of areas, and obtain location information of the target area.
It will be appreciated that the calibration module 212 may be used to perform step S20 described above, and reference may be made to the details of the implementation of the calibration module 212 described above with respect to step S20.
The training module 213 is configured to train the constructed neural network model by using a plurality of training positive samples marked with the location information of the target area, so as to obtain an identification model.
It will be appreciated that the training module 213 may be used to perform step S30 described above, and reference may be made to the details of the implementation of the training module 213 described above with respect to step S30.
In a possible implementation manner, the recognition model includes a detection sub-model and a recognition sub-model, and the training module 213 is configured to obtain the detection sub-model and the recognition sub-model by:
training the constructed first neural network model by utilizing a plurality of training positive samples marked with the position information of the target area to obtain a detection sub-model;
Sample expansion is carried out according to the content information in the target area of the training positive samples, so that a plurality of expansion samples are obtained;
And training the constructed second neural network model by using a plurality of training positive samples and a plurality of expansion samples to obtain the recognition sub-model.
In a possible implementation manner, the recognition model further includes a classification sub-model, and the training module 213 is further configured to obtain the classification sub-model by:
And training the constructed classifier by using the acquired training negative samples and the training positive samples marked with the position information of the target area to obtain a classification sub-model.
In a possible implementation manner, the content information in the target area includes words and numbers, and the training module 213 is configured to obtain an extension sample by:
extracting numbers contained in a target area of each training positive sample to form a number set;
combining the numbers contained in the number set to obtain a plurality of number combinations;
and obtaining a plurality of expansion samples based on the plurality of digital combinations and the text.
In a possible implementation, the digital representation date, the training module 213 is configured to obtain a digital combination by:
aiming at each preset expansion date, acquiring an expansion number required by the expansion date;
randomly extracting numbers corresponding to the extended numbers from the number set to form a number combination representing the extended date.
In a possible implementation, the training module 213 is configured to construct a digital set by:
Extracting numbers contained in a target area of each training positive sample;
acquiring an expansion set containing a plurality of numbers carrying different writing information;
the numbers in the extended set and the numbers in the plurality of training positive samples are formed into a set of numbers.
In a possible implementation manner, the training module 213 is configured to obtain the extended samples by:
For each training positive sample, buckling and taking out numbers in a target area of the training positive sample to obtain a corresponding expansion template;
and filling each digital combination into any expansion template to obtain a corresponding expansion sample.
In a possible implementation manner, the position information includes corner coordinates of the target area and a rectangular frame for framing the target area;
The training module 213 is configured to obtain a detection sub-model by:
and training the constructed first neural network model by utilizing a plurality of training positive samples carrying angular point coordinates of the target area and a rectangular frame framing the target area to obtain a detection sub-model.
Fourth embodiment
Referring to fig. 19, an electronic device 300 is further provided in the embodiment of the application, and the electronic device 300 may be the server 110 described above. The electronic device 300 includes: a processor 310, a memory 320, and a bus 330. The memory 320 stores machine-readable instructions executable by the processor 310, which when executed by the processor 310, perform the following processes when the electronic device 300 is in operation, the processor 310 communicates with the memory 320 via the bus 330:
in a possible implementation, the instructions executed by the processor 310 include the following procedures:
aiming at each training positive sample in the acquired multiple training positive samples, acquiring multiple areas containing content information in the training positive samples;
identifying a target area from the plurality of areas, and obtaining position information of the target area;
and training the constructed neural network model by utilizing a plurality of training positive samples marked with the position information of the target area to obtain an identification model.
In another possible implementation, the instructions executed by the processor 310 include the following procedure:
acquiring an image to be identified, wherein the image to be identified is provided with a plurality of areas containing content information;
processing the image to be identified by using a pre-established identification model, determining a target area from the plurality of areas according to the position information of each area, and carrying out information identification on the target area to obtain an identification result;
The recognition model is obtained by training a plurality of training positive samples marked with the position information of the target area in advance, and each training positive sample is provided with a plurality of areas containing content information.
With respect to the processes involved in the instructions executed by the processor 310 when the electronic device 300 is running, reference may be made to the relevant descriptions in the method embodiments described above, which are not described in detail here.
Fifth embodiment
The embodiment of the application also provides a computer readable storage medium, wherein the computer readable storage medium is stored with a computer program, and the computer program executes the steps of the vehicle-mounted field image recognition model building method or the vehicle-mounted field image recognition method when being run by a processor.
An embodiment of the present application provides a computer program product, which when run on a computer, causes the computer to execute the steps of the above-described vehicle-mounted field image recognition model building method or vehicle-mounted field image recognition method.
Specifically, the storage medium can be a general-purpose storage medium, such as a removable disk, a hard disk, or the like, and the computer program on the storage medium, when executed, can perform the above-described vehicle-mounted area image recognition model building method or vehicle-mounted area image recognition method.
It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described system and apparatus may refer to corresponding procedures in the method embodiments, and are not repeated in the present disclosure. In the several embodiments provided by the present application, it should be understood that the disclosed systems, devices, and methods may be implemented in other manners. The above-described apparatus embodiments are merely illustrative, and the division of the modules is merely a logical function division, and there may be additional divisions when actually implemented, and for example, multiple modules or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be through some communication interface, indirect coupling or communication connection of devices or modules, electrical, mechanical, or other form.
The modules described as separate components may or may not be physically separate, and components shown as modules may or may not be physical units, may be located in one place, or may be distributed over multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in the embodiments of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a non-volatile computer readable storage medium executable by a processor. Based on this understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a usb disk, a removable hard disk, a ROM, a RAM, a magnetic disk, or an optical disk, etc.
The foregoing is merely illustrative of the present application, and the present application is not limited thereto, and any person skilled in the art will readily appreciate variations or alternatives within the scope of the present application. Therefore, the protection scope of the application is subject to the protection scope of the claims.

Claims (13)

1. The vehicle-mounted field image recognition method is characterized by comprising the following steps of:
acquiring an image to be identified, wherein the image to be identified is provided with a plurality of areas containing content information;
processing the image to be identified by using a pre-established identification model, and determining a target area from a plurality of areas according to the position information of each area, wherein the identification model comprises a detection sub-model, and comprises the following steps:
Determining a preliminary target area from the plurality of areas by utilizing a detection sub-model established in advance based on detection of position information, wherein the position information comprises a rectangular frame and corner coordinates;
Constructing a minimum circumscribed frame of the preliminary target area according to the corner coordinates and the side lines of the preliminary target area;
Detecting whether the size of the minimum external frame is within a preset range, and if so, determining the preliminary target area as a target area;
information identification is carried out on the target area by utilizing the identification model, and an identification result is obtained;
The recognition model is obtained by training a plurality of training positive samples marked with the position information of the target area in advance, and each training positive sample is provided with a plurality of areas containing content information.
2. The vehicle-mounted domain image recognition method according to claim 1, wherein the recognition model further includes a recognition sub-model;
The method for identifying the target area by using the identification model to obtain an identification result further comprises the following steps:
And carrying out information recognition on the target area of the image to be recognized by utilizing a pre-established recognition sub-model to obtain a recognition result, wherein the recognition sub-model is obtained by training a plurality of training positive samples and a plurality of expansion samples obtained by expanding according to the content information of the training positive samples.
3. The method for identifying an image in a vehicle-mounted area according to claim 2, wherein the identifying the target area of the image to be identified by using a pre-established identification sub-model to obtain an identification result includes:
And obtaining character information and digital information contained in the target area of the image to be identified by utilizing the identification submodel, and obtaining corresponding date information according to the digital information.
4. The vehicle-mounted field image recognition method according to claim 1, wherein before the step of performing information recognition on the target area of the image to be recognized by using the recognition model to obtain a recognition result, the method further comprises:
and detecting whether the azimuth information of the minimum external frame meets a preset requirement, and if the azimuth information does not meet the preset requirement, carrying out rotation correction on the target area.
5. The vehicle-mounted field image recognition method according to claim 2, wherein the recognition model further comprises a classification sub-model, the classification sub-model is obtained by training a classifier in advance by using a plurality of training positive samples and a plurality of training negative samples;
The processing the image to be identified by using a pre-established identification model, determining a target area from the plurality of areas according to the position information of each area, and identifying the target area by using the identification model to obtain an identification result, and further comprising:
And classifying and identifying the determined target area by using the classifying sub-model to obtain a classifying result, wherein the classifying result represents whether the target area contains information corresponding to the content information in the target area of the training positive sample.
6. An in-vehicle area image recognition apparatus, characterized by comprising:
The acquisition module is used for acquiring an image to be identified, wherein the image to be identified is provided with a plurality of areas containing content information;
the recognition module is used for processing the image to be recognized by utilizing a pre-established recognition model, determining a target area from a plurality of areas according to the position information of each area, wherein the recognition model comprises a detection sub-model, and comprises the following components:
Determining a preliminary target area from the plurality of areas by utilizing a detection sub-model established in advance based on detection of position information, wherein the position information comprises a rectangular frame and corner coordinates;
Constructing a minimum circumscribed frame of the preliminary target area according to the corner coordinates and the side lines of the preliminary target area;
detecting whether the size of the minimum external frame is within a preset range, and if so, determining the preliminary target area as a target area; and
The identification model is used for carrying out information identification on the target area to obtain an identification result;
The recognition model is obtained by training a plurality of training positive samples marked with the position information of the target area in advance, and each training positive sample is provided with a plurality of areas containing content information.
7. The in-vehicle area image recognition apparatus according to claim 6, wherein the recognition model further includes a recognition sub-model;
The identification module is further configured to obtain the identification result by:
And carrying out information recognition on the target area of the image to be recognized by utilizing a pre-established recognition sub-model to obtain a recognition result, wherein the recognition sub-model is obtained by training a plurality of training positive samples and a plurality of expansion samples obtained by expanding according to the content information of the training positive samples.
8. The in-vehicle area image recognition apparatus according to claim 7, wherein the recognition module is configured to obtain the recognition result including the date information by:
And obtaining character information and digital information contained in the target area of the image to be identified by utilizing the identification submodel, and obtaining corresponding date information according to the digital information.
9. The in-vehicle area image recognition device according to claim 6, characterized in that the in-vehicle area image recognition device further comprises a correction module;
And the correction module is used for detecting whether the azimuth information of the minimum external frame meets the preset requirement, and if not, carrying out rotation correction on the target area.
10. The vehicle-mounted field image recognition device according to claim 7, wherein the recognition model further comprises a classification sub-model, the classification sub-model being obtained by training a classifier in advance using a plurality of training positive samples and a plurality of training negative samples;
the identification module is also used for classifying and judging the image to be identified by the following modes:
And classifying and identifying the determined target area by using the classifying sub-model to obtain a classifying result, wherein the classifying result represents whether the target area contains information corresponding to the content information in the target area of the training positive sample.
11. An electronic device, comprising: a processor, a storage medium and a bus, the storage medium storing machine-readable instructions executable by the processor, the processor and the storage medium communicating over the bus when the electronic device is running, the processor executing the machine-readable instructions to perform the steps of the method of any one of claims 1 to 5.
12. A computer-readable storage medium, characterized in that it has stored thereon a computer program which, when executed by a processor, performs the steps of the method according to any of claims 1 to 5.
13. A computer program product comprising computer programs/instructions which, when run on a computer, cause the computer to perform the method of any of claims 1-5.
CN202110145812.5A 2021-02-02 Vehicle-mounted field image recognition method, recognition model establishment method, device, electronic equipment and readable storage medium Active CN112818865B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110145812.5A CN112818865B (en) 2021-02-02 Vehicle-mounted field image recognition method, recognition model establishment method, device, electronic equipment and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110145812.5A CN112818865B (en) 2021-02-02 Vehicle-mounted field image recognition method, recognition model establishment method, device, electronic equipment and readable storage medium

Publications (2)

Publication Number Publication Date
CN112818865A CN112818865A (en) 2021-05-18
CN112818865B true CN112818865B (en) 2024-07-02

Family

ID=

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110738602A (en) * 2019-09-12 2020-01-31 北京三快在线科技有限公司 Image processing method and device, electronic equipment and readable storage medium

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110738602A (en) * 2019-09-12 2020-01-31 北京三快在线科技有限公司 Image processing method and device, electronic equipment and readable storage medium

Similar Documents

Publication Publication Date Title
CN108038474B (en) Face detection method, convolutional neural network parameter training method, device and medium
CN108009543B (en) License plate recognition method and device
US11455805B2 (en) Method and apparatus for detecting parking space usage condition, electronic device, and storage medium
CN109871829B (en) Detection model training method and device based on deep learning
CN112613344B (en) Vehicle track occupation detection method, device, computer equipment and readable storage medium
CN110969592B (en) Image fusion method, automatic driving control method, device and equipment
CN114038004A (en) Certificate information extraction method, device, equipment and storage medium
CN112016605A (en) Target detection method based on corner alignment and boundary matching of bounding box
CN113569968B (en) Model training method, target detection method, device, equipment and storage medium
CN113822247A (en) Method and system for identifying illegal building based on aerial image
CN112052855A (en) License plate recognition method and device based on deep learning
CN112381092B (en) Tracking method, tracking device and computer readable storage medium
CN113255444A (en) Training method of image recognition model, image recognition method and device
CN112801227A (en) Typhoon identification model generation method, device, equipment and storage medium
CN114120071A (en) Detection method of image with object labeling frame
CN109523570B (en) Motion parameter calculation method and device
CN112818865B (en) Vehicle-mounted field image recognition method, recognition model establishment method, device, electronic equipment and readable storage medium
CN112115737B (en) Vehicle orientation determining method and device and vehicle-mounted terminal
CN112767412A (en) Vehicle component level segmentation method and device and electronic equipment
CN113902927B (en) Comprehensive information processing method fusing image and point cloud information
KR102416714B1 (en) System and method for city-scale tree mapping using 3-channel images and multiple deep learning
CN116246161A (en) Method and device for identifying target fine type of remote sensing image under guidance of domain knowledge
CN112818865A (en) Vehicle-mounted field image identification method, identification model establishing method, device, electronic equipment and readable storage medium
CN114332814A (en) Parking frame identification method and device, electronic equipment and storage medium
CN113435350A (en) Traffic marking detection method, device, equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
SE01 Entry into force of request for substantive examination
GR01 Patent grant