CN112712124B - Multi-module cooperative object recognition system and method based on deep learning - Google Patents

Multi-module cooperative object recognition system and method based on deep learning Download PDF

Info

Publication number
CN112712124B
CN112712124B CN202011641665.2A CN202011641665A CN112712124B CN 112712124 B CN112712124 B CN 112712124B CN 202011641665 A CN202011641665 A CN 202011641665A CN 112712124 B CN112712124 B CN 112712124B
Authority
CN
China
Prior art keywords
video
module
neural network
data
video data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011641665.2A
Other languages
Chinese (zh)
Other versions
CN112712124A (en
Inventor
奚照明
杨哲
邵强
梁昭
蔡达
张辉
马琳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong Aubang Transportation Facilities Engineering Co ltd
Original Assignee
Shandong Aubang Transportation Facilities Engineering Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong Aubang Transportation Facilities Engineering Co ltd filed Critical Shandong Aubang Transportation Facilities Engineering Co ltd
Priority to CN202011641665.2A priority Critical patent/CN112712124B/en
Publication of CN112712124A publication Critical patent/CN112712124A/en
Application granted granted Critical
Publication of CN112712124B publication Critical patent/CN112712124B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

The invention belongs to the technical field of deep learning, and provides a multi-module cooperative object identification system and method based on deep learning. The system comprises a video input module, a video processing subsystem module, an intelligent video engine module, a neural network acceleration engine module, a video graphics subsystem module and a video output module which are integrated into a whole and work cooperatively. The real-time identification of the object is realized by utilizing the cooperation of multiple modules, the problem that the camera image needs to be uploaded to a server for identification processing at present is solved, the time delay caused by network delay or network bandwidth limitation is avoided, and the real-time identification is realized.

Description

Multi-module cooperative object recognition system and method based on deep learning
Technical Field
The invention belongs to the technical field of deep learning, and particularly relates to a multi-module cooperative object identification system and method based on deep learning.
Background
The statements in this section merely provide background information related to the present disclosure and may not necessarily constitute prior art.
The visual information processing is to construct an intelligent system simulating human visual ability based on external perception data and judge and identify a target, wherein object identification is the basis of the visual information processing technology. With the popularization of computers and intelligent terminals and the rapid development of the internet, the rapid extension of the application field of image and video big data provides challenges for object recognition technology. The existing object recognition technology has the characteristics of high efficiency, high performance and even intellectualization.
In view of the requirements of efficiency and performance trade-off and intellectualization, deep learning is rapidly becoming a research hotspot of computer vision by virtue of strong modeling and data characterization capabilities. The deep learning establishes a logic level model of the internal implicit relation of the learning data through the function mapping from the low-level signal to the high-level feature so as to simulate the visual cognition reasoning process of the human brain, so that the learned feature has stronger generalization capability and expression capability.
However, the inventor finds that, with the increase of the quality of video images, the existing 5G network infrastructure is not covered completely, the bandwidth of a 4G network cannot meet the real-time transmission of high-quality video images, and the real-time performance of object identification cannot be guaranteed even more because a computer receives data returned from a field and then performs further computer vision processing on object identification.
Disclosure of Invention
In order to solve at least one technical problem in the background art, the invention provides a multi-module cooperation object recognition system and method based on deep learning, which can realize real-time recognition of an object by using multi-module cooperation.
In order to achieve the purpose, the invention adopts the following technical scheme:
the invention provides a multi-module cooperative object recognition system based on deep learning.
A multi-module cooperative object recognition system based on deep learning comprises a video input module, a video processing subsystem module, an intelligent video engine module, a neural network acceleration engine module, a video graphics subsystem module and a video output module which are integrated into a whole and cooperatively work;
the video input module is used for receiving real-time video data and storing the real-time video data into a specified memory area;
the video processing subsystem module is used for calling original video data of the memory area and decomposing the original video data into basic video data and extended video data;
the intelligent video engine module is used for converting image frame data in the current extended video data into frame data in an image format matched with the neural network model;
the neural network acceleration engine module is used for acquiring frame data after format conversion and identifying to obtain four-point coordinate position information of the class and the outline of the object through a neural network model;
the video graphics subsystem module is used for acquiring basic video data and then drawing a contour frame for identifying the object in the basic video data based on the category of the object and the position information of four-point coordinates of the contour;
and the video output module is used for outputting the video image data with the outline frame of the identified object.
As an embodiment, the base video data maintains the resolution of the original video data.
The technical scheme has the advantages that the consistency with original data can be guaranteed, and the object outline frame identified in the later period can be restored to the original video data more accurately.
In one embodiment, the resolution of the extended video data is matched to a neural network model within the neural network acceleration engine module.
The technical scheme has the advantage that the neural network model can be matched with the neural network model, and a data basis is provided for the neural network model.
As an implementation mode, the video input module, the video processing subsystem module, the intelligent video engine module, the neural network acceleration engine module, the video graphics subsystem module and the video output module are all started after receiving a starting command and all perform initialization operation at the same time.
The technical scheme has the advantages that the modules are initialized, and the accuracy of post-data processing can be guaranteed.
As an embodiment, during the initialization operation, the initialization of the neural network acceleration engine module includes loading a trained neural network model in a specific format.
The technical scheme has the advantages that before loading, the format of the neural network model trained in the computer needs to be converted into a specific format which can be loaded by the neural network acceleration engine module in advance, and the efficiency of video image data processing is improved.
As an embodiment, the video processing subsystem module, the intelligent video engine module, the neural network acceleration engine module, the video graphics subsystem module and the video output module perform multi-thread parallel operation; the VitoVo thread operation is carried out among the video processing subsystem module, the neural network acceleration engine module, the video graphics subsystem module and the video output module; and a detect thread operation is carried out between the intelligent video engine module and the neural network acceleration engine module.
The technical scheme has the advantages that the processing efficiency of video data can be improved and the real-time property of object identification can be guaranteed by parallel thread operation.
In a Vitevo thread operation, extracting frame data from the extended video frame data and putting the frame data into a frame data linked list; in detect thread operation, taking out frame data from the frame data linked list in sequence, judging whether the frame data is identified, defining a zone bit according to the result of identifying, and storing the zone bit and the frame number into the zone bit linked list.
In the Vitevo thread operation, object identification results and frame numbers of frame data are sequentially extracted from the identification result linked list, wherein the object identification results and the frame numbers of the frame data are identified by the neural network acceleration engine module in the detection thread and are stored in the identification result linked list.
The technical scheme has the advantages that the flag bit is utilized to detect the corresponding thread, the processing sequence of the video images in the corresponding thread is guaranteed, and omission of the video images is avoided.
The second aspect of the invention provides a multi-module cooperative object identification method based on deep learning.
A recognition method of a multi-module cooperative object recognition system based on deep learning comprises the following steps:
receiving a starting command and initializing a video input module, a video processing subsystem module, an intelligent video engine module, a neural network acceleration engine module, a video graphics subsystem module and a video output module;
receiving real-time video data by using a video input module and storing the real-time video data into a specified memory area;
calling original video data in a memory area by using a video processing subsystem module and decomposing the original video data into basic video data and extended video data;
converting image frame data in the current extended video data into frame data in an image format matched with the neural network model by using an intelligent video engine model;
acquiring frame data after format conversion by using a neural network acceleration engine module, and identifying to obtain four-point coordinate position information of the class and the outline of the object through a neural network model;
acquiring basic video data by using a video graphics subsystem module, and drawing a contour frame for identifying an object in the basic video data based on the category of the object and the position information of four-point coordinates of the contour;
and outputting the video image data with the outline frame of the identified object by using a video output module.
As one embodiment, the identification method of the deep learning based multi-module cooperative object identification system further comprises identifying a detect thread and a VitoVo thread, and the two threads are executed in parallel.
Compared with the prior art, the invention has the beneficial effects that:
the invention realizes the real-time identification of the object by utilizing the multi-module cooperation, solves the problem that the camera image needs to be uploaded to a server for identification processing at present, avoids the time delay caused by network delay or the limitation of network bandwidth, and realizes the real-time identification;
the video processing subsystem module, the intelligent video engine module, the neural network acceleration engine module, the video graphics subsystem module and the video output module carry out multi-thread parallel operation, and the intelligent video engine module is adopted to convert an input image format into an image format required by a model, so that the utilization rate of a CPU (Central processing Unit) is reduced, the image processing time is shortened, and the real-time property of object identification in an image is ensured.
Advantages of additional aspects of the invention will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, are included to provide a further understanding of the invention, and are incorporated in and constitute a part of this specification, illustrate exemplary embodiments of the invention and together with the description serve to explain the invention and not to limit the invention.
Fig. 1 is a schematic structural diagram of a deep learning-based multi-module cooperative object recognition system according to an embodiment of the present invention.
Detailed Description
The invention is further described with reference to the following figures and examples.
It is to be understood that the following detailed description is exemplary and is intended to provide further explanation of the invention as claimed. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of exemplary embodiments according to the invention. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, and it should be understood that when the terms "comprises" and/or "comprising" are used in this specification, they specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof, unless the context clearly indicates otherwise.
Interpretation of terms:
the VitoVo thread: input to the output thread.
detect thread: and (4) identifying and detecting the object.
Example one
Referring to fig. 1, the deep learning-based multi-module cooperative object recognition system of the present embodiment includes a video input module, a video processing subsystem module, an intelligent video engine module, a neural network acceleration engine module, a video graphics subsystem module, and a video output module, which are integrated into a whole and cooperate with each other.
In specific implementation, in order to ensure the accuracy of the post-data processing, the video input module, the video processing subsystem module, the intelligent video engine module, the neural network acceleration engine module, the video graphics subsystem module and the video output module are all started after receiving a start command and are all initialized.
During an initialization operation, the initialization of the neural network acceleration engine module includes loading a trained neural network model in a particular format. Before loading, format conversion needs to be carried out on a neural network model trained in a computer in advance, and the neural network model is converted into a specific format which can be loaded by a neural network acceleration engine module, so that the efficiency of video image data processing is improved.
As a specific embodiment, the training process of the neural network model is as follows:
collecting pictures containing an object to be detected in different scenes, carrying out unification treatment on the pictures, and labeling the class of the object to be detected to form a sample set; the sample set is divided into a training set and a testing set;
selecting one sample (Ai Bi) of the training set; wherein Bi is data and Ai is a label;
sending the data into a network, and calculating the actual output Y of the network; at the moment, the weights in the network are random;
calculating an error D-Bi-Y (the difference between the predicted value Bi and the actual value Y);
adjusting a weight matrix W according to the error D;
the above process is repeated for each sample until the error does not exceed the specified range for the entire training set.
In the present embodiment, the scenes include, but are not limited to, day, night, rainy day, snowy day, foggy day, and the like.
The yolov3 neural network model is used in this embodiment. The Caffe framework performs model training. And the format is consistent with the deep learning framework cafe supported by the front-end chip and is converted into a format which can be supported by the neural network acceleration engine module.
It should be noted here that the neural network model may also be other existing network structures, and those skilled in the art can specifically select the neural network model according to actual situations, and the details are not described here.
Specifically, the video input module is used for receiving real-time video data and storing the real-time video data into a specified memory area.
For example: the video input module receives real-time video data shot by a camera through an MIPI (Mobile Industry Processor Interface), processes the received original video image data and realizes the acquisition of the video data; the video input module stores the received data into a designated memory area.
Specifically, the video processing subsystem module is configured to retrieve original video data of the memory area and decompose the original video data into basic video data and extended video data.
Wherein the base video data maintains the resolution of the original video data. The resolution of the extended video data is matched to a neural network model within the neural network acceleration engine module.
Therefore, the consistency with the original data can be guaranteed, the object outline frame identified in the later period can be restored to the original video data more accurately, the resolution of the extended video data can be matched with the neural network model, and a data basis is provided for the neural network model.
Specifically, the intelligent video engine module is used for converting image frame data in the current extended video data into frame data in an image format matched with the neural network model.
In this embodiment, the image frame data in the current extended video data is in yuv format, and the image format matched with the neural network model in this embodiment is in rgb format.
Specifically, the neural network acceleration engine module is used for acquiring frame data after format conversion, and obtaining the category and four-point coordinate position information of the outline of the object through neural network model identification.
Specifically, the video graphics subsystem module is used for acquiring basic video data, and then drawing a contour frame for identifying the object in the basic video data based on the category of the object and the position information of four-point coordinates of the contour.
Specifically, the video output module is used for outputting video image data with a contour frame of the identified object.
In specific implementation, the video processing subsystem module, the intelligent video engine module, the neural network acceleration engine module, the video graphics subsystem module and the video output module perform multi-thread parallel operation; the VitoVo thread operation is carried out among the video processing subsystem module, the neural network acceleration engine module, the video graphics subsystem module and the video output module; and a detect thread operation is carried out between the intelligent video engine module and the neural network acceleration engine module. The parallel thread operation can improve the processing efficiency of video data and ensure the real-time property of object identification.
In the VitoVo thread:
after the video processing subsystem module decomposes the video data collected by the video input module into basic video data and extended video data, extracting frame data from the extended video frame data, and putting the frame data into a frame data linked list;
sequentially taking out the zone bits and the frame numbers of the frame data from the zone bit linked list, wherein the zone bits of the frame data are stored in the zone bit linked list in the identification detect thread;
the flag bit represents whether the frame data is used for object identification; because the object recognition frame rate in the neural network acceleration engine module is less than the sampling frame rate of the video input module for video data, in the embodiment, a frame extraction recognition mode is adopted, and a flag bit is used for marking whether frame data is used for object recognition;
sequentially extracting the object identification result and the frame number of the frame data from the identification result linked list, wherein the object identification result and the frame number of the frame data are obtained by the neural network acceleration engine module in the identification detect thread and are stored in the identification result linked list; the identification result comprises the object category and the coordinate position information of four points of the outline;
the video graphics subsystem module acquires basic video data, and draws a contour frame for identifying an object in the basic video data according to contour four-point coordinate position information acquired by the neural network acceleration engine module;
the video output module outputs video image data with the outline frame of the identified object.
Identifying a detect thread:
taking out frame data from the frame data linked list in sequence, judging whether the frame data is identified, defining a zone bit according to the result of identifying whether the frame data is identified, and storing the zone bit and the frame number into the zone bit linked list;
an intelligent video engine module is adopted to convert the frame data in the input image format into the frame data in the image format required by the model,
acquiring frame data after format conversion by adopting a neural network acceleration engine module, and identifying to obtain four-point coordinate position information of the class and the outline of the object through a neural network model; and storing the coordinate position information and the frame number of the four points of the object type and the outline into an identification result linked list.
In the embodiment, the flag bit is used for detecting the corresponding thread, so that the processing sequence of the video images in the corresponding thread is guaranteed, and the omission of the video images is avoided.
In the embodiment, the real-time identification of the object is realized by utilizing the cooperation of multiple modules, the problem that the camera image needs to be uploaded to a server for identification processing at present is solved, the time delay caused by network delay or network bandwidth limitation is avoided, and the real-time identification is realized.
The recognition method of the deep learning-based multi-module collaborative object recognition system comprises the following steps:
s101: receiving a starting command and initializing a video input module, a video processing subsystem module, an intelligent video engine module, a neural network acceleration engine module, a video graphics subsystem module and a video output module;
s102: receiving real-time video data by using a video input module and storing the real-time video data into a specified memory area;
s103: calling original video data in a memory area by using a video processing subsystem module and decomposing the original video data into basic video data and extended video data;
s104: converting image frame data in the current extended video data into frame data in an image format matched with the neural network model by using an intelligent video engine model;
s105: acquiring frame data after format conversion by using a neural network acceleration engine module, and identifying to obtain four-point coordinate position information of the class and the outline of the object through a neural network model;
s106: acquiring basic video data by using a video graphics subsystem module, and drawing a contour frame for identifying an object in the basic video data based on the category of the object and the position information of four-point coordinates of the contour;
s107: and outputting the video image data with the outline frame of the identified object by using a video output module.
In some embodiments, the identification method of the deep learning based multi-module cooperative object identification system further comprises identifying a detect thread and a VitoVo thread, which are executed in parallel.
In the embodiment, the video processing subsystem module, the intelligent video engine module, the neural network acceleration engine module, the video graphics subsystem module and the video output module perform multi-thread parallel operation, and the intelligent video engine module is adopted to convert an input image format into an image format required by a model, so that the utilization rate of a CPU (Central processing Unit) is reduced, the image processing time is shortened, and the real-time property of object identification in an image is ensured.
In addition, in this embodiment, not only object recognition in a moving image but also still image recognition can be performed to input an image (picture/video) still image/moving image.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of a hardware embodiment, a software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program, which can be stored in a computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), or the like.
The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (7)

1. A multi-module cooperative object recognition system based on deep learning is characterized by comprising a video input module, a video processing subsystem module, an intelligent video engine module, a neural network acceleration engine module, a video graphics subsystem module and a video output module which are integrated into a whole and cooperate with one another;
the video input module is used for receiving real-time video data and storing the real-time video data into a specified memory area; the video input module receives real-time video data shot by a camera through an MIPI (Mobile industry processor interface), processes the received original video image data and realizes the acquisition of the video data;
the video processing subsystem module is used for calling original video data of the memory area and decomposing the original video data into basic video data and extended video data; wherein the base video data maintains the resolution of the original video data; the resolution of the extended video data is matched with a neural network model in a neural network acceleration engine module;
the intelligent video engine module is used for converting image frame data in the current extended video data into frame data in an image format matched with the neural network model;
the neural network acceleration engine module is used for acquiring frame data after format conversion and identifying to obtain four-point coordinate position information of the class and the outline of the object through a neural network model; the neural network acceleration engine module carries out format conversion on a neural network model trained in a computer in an initialization operation process and converts the neural network model into a specific format which can be loaded by the neural network acceleration engine module;
the video graphics subsystem module is used for acquiring basic video data and then drawing a contour frame for identifying the object in the basic video data based on the category of the object and the position information of four-point coordinates of the contour;
and the video output module is used for outputting the video image data with the outline frame of the identified object.
2. The deep learning-based multi-module cooperative object recognition system of claim 1, wherein the video input module, the video processing subsystem module, the intelligent video engine module, the neural network acceleration engine module, the video graphics subsystem module and the video output module are all started after receiving a start command and all perform initialization operations at the same time.
3. The deep learning based multi-module cooperative object recognition system of claim 1, wherein the video processing subsystem module, the intelligent video engine module, the neural network acceleration engine module, the video graphics subsystem module and the video output module perform multi-thread parallel operations therebetween; the VitoVo thread operation is carried out among the video processing subsystem module, the neural network acceleration engine module, the video graphics subsystem module and the video output module; and a detect thread operation is carried out between the intelligent video engine module and the neural network acceleration engine module.
4. The deep learning based multi-module cooperative object recognition system of claim 3, wherein in a Vitevo thread operation, frame data is extracted from the extended video frame data and placed into a linked list of frame data; in detect thread operation, taking out frame data from the frame data linked list in sequence, judging whether the frame data is identified, defining a zone bit according to the result of identifying, and storing the zone bit and the frame number into the zone bit linked list.
5. The deep learning-based multi-module cooperative object recognition system as claimed in claim 3, wherein in the Vitevo thread operation, the object recognition result and the frame number of the frame data are sequentially extracted from the recognition result linked list, and the object recognition result and the frame number of the frame data are recognized by the neural network acceleration engine module in the recognition detect thread and stored in the recognition result linked list.
6. A recognition method of the deep learning based multi-module cooperative object recognition system according to any one of claims 1 to 5, comprising:
receiving a starting command and initializing a video input module, a video processing subsystem module, an intelligent video engine module, a neural network acceleration engine module, a video graphics subsystem module and a video output module;
receiving real-time video data by using a video input module and storing the real-time video data into a specified memory area; real-time video data shot by a camera is received through an MIPI (Mobile industry processor interface), and the received original video image data is processed to realize the collection of the video data;
calling original video data in a memory area by using a video processing subsystem module and decomposing the original video data into basic video data and extended video data; wherein the base video data maintains the resolution of the original video data; the resolution of the extended video data is matched with a neural network model in a neural network acceleration engine module;
converting image frame data in the current extended video data into frame data in an image format matched with the neural network model by using an intelligent video engine model;
acquiring frame data after format conversion by using a neural network acceleration engine module, and identifying to obtain four-point coordinate position information of the class and the outline of the object through a neural network model; in the initialization operation process, format conversion is carried out on a trained neural network model in a computer, and the neural network model is converted into a specific format which can be loaded by a neural network acceleration engine module; acquiring basic video data by using a video graphics subsystem module, and drawing a contour frame for identifying an object in the basic video data based on the category of the object and the position information of four-point coordinates of the contour;
and outputting the video image data with the outline frame of the identified object by using a video output module.
7. The identification method of claim 6 further comprising identifying a detect thread and a VitoVo thread, the two threads executing in parallel.
CN202011641665.2A 2020-12-31 2020-12-31 Multi-module cooperative object recognition system and method based on deep learning Active CN112712124B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011641665.2A CN112712124B (en) 2020-12-31 2020-12-31 Multi-module cooperative object recognition system and method based on deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011641665.2A CN112712124B (en) 2020-12-31 2020-12-31 Multi-module cooperative object recognition system and method based on deep learning

Publications (2)

Publication Number Publication Date
CN112712124A CN112712124A (en) 2021-04-27
CN112712124B true CN112712124B (en) 2021-12-10

Family

ID=75548036

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011641665.2A Active CN112712124B (en) 2020-12-31 2020-12-31 Multi-module cooperative object recognition system and method based on deep learning

Country Status (1)

Country Link
CN (1) CN112712124B (en)

Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2009058978A1 (en) * 2007-10-30 2009-05-07 Schlumberger Technology Corporation Chromatography data processing method and system
CN106355573A (en) * 2016-08-24 2017-01-25 北京小米移动软件有限公司 Target object positioning method and device in pictures
CN106778773A (en) * 2016-11-23 2017-05-31 北京小米移动软件有限公司 The localization method and device of object in picture
CN107368886A (en) * 2017-02-23 2017-11-21 奥瞳系统科技有限公司 Based on the nerve network system for reusing small-scale convolutional neural networks module
CN108038544A (en) * 2017-12-04 2018-05-15 华南师范大学 Neutral net deep learning method and system based on big data and deep learning
CN108319968A (en) * 2017-12-27 2018-07-24 中国农业大学 A kind of recognition methods of fruits and vegetables image classification and system based on Model Fusion
CN109165662A (en) * 2018-07-03 2019-01-08 哈尔滨工业大学(威海) Alimentary canal inner wall lesion type intelligent identification Method and device based on deep learning
CN109376637A (en) * 2018-10-15 2019-02-22 齐鲁工业大学 Passenger number statistical system based on video monitoring image processing
CN110021014A (en) * 2019-03-29 2019-07-16 无锡祥生医疗科技股份有限公司 Nerve fiber recognition methods, system and storage medium neural network based
CN110287875A (en) * 2019-06-25 2019-09-27 腾讯科技(深圳)有限公司 Detection method, device, electronic equipment and the storage medium of video object
CN111339990A (en) * 2020-03-13 2020-06-26 乐鑫信息科技(上海)股份有限公司 Face recognition system and method based on dynamic update of face features
CN111553213A (en) * 2020-04-17 2020-08-18 大连理工大学 Real-time distributed identity-aware pedestrian attribute identification method in mobile edge cloud
CN111709306A (en) * 2020-05-22 2020-09-25 江南大学 Double-current network behavior identification method based on multilevel space-time feature fusion enhancement

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101425137A (en) * 2008-11-10 2009-05-06 北方工业大学 Face Image Fusion Method Based on Laplacian Pyramid
CN103955691A (en) * 2014-05-08 2014-07-30 中南大学 Multi-resolution LBP textural feature extracting method
CN105551036B (en) * 2015-12-10 2019-10-08 中国科学院深圳先进技术研究院 A kind of training method and device of deep learning network
US10074038B2 (en) * 2016-11-23 2018-09-11 General Electric Company Deep learning medical systems and methods for image reconstruction and quality evaluation
CN107016344A (en) * 2017-03-08 2017-08-04 上海极链网络科技有限公司 Brand identity system and its implementation in video
WO2018230294A1 (en) * 2017-06-15 2018-12-20 シャープ株式会社 Video processing device, display device, video processing method, and control program
CN108898086B (en) * 2018-06-20 2023-05-26 腾讯科技(深圳)有限公司 Video image processing method and device, computer readable medium and electronic equipment
CN109726751B (en) * 2018-12-21 2020-11-27 北京工业大学 Method for recognizing electroencephalogram based on deep convolutional neural network
CN110619747A (en) * 2019-09-27 2019-12-27 山东奥邦交通设施工程有限公司 Intelligent monitoring method and system for highway road

Patent Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2009058978A1 (en) * 2007-10-30 2009-05-07 Schlumberger Technology Corporation Chromatography data processing method and system
CN106355573A (en) * 2016-08-24 2017-01-25 北京小米移动软件有限公司 Target object positioning method and device in pictures
CN106778773A (en) * 2016-11-23 2017-05-31 北京小米移动软件有限公司 The localization method and device of object in picture
CN107368886A (en) * 2017-02-23 2017-11-21 奥瞳系统科技有限公司 Based on the nerve network system for reusing small-scale convolutional neural networks module
CN108038544A (en) * 2017-12-04 2018-05-15 华南师范大学 Neutral net deep learning method and system based on big data and deep learning
CN108319968A (en) * 2017-12-27 2018-07-24 中国农业大学 A kind of recognition methods of fruits and vegetables image classification and system based on Model Fusion
CN109165662A (en) * 2018-07-03 2019-01-08 哈尔滨工业大学(威海) Alimentary canal inner wall lesion type intelligent identification Method and device based on deep learning
CN109376637A (en) * 2018-10-15 2019-02-22 齐鲁工业大学 Passenger number statistical system based on video monitoring image processing
CN110021014A (en) * 2019-03-29 2019-07-16 无锡祥生医疗科技股份有限公司 Nerve fiber recognition methods, system and storage medium neural network based
CN110287875A (en) * 2019-06-25 2019-09-27 腾讯科技(深圳)有限公司 Detection method, device, electronic equipment and the storage medium of video object
CN111339990A (en) * 2020-03-13 2020-06-26 乐鑫信息科技(上海)股份有限公司 Face recognition system and method based on dynamic update of face features
CN111553213A (en) * 2020-04-17 2020-08-18 大连理工大学 Real-time distributed identity-aware pedestrian attribute identification method in mobile edge cloud
CN111709306A (en) * 2020-05-22 2020-09-25 江南大学 Double-current network behavior identification method based on multilevel space-time feature fusion enhancement

Also Published As

Publication number Publication date
CN112712124A (en) 2021-04-27

Similar Documents

Publication Publication Date Title
CN111784685B (en) Power transmission line defect image identification method based on cloud edge cooperative detection
CN109711407B (en) License plate recognition method and related device
CN112183482A (en) Dangerous driving behavior recognition method, device and system and readable storage medium
CN110929795B (en) Method for quickly identifying and positioning welding spot of high-speed wire welding machine
CN110689539A (en) Workpiece surface defect detection method based on deep learning
CN110807775A (en) Traditional Chinese medicine tongue image segmentation device and method based on artificial intelligence and storage medium
CN106845434B (en) Image type machine room water leakage monitoring method based on support vector machine
CN109871789A (en) Vehicle checking method under a kind of complex environment based on lightweight neural network
CN113052295B (en) Training method of neural network, object detection method, device and equipment
CN111414916A (en) Method and device for extracting and generating text content in image and readable storage medium
CN112950642A (en) Point cloud instance segmentation model training method and device, electronic equipment and medium
CN116524195B (en) Semantic segmentation method, semantic segmentation device, electronic equipment and storage medium
CN110599453A (en) Panel defect detection method and device based on image fusion and equipment terminal
CN108900895B (en) Method and device for shielding target area of video stream
CN112686314B (en) Target detection method and device based on long-distance shooting scene and storage medium
CN112560779B (en) Method and equipment for identifying overflow of feeding port and feeding control system of stirring station
CN108877030A (en) Image processing method, device, terminal and computer readable storage medium
CN112712124B (en) Multi-module cooperative object recognition system and method based on deep learning
CN115063348A (en) Part surface defect detection method, device, equipment and medium
CN110427920B (en) Real-time pedestrian analysis method oriented to monitoring environment
CN114037646A (en) Intelligent image detection method, system, readable medium and equipment based on Internet of things
CN112016515A (en) File cabinet vacancy detection method and device
CN115424095B (en) Quality analysis method and device based on waste materials
CN114419451B (en) Method and device for identifying inside and outside of elevator, electronic equipment and storage medium
CN112802338B (en) Highway real-time early warning method and system based on deep learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CB03 Change of inventor or designer information
CB03 Change of inventor or designer information

Inventor after: Xi Zhaoming

Inventor after: Yang Zhe

Inventor after: Shao Qiang

Inventor after: Liang Zhao

Inventor after: Cai Da

Inventor after: Zhang Hui

Inventor after: Ma Lin

Inventor before: Xi Zhaoming

Inventor before: Yang Zhe

Inventor before: Shao Qiang

Inventor before: Liang Zhao

Inventor before: Cai Da

Inventor before: Zhang Hui

Inventor before: Ma Lin