CN110704711A - Object automatic identification system for lifetime learning - Google Patents

Object automatic identification system for lifetime learning Download PDF

Info

Publication number
CN110704711A
CN110704711A CN201910856599.1A CN201910856599A CN110704711A CN 110704711 A CN110704711 A CN 110704711A CN 201910856599 A CN201910856599 A CN 201910856599A CN 110704711 A CN110704711 A CN 110704711A
Authority
CN
China
Prior art keywords
pictures
data
data set
lifetime
crawler
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910856599.1A
Other languages
Chinese (zh)
Inventor
仲国强
李涛
刘文雪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ocean University of China
Original Assignee
Ocean University of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ocean University of China filed Critical Ocean University of China
Priority to CN201910856599.1A priority Critical patent/CN110704711A/en
Publication of CN110704711A publication Critical patent/CN110704711A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9532Query formulation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a lifetime learning-oriented object automatic identification system, which comprises a data set acquisition module, a data acquisition module and a database module, wherein the data set acquisition module is used for acquiring a data set; identifying an object; a picture verification code; and (5) voice recognition. The invention has the advantages of being capable of learning and automatically identifying objects throughout the life and high in accuracy.

Description

Object automatic identification system for lifetime learning
Technical Field
The invention belongs to the technical field of automatic object identification, and relates to an automatic object identification system for lifetime learning.
Background
As an important branch of artificial intelligence, object automatic identification has been applied to many fields such as security protection, unmanned driving and the like, and has great development potential. In the object recognition task, the size of the data set and the capabilities of the model determine the overall system performance. Under the condition of enough data quantity, the deep learning method and the machine learning method can achieve higher accuracy.
Disclosure of Invention
The invention aims to provide an object automatic identification system for lifetime learning, and has the advantages of lifetime learning, automatic object identification and high accuracy.
The technical scheme adopted by the invention is carried out according to the following steps:
step 1: acquiring a data set;
step 2: identifying an object;
and step 3: a picture verification code;
and 4, step 4: and (5) voice recognition.
Further, the step 1 of acquiring the data set is to collect data pictures by adopting a method of focusing a crawler, the data can be acquired uninterruptedly by the focusing crawler, and the deep learning model used in the invention can learn any kind of pictures because the data crawled by the crawler has diversity.
The method of focusing the crawler is as follows:
1) begin running at the initial URL;
2) acquiring a webpage;
3) capturing a new URL and putting the new URL into a URL queue;
4) evaluating the webpage and the URL according to an analysis algorithm;
5) if the stopping condition is met, ending, otherwise, turning to the next step;
6) and selecting the URL according to the search strategy, and jumping to the step 3.
Further, step 2: identifying an object;
the Deep learning model employed is a Deep Residual Network (Deep Residual Network). Generally, the performance of the network can be improved well by increasing the depth of the network, the deeper the network is, the better the detection effect is generally, so there is an idea that the deeper the network layer number is, the better the network layer number is, but this is not the case, after the network layer number reaches a certain number, the performance of the network is saturated, and the performance of the network starts to degrade by increasing the depth, and the degradation is not caused by overfitting, because the training precision and the testing precision are reduced at this time, which means that the deep network becomes difficult to train after the network becomes very deep. The occurrence of ResNet solves the problem of performance degradation after the network depth becomes deeper. The present invention uses Resnet-50 with a 50-layer network structure.
Further, step 3: a picture verification code;
and filtering the data crawled by the web crawler by using a picture verification code mode. When a picture of a specific category a is crawled, all obtained pictures cannot be guaranteed to belong to the category a, the pictures which do not belong to the category a are called as wrong pictures, and the wrong pictures generate noise during model training and reduce the performance of the model, so that the wrong pictures need to be eliminated by a desired method. The invention adopts the form of the picture verification code to filter data because the user clicks the verification code to be a form of marking the picture, and the invention completes the marking by utilizing the operation of the user. Specifically, taking 'camera' as an example, 6 pictures are selected each time as the verification code, wherein one part of the 6 pictures are selected from the 'non-camera' category in the auxiliary training set, and one part of the 6 pictures are sampled from the crawler data set. In the user operation stage, if the user correctly selects the photo of the auxiliary training set, the user passes the verification, and if the user selects the picture crawled by the crawler at the moment, the crawled picture is considered to not belong to the camera class with a high probability and can be directly deleted from the crawler data set. If the user does not fully select the pictures of the secondary training set (i.e. all 'non-camera' pictures), then refresh is performed.
Through the picture verification code, the method and the device can filter data by using the operation of a user and delete the wrong picture in the crawler data set. Due to the fact that data are obtained in real time, the real-time marks of users and the training set can be increased along with the increase of time, the accuracy of the model is improved, the generalization capability of the model to a new sample is greatly improved, and the purpose of lifelong learning is achieved.
Further, step 4: performing voice recognition;
through voice recognition filtering, a user rejects wrong pictures in the data set through voice recognition, and the data set is purified. The speech recognition uses a method of calling a Baidu speech recognition rest API.
Drawings
FIG. 1 is a data acquisition interface;
FIG. 2 is a model training interface;
FIG. 3 is a captcha interface.
Detailed Description
The present invention will be described in detail with reference to the following embodiments.
1. Test protocol
Firstly, 200 pictures are crawled, and error data in the pictures are removed to be used as a test set. Then, 1200 pictures are crawled again as a training set to train the model. Specifically, one model is trained by using 200 pictures, another model is trained by using all 1200 pictures, and the two models are tested on a test set to obtain the accuracy of the models.
2. Test environment set-up
(1) The whole testing process
And (3) testing environment: windows system
And (3) operating environment: python3.5
(2) Object identification process
And (3) testing environment: windows system is equipped with GPU
And (3) operating environment: tensorflow Keras
(3) Speech recognition process
And (3) testing environment: windows system
The installation is required: baidu AIP SDK, PYAUDIO
3. Test procedure
The method mainly comprises a data acquisition module, a model training module, a verification code module and a voice recognition module, wherein the test results of all the modules are as follows.
(1) Data acquisition phase
In fig. 1, the item name is entered and the label of the item is filled in according to the prompt. And then, crawler parameters are set, a starting page number (each page comprises 60 pictures) and a stopping page number are selected, after the page is clicked and determined, data begin to crawl, the crawled data can be stored in a preset folder, and a display frame below the page can display the crawling progress.
(2) Model training phase
In fig. 2, after the name and the label of the article are input according to the prompt, the batch _ size and the epochs are selected, and after the click training, the model training is performed by using the data in the folder obtained in the data acquisition stage. The lower display box will show the model training progress.
(3) Verification code phase
Shown in fig. 3 is a picture of the authentication code for selection by the user who, upon selection, can click a check box to select and then click an ok button to submit. Another option for the user is to click on the voice control and select the picture that matches the prompt by voice.
(4) Speech control phase
And a voice control stage, wherein the user clicks a voice control and then inputs a control voice, such as 'select 3, 5 pages', and if the input voice correctly selects a corresponding picture, a verification passing dialog box pops up.
4. Analysis of results
Table 1 shows the test accuracy obtained using 200 data training models and table 2 shows the test accuracy obtained using 1200 data training. The results show that the performance of the model can be improved by adding the training data. The deep learning model used in the invention achieves higher test accuracy.
Accuracy after learning of 1200 samples in table
Identifying an object Camera with a camera module Mobile phone Tape recorder Sound equipment Adhesive tape
Rate of accuracy 87.5% 89.75% 88.75% 90% 96.75%
Accuracy after learning of 11200 samples of the Table
Identifying an object Camera with a camera module Mobile phone Tape recorder Sound equipment Adhesive tape
Rate of accuracy 97.25% 97.75% 97.75% 98.5% 99.05%
The invention uses Deep Residual Network (Deep Residual Network) for object recognition. In order to obtain training data, the invention adopts a crawler mode to crawl data from the Internet. The data set crawled by the crawler has error pictures, and the invention utilizes a verification code mode to remove the error data. And finally, using the filtered data for training the deep residual error network. The whole process is carried out in an iteration mode, the data can be continuously crawled, filtered and the depth model can be trained, and therefore the whole-life learning of the system can be achieved. In addition, the invention also adds a voice control feedback module, so that the user can flexibly select a verification mode. Generally, the method and the device not only realize lifetime learning of the automatic object identification system, but also can utilize a crowd funding mode to carry out data marking, continuously expand the data set, effectively solve the problems of high consumption of manpower, material resources and financial resources of the manually marked data set and the like, and have very wide application prospect.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not intended to limit the present invention in any way, and all simple modifications, equivalent variations and modifications made to the above embodiments according to the technical spirit of the present invention are within the scope of the present invention.

Claims (5)

1. The lifetime learning-oriented object automatic identification system is characterized by comprising the following steps of:
step 1: acquiring a data set;
step 2: identifying an object;
and step 3: a picture verification code;
and 4, step 4: and (5) voice recognition.
2. The lifetime-oriented learning object automatic recognition system according to claim 1, wherein: the step 1 of acquiring the data set is to collect data pictures by adopting a method of focusing a crawler, and the data can be acquired by the focusing crawler without intermission;
the method of focusing the crawler is as follows:
1) begin running at the initial URL;
2) acquiring a webpage;
3) capturing a new URL and putting the new URL into a URL queue;
4) evaluating the webpage and the URL according to an analysis algorithm;
5) if the stopping condition is met, ending, otherwise, turning to the next step;
6) and selecting the URL according to the search strategy, and jumping to the step 3.
3. The lifetime-oriented learning object automatic recognition system according to claim 1, wherein: the deep learning model adopted by the step 2 object identification is a deep residual error network, and Resnet-50 with a 50-layer network structure is used.
4. The lifetime-oriented learning object automatic recognition system according to claim 1, wherein: and 3, filtering the data crawled by the web crawler by using a picture verification code mode, selecting 6 pictures as the verification codes each time, wherein one part of the 6 pictures is selected from the pictures of the auxiliary training set, and the other part of the 6 pictures is the pictures of non-camera types, and the other part of the 6 pictures is the pictures sampled from the crawler data set.
5. The lifetime-oriented learning object automatic recognition system according to claim 1, wherein: and 4, voice recognition is to filter through voice recognition, a user rejects wrong pictures in the data set through voice recognition, and the data set is purified, wherein the voice recognition adopts a method of calling a hundred-degree voice recognition rest API.
CN201910856599.1A 2019-09-11 2019-09-11 Object automatic identification system for lifetime learning Pending CN110704711A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910856599.1A CN110704711A (en) 2019-09-11 2019-09-11 Object automatic identification system for lifetime learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910856599.1A CN110704711A (en) 2019-09-11 2019-09-11 Object automatic identification system for lifetime learning

Publications (1)

Publication Number Publication Date
CN110704711A true CN110704711A (en) 2020-01-17

Family

ID=69195936

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910856599.1A Pending CN110704711A (en) 2019-09-11 2019-09-11 Object automatic identification system for lifetime learning

Country Status (1)

Country Link
CN (1) CN110704711A (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101382836A (en) * 2008-09-05 2009-03-11 浙江大学 Electronic painting creative method based on multi-medium user interaction
CN101520798A (en) * 2009-03-06 2009-09-02 苏州锐创通信有限责任公司 Webpage classification technology based on vertical search and focused crawler
CN107657269A (en) * 2017-08-24 2018-02-02 百度在线网络技术(北京)有限公司 A kind of method and apparatus for being used to train picture purification model
CN107679183A (en) * 2017-09-29 2018-02-09 百度在线网络技术(北京)有限公司 Grader training data acquisition methods and device, server and storage medium
CN109558774A (en) * 2017-09-27 2019-04-02 中国海洋大学 Object automatic recognition system based on depth residual error network and support vector machines
CN111090768A (en) * 2019-12-17 2020-05-01 杭州深绘智能科技有限公司 Similar image retrieval system and method based on deep convolutional neural network
CN111324797A (en) * 2020-02-20 2020-06-23 民生科技有限责任公司 Method and device for acquiring data accurately at high speed

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101382836A (en) * 2008-09-05 2009-03-11 浙江大学 Electronic painting creative method based on multi-medium user interaction
CN101520798A (en) * 2009-03-06 2009-09-02 苏州锐创通信有限责任公司 Webpage classification technology based on vertical search and focused crawler
CN107657269A (en) * 2017-08-24 2018-02-02 百度在线网络技术(北京)有限公司 A kind of method and apparatus for being used to train picture purification model
CN109558774A (en) * 2017-09-27 2019-04-02 中国海洋大学 Object automatic recognition system based on depth residual error network and support vector machines
CN107679183A (en) * 2017-09-29 2018-02-09 百度在线网络技术(北京)有限公司 Grader training data acquisition methods and device, server and storage medium
CN111090768A (en) * 2019-12-17 2020-05-01 杭州深绘智能科技有限公司 Similar image retrieval system and method based on deep convolutional neural network
CN111324797A (en) * 2020-02-20 2020-06-23 民生科技有限责任公司 Method and device for acquiring data accurately at high speed

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
周立柱、林玲: "聚焦爬虫技术研究综述" *

Similar Documents

Publication Publication Date Title
US11276407B2 (en) Metadata-based diarization of teleconferences
CN109697162B (en) Software defect automatic detection method based on open source code library
CN111858242B (en) System log abnormality detection method and device, electronic equipment and storage medium
CN107015961B (en) Text similarity comparison method
CN106095928A (en) A kind of event type recognition methods and device
CN109087667B (en) Voice fluency recognition method and device, computer equipment and readable storage medium
CN111597818B (en) Call quality inspection method, device, computer equipment and computer readable storage medium
CN111144097B (en) Modeling method and device for emotion tendency classification model of dialogue text
CN106354852A (en) Search method and device based on artificial intelligence
CN109165564B (en) Electronic photo album, generation method, system, storage medium and computer equipment
CN108509561B (en) Post recruitment data screening method and system based on machine learning and storage medium
CN113742733A (en) Reading understanding vulnerability event trigger word extraction and vulnerability type identification method and device
CN111144360A (en) Multimode information identification method and device, storage medium and electronic equipment
US20220157322A1 (en) Metadata-based diarization of teleconferences
CN106528429A (en) UI testing method and device
CN111368106B (en) Method and device for processing wild advertisement and computer readable storage medium
CN108459848A (en) A kind of script acquisition methods and system applied to Excel softwares
CN109697982A (en) A kind of speaker speech recognition system in instruction scene
CN110704711A (en) Object automatic identification system for lifetime learning
CN111338618B (en) Application scene driven Android application micro-service automatic generation method
CN110362828B (en) Network information risk identification method and system
CN109829887B (en) Image quality evaluation method based on deep neural network
CN109995605B (en) Flow identification method and device and computer readable storage medium
CN111178068B (en) Method and device for evaluating furcation violence tendency based on dialogue emotion detection
CN115630365A (en) Verification code malicious input detection method based on deep learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination