CN113220963A - Machine intelligent learning method based on Internet big data - Google Patents

Machine intelligent learning method based on Internet big data Download PDF

Info

Publication number
CN113220963A
CN113220963A CN202011237579.5A CN202011237579A CN113220963A CN 113220963 A CN113220963 A CN 113220963A CN 202011237579 A CN202011237579 A CN 202011237579A CN 113220963 A CN113220963 A CN 113220963A
Authority
CN
China
Prior art keywords
data
machine
learning
intelligent
module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011237579.5A
Other languages
Chinese (zh)
Inventor
谭旭
曹维
张倩
庄穆妮
赵学华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Institute of Information Technology
Original Assignee
Shenzhen Institute of Information Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Institute of Information Technology filed Critical Shenzhen Institute of Information Technology
Priority to CN202011237579.5A priority Critical patent/CN113220963A/en
Publication of CN113220963A publication Critical patent/CN113220963A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention relates to the technical field of machine learning, and discloses an intelligent machine learning method based on Internet big data, which comprises the following steps: s1, data acquisition; s2, processing data; s3, storing data; s4, learning data; s5, data storage; s6, predicting results; s7, processing voice and image; the invention can effectively avoid bringing the virus and the advertisement generated in the big data into the machine learning content by effectively intercepting, cleaning and filtering the acquired data, and can more quickly transmit the data into the machine by inducing and integrating the filtered data, and meanwhile, the machine can learn various data contents and learn various political expressions and expressions according to the unsupervised and supervised learning modes, thereby effectively improving the learning ability and the imitation ability of the machine, realizing the human-computer interaction effect and improving the commercial value.

Description

Machine intelligent learning method based on Internet big data
Technical Field
The invention relates to the technical field of machine learning, in particular to an intelligent machine learning method based on internet big data.
Background
The internet big data is a data set which cannot be captured, managed and processed by a conventional software tool within a certain time range, is a massive, high-growth rate and diversified information asset which can have stronger decision making power, insight discovery power and flow optimization capability only by a new processing mode, is the basis of artificial intelligence, and is converted into knowledge or productivity, machine learning can not be left, so that the machine learning is the core of the artificial intelligence, and is a fundamental way for enabling a machine to have human-like intelligence.
However, in the current machine learning process, spam such as advertisements and news in big data is often brought into learning contents, so that the storage space inside the machine is affected, the memory of a general machine is fixed, the memory space of the machine cannot be reasonably utilized, the content needs to be deleted manually, new data cannot be learned effectively, and polite terms and expressions of the machine in the machine learning process are simple, so that the value is low. Therefore, those skilled in the art provide a machine intelligent learning method based on internet big data to solve the problems mentioned in the above background art.
Disclosure of Invention
The invention aims to provide an internet big data based machine intelligent learning method to solve the problems in the background technology.
In order to achieve the purpose, the invention provides the following technical scheme: an intelligent machine learning method based on internet big data comprises the following steps:
s1, data acquisition: collecting all generated data in the external internet big data, and sending the collected data to the interior of the machine through the cloud platform to be subjected to classification learning processing by a machine learning module in the machine;
s2, data processing: preprocessing the acquired data to process the data such as advertisements and viruses in the acquired data, extracting basic features of the filtered data, and then accessing the filtered data;
s3, data storage: the information extracted in the S2 is subjected to multi-layer complex feature extraction and then stored, and different data are classified and stored through the internal memory of the machine, so that the machine can learn conveniently;
s4, learning data: dividing data into proper categories according to different classification data information, establishing a model or rule according to a tag value or a target value, identifying or predicting new data to belong to supervised learning by using the model or rule formed by the data with the target value, wherein the unsupervised learning is opposite to the supervised learning, the unsupervised learning does not specify the target value or cannot know the target value in advance, and similar or similar data can be divided into the same group;
s5, data storage: the learned data is stored into an internal memory card of the machine, so that the phenomenon of data loss can be avoided;
s6, prediction result: predicting whether the stored data is accurate data or not through an intelligent module in the machine, deleting the data if the stored data is the accurate data, and storing the data if the stored data is the correct data;
s7, voice and image processing: the data after collection processing and storage are processed by image and sound, meanwhile, the voice playing and image projection are realized by adopting an intelligent control module in the data acquisition processing and storage, and the intelligent control module is used for carrying out operation and intelligent dialogue processing on the data.
As a still further scheme of the invention: the supervised learning in S4 is to train a prediction model learning based on the input data and the target value, and the unsupervised learning is to perform a group learning based on the input poly-p data.
As a still further scheme of the invention: the data processing in S2 includes the following steps:
(1) and data cleaning: all the collected big data are transmitted to a temporary storage module in the machine, and all the data in the temporary storage module are cleaned by an intelligent interception module in the machine, so that the data with viruses and harmful data are intercepted and discharged;
(2) and data filtering: filtering the cleaned data again, and intercepting and discharging advertisements, news and the like in the data through an intelligent module in the data;
(3) and data integration: the data after cleaning and filtering are summarized through data integration and integrated into a whole, so that the data transmission is facilitated.
As a still further scheme of the invention: the data learning in S4 may be performed by polite expression training, word content information, expression using training, similar sentence repetition information, dialogue duration information, training for big data, and the like.
As a still further scheme of the invention: the polite expression is at least more than twenty expressions such as 'hello', 'thank you', 'bad breath' and the like, and the expression training is at least more than ten expressions such as 'happiness, anger, sadness, happiness' and the like.
As a still further scheme of the invention: the query module and the human-computer interaction module are arranged in the S7, so that data stored in the data processing module can be queried, the function of fast data processing and retrieval can be realized, and the retrieved information can be played through voice and images through the human-computer interaction module.
As a still further scheme of the invention: in the data storage at S3, the data storage is performed at a speed of every 3S, and compression is performed after each storage, so that a large amount of data can be stored.
As a still further scheme of the invention: in the data storage of S5, the stored data can be saved for at least three months, and can be prompted in the first twelve days of automatic deletion.
As a still further scheme of the invention: in the prediction result of S6, the intelligent module is connected with the outside in a wireless connection mode, and meanwhile, the data is compared with the intelligent module to judge.
Compared with the prior art, the invention has the beneficial effects that: the invention can effectively avoid bringing the virus and the advertisement generated in the big data into the machine learning content by effectively intercepting, cleaning and filtering the collected data, and can more quickly transmit the data into the machine after filtering, meanwhile, the machine can learn various data contents and learn various political expressions and expressions according to the unsupervised and supervised learning modes, thereby effectively improving the learning ability and the imitation ability of the machine, realizing the human-computer interaction effect, improving the commercial value, storing the stored data for more than three months, fully retaining the data, prompting before automatically deleting the data, ensuring the safety of the data, cleaning the space, enabling the machine to learn new contents, and storing the data at the speed of every 3S, therefore, a certain time gap exists in data compression, and compression is performed after each storage, so that more data can be stored.
Drawings
FIG. 1 is a schematic structural diagram of a machine intelligent learning method based on Internet big data;
fig. 2 is a flow chart of data cleaning in an internet big data based machine intelligent learning method.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
According to fig. 1-2, in an embodiment of the present invention, a machine intelligent learning method based on internet big data includes the following steps:
s1, data acquisition: the method comprises the steps that all generated data in external Internet big data are collected, the collected data are sent to the interior of a machine through a cloud platform and are subjected to classified learning processing through a machine learning module in the machine, the data in the external Internet big data are transmitted and collected in the modes of Internet, mobile Internet, communication network, Internet of things and the like, and the cloud platform is mainly used for analyzing and processing the big data;
s2, data processing: preprocessing the acquired data to process the data such as advertisements and viruses in the acquired data, extracting basic features of the filtered data, and then accessing the filtered data;
the data processing in S2 includes the steps of:
(1) and data cleaning: all the collected big data are transmitted to a temporary storage module in the machine, and all the data in the temporary storage module are cleaned by an intelligent interception module in the machine, so that the data with viruses and harmful data are intercepted and discharged;
(2) and data filtering: filtering the cleaned data again, and intercepting and discharging advertisements, news and the like in the data through an intelligent module in the data;
(3) and data integration: the data after cleaning and filtering are summarized through data integration and integrated into a whole, so that the data transmission is facilitated. The collected data are effectively intercepted, cleaned and filtered, viruses and advertisements generated in the big data can be effectively prevented from being brought into the machine learning content, and the filtered data are summarized and integrated, so that the data can be transmitted to the machine more quickly.
S3, data storage: the information extracted in the S2 is subjected to multi-layer complex feature extraction and then stored, different data are classified and stored through a self memory in the machine, so that the machine can learn conveniently, the stored data can be stored for more than three months, the data can be fully reserved, meanwhile, a prompt can be given before automatic deletion, the safety of the data is ensured, the space can be cleaned, and the machine can learn new contents;
s4, learning data: dividing data into proper categories according to different classification data information, establishing a model or rule according to a tag value or a target value, identifying or predicting new data to belong to supervised learning by using the model or rule formed by the data with the target value, wherein the unsupervised learning is opposite to the supervised learning, the unsupervised learning does not specify the target value or cannot know the target value in advance, and similar or similar data can be divided into the same group;
the supervised learning in the S4 is the learning of the prediction model based on the input data and the target value training, the unsupervised learning is the group learning according to the input poly-p data, the data learning in the S4 can perform political expression training, word content information, expression training, similar sentence repetition information, dialogue duration information, big data training and the like, the political expressions are at least more than twenty expressions such as "hello", "thank you", "bad breath" and the like, the expression training at least includes more than ten expressions such as "happiness, anger, grief, music" and the like, and the machine can learn various data contents and learn various political expressions and expressions according to the unsupervised and supervised learning modes, thereby effectively improving the learning ability and the imitation ability of the machine, and realizing the function of human-computer interaction, so that the commercial value is improved.
S5, data storage: the learned data is stored into an internal memory card of the machine, so that the phenomenon of data loss can be avoided;
in the data storage of S5, the stored data can be stored for at least three months, and can be prompted for twelve days before automatic deletion, and the data storage is performed at a speed of every 3S, so that there is a certain time gap when the data is compressed, and the data is compressed after each storage, thereby enabling to store more data.
S6, prediction result: predicting whether the stored data is accurate data or not through an intelligent module in the machine, deleting the data if the stored data is the accurate data, and storing the data if the stored data is the correct data;
in the prediction result of S6, the intelligent module is connected with the outside in a wireless connection mode, and meanwhile, the data is compared with the intelligent module to judge.
S7, voice and image processing: processing the acquired and stored data by image and sound, realizing voice playing and image projection by adopting an intelligent control module in the data acquisition device, and performing operation and intelligent dialogue processing on the data by utilizing the intelligent control module;
the query module and the human-computer interaction module are arranged in the S7, so that data stored in the query module can be queried, the function of quickly processing and retrieving the data is realized, the retrieved information is played through voice and images through the human-computer interaction module, the learned contents of the machine can be displayed through the arrangement, and the human and the machine can be conversed and processed through the voice receiving module arranged in the machine, so that the learning capability of the machine can be effectively checked.
The above-mentioned embodiments are intended to illustrate the objects, technical solutions and advantages of the present invention in further detail, and it should be understood that the above-mentioned embodiments are merely exemplary embodiments of the present invention, and are not intended to limit the scope of the present invention, and any modifications, equivalent substitutions, improvements and the like made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims (9)

1. A machine intelligent learning method based on Internet big data is characterized by comprising the following steps:
s1, data acquisition: collecting all generated data in the external internet big data, and sending the collected data to the interior of the machine through the cloud platform to be subjected to classification learning processing by a machine learning module in the machine;
s2, data processing: preprocessing the acquired data to process the data such as advertisements and viruses in the acquired data, extracting basic features of the filtered data, and then accessing the filtered data;
s3, data storage: the information extracted in the S2 is subjected to multi-layer complex feature extraction and then stored, and different data are classified and stored through the internal memory of the machine, so that the machine can learn conveniently;
s4, learning data: dividing data into proper categories according to different classification data information, establishing a model or rule according to a tag value or a target value, identifying or predicting new data to belong to supervised learning by using the model or rule formed by the data with the target value, wherein the unsupervised learning is opposite to the supervised learning, the unsupervised learning does not specify the target value or cannot know the target value in advance, and similar or similar data can be divided into the same group;
s5, data storage: the learned data is stored into an internal memory card of the machine, so that the phenomenon of data loss can be avoided;
s6, prediction result: predicting whether the stored data is accurate data or not through an intelligent module in the machine, deleting the data if the stored data is the accurate data, and storing the data if the stored data is the correct data;
s7, voice and image processing: the data after collection processing and storage are processed by image and sound, meanwhile, the voice playing and image projection are realized by adopting an intelligent control module in the data acquisition processing and storage, and the intelligent control module is used for carrying out operation and intelligent dialogue processing on the data.
2. The internet big data-based machine intelligent learning method of claim 1, wherein the supervised learning in S4 is prediction model learning based on input data and target value training, and the unsupervised learning is grouping learning based on input poly-p data.
3. The internet big data-based machine intelligent learning method according to claim 1, wherein the data processing in S2 includes the following steps:
(1) and data cleaning: all the collected big data are transmitted to a temporary storage module in the machine, and all the data in the temporary storage module are cleaned by an intelligent interception module in the machine, so that the data with viruses and harmful data are intercepted and discharged;
(2) and data filtering: filtering the cleaned data again, and intercepting and discharging advertisements, news and the like in the data through an intelligent module in the data;
(3) and data integration: the data after cleaning and filtering are summarized through data integration and integrated into a whole, so that the data transmission is facilitated.
4. The internet big data-based machine intelligent learning method of claim 1, wherein in S4, the data learning can be performed with political phrase training, word content information, expression training, similar sentence repetition information, dialogue duration information, big data training, etc.
5. The machine intelligent learning method based on internet big data as claimed in claim 4, wherein the polite expression is at least twenty or more expressions such as "hello", "thank you", "cheerful" and the like, and the expression training is at least ten or more expressions such as "happiness, anger, sadness, happiness" and the like.
6. The machine intelligent learning method based on internet big data as claimed in claim 1, wherein the S7 is provided with an inquiry module and a human-computer interaction module, so as to realize inquiry of data stored therein, realize rapid data processing and retrieval functions thereof, and play retrieved information through voice and image by the human-computer interaction module.
7. The internet big data-based machine intelligent learning method as claimed in claim 1, wherein in the S3 data storage, the data storage is stored at a speed of every 3S and compressed after each storage, so that more data can be stored.
8. The internet big data-based machine intelligent learning method as claimed in claim 1, wherein in the S5 data storage, the stored data can be saved for at least three months or more, and can be prompted for twelve days before being automatically deleted.
9. The internet big data-based machine intelligent learning method as claimed in claim 1, wherein in the prediction result of S6, the intelligent module is connected with the outside world through a wireless connection, and the data is compared with the intelligent module for judgment.
CN202011237579.5A 2020-11-09 2020-11-09 Machine intelligent learning method based on Internet big data Pending CN113220963A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011237579.5A CN113220963A (en) 2020-11-09 2020-11-09 Machine intelligent learning method based on Internet big data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011237579.5A CN113220963A (en) 2020-11-09 2020-11-09 Machine intelligent learning method based on Internet big data

Publications (1)

Publication Number Publication Date
CN113220963A true CN113220963A (en) 2021-08-06

Family

ID=77085769

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011237579.5A Pending CN113220963A (en) 2020-11-09 2020-11-09 Machine intelligent learning method based on Internet big data

Country Status (1)

Country Link
CN (1) CN113220963A (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1760901A (en) * 2005-11-03 2006-04-19 上海交通大学 System for filtering E-mails
CN107240395A (en) * 2017-06-16 2017-10-10 百度在线网络技术(北京)有限公司 A kind of acoustic training model method and apparatus, computer equipment, storage medium
CN109710826A (en) * 2018-11-29 2019-05-03 淮河水利委员会水文局(信息中心) A kind of internet information artificial intelligence acquisition method and its system
CN110311857A (en) * 2019-06-28 2019-10-08 温州易思网络科技有限公司 A kind of college association online interaction platform

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1760901A (en) * 2005-11-03 2006-04-19 上海交通大学 System for filtering E-mails
CN107240395A (en) * 2017-06-16 2017-10-10 百度在线网络技术(北京)有限公司 A kind of acoustic training model method and apparatus, computer equipment, storage medium
CN109710826A (en) * 2018-11-29 2019-05-03 淮河水利委员会水文局(信息中心) A kind of internet information artificial intelligence acquisition method and its system
CN110311857A (en) * 2019-06-28 2019-10-08 温州易思网络科技有限公司 A kind of college association online interaction platform

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
陈刚等, 北京:中国传媒大学出版社 *

Similar Documents

Publication Publication Date Title
CN111382623B (en) Live broadcast auditing method, device, server and storage medium
CN111738011A (en) Illegal text recognition method and device, storage medium and electronic device
CN112165462A (en) Attack prediction method and device based on portrait, electronic equipment and storage medium
CN109634994A (en) A kind of the matching method for pushing and computer equipment and storage medium of resume and position
CN111177310A (en) Intelligent scene conversation method and device for power service robot
CN103813279A (en) Junk short message detecting method and device
WO2021036439A1 (en) Method for responding to complaint, and device
CN111460162B (en) Text classification method and device, terminal equipment and computer readable storage medium
CN109657063A (en) A kind of processing method and storage medium of magnanimity environment-protection artificial reported event data
CN110189751A (en) Method of speech processing and equipment
CN109446299B (en) Method and system for searching e-mail content based on event recognition
CN112507167A (en) Method and device for identifying video collection, electronic equipment and storage medium
CN107145568A (en) A kind of quick media event clustering system and method
WO2024055603A1 (en) Method and apparatus for identifying text from minor
CN110059189B (en) Game platform message classification system and method
CN112784011A (en) Emotional problem processing method, device and medium based on CNN and LSTM
CN113220963A (en) Machine intelligent learning method based on Internet big data
CN112492606A (en) Classification and identification method and device for spam messages, computer equipment and storage medium
CN111464687A (en) Strange call request processing method and device
CN116881408A (en) Visual question-answering fraud prevention method and system based on OCR and NLP
CN115169293A (en) Text steganalysis method, system, device and storage medium
CN115905572A (en) Social robot detection method and storage medium for twitter users
CN112966509B (en) Text quality evaluation method and device, storage medium and computer equipment
CN112307209B (en) Short text classification method and system based on character vector
CN113011875A (en) Text processing method and device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20210806

RJ01 Rejection of invention patent application after publication