CN108121508A - Multi-source heterogeneous data collecting system and processing method based on education big data - Google Patents

Multi-source heterogeneous data collecting system and processing method based on education big data Download PDF

Info

Publication number
CN108121508A
CN108121508A CN201711369499.3A CN201711369499A CN108121508A CN 108121508 A CN108121508 A CN 108121508A CN 201711369499 A CN201711369499 A CN 201711369499A CN 108121508 A CN108121508 A CN 108121508A
Authority
CN
China
Prior art keywords
data
collecting device
source heterogeneous
collecting system
cleaning
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201711369499.3A
Other languages
Chinese (zh)
Inventor
刘三女牙
黄涛
张�浩
杨华利
张文君
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huazhong Normal University
Central China Normal University
Original Assignee
Huazhong Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huazhong Normal University filed Critical Huazhong Normal University
Priority to CN201711369499.3A priority Critical patent/CN108121508A/en
Publication of CN108121508A publication Critical patent/CN108121508A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0614Improving the reliability of storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0646Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
    • G06F3/0652Erasing, e.g. deleting, data cleaning, moving of data to a wastebasket
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0673Single storage device
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/20Education
    • G06Q50/205Education administration or guidance

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • General Engineering & Computer Science (AREA)
  • Educational Technology (AREA)
  • Educational Administration (AREA)
  • Strategic Management (AREA)
  • Tourism & Hospitality (AREA)
  • Health & Medical Sciences (AREA)
  • Economics (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • General Business, Economics & Management (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention relates to a kind of multi-source heterogeneous data collecting systems and processing method based on education big data, belong to technical field of data processing.Multi-source heterogeneous data collecting system, including:At least a set of collecting device, memory, processor and data summarization module.It often covers collecting device to be applied in a campus, for gathering the behavioral data of teaching process middle school student and/or teacher.In memory and including one or more software function modules performed by processor, data summarization module is used to that the data of at least a set of collecting device acquisition to be collected, cleaned and classified the storage of data summarization module.Data summarization module is during the data to every set collecting device acquisition are cleaned and classified, by the mixed and disorderly data of various structures, content according to certain form collator into unified data, and filter out redundancy therein, it ensure that the quality of data from source, improve the efficiency and reliability of subsequent analysis.

Description

Multi-source heterogeneous data collecting system and processing method based on education big data
Technical field
The invention belongs to technical field of data processing, and in particular to a kind of multi-source heterogeneous data based on education big data are adopted Collecting system and processing method.
Background technology
With the development of science and technology, technology of Internet of things is increasingly becoming one of current hot issue, numerous world esbablished corporations are confused Confusingly put into the research of technology of Internet of things.At the same time, basis of the Digital Campus Construction as education informationization construction, information Change the demand driving development and update of correlation technique built, wherein, big data be presently the most a kind of popular technology and Ability, it is that significant association is found from the unusual data of the various dimensions of magnanimity, excavates things changing rule, Accurate Prediction The ability of things development trend.It is then directly to result from various educational activities to educate big data, how to obtain these data and place The relation that reason excavates therebetween is most important basis.With the fast development of electronic technology and wireless communication technique, " wisdom is taught The concepts such as room ", " smart city " are also emerged in large numbers respectively, this also becomes the trend of development in science and technology.In current " wisdom classroom " Classroom as education important place, be in the place and data acquisition that each student and teacher are concerned about the most most For one of important place.Common educational data acquisition is mostly derived from this four major classes technology, i.e. Internet of Things perceives class technology, depending on Frequency records class technology, image identification class technology, platform acquisition class technology.One example of the four major classes technology is i.e. as " wisdom is taught Room ".But the current solution for " wisdom classroom " also there are it is more the shortcomings that, be mainly shown as that the data got are inadequate Comprehensively, so that can not accurately cover the every aspect in teaching, in addition, collecting device after data are obtained, just directly will Data are transferred to subsequent equipment and carry out simple analysis, so as to based on the characteristics such as the basic association between data by its with chart and/ Or the form of word is shown, function is relatively single.
The content of the invention
In consideration of it, it is an object of the invention to provide it is a kind of based on education big data multi-source heterogeneous data collecting system and Processing method, to effectively improve the above problem.
What the embodiment of the present invention was realized in:
In a first aspect, an embodiment of the present invention provides it is a kind of based on education big data multi-source heterogeneous data collecting system, Including:At least a set of collecting device, memory, processor and data summarization module.The collecting device is often covered applied to one In classroom, for gathering the behavioral data of teaching process middle school student and/or teacher;The data summarization module is stored in described deposit In reservoir and including the software function module that one or more is performed by the processor, the data summarization module is used for institute The data for stating at least a set of collecting device acquisition are collected, clean and classify.
In preferred embodiments of the present invention, often covering the collecting device includes:Applied in classroom camera, touch At least one of screen, electronic whiteboard, microphone array and mobile terminal.
In preferred embodiments of the present invention, the data that at least a set of collecting device is gathered are:Voice data, figure As at least one of data and text data.
In preferred embodiments of the present invention, the data summarization module includes:Submodule is collected, for collecting described in warp The data of at least a set of collecting device acquisition and the data manually imported;Submodule is cleaned, for according to preset standard form The data got are cleaned, filter out redundancy;Classification submodule, for classifying to the data after cleaning, Obtain grouped data;Preserve submodule, for by the grouped data storage in the memory with the grouped data Corresponding database.
In preferred embodiments of the present invention, the classification submodule includes:Recognition unit, for the data after cleaning Type be identified;Taxon, the type for will identify that stamp tag along sort, obtain grouped data.
In preferred embodiments of the present invention, the data summarization module further includes:Judging submodule, for judging to clean Whether the form of data afterwards is consistent with the preset standard form.
In preferred embodiments of the present invention, the database being stored in the memory includes:Hadoop databases, Mysql databases and Nosql databases.
Second aspect, the embodiment of the present invention additionally provide it is a kind of based on education big data processing method, applied to based on The multi-source heterogeneous data collecting system of big data is educated, the multi-source heterogeneous data collecting system includes:At least a set of acquisition is set It is standby, it often covers the collecting device and is applied in a classroom, the described method includes:Collect at least a set of collecting device acquisition described in Data and the data that manually import;The data being collected into are cleaned according to preset standard form, filter out redundancy letter Breath;Classify to the data after cleaning, obtain grouped data;Grouped data storage is arrived opposite with the grouped data The database answered.
In preferred embodiments of the present invention, the data after described pair of cleaning are classified, and obtain grouped data, including: The type of data after cleaning is identified;The type that will identify that stamps tag along sort, obtains grouped data.
It is described that the data being collected into are cleaned according to preset standard form in preferred embodiments of the present invention, mistake After filtering redundancy, the method further includes:Judge cleaning after data form whether with the preset standard form Unanimously.
Multi-source heterogeneous data collecting system and processing method provided in an embodiment of the present invention based on education big data, this is more Source isomeric data acquisition system, including:At least a set of collecting device, memory and data summarization module.Wherein, often described in set Collecting device is applied in a classroom, for gathering the behavioral data of teaching process middle school student and/or teacher.Often cover collecting device The data of acquisition are after data summarization module is collected, cleans and classifies, then are managed collectively preservation, so as to follow-up further The data are analyzed.During data are cleaned and classified, the mixed and disorderly data of various structures, content are pressed According to certain form collator into unified data, and redundancy therein is filtered out, the quality of data is ensure that from source, is carried The high efficiency and reliability of subsequent analysis.
Other features and advantages of the present invention will be illustrated in subsequent specification, also, partly be become from specification It is clear that understood by implementing the embodiment of the present invention.The purpose of the present invention and other advantages can be by being write Specifically noted structure is realized and obtained in specification, claims and attached drawing.
Description of the drawings
It in order to illustrate more clearly about the embodiment of the present invention or technical scheme of the prior art, below will be to institute in embodiment Attached drawing to be used is needed to be briefly described, it should be apparent that, the accompanying drawings in the following description is only some implementations of the present invention Example, for those of ordinary skill in the art, without creative efforts, can also obtain according to these attached drawings Obtain other attached drawings.By the way that shown in attached drawing, above and other purpose of the invention, feature and advantage will become apparent from.In whole Identical reference numeral indicates identical part in attached drawing.Deliberately attached drawing, emphasis are not drawn by actual size equal proportion scaling It is the purport for showing the present invention.
Fig. 1 shows a kind of structure diagram of multi-source heterogeneous data collecting system provided in an embodiment of the present invention.
Fig. 2 shows a kind of method flow diagram for processing method that first embodiment of the invention provides.
Fig. 3 shows the method flow diagram of the step S103 in Fig. 2 provided in an embodiment of the present invention.
Fig. 4 shows a kind of method flow diagram for processing method that second embodiment of the invention provides.
Fig. 5 shows a kind of module diagram of data summarization module provided in an embodiment of the present invention.
Fig. 6 shows a kind of module diagram of submodule of classifying provided in an embodiment of the present invention.
Specific embodiment
To make the purpose, technical scheme and advantage of the embodiment of the present invention clearer, below in conjunction with the embodiment of the present invention In attached drawing, the technical solution in the embodiment of the present invention is clearly and completely described, it is clear that described embodiment is Part of the embodiment of the present invention, instead of all the embodiments.The present invention implementation being usually described and illustrated herein in the accompanying drawings The component of example can configure to arrange and design with a variety of.
Therefore, below the detailed description of the embodiment of the present invention to providing in the accompanying drawings be not intended to limit it is claimed The scope of the present invention, but be merely representative of the present invention selected embodiment.Based on the embodiments of the present invention, this field is common Technical staff's all other embodiments obtained without creative efforts belong to the model that the present invention protects It encloses.
It should be noted that:Similar label and letter represents similar terms in following attached drawing, therefore, once a certain Xiang Yi It is defined, then it further need not be defined and explained in subsequent attached drawing in a attached drawing.
In the description of the present invention, it is necessary to which explanation, term " first ", " second ", " the 3rd " etc. are only used for distinguishing and retouch It states, and it is not intended that instruction or hint relative importance.
First embodiment
As shown in Figure 1, a kind of multi-source heterogeneous data collecting system based on education big data provided in an embodiment of the present invention 100.The multi-source heterogeneous data collecting system 100 includes:At least a set of collecting device 110, memory 120, storage control 130th, processor 140 and data summarizing module 150.
Wherein, what the often set collecting device 110 in above-mentioned was made of the multiple components being applied in classroom, for example, Including:The instruments such as camera, electronic whiteboard, laser pen, projecting apparatus, touch-screen, microphone array, mobile terminal.Wherein, it is mobile Terminal includes but not limited to:The equipment such as smart mobile phone, PC, laptop, tablet computer and Intelligent bracelet, can be with Understand, it is limitation of the present invention that can not be understood as above-mentioned exemplifications set out, as long as having access to based on Internet of Things The instruments such as sensor, the ancillary equipment in the network system in the multi-source heterogeneous data collecting system 100 of network technology all should be understood that Into being within the scope of the present invention.
Wherein, collecting device 110 is often covered, for gathering the behavioral data of teaching process middle school student and/or teacher, in order to make The data of acquisition are comprehensive, that is, are related to the every aspect of student and teacher's behaviors, it is above-mentioned in often set collecting device 110 be by structure Multiple components of the frame in a network system are formed.
It is to be appreciated that " being applied to classroom " in above-mentioned can not be understood as being limitation of the present invention, due to classroom It is to record students ' behavior and the most place of teacher's behaviors as the important place of education, therefore only as representative Place, that is to say, that the data that collecting device 110 gathers are not limited to be collected in the data in classroom.
The memory 120, storage control 130,140 each element of processor directly or indirectly electrically connect between each other It connects, to realize the transmission of data or interaction.For example, these elements can pass through one or more communication bus or signal between each other Line, which is realized, to be electrically connected.The data summarization module 150 include it is at least one can be in the form of software or firmware (firmware) It is stored in the memory 120 or is solidificated in the operating system (operating of the multi-source heterogeneous data collecting system 100 System, OS) in software function module.The processor 140 is used to perform the executable module stored in memory 120, Such as the software function module or computer program that the data summarization module 150 includes.
Wherein, memory 120 may be, but not limited to, random access memory (Random Access Memory, RAM), read-only memory (Read Only Memory, ROM), programmable read only memory (Programmable Read- Only Memory, PROM), erasable read-only memory (Erasable Programmable Read-Only Memory, EPROM), electricallyerasable ROM (EEROM) (Electric Erasable Programmable Read-Only Memory, EEPROM) etc..Wherein, memory 120 is for storing program, and the processor 140 is after receiving and executing instruction, described in execution Program, performed by the multi-source heterogeneous data collecting system 100 for the flow definition that aftermentioned any embodiment of the embodiment of the present invention discloses Method can be applied to realize in processor 140 or by processor 140.
Processor 140 may be a kind of IC chip, have the processing capacity of signal.Above-mentioned processor can be General processor, including central processing unit (Central Processing Unit, CPU), network processing unit (Network Processor, NP) etc.;It can also be digital signal processor (DSP), application-specific integrated circuit (ASIC), ready-made programmable gate array Arrange (FPGA) either other programmable logic device, discrete gate or transistor logic, discrete hardware components.It can realize Or disclosed each method, step and logic diagram in the execution embodiment of the present invention.General processor can be microprocessor Or the processor 140 can also be any conventional processor etc..
Second embodiment
Referring to Fig. 2, to be provided in an embodiment of the present invention a kind of applied to above-mentioned multi-source heterogeneous data collecting system 100 Processing method, below in conjunction with Fig. 2 to its institute comprising the step of illustrate.
Step S101:Collect data.
The data can be derived from the data gathered based at least a set of collecting device in above-mentioned or come Come from the data manually imported.Wherein, manually importing data includes:Student data, teacher's data and school's data.Wherein, Student data includes:Student's essential information, campus card information, student's details, class's information, student individuality information, It practises experience, teacher evaluation information, family's details, learning experiences, archive of student, exam information, campus card consumption information, learn Practise the data such as behavioral data;Wherein, teacher's data include:Teaching and administrative staff's essential information, teaching and administrative staff's details, class's information;Its In, school's data include:School leaders information, campus facilities' information, school's mechanism information, school's essential information, School Buildings Information etc..
Wherein, when being acquired to data, such a thinking can be based on and carried out, i.e., using Xapi specifications and tradition Acquisition mode (mode e.g., manually imported) in meaning is combined.Wherein, Xapi is a kind of brand-new to be used for storing and accessing The technical specification of study, learner can pass through data, computer, mobile terminal and social platform from random time place It practises, and Xapi technologies can be collected, tracked and recorded on these study event datas.
Traditional data acquisition needs to set the information attribute to be stored in advance, directly collects and obtains structuring Data, it is this commonly used in some static attributes of acquisition, such as student, teaching and administrative staff, school, family's essential information.
Step S102:The data being collected into are cleaned according to preset standard form, filter out redundancy.
The data got due to acquired equipment and the data manually imported are most basic initial data, are not only tied Structure variation but also carry many redundancies, it is therefore desirable to the data got be cleaned, by various structures, interior Hold mixed and disorderly data cleansing into the data of unified standard form, and redundancy is filtered out during cleaning.Wherein, preset Reference format can be set according to actual use demand, for example, it may be structuring, unstructured and semi-structured mark Quasiconfiguaration.
Wherein, when clearing up data, it is clear that missing values cleaning, format content cleaning and logic error can be included It washes.Missing values are most common data problems, processing missing values also have many methods, may be employed here following steps into Row:It is according to definite missing values scope, the missing values ratio of each field to be calculated, then according to missing ratio and word first Section importance, generates strategy respectively;Followed by, delete, remove unwanted field, finally, fill missing content, wherein, certain A little missing values can be filled, and according to professional knowledge and experience can be speculated filling missing values or be set one according to demand A standard, in the hope of its average or other methods supplement missing values.Certainly, shortage of data amount is larger, the higher situation of error rate Under, it can ask to reacquire data or obtain relevant data from other channels.
Wherein, log information is important data source, substantially can be with first number in terms of its usual data format and content According to description it is consistent.And if artificially collecting or user fills in, then its form may with exist in content it is certain Deviation.So for the data being collected into, its similar data is processed into consistent form, should not be deposited in clearing contents Character and remove the content not being inconsistent in a certain field with the field.
Wherein, logic error cleaning mainly removes the problem of some simply can be found that in logic, such as a people There is space among name, then system can be judged as two people, so its duplicate removal is handled by simple parser;Also Have and exactly remove some unreasonable values, the age hundreds of of such as one people, even a few Your Highness, this apparent mistake can be by it It deletes or is handled according to missing values.
Step S103:Classify to the data after cleaning, obtain grouped data.
After the data after being cleaned, classify to it, obtain grouped data, as a kind of embodiment, with Fig. 3 Comprising flow chart this process is illustrated.
Step S201:The type of data after cleaning is identified.
, it is necessary to classify to it after data cleansing, to be managed collectively since to classify, just should be to cleaning after The classification of data is identified, to identify which kind of type is the data belong to, due to cleaning when be according to preset standard What form was cleaned, that is to say, that in the present embodiment, the data after cleaning include structuring, unstructured and half structure The three kinds of data changed since the attribute of each structure is different, can be identified accordingly.
Step S202:The type that will identify that stamps tag along sort, obtains grouped data.
After the type that the data belong to is identified, which is stamped into tag along sort, so as to obtain grouped data, in order to It readily appreciates, below illustrates citing, for example, when the classification for identifying the data is structured type, just stamp table Levy the label of structured type;When the classification for identifying the data is unstructured type, the unstructured class of characterization is just stamped The label of type;When the classification for identifying the data is semi-structured type, the label for characterizing semi-structured type is just stamped.
Step S104:Grouped data storage is arrived and the corresponding database of the grouped data.
After obtaining grouped data, corresponding database is stored to, wherein, the database includes: Hadoop databases, Mysql databases and Nosql databases.The data that unstructured type will be belonged to are stored to Hadoop The data for belonging to structured type are stored to Mysql databases, the data storage for belonging to semi-structured type are arrived by database Nosql databases.
3rd embodiment
As another embodiment, referring to Fig. 4, it is different to be applied to above-mentioned multi-source for one kind provided in an embodiment of the present invention The processing method of structure data collecting system 100, below in conjunction with Fig. 4 to its institute comprising the step of illustrate.
Step S301:Collect data.
The step is identical with step S101, illustrates and refers to step S101.
Step S302:The data being collected into are cleaned according to preset standard form, filter out redundancy.
The step is identical with step S102, illustrates and refers to step S102.
Step S303:Judge whether the form of the data after cleaning is consistent with the preset standard form.
In order to avoid the data cleansing to collection is obtained not enough thoroughly, it is necessary to be verified to the data after cleaning, judgement is clear Whether the form of the data after washing is consistent with preset standard form, if form is consistent, illustrates to clean thorough, execution step S304;If inconsistent, illustrate that data cleansing is not thorough enough, it is also necessary to continue to clean, then perform step S302, then to inconsistent Data re-start cleaning, until form is consistent.
Step S304:Classify to the data after cleaning, obtain grouped data.
The step is identical with step S103, illustrates and refers to step S103.
Step S305:Grouped data storage is arrived and the corresponding database of the grouped data.
The step is identical with step S104, illustrates and refers to step S104.
Fourth embodiment
The embodiment of the present invention additionally provides a kind of data summarization module 150, as shown in Figure 5.The data summarization module 150 is wrapped It includes:It collects submodule 151, cleaning submodule 152, judging submodule 153, classification submodule 154 and preserves submodule 155.
The collection submodule 151, for collecting the data of at least a set of collecting device acquisition described in and manually leading The data entered.
The cleaning submodule 152, for being cleaned according to preset standard form to the data got, filters out superfluous Remaining information.
The judging submodule 153, for judge cleaning after data form whether with the preset standard form one It causes.
The classification submodule 154, for classifying to the data after cleaning, obtains grouped data.Further, such as Shown in Fig. 6, which includes:Recognition unit 1541 and taxon 1542.
The recognition unit 1541, is identified for the type to the data after cleaning;
The taxon 1542, the type that will identify that stamp tag along sort, obtain grouped data.
It is described preservation submodule 155, for by the grouped data storage in the memory with the classification number According to corresponding database.
It should be noted that each embodiment in this specification is described by the way of progressive, each embodiment weight Point explanation is all difference from other examples, and just to refer each other for identical similar part between each embodiment.
The technique effect of the data summarization module 150 that the embodiment of the present invention is provided, realization principle and generation and foregoing Embodiment of the method is identical, and to briefly describe, device embodiment part does not refer to part, can refer to corresponding in preceding method embodiment Content.
In several embodiments provided herein, it should be understood that disclosed apparatus and method can also pass through Other modes are realized.The apparatus embodiments described above are merely exemplary, for example, flow chart and block diagram in attached drawing Show the devices of multiple embodiments according to the present invention, method and computer program product architectural framework in the cards, Function and operation.In this regard, each box in flow chart or block diagram can represent the one of a module, program segment or code Part, a part for the module, program segment or code include one or more and are used to implement holding for defined logic function Row instruction.It should also be noted that at some as in the realization method replaced, the function that is marked in box can also be to be different from The order marked in attached drawing occurs.For example, two continuous boxes can essentially perform substantially in parallel, they are sometimes It can perform in the opposite order, this is depending on involved function.It is it is also noted that every in block diagram and/or flow chart The combination of a box and the box in block diagram and/or flow chart can use function or the dedicated base of action as defined in performing It realizes or can be realized with the combination of specialized hardware and computer instruction in the system of hardware.
In addition, each function module in each embodiment of the present invention can integrate to form an independent portion Point or modules individualism, can also two or more modules be integrated to form an independent part.
If the function is realized in the form of software function module and is independent production marketing or in use, can be with It is stored in a computer read/write memory medium.Based on such understanding, technical scheme is substantially in other words The part contribute to the prior art or the part of the technical solution can be embodied in the form of software product, the meter Calculation machine software product is stored in a storage medium, is used including some instructions so that a computer equipment (can be People's computer, server or network equipment etc.) perform all or part of the steps of the method according to each embodiment of the present invention. And foregoing storage medium includes:USB flash disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), arbitrary access The various media that can store program code such as memory (RAM, Random Access Memory), magnetic disc or CD.It needs It is noted that herein, relational terms such as first and second and the like are used merely to an entity or operation It is distinguished with another entity or operation, without necessarily requiring or implying between these entities or operation, there are any this Actual relation or order.Moreover, term " comprising ", "comprising" or its any other variant are intended to nonexcludability Comprising so that process, method, article or equipment including a series of elements are not only including those elements, but also wrap Include other elements that are not explicitly listed or further include for this process, method, article or equipment it is intrinsic will Element.In the absence of more restrictions, the element limited by sentence "including a ...", it is not excluded that wanted including described Also there are other identical elements in the process of element, method, article or equipment.
The foregoing is only a preferred embodiment of the present invention, is not intended to limit the invention, for the skill of this field For art personnel, the invention may be variously modified and varied.Within the spirit and principles of the invention, that is made any repaiies Change, equivalent substitution, improvement etc., should all be included in the protection scope of the present invention.

Claims (10)

1. a kind of multi-source heterogeneous data collecting system based on education big data, which is characterized in that including:
Memory;
Processor;
At least a set of collecting device often covers the collecting device applied in a classroom, for gather teaching process middle school student and/ Or the behavioral data of teacher;
And data summarization module, the data summarization module are stored in the memory and including one or more by described The software function module that processor performs, the data summarization module are used for the data at least a set of collecting device acquisition It is collected, cleans and classifies.
2. multi-source heterogeneous data collecting system according to claim 1, which is characterized in that often cover the collecting device bag It includes:Applied at least one of camera, touch-screen, electronic whiteboard, microphone array and mobile terminal in classroom.
3. multi-source heterogeneous data collecting system according to claim 1, which is characterized in that at least a set of collecting device The data gathered are:At least one of voice data, image data and text data.
4. multi-source heterogeneous data collecting system according to claim 1, which is characterized in that the data summarization module bag It includes:
Submodule is collected, for collecting the data of at least a set of collecting device acquisition and the data manually imported described in;
Submodule is cleaned, for being cleaned according to preset standard form to the data got, filters out redundancy;
Classification submodule, for classifying to the data after cleaning, obtains grouped data;
Preserve submodule, for by the grouped data storage in the memory with the corresponding number of the grouped data According to storehouse.
5. multi-source heterogeneous data collecting system according to claim 4, which is characterized in that the classification submodule includes:
Recognition unit is identified for the type to the data after cleaning;
Taxon, the type for will identify that stamp tag along sort, obtain grouped data.
6. multi-source heterogeneous data collecting system according to claim 4, which is characterized in that the data summarization module is also wrapped It includes:Judging submodule, for judging whether the form of the data after cleaning is consistent with the preset standard form.
7. multi-source heterogeneous data collecting system according to claim 6, which is characterized in that be stored in the memory Database includes:Hadoop databases, Mysql databases and Nosql databases.
8. a kind of processing method based on education big data, which is characterized in that applied to based on the multi-source heterogeneous of education big data Data collecting system, the multi-source heterogeneous data collecting system include:At least a set of collecting device, often covering the collecting device should For in a classroom, the described method includes:
Collect data;
The data being collected into are cleaned according to preset standard form, filter out redundancy;
Classify to the data after cleaning, obtain grouped data;
Grouped data storage is arrived and the corresponding database of the grouped data.
9. according to the method described in claim 8, it is characterized in that, described pair cleaning after data classify, classified Data, including:
The type of data after cleaning is identified;
The type that will identify that stamps tag along sort, obtains grouped data.
10. according to the method described in claim 8, it is characterized in that, it is described according to preset standard form to the data that are collected into It is cleaned, after filtering out redundancy, the method further includes:
Judge whether the form of the data after cleaning is consistent with the preset standard form.
CN201711369499.3A 2017-12-15 2017-12-15 Multi-source heterogeneous data collecting system and processing method based on education big data Pending CN108121508A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711369499.3A CN108121508A (en) 2017-12-15 2017-12-15 Multi-source heterogeneous data collecting system and processing method based on education big data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711369499.3A CN108121508A (en) 2017-12-15 2017-12-15 Multi-source heterogeneous data collecting system and processing method based on education big data

Publications (1)

Publication Number Publication Date
CN108121508A true CN108121508A (en) 2018-06-05

Family

ID=62229382

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711369499.3A Pending CN108121508A (en) 2017-12-15 2017-12-15 Multi-source heterogeneous data collecting system and processing method based on education big data

Country Status (1)

Country Link
CN (1) CN108121508A (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109948052A (en) * 2019-03-08 2019-06-28 上海七牛信息技术有限公司 A kind of internet information filtering auditing system, method and device
CN110995639A (en) * 2019-08-30 2020-04-10 深圳精匠云创科技有限公司 Data transmission method
CN111488015A (en) * 2020-03-19 2020-08-04 成都理工大学 Temperature and humidity control method based on ARM11 platform
CN111784309A (en) * 2020-07-17 2020-10-16 了信信息科技(上海)有限公司 Data management platform and method for medicine research and development field
CN111949850A (en) * 2020-08-14 2020-11-17 北京锐安科技有限公司 Multi-source data acquisition method, device, equipment and storage medium
CN112905580A (en) * 2021-03-19 2021-06-04 贵州航天云网科技有限公司 Multi-source heterogeneous data fusion system and method based on industrial big data
CN113407604A (en) * 2021-05-21 2021-09-17 上汽通用五菱汽车股份有限公司 Data integration method, system and computer readable storage medium
CN113449326A (en) * 2021-08-30 2021-09-28 北京博睿天扬科技有限公司 Industrial big data analysis system based on multi-source heterogeneous data processing
CN117407381A (en) * 2023-09-26 2024-01-16 陕西小保当矿业有限公司 Real-time processing method and device for big data of mine industry

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105956015A (en) * 2016-04-22 2016-09-21 四川中软科技有限公司 Service platform integration method based on big data
CN106296498A (en) * 2015-05-21 2017-01-04 中兴通讯股份有限公司 Data processing method and device
CN106447561A (en) * 2016-10-08 2017-02-22 华中师范大学 Dynamic visualization method and system based on big education data
CN106446255A (en) * 2016-10-18 2017-02-22 安徽天达网络科技有限公司 Data processing method based on cloud server

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106296498A (en) * 2015-05-21 2017-01-04 中兴通讯股份有限公司 Data processing method and device
CN105956015A (en) * 2016-04-22 2016-09-21 四川中软科技有限公司 Service platform integration method based on big data
CN106447561A (en) * 2016-10-08 2017-02-22 华中师范大学 Dynamic visualization method and system based on big education data
CN106446255A (en) * 2016-10-18 2017-02-22 安徽天达网络科技有限公司 Data processing method based on cloud server

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109948052A (en) * 2019-03-08 2019-06-28 上海七牛信息技术有限公司 A kind of internet information filtering auditing system, method and device
CN110995639A (en) * 2019-08-30 2020-04-10 深圳精匠云创科技有限公司 Data transmission method
CN111488015A (en) * 2020-03-19 2020-08-04 成都理工大学 Temperature and humidity control method based on ARM11 platform
CN111784309A (en) * 2020-07-17 2020-10-16 了信信息科技(上海)有限公司 Data management platform and method for medicine research and development field
CN111949850A (en) * 2020-08-14 2020-11-17 北京锐安科技有限公司 Multi-source data acquisition method, device, equipment and storage medium
CN111949850B (en) * 2020-08-14 2024-03-22 北京锐安科技有限公司 Multi-source data acquisition method, device, equipment and storage medium
CN112905580A (en) * 2021-03-19 2021-06-04 贵州航天云网科技有限公司 Multi-source heterogeneous data fusion system and method based on industrial big data
CN112905580B (en) * 2021-03-19 2022-03-18 贵州航天云网科技有限公司 Multi-source heterogeneous data fusion system and method based on industrial big data
CN113407604A (en) * 2021-05-21 2021-09-17 上汽通用五菱汽车股份有限公司 Data integration method, system and computer readable storage medium
CN113449326A (en) * 2021-08-30 2021-09-28 北京博睿天扬科技有限公司 Industrial big data analysis system based on multi-source heterogeneous data processing
CN117407381A (en) * 2023-09-26 2024-01-16 陕西小保当矿业有限公司 Real-time processing method and device for big data of mine industry

Similar Documents

Publication Publication Date Title
CN108121508A (en) Multi-source heterogeneous data collecting system and processing method based on education big data
CN108121785A (en) A kind of analysis method based on education big data
Kaufmann et al. Drawing graphs: methods and models
US9519698B1 (en) Visualization of graphical representations of log files
CN108132887B (en) User interface method of calibration, device, software testing system, terminal and medium
CN108132989A (en) A kind of distributed system based on education big data
Bell et al. Data-driven agent-based exploration of customer behavior
He Improving user experience with case-based reasoning systems using text mining and Web 2.0
CN108345481A (en) A kind of page display method, device, client and server
US20180293155A1 (en) Intelligent device selection for mobile application testing
Hernández García et al. Visualizations of online course interactions for social network learning analytics
CN108664649A (en) Knowledge content method for pushing, device and push server
CN110392155B (en) Notification message display and processing method, device and equipment
CN108510384A (en) Loan product recommends method and device
CN108959469A (en) Read management method, device, computer equipment and storage medium
CN108986125A (en) Object edge extracting method, device and electronic equipment
CN109614414A (en) A kind of determination method and device of user information
US11095953B2 (en) Hierarchical video concept tagging and indexing system for learning content orchestration
Conde et al. Visual learning analytics techniques applied in software engineering subjects
CN109344255A (en) The fill method and terminal device of label
CN107368506A (en) Unstructured data analysis system and method
US10740070B2 (en) Locating features in a layered software application
CN105761181A (en) Method and apparatus for evaluating teaching performance
Maier et al. Learning analytics cockpit for MOOC platforms
Dobashi Time series analysis of the in class page view history of digital teaching materials using cross table

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20180605

RJ01 Rejection of invention patent application after publication