CN108121508A - Multi-source heterogeneous data collecting system and processing method based on education big data - Google Patents
Multi-source heterogeneous data collecting system and processing method based on education big data Download PDFInfo
- Publication number
- CN108121508A CN108121508A CN201711369499.3A CN201711369499A CN108121508A CN 108121508 A CN108121508 A CN 108121508A CN 201711369499 A CN201711369499 A CN 201711369499A CN 108121508 A CN108121508 A CN 108121508A
- Authority
- CN
- China
- Prior art keywords
- data
- collecting device
- source heterogeneous
- collecting system
- cleaning
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000003672 processing method Methods 0.000 title claims abstract description 12
- 238000000034 method Methods 0.000 claims abstract description 34
- 230000006870 function Effects 0.000 claims abstract description 14
- 230000008569 process Effects 0.000 claims abstract description 9
- 230000003542 behavioural effect Effects 0.000 claims abstract description 6
- 238000004140 cleaning Methods 0.000 claims description 31
- 238000013500 data storage Methods 0.000 claims description 8
- 238000001914 filtration Methods 0.000 claims description 2
- 238000012545 processing Methods 0.000 abstract description 7
- 238000003860 storage Methods 0.000 abstract description 5
- 238000004458 analytical method Methods 0.000 abstract description 3
- 238000005516 engineering process Methods 0.000 description 15
- 238000010586 diagram Methods 0.000 description 11
- 238000011161 development Methods 0.000 description 5
- 230000008901 benefit Effects 0.000 description 4
- 230000006399 behavior Effects 0.000 description 3
- 230000008859 change Effects 0.000 description 2
- 238000004891 communication Methods 0.000 description 2
- 238000004590 computer program Methods 0.000 description 2
- 238000010276 construction Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000004321 preservation Methods 0.000 description 2
- 241001269238 Data Species 0.000 description 1
- 230000009471 action Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000012512 characterization method Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 230000008676 import Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000000547 structure data Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 239000013589 supplement Substances 0.000 description 1
- 238000005406 washing Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0602—Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
- G06F3/0614—Improving the reliability of storage systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0628—Interfaces specially adapted for storage systems making use of a particular technique
- G06F3/0646—Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
- G06F3/0652—Erasing, e.g. deleting, data cleaning, moving of data to a wastebasket
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0668—Interfaces specially adapted for storage systems adopting a particular infrastructure
- G06F3/0671—In-line storage system
- G06F3/0673—Single storage device
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/10—Services
- G06Q50/20—Education
- G06Q50/205—Education administration or guidance
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Business, Economics & Management (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Human Computer Interaction (AREA)
- General Engineering & Computer Science (AREA)
- Educational Technology (AREA)
- Educational Administration (AREA)
- Strategic Management (AREA)
- Tourism & Hospitality (AREA)
- Health & Medical Sciences (AREA)
- Economics (AREA)
- General Health & Medical Sciences (AREA)
- Human Resources & Organizations (AREA)
- Marketing (AREA)
- Primary Health Care (AREA)
- General Business, Economics & Management (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The present invention relates to a kind of multi-source heterogeneous data collecting systems and processing method based on education big data, belong to technical field of data processing.Multi-source heterogeneous data collecting system, including:At least a set of collecting device, memory, processor and data summarization module.It often covers collecting device to be applied in a campus, for gathering the behavioral data of teaching process middle school student and/or teacher.In memory and including one or more software function modules performed by processor, data summarization module is used to that the data of at least a set of collecting device acquisition to be collected, cleaned and classified the storage of data summarization module.Data summarization module is during the data to every set collecting device acquisition are cleaned and classified, by the mixed and disorderly data of various structures, content according to certain form collator into unified data, and filter out redundancy therein, it ensure that the quality of data from source, improve the efficiency and reliability of subsequent analysis.
Description
Technical field
The invention belongs to technical field of data processing, and in particular to a kind of multi-source heterogeneous data based on education big data are adopted
Collecting system and processing method.
Background technology
With the development of science and technology, technology of Internet of things is increasingly becoming one of current hot issue, numerous world esbablished corporations are confused
Confusingly put into the research of technology of Internet of things.At the same time, basis of the Digital Campus Construction as education informationization construction, information
Change the demand driving development and update of correlation technique built, wherein, big data be presently the most a kind of popular technology and
Ability, it is that significant association is found from the unusual data of the various dimensions of magnanimity, excavates things changing rule, Accurate Prediction
The ability of things development trend.It is then directly to result from various educational activities to educate big data, how to obtain these data and place
The relation that reason excavates therebetween is most important basis.With the fast development of electronic technology and wireless communication technique, " wisdom is taught
The concepts such as room ", " smart city " are also emerged in large numbers respectively, this also becomes the trend of development in science and technology.In current " wisdom classroom "
Classroom as education important place, be in the place and data acquisition that each student and teacher are concerned about the most most
For one of important place.Common educational data acquisition is mostly derived from this four major classes technology, i.e. Internet of Things perceives class technology, depending on
Frequency records class technology, image identification class technology, platform acquisition class technology.One example of the four major classes technology is i.e. as " wisdom is taught
Room ".But the current solution for " wisdom classroom " also there are it is more the shortcomings that, be mainly shown as that the data got are inadequate
Comprehensively, so that can not accurately cover the every aspect in teaching, in addition, collecting device after data are obtained, just directly will
Data are transferred to subsequent equipment and carry out simple analysis, so as to based on the characteristics such as the basic association between data by its with chart and/
Or the form of word is shown, function is relatively single.
The content of the invention
In consideration of it, it is an object of the invention to provide it is a kind of based on education big data multi-source heterogeneous data collecting system and
Processing method, to effectively improve the above problem.
What the embodiment of the present invention was realized in:
In a first aspect, an embodiment of the present invention provides it is a kind of based on education big data multi-source heterogeneous data collecting system,
Including:At least a set of collecting device, memory, processor and data summarization module.The collecting device is often covered applied to one
In classroom, for gathering the behavioral data of teaching process middle school student and/or teacher;The data summarization module is stored in described deposit
In reservoir and including the software function module that one or more is performed by the processor, the data summarization module is used for institute
The data for stating at least a set of collecting device acquisition are collected, clean and classify.
In preferred embodiments of the present invention, often covering the collecting device includes:Applied in classroom camera, touch
At least one of screen, electronic whiteboard, microphone array and mobile terminal.
In preferred embodiments of the present invention, the data that at least a set of collecting device is gathered are:Voice data, figure
As at least one of data and text data.
In preferred embodiments of the present invention, the data summarization module includes:Submodule is collected, for collecting described in warp
The data of at least a set of collecting device acquisition and the data manually imported;Submodule is cleaned, for according to preset standard form
The data got are cleaned, filter out redundancy;Classification submodule, for classifying to the data after cleaning,
Obtain grouped data;Preserve submodule, for by the grouped data storage in the memory with the grouped data
Corresponding database.
In preferred embodiments of the present invention, the classification submodule includes:Recognition unit, for the data after cleaning
Type be identified;Taxon, the type for will identify that stamp tag along sort, obtain grouped data.
In preferred embodiments of the present invention, the data summarization module further includes:Judging submodule, for judging to clean
Whether the form of data afterwards is consistent with the preset standard form.
In preferred embodiments of the present invention, the database being stored in the memory includes:Hadoop databases,
Mysql databases and Nosql databases.
Second aspect, the embodiment of the present invention additionally provide it is a kind of based on education big data processing method, applied to based on
The multi-source heterogeneous data collecting system of big data is educated, the multi-source heterogeneous data collecting system includes:At least a set of acquisition is set
It is standby, it often covers the collecting device and is applied in a classroom, the described method includes:Collect at least a set of collecting device acquisition described in
Data and the data that manually import;The data being collected into are cleaned according to preset standard form, filter out redundancy letter
Breath;Classify to the data after cleaning, obtain grouped data;Grouped data storage is arrived opposite with the grouped data
The database answered.
In preferred embodiments of the present invention, the data after described pair of cleaning are classified, and obtain grouped data, including:
The type of data after cleaning is identified;The type that will identify that stamps tag along sort, obtains grouped data.
It is described that the data being collected into are cleaned according to preset standard form in preferred embodiments of the present invention, mistake
After filtering redundancy, the method further includes:Judge cleaning after data form whether with the preset standard form
Unanimously.
Multi-source heterogeneous data collecting system and processing method provided in an embodiment of the present invention based on education big data, this is more
Source isomeric data acquisition system, including:At least a set of collecting device, memory and data summarization module.Wherein, often described in set
Collecting device is applied in a classroom, for gathering the behavioral data of teaching process middle school student and/or teacher.Often cover collecting device
The data of acquisition are after data summarization module is collected, cleans and classifies, then are managed collectively preservation, so as to follow-up further
The data are analyzed.During data are cleaned and classified, the mixed and disorderly data of various structures, content are pressed
According to certain form collator into unified data, and redundancy therein is filtered out, the quality of data is ensure that from source, is carried
The high efficiency and reliability of subsequent analysis.
Other features and advantages of the present invention will be illustrated in subsequent specification, also, partly be become from specification
It is clear that understood by implementing the embodiment of the present invention.The purpose of the present invention and other advantages can be by being write
Specifically noted structure is realized and obtained in specification, claims and attached drawing.
Description of the drawings
It in order to illustrate more clearly about the embodiment of the present invention or technical scheme of the prior art, below will be to institute in embodiment
Attached drawing to be used is needed to be briefly described, it should be apparent that, the accompanying drawings in the following description is only some implementations of the present invention
Example, for those of ordinary skill in the art, without creative efforts, can also obtain according to these attached drawings
Obtain other attached drawings.By the way that shown in attached drawing, above and other purpose of the invention, feature and advantage will become apparent from.In whole
Identical reference numeral indicates identical part in attached drawing.Deliberately attached drawing, emphasis are not drawn by actual size equal proportion scaling
It is the purport for showing the present invention.
Fig. 1 shows a kind of structure diagram of multi-source heterogeneous data collecting system provided in an embodiment of the present invention.
Fig. 2 shows a kind of method flow diagram for processing method that first embodiment of the invention provides.
Fig. 3 shows the method flow diagram of the step S103 in Fig. 2 provided in an embodiment of the present invention.
Fig. 4 shows a kind of method flow diagram for processing method that second embodiment of the invention provides.
Fig. 5 shows a kind of module diagram of data summarization module provided in an embodiment of the present invention.
Fig. 6 shows a kind of module diagram of submodule of classifying provided in an embodiment of the present invention.
Specific embodiment
To make the purpose, technical scheme and advantage of the embodiment of the present invention clearer, below in conjunction with the embodiment of the present invention
In attached drawing, the technical solution in the embodiment of the present invention is clearly and completely described, it is clear that described embodiment is
Part of the embodiment of the present invention, instead of all the embodiments.The present invention implementation being usually described and illustrated herein in the accompanying drawings
The component of example can configure to arrange and design with a variety of.
Therefore, below the detailed description of the embodiment of the present invention to providing in the accompanying drawings be not intended to limit it is claimed
The scope of the present invention, but be merely representative of the present invention selected embodiment.Based on the embodiments of the present invention, this field is common
Technical staff's all other embodiments obtained without creative efforts belong to the model that the present invention protects
It encloses.
It should be noted that:Similar label and letter represents similar terms in following attached drawing, therefore, once a certain Xiang Yi
It is defined, then it further need not be defined and explained in subsequent attached drawing in a attached drawing.
In the description of the present invention, it is necessary to which explanation, term " first ", " second ", " the 3rd " etc. are only used for distinguishing and retouch
It states, and it is not intended that instruction or hint relative importance.
First embodiment
As shown in Figure 1, a kind of multi-source heterogeneous data collecting system based on education big data provided in an embodiment of the present invention
100.The multi-source heterogeneous data collecting system 100 includes:At least a set of collecting device 110, memory 120, storage control
130th, processor 140 and data summarizing module 150.
Wherein, what the often set collecting device 110 in above-mentioned was made of the multiple components being applied in classroom, for example,
Including:The instruments such as camera, electronic whiteboard, laser pen, projecting apparatus, touch-screen, microphone array, mobile terminal.Wherein, it is mobile
Terminal includes but not limited to:The equipment such as smart mobile phone, PC, laptop, tablet computer and Intelligent bracelet, can be with
Understand, it is limitation of the present invention that can not be understood as above-mentioned exemplifications set out, as long as having access to based on Internet of Things
The instruments such as sensor, the ancillary equipment in the network system in the multi-source heterogeneous data collecting system 100 of network technology all should be understood that
Into being within the scope of the present invention.
Wherein, collecting device 110 is often covered, for gathering the behavioral data of teaching process middle school student and/or teacher, in order to make
The data of acquisition are comprehensive, that is, are related to the every aspect of student and teacher's behaviors, it is above-mentioned in often set collecting device 110 be by structure
Multiple components of the frame in a network system are formed.
It is to be appreciated that " being applied to classroom " in above-mentioned can not be understood as being limitation of the present invention, due to classroom
It is to record students ' behavior and the most place of teacher's behaviors as the important place of education, therefore only as representative
Place, that is to say, that the data that collecting device 110 gathers are not limited to be collected in the data in classroom.
The memory 120, storage control 130,140 each element of processor directly or indirectly electrically connect between each other
It connects, to realize the transmission of data or interaction.For example, these elements can pass through one or more communication bus or signal between each other
Line, which is realized, to be electrically connected.The data summarization module 150 include it is at least one can be in the form of software or firmware (firmware)
It is stored in the memory 120 or is solidificated in the operating system (operating of the multi-source heterogeneous data collecting system 100
System, OS) in software function module.The processor 140 is used to perform the executable module stored in memory 120,
Such as the software function module or computer program that the data summarization module 150 includes.
Wherein, memory 120 may be, but not limited to, random access memory (Random Access Memory,
RAM), read-only memory (Read Only Memory, ROM), programmable read only memory (Programmable Read-
Only Memory, PROM), erasable read-only memory (Erasable Programmable Read-Only Memory,
EPROM), electricallyerasable ROM (EEROM) (Electric Erasable Programmable Read-Only Memory,
EEPROM) etc..Wherein, memory 120 is for storing program, and the processor 140 is after receiving and executing instruction, described in execution
Program, performed by the multi-source heterogeneous data collecting system 100 for the flow definition that aftermentioned any embodiment of the embodiment of the present invention discloses
Method can be applied to realize in processor 140 or by processor 140.
Processor 140 may be a kind of IC chip, have the processing capacity of signal.Above-mentioned processor can be
General processor, including central processing unit (Central Processing Unit, CPU), network processing unit (Network
Processor, NP) etc.;It can also be digital signal processor (DSP), application-specific integrated circuit (ASIC), ready-made programmable gate array
Arrange (FPGA) either other programmable logic device, discrete gate or transistor logic, discrete hardware components.It can realize
Or disclosed each method, step and logic diagram in the execution embodiment of the present invention.General processor can be microprocessor
Or the processor 140 can also be any conventional processor etc..
Second embodiment
Referring to Fig. 2, to be provided in an embodiment of the present invention a kind of applied to above-mentioned multi-source heterogeneous data collecting system 100
Processing method, below in conjunction with Fig. 2 to its institute comprising the step of illustrate.
Step S101:Collect data.
The data can be derived from the data gathered based at least a set of collecting device in above-mentioned or come
Come from the data manually imported.Wherein, manually importing data includes:Student data, teacher's data and school's data.Wherein,
Student data includes:Student's essential information, campus card information, student's details, class's information, student individuality information,
It practises experience, teacher evaluation information, family's details, learning experiences, archive of student, exam information, campus card consumption information, learn
Practise the data such as behavioral data;Wherein, teacher's data include:Teaching and administrative staff's essential information, teaching and administrative staff's details, class's information;Its
In, school's data include:School leaders information, campus facilities' information, school's mechanism information, school's essential information, School Buildings
Information etc..
Wherein, when being acquired to data, such a thinking can be based on and carried out, i.e., using Xapi specifications and tradition
Acquisition mode (mode e.g., manually imported) in meaning is combined.Wherein, Xapi is a kind of brand-new to be used for storing and accessing
The technical specification of study, learner can pass through data, computer, mobile terminal and social platform from random time place
It practises, and Xapi technologies can be collected, tracked and recorded on these study event datas.
Traditional data acquisition needs to set the information attribute to be stored in advance, directly collects and obtains structuring
Data, it is this commonly used in some static attributes of acquisition, such as student, teaching and administrative staff, school, family's essential information.
Step S102:The data being collected into are cleaned according to preset standard form, filter out redundancy.
The data got due to acquired equipment and the data manually imported are most basic initial data, are not only tied
Structure variation but also carry many redundancies, it is therefore desirable to the data got be cleaned, by various structures, interior
Hold mixed and disorderly data cleansing into the data of unified standard form, and redundancy is filtered out during cleaning.Wherein, preset
Reference format can be set according to actual use demand, for example, it may be structuring, unstructured and semi-structured mark
Quasiconfiguaration.
Wherein, when clearing up data, it is clear that missing values cleaning, format content cleaning and logic error can be included
It washes.Missing values are most common data problems, processing missing values also have many methods, may be employed here following steps into
Row:It is according to definite missing values scope, the missing values ratio of each field to be calculated, then according to missing ratio and word first
Section importance, generates strategy respectively;Followed by, delete, remove unwanted field, finally, fill missing content, wherein, certain
A little missing values can be filled, and according to professional knowledge and experience can be speculated filling missing values or be set one according to demand
A standard, in the hope of its average or other methods supplement missing values.Certainly, shortage of data amount is larger, the higher situation of error rate
Under, it can ask to reacquire data or obtain relevant data from other channels.
Wherein, log information is important data source, substantially can be with first number in terms of its usual data format and content
According to description it is consistent.And if artificially collecting or user fills in, then its form may with exist in content it is certain
Deviation.So for the data being collected into, its similar data is processed into consistent form, should not be deposited in clearing contents
Character and remove the content not being inconsistent in a certain field with the field.
Wherein, logic error cleaning mainly removes the problem of some simply can be found that in logic, such as a people
There is space among name, then system can be judged as two people, so its duplicate removal is handled by simple parser;Also
Have and exactly remove some unreasonable values, the age hundreds of of such as one people, even a few Your Highness, this apparent mistake can be by it
It deletes or is handled according to missing values.
Step S103:Classify to the data after cleaning, obtain grouped data.
After the data after being cleaned, classify to it, obtain grouped data, as a kind of embodiment, with Fig. 3
Comprising flow chart this process is illustrated.
Step S201:The type of data after cleaning is identified.
, it is necessary to classify to it after data cleansing, to be managed collectively since to classify, just should be to cleaning after
The classification of data is identified, to identify which kind of type is the data belong to, due to cleaning when be according to preset standard
What form was cleaned, that is to say, that in the present embodiment, the data after cleaning include structuring, unstructured and half structure
The three kinds of data changed since the attribute of each structure is different, can be identified accordingly.
Step S202:The type that will identify that stamps tag along sort, obtains grouped data.
After the type that the data belong to is identified, which is stamped into tag along sort, so as to obtain grouped data, in order to
It readily appreciates, below illustrates citing, for example, when the classification for identifying the data is structured type, just stamp table
Levy the label of structured type;When the classification for identifying the data is unstructured type, the unstructured class of characterization is just stamped
The label of type;When the classification for identifying the data is semi-structured type, the label for characterizing semi-structured type is just stamped.
Step S104:Grouped data storage is arrived and the corresponding database of the grouped data.
After obtaining grouped data, corresponding database is stored to, wherein, the database includes:
Hadoop databases, Mysql databases and Nosql databases.The data that unstructured type will be belonged to are stored to Hadoop
The data for belonging to structured type are stored to Mysql databases, the data storage for belonging to semi-structured type are arrived by database
Nosql databases.
3rd embodiment
As another embodiment, referring to Fig. 4, it is different to be applied to above-mentioned multi-source for one kind provided in an embodiment of the present invention
The processing method of structure data collecting system 100, below in conjunction with Fig. 4 to its institute comprising the step of illustrate.
Step S301:Collect data.
The step is identical with step S101, illustrates and refers to step S101.
Step S302:The data being collected into are cleaned according to preset standard form, filter out redundancy.
The step is identical with step S102, illustrates and refers to step S102.
Step S303:Judge whether the form of the data after cleaning is consistent with the preset standard form.
In order to avoid the data cleansing to collection is obtained not enough thoroughly, it is necessary to be verified to the data after cleaning, judgement is clear
Whether the form of the data after washing is consistent with preset standard form, if form is consistent, illustrates to clean thorough, execution step
S304;If inconsistent, illustrate that data cleansing is not thorough enough, it is also necessary to continue to clean, then perform step S302, then to inconsistent
Data re-start cleaning, until form is consistent.
Step S304:Classify to the data after cleaning, obtain grouped data.
The step is identical with step S103, illustrates and refers to step S103.
Step S305:Grouped data storage is arrived and the corresponding database of the grouped data.
The step is identical with step S104, illustrates and refers to step S104.
Fourth embodiment
The embodiment of the present invention additionally provides a kind of data summarization module 150, as shown in Figure 5.The data summarization module 150 is wrapped
It includes:It collects submodule 151, cleaning submodule 152, judging submodule 153, classification submodule 154 and preserves submodule 155.
The collection submodule 151, for collecting the data of at least a set of collecting device acquisition described in and manually leading
The data entered.
The cleaning submodule 152, for being cleaned according to preset standard form to the data got, filters out superfluous
Remaining information.
The judging submodule 153, for judge cleaning after data form whether with the preset standard form one
It causes.
The classification submodule 154, for classifying to the data after cleaning, obtains grouped data.Further, such as
Shown in Fig. 6, which includes:Recognition unit 1541 and taxon 1542.
The recognition unit 1541, is identified for the type to the data after cleaning;
The taxon 1542, the type that will identify that stamp tag along sort, obtain grouped data.
It is described preservation submodule 155, for by the grouped data storage in the memory with the classification number
According to corresponding database.
It should be noted that each embodiment in this specification is described by the way of progressive, each embodiment weight
Point explanation is all difference from other examples, and just to refer each other for identical similar part between each embodiment.
The technique effect of the data summarization module 150 that the embodiment of the present invention is provided, realization principle and generation and foregoing
Embodiment of the method is identical, and to briefly describe, device embodiment part does not refer to part, can refer to corresponding in preceding method embodiment
Content.
In several embodiments provided herein, it should be understood that disclosed apparatus and method can also pass through
Other modes are realized.The apparatus embodiments described above are merely exemplary, for example, flow chart and block diagram in attached drawing
Show the devices of multiple embodiments according to the present invention, method and computer program product architectural framework in the cards,
Function and operation.In this regard, each box in flow chart or block diagram can represent the one of a module, program segment or code
Part, a part for the module, program segment or code include one or more and are used to implement holding for defined logic function
Row instruction.It should also be noted that at some as in the realization method replaced, the function that is marked in box can also be to be different from
The order marked in attached drawing occurs.For example, two continuous boxes can essentially perform substantially in parallel, they are sometimes
It can perform in the opposite order, this is depending on involved function.It is it is also noted that every in block diagram and/or flow chart
The combination of a box and the box in block diagram and/or flow chart can use function or the dedicated base of action as defined in performing
It realizes or can be realized with the combination of specialized hardware and computer instruction in the system of hardware.
In addition, each function module in each embodiment of the present invention can integrate to form an independent portion
Point or modules individualism, can also two or more modules be integrated to form an independent part.
If the function is realized in the form of software function module and is independent production marketing or in use, can be with
It is stored in a computer read/write memory medium.Based on such understanding, technical scheme is substantially in other words
The part contribute to the prior art or the part of the technical solution can be embodied in the form of software product, the meter
Calculation machine software product is stored in a storage medium, is used including some instructions so that a computer equipment (can be
People's computer, server or network equipment etc.) perform all or part of the steps of the method according to each embodiment of the present invention.
And foregoing storage medium includes:USB flash disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), arbitrary access
The various media that can store program code such as memory (RAM, Random Access Memory), magnetic disc or CD.It needs
It is noted that herein, relational terms such as first and second and the like are used merely to an entity or operation
It is distinguished with another entity or operation, without necessarily requiring or implying between these entities or operation, there are any this
Actual relation or order.Moreover, term " comprising ", "comprising" or its any other variant are intended to nonexcludability
Comprising so that process, method, article or equipment including a series of elements are not only including those elements, but also wrap
Include other elements that are not explicitly listed or further include for this process, method, article or equipment it is intrinsic will
Element.In the absence of more restrictions, the element limited by sentence "including a ...", it is not excluded that wanted including described
Also there are other identical elements in the process of element, method, article or equipment.
The foregoing is only a preferred embodiment of the present invention, is not intended to limit the invention, for the skill of this field
For art personnel, the invention may be variously modified and varied.Within the spirit and principles of the invention, that is made any repaiies
Change, equivalent substitution, improvement etc., should all be included in the protection scope of the present invention.
Claims (10)
1. a kind of multi-source heterogeneous data collecting system based on education big data, which is characterized in that including:
Memory;
Processor;
At least a set of collecting device often covers the collecting device applied in a classroom, for gather teaching process middle school student and/
Or the behavioral data of teacher;
And data summarization module, the data summarization module are stored in the memory and including one or more by described
The software function module that processor performs, the data summarization module are used for the data at least a set of collecting device acquisition
It is collected, cleans and classifies.
2. multi-source heterogeneous data collecting system according to claim 1, which is characterized in that often cover the collecting device bag
It includes:Applied at least one of camera, touch-screen, electronic whiteboard, microphone array and mobile terminal in classroom.
3. multi-source heterogeneous data collecting system according to claim 1, which is characterized in that at least a set of collecting device
The data gathered are:At least one of voice data, image data and text data.
4. multi-source heterogeneous data collecting system according to claim 1, which is characterized in that the data summarization module bag
It includes:
Submodule is collected, for collecting the data of at least a set of collecting device acquisition and the data manually imported described in;
Submodule is cleaned, for being cleaned according to preset standard form to the data got, filters out redundancy;
Classification submodule, for classifying to the data after cleaning, obtains grouped data;
Preserve submodule, for by the grouped data storage in the memory with the corresponding number of the grouped data
According to storehouse.
5. multi-source heterogeneous data collecting system according to claim 4, which is characterized in that the classification submodule includes:
Recognition unit is identified for the type to the data after cleaning;
Taxon, the type for will identify that stamp tag along sort, obtain grouped data.
6. multi-source heterogeneous data collecting system according to claim 4, which is characterized in that the data summarization module is also wrapped
It includes:Judging submodule, for judging whether the form of the data after cleaning is consistent with the preset standard form.
7. multi-source heterogeneous data collecting system according to claim 6, which is characterized in that be stored in the memory
Database includes:Hadoop databases, Mysql databases and Nosql databases.
8. a kind of processing method based on education big data, which is characterized in that applied to based on the multi-source heterogeneous of education big data
Data collecting system, the multi-source heterogeneous data collecting system include:At least a set of collecting device, often covering the collecting device should
For in a classroom, the described method includes:
Collect data;
The data being collected into are cleaned according to preset standard form, filter out redundancy;
Classify to the data after cleaning, obtain grouped data;
Grouped data storage is arrived and the corresponding database of the grouped data.
9. according to the method described in claim 8, it is characterized in that, described pair cleaning after data classify, classified
Data, including:
The type of data after cleaning is identified;
The type that will identify that stamps tag along sort, obtains grouped data.
10. according to the method described in claim 8, it is characterized in that, it is described according to preset standard form to the data that are collected into
It is cleaned, after filtering out redundancy, the method further includes:
Judge whether the form of the data after cleaning is consistent with the preset standard form.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711369499.3A CN108121508A (en) | 2017-12-15 | 2017-12-15 | Multi-source heterogeneous data collecting system and processing method based on education big data |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711369499.3A CN108121508A (en) | 2017-12-15 | 2017-12-15 | Multi-source heterogeneous data collecting system and processing method based on education big data |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108121508A true CN108121508A (en) | 2018-06-05 |
Family
ID=62229382
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201711369499.3A Pending CN108121508A (en) | 2017-12-15 | 2017-12-15 | Multi-source heterogeneous data collecting system and processing method based on education big data |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108121508A (en) |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109948052A (en) * | 2019-03-08 | 2019-06-28 | 上海七牛信息技术有限公司 | A kind of internet information filtering auditing system, method and device |
CN110995639A (en) * | 2019-08-30 | 2020-04-10 | 深圳精匠云创科技有限公司 | Data transmission method |
CN111488015A (en) * | 2020-03-19 | 2020-08-04 | 成都理工大学 | Temperature and humidity control method based on ARM11 platform |
CN111784309A (en) * | 2020-07-17 | 2020-10-16 | 了信信息科技(上海)有限公司 | Data management platform and method for medicine research and development field |
CN111949850A (en) * | 2020-08-14 | 2020-11-17 | 北京锐安科技有限公司 | Multi-source data acquisition method, device, equipment and storage medium |
CN112905580A (en) * | 2021-03-19 | 2021-06-04 | 贵州航天云网科技有限公司 | Multi-source heterogeneous data fusion system and method based on industrial big data |
CN113407604A (en) * | 2021-05-21 | 2021-09-17 | 上汽通用五菱汽车股份有限公司 | Data integration method, system and computer readable storage medium |
CN113449326A (en) * | 2021-08-30 | 2021-09-28 | 北京博睿天扬科技有限公司 | Industrial big data analysis system based on multi-source heterogeneous data processing |
CN117407381A (en) * | 2023-09-26 | 2024-01-16 | 陕西小保当矿业有限公司 | Real-time processing method and device for big data of mine industry |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105956015A (en) * | 2016-04-22 | 2016-09-21 | 四川中软科技有限公司 | Service platform integration method based on big data |
CN106296498A (en) * | 2015-05-21 | 2017-01-04 | 中兴通讯股份有限公司 | Data processing method and device |
CN106447561A (en) * | 2016-10-08 | 2017-02-22 | 华中师范大学 | Dynamic visualization method and system based on big education data |
CN106446255A (en) * | 2016-10-18 | 2017-02-22 | 安徽天达网络科技有限公司 | Data processing method based on cloud server |
-
2017
- 2017-12-15 CN CN201711369499.3A patent/CN108121508A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106296498A (en) * | 2015-05-21 | 2017-01-04 | 中兴通讯股份有限公司 | Data processing method and device |
CN105956015A (en) * | 2016-04-22 | 2016-09-21 | 四川中软科技有限公司 | Service platform integration method based on big data |
CN106447561A (en) * | 2016-10-08 | 2017-02-22 | 华中师范大学 | Dynamic visualization method and system based on big education data |
CN106446255A (en) * | 2016-10-18 | 2017-02-22 | 安徽天达网络科技有限公司 | Data processing method based on cloud server |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109948052A (en) * | 2019-03-08 | 2019-06-28 | 上海七牛信息技术有限公司 | A kind of internet information filtering auditing system, method and device |
CN110995639A (en) * | 2019-08-30 | 2020-04-10 | 深圳精匠云创科技有限公司 | Data transmission method |
CN111488015A (en) * | 2020-03-19 | 2020-08-04 | 成都理工大学 | Temperature and humidity control method based on ARM11 platform |
CN111784309A (en) * | 2020-07-17 | 2020-10-16 | 了信信息科技(上海)有限公司 | Data management platform and method for medicine research and development field |
CN111949850A (en) * | 2020-08-14 | 2020-11-17 | 北京锐安科技有限公司 | Multi-source data acquisition method, device, equipment and storage medium |
CN111949850B (en) * | 2020-08-14 | 2024-03-22 | 北京锐安科技有限公司 | Multi-source data acquisition method, device, equipment and storage medium |
CN112905580A (en) * | 2021-03-19 | 2021-06-04 | 贵州航天云网科技有限公司 | Multi-source heterogeneous data fusion system and method based on industrial big data |
CN112905580B (en) * | 2021-03-19 | 2022-03-18 | 贵州航天云网科技有限公司 | Multi-source heterogeneous data fusion system and method based on industrial big data |
CN113407604A (en) * | 2021-05-21 | 2021-09-17 | 上汽通用五菱汽车股份有限公司 | Data integration method, system and computer readable storage medium |
CN113449326A (en) * | 2021-08-30 | 2021-09-28 | 北京博睿天扬科技有限公司 | Industrial big data analysis system based on multi-source heterogeneous data processing |
CN117407381A (en) * | 2023-09-26 | 2024-01-16 | 陕西小保当矿业有限公司 | Real-time processing method and device for big data of mine industry |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108121508A (en) | Multi-source heterogeneous data collecting system and processing method based on education big data | |
CN108121785A (en) | A kind of analysis method based on education big data | |
Kaufmann et al. | Drawing graphs: methods and models | |
US9519698B1 (en) | Visualization of graphical representations of log files | |
CN108132887B (en) | User interface method of calibration, device, software testing system, terminal and medium | |
CN108132989A (en) | A kind of distributed system based on education big data | |
Bell et al. | Data-driven agent-based exploration of customer behavior | |
He | Improving user experience with case-based reasoning systems using text mining and Web 2.0 | |
CN108345481A (en) | A kind of page display method, device, client and server | |
US20180293155A1 (en) | Intelligent device selection for mobile application testing | |
Hernández García et al. | Visualizations of online course interactions for social network learning analytics | |
CN108664649A (en) | Knowledge content method for pushing, device and push server | |
CN110392155B (en) | Notification message display and processing method, device and equipment | |
CN108510384A (en) | Loan product recommends method and device | |
CN108959469A (en) | Read management method, device, computer equipment and storage medium | |
CN108986125A (en) | Object edge extracting method, device and electronic equipment | |
CN109614414A (en) | A kind of determination method and device of user information | |
US11095953B2 (en) | Hierarchical video concept tagging and indexing system for learning content orchestration | |
Conde et al. | Visual learning analytics techniques applied in software engineering subjects | |
CN109344255A (en) | The fill method and terminal device of label | |
CN107368506A (en) | Unstructured data analysis system and method | |
US10740070B2 (en) | Locating features in a layered software application | |
CN105761181A (en) | Method and apparatus for evaluating teaching performance | |
Maier et al. | Learning analytics cockpit for MOOC platforms | |
Dobashi | Time series analysis of the in class page view history of digital teaching materials using cross table |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20180605 |
|
RJ01 | Rejection of invention patent application after publication |