CN108694194A - A kind of method and apparatus of construction data object - Google Patents
A kind of method and apparatus of construction data object Download PDFInfo
- Publication number
- CN108694194A CN108694194A CN201710227914.5A CN201710227914A CN108694194A CN 108694194 A CN108694194 A CN 108694194A CN 201710227914 A CN201710227914 A CN 201710227914A CN 108694194 A CN108694194 A CN 108694194A
- Authority
- CN
- China
- Prior art keywords
- data
- parsing
- entry
- note
- item
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The present invention provides a kind of method of construction data object, which is characterized in that this method includes:Obtain structured source data;Strategy is parsed according to predefined data, parsing is executed to each effective data entry in the structured source data, to obtain the data set for meeting the data parsing strategy corresponding with the data entry;For the data set using the predefined note to data object to extract data item, the predefined one or more attributes for defining the data object to the note of data object from the data set;Data object corresponding with the data entry is constructed according to the data item extracted.
Description
Technical field
The present invention relates to computer realm more particularly to a kind of construction method, apparatus of data object, electronic equipments and can
Read storage medium.
Background technology
In realizing process of the present invention, inventor has found that at least there are the following problems in the prior art:
Existing structured source data analytic technique scheme is flowed by IO reads source data, and often parsing data line will structure
The object for making an entity, is then saved in data in database.The reusability of this technical solution code is excessively poor, often
A kind of file of content format is required for writing complete process of analysis.Such as " student " source data include " name ", " student number ",
" age " three attributes;" school " source data includes then " title ", " address " two attributes.So in parsing " student " and "
When the two kinds of source datas in school ", need to build two different process of analysis for different attributes.If there is hundreds of
The source data of different-format wants dissection process, individually writes analysis program to each file, workload is very heavy and is easy out
It is wrong.
In addition, other than individually writing analysis program to each source data, it is necessary to consider the mode of source data processing.Example
Such as, it needs in storage to database after the parsing of " student " source data, and is needed in storage to disk after the parsing of " school " source data.
It is writing except analysis program, it is also necessary to which specific operation sequence is write to the different source datas through parsing.
In addition to this, with the development of cloud storage technology and universal, more and more enterprises and individuals store data in
In the remote data storage of such as cloud storage.In this case, the method for reading the source data in cloud storage at present is mainly adopted
It is read with the mode of Java IO streams.In the case where there is the source data of hundreds of different-format to want dissection process, every time
The object for reading data line, building the row data, it is clear that be very inefficient.
Invention content
In view of this, the embodiment of the present invention provides method, apparatus, electronic equipment and the readable storage medium of construction data object
Matter, being capable of the upper source data preserved such as batch processing cloud storage, the source data in other remote data storages and local
Source data in file, and final analysis object is generated come dynamic analysis source data by the note based on reflection, finally lead to
Call back function is crossed to execute subsequent operation to the object that parsing generates.Entire document analysis logic described above can integrate public affairs
Code library, for users to use, user only need the object returned after the format of predefined feed data content, source data parsing
And it is saved in the call back function of database.
To achieve the above object, one side according to the ... of the embodiment of the present invention provides a kind of side of construction data object
Method.
The method of the construction data object of one side according to the ... of the embodiment of the present invention, including:Obtain structured source data;
Strategy is parsed according to predefined data, parsing is executed to each effective data entry in the structured source data, with
Obtain the data set for meeting the data parsing strategy corresponding with the data entry;For the data set using predefined
To the note of data object to extract data item from the data set, described in the predefined note definition to data object
One or more attributes of data object;Data pair corresponding with the data entry are constructed according to the data item extracted
As.
Optionally, operation is executed to the data object constructed using call back function.
Optionally, wherein data parsing strategy includes indicating effective data entry in one or more of source numbers
The separator letter of the separator between data item in the location information and the instruction effective data entry of position in
Breath.
Optionally, strategy is parsed according to the data to execute each effective data entry in structured source data
The step of parsing further includes according to the positional information, being determined in one or more of structured source data one or more
Effective data entry;For each effective data entry:According to the delimiter information, from the effective data entry
Middle at least one data item of determination, to obtain including the data set through parsing of identified at least one data item.
Optionally, wherein the logarithm wants structure according to attribute of the note of object including the data object to be constructed and with described
Position of the corresponding data item of attribute for the data object made in the data set through parsing.
Optionally, wherein also being wrapped the step of data item using the note to data object to be extracted from data set
It includes:According to position of the data item corresponding with the attribute for the data object to be constructed in the data set through parsing from
The data item is extracted in the data set through parsing.
Optionally, include the step of data item to be extracted from data set using the note to data object:According to
The data item extracted is distributed to the data object to be constructed by sequence defined in the note to data object
Attribute, wherein the data object each to be constructed is corresponding with each effective data entry.
Optionally, wherein the structured source data are one in multiple structured source data, for the multiple knot
Each in structure source data executes the acquisition, the parsing, the extraction, the constitution step respectively, and is directed to
The acquisition performed by least two structured source data in the multiple structured source data, described carries the parsing
It takes, the constitution step executes parallel.
Optionally, wherein the data object is Java object, and the wherein described logarithm is to be based on according to the note of object
The Java object of Java reflex mechanisms is explained.
Other side according to the ... of the embodiment of the present invention additionally provides a kind of device of construction data object.
The device of the construction data object of one side according to the ... of the embodiment of the present invention includes:Data acquisition module is used for
Obtain structured source data;Analytics engine module, including:Policy resolution submodule, for parsing plan according to predefined data
Slightly, parsing is executed to each effective data entry in the structured source data, it is corresponding with the data entry to obtain
The data set for meeting data parsing strategy;Data item extracting sub-module, it is predefined for being applied for the data set
To the note of data object to extract data item from the data set, described in the predefined note definition to data object
One or more attributes of data object;Object formation submodule, for being constructed and the data according to the data item extracted
The corresponding data object of entry.
Optionally, described device further includes call back function module, for using call back function come to the data pair constructed
As executing operation.
Optionally, wherein data parsing strategy includes indicating effective data entry in one or more of source numbers
The separator letter of the separator between data item in the location information and the instruction effective data entry of position in
Breath.
Optionally, the policy resolution submodule is additionally operable to:According to the positional information, in one or more of structures
Change and determines one or more effective data entries in source data;For each effective data entry:According to the separation
Information is accorded with, at least one data item is determined from the effective data entry, to obtain including identified at least one number
According to the data set through parsing of item.
Optionally, wherein the logarithm wants structure according to attribute of the note of object including the data object to be constructed and with described
Position of the corresponding data item of attribute for the data object made in the data set through parsing.
Optionally, the data item extracting sub-module is additionally operable to:According to the attribute pair with the data object to be constructed
The data item is extracted from the data set through parsing in position of the data item answered in the data set through parsing.
Optionally, the object formation submodule is additionally operable to:According to defined in the note to data object sequence,
The data item extracted is distributed into the attribute of the data object to be constructed to construct data object, wherein each to construct
Data object it is corresponding with each effective data entry.
Optionally, wherein the structured source data are one in multiple structured source data, for the multiple knot
Each in structure source data executes the acquisition, the parsing, the extraction, the constitution step respectively, and is directed to
The acquisition performed by least two structured source data in the multiple structured source data, described carries the parsing
It takes, the constitution step executes parallel.
Optionally, wherein the object is Java object, and the wherein described logarithm is to be based on Java according to the note of object
The Java object of reflex mechanism is explained.
Other side according to the ... of the embodiment of the present invention additionally provides a kind of electronic equipment of construction data object.
The electronic equipment of the construction data object of other side according to the ... of the embodiment of the present invention, which is characterized in that including:
One or more processors;Storage device, for storing one or more programs, when one or more of programs are by described one
A or multiple processors execute so that the method that one or more of processors realize construction data object.
Other side according to the ... of the embodiment of the present invention additionally provides a kind of computer-readable medium.
The computer-readable medium of other side according to the ... of the embodiment of the present invention, is stored thereon with computer program,
It is characterized in that, the method that construction data object is realized when described program is executed by processor.
One embodiment in foregoing invention has the following advantages that or advantageous effect:Because being noted using the object based on reflection
Solution parses source data, constructs the technological means of data object, thus the source data for overcoming different-format need to construct it is different
The technical issues of analysis program, and then reach different source datas and handled using identical resolution logic, improve code reuse
Technique effect;Because using the parallel processing manner of such as thread pool, the low problem of analyzing efficiency line by line is overcome, is reached
Improve the technique effect of analyzing efficiency;Additionally since using call back function mechanism, it is more convenient the follow-up place to generated object
Reason.
Further effect possessed by above-mentioned non-usual optional mode adds hereinafter in conjunction with specific implementation mode
With explanation.
Description of the drawings
Attached drawing does not constitute inappropriate limitation of the present invention for more fully understanding the present invention.Wherein:
Fig. 1 is the flow chart of the method for construction data object according to the ... of the embodiment of the present invention;
Fig. 2 is the schematic diagram according to the ... of the embodiment of the present invention realized with thread pool to source data parallel processing;
Fig. 3 is the stream according to the ... of the embodiment of the present invention parsed to one or more of source datas using analytics engine
Cheng Tu;
Fig. 4 is the flow chart according to the ... of the embodiment of the present invention that subsequent operation is executed using call back function;
Fig. 5 is the schematic diagram of the device of construction data object according to the ... of the embodiment of the present invention;
Fig. 6 is adapted for the structural representation of the computer system for the terminal device or server of realizing the embodiment of the present application
Figure
Specific implementation mode
It explains to the exemplary embodiment of the present invention below in conjunction with attached drawing, including the various of the embodiment of the present invention
Details should think them only exemplary to help understanding.Therefore, those of ordinary skill in the art should recognize
It arrives, various changes and modifications can be made to the embodiments described herein, without departing from scope and spirit of the present invention.Together
The description to known function and structure is omitted for clarity and conciseness in sample in following description.
Fig. 1 is the flow chart of the method for construction data object according to an embodiment of the invention, is depicted as shown in Figure 1
The method of construction data object according to an embodiment of the invention.
In one embodiment, source data to be resolved is stored in the data in the file in cloud storage.Although at this
Source data to be resolved is downloaded from remote data storage in embodiment, but resolving according to an embodiment of the invention
It is also applied for parsing local data.
In the step S11 of Fig. 1, can source data be downloaded in batches from the remote data storage of such as cloud storage first.
In this embodiment, source data to be resolved is stored in the file data in cloud storage, in step s 11 deposits file from cloud
Batch is downloaded in storage.File is singly downloaded different from the prior art, is singly parsed, but by cloud storage
On files in batch download with pending.In one embodiment, downloaded file can be saved in the disk of server
On, while returning to file in the path of disk and being saved in file ordered queue.In other embodiments, it can will be downloaded
File be saved in caching, for subsequent processing.As an alternative, file to be resolved is local file, then is not necessarily to from long-range
Data storage obtains.In other embodiments, source data such as to be resolved is local file, then need not execute step S11.
In the step S12 of Fig. 1, the parallel processing to one or more of source datas may be implemented.Preferably, for
The process of one or more source data construction data objects may include S12, with parallel processing one or more source data.One
In a embodiment, one or more source datas to be resolved are the file datas in cloud storage.By file under cloud storage batch
After being downloaded to local disk, the path of file or other identifier can be saved in document queue.In another embodiment, by file
It is downloaded in caching from cloud storage batch, and records path or the other identifier of file in such as listed files.To file
The various parallel computation modes such as thread pool, process pool, Distributed Calculation may be used to realize in parallel processing.At another
In embodiment, pending one or more source datas not instead of file, the source data of such as tables of data of other forms,
It can be handled in a similar way.Realize that the embodiment schematic diagram of parallel processing is as shown in Figure 2 with thread pool.
Fig. 2 is the schematic diagram according to the ... of the embodiment of the present invention realized with thread pool to source data parallel processing, such as Fig. 2 institutes
Show, depicts the schematic diagram according to the ... of the embodiment of the present invention realized with thread pool and handled file in parallel.In one embodiment,
File after the locally downloading disk of cloud storage batch, is regularly being obtained into the road of file from file ordered queue from the background
Diameter is simultaneously distributed to multiple thread pools.As described in Figure, the information that file 0 is obtained in thread 0 obtains file 1 in thread 1
Information, and so on.Downloaded file 1 is concurrently handled to file n in thread 0 to thread n in a manner of thread pool.
In one embodiment, downloaded file can be the file of various types of formats.Such as the following table 1 and following table
Shown in 2.
For example, personal information data:
Name, age, educational background, occupation
Zhang San, 25, undergraduate course, teacher
Li Si, 21, training, driver
For example, merchandise news:
Title, price, color, specification
A brand mobile phones, 2999, white, 5.5 cun of screens
B brand mobile phones, 5999, golden, 5.5 cun of screens
Personal information and merchandise news have different attributes, type, unit, format etc..In a kind of this embodiment,
Parallel processing is realized using such as thread pool, then can be in thread 0 to thread n, processing independently of each other parses these not
The file of same type format.
Fig. 1 is returned to parse one or more source datas using analytics engine in the step S13 of Fig. 1.Specifically
Resolving is referring to Fig. 3.
Fig. 3 is the stream according to the ... of the embodiment of the present invention parsed to one or more of source datas using analytics engine
Cheng Tu, as shown in figure 3, depicting the stream according to the ... of the embodiment of the present invention parsed to the data downloaded using analytics engine
Journey.
In the step S31 of Fig. 3, parsing strategy is obtained first.Analytics engine is according to parsing strategy to every data line
It is parsed, can be referred to as data entry per data line.In one embodiment, parsing strategy includes mainly with next
A little strategies:(1) invalid data is skipped in definition.For example, preceding how many row in file skipped, that is, how many before file capable do not do
It parses (2) and defines separator.For example, the text for defining file content data is separated according to what symbol, for example csv files are
Separated by commas accords with file, its content-data is exactly according to separated by commas.
In one embodiment, parsing strategy is defined by the user, and passes to analytics engine.In another embodiment
In, parsing strategy is predefined in a program, and user follows the specification of parsing strategy when resolution file data.
In the exemplary embodiment, file source data to be resolved is as follows:
-----*&&&&))^^_
jklfds^&* %&%P }
Name, age, educational background, occupation
Zhang San, 25, undergraduate course, teacher
Li Si, 21, training, driver
In this embodiment, the first row and the second row are useless information, need not parse, then parse policy definition
" skipping preceding 2 row ".For example, the format of specification is " skip_lines:2 ", preceding 2 row is skipped in expression.In addition, in the exemplary file
In source data, separator is comma.Strategy is then parsed to also need to define " separated by commas ".For example, the format of specification is
"separator:, " indicate file content separated by commas.After analytics engine gets parsing strategy, according to parsing plan
Specification slightly acts to execute corresponding parsing.Such as " skip_lines:2 ", program can skip preceding 2 row automatically;
"separator:, " program can be separated the data of this line automatically according to comma.
On in step s 32 of fig. 3, each row of data is read using row resolver.In one embodiment, analytics engine obtains
To after parsing strategy, row resolver is called.The effect of row resolver is to read data often capable in file, and parsing is supplied to draw
Hold up the data according to the parsing policy resolution row.
The example embodiment continued the above, in this document:
Name, age, educational background, occupation
Zhang San, 25, undergraduate course, teacher
Li Si, 21, training, driver
Wherein, each row of data is read via row resolver:
The first row is:(name, age, educational background, occupation)
Second row is:(Zhang San, 25, undergraduate course, teacher)
The third line is:(Li Si, 21, training, driver)
In the step S33 of Fig. 3, analytics engine carries out each row of data according to the user-defined note to data object
Parsing, data item corresponding with the object properties to be built is converted by each row of data.In one embodiment, to data
The definition mode of the note of object is that the serial number occurred in the text according to the field in object is labeled.Implement at one
In example, the object to be built is Java object, then is that the object based on Java reflex mechanisms is explained to the note of data object.It is right
As explaining the attribute for helping user by defining constructed object, to extract effective object property field.
Term " note " is attached to some metamessages in code, is solved in compiling, operation for some tools
Analysis and use, play explanation, the function of configuration.The semanteme of code is not directly affected, but he can be seen as program
Tool or class libraries.It can in turn influence the Program Semantics being currently running.For example, Java explains (Annotation)
It is equivalent to a kind of label, note is added in a program and is equal to stamp certain label, javac compilers, developing instrument for program
Corresponding behaviour can be executed according to label by reflecting whether there is or not which kind of labels in class and various elements to understand with other programs
Make.
In the step S34 of Fig. 3, the attribute for the object to be constructed will be assigned to using the data item parsed,
To construct object.
In one embodiment, for example, the data in file source data per a line will be converted to student (Student) in this way
One object, Student objects have two attributes of name (name) and age (age).And also have student's in file source data
Name and age data are separated with separated by commas symbol.So the Content Transformation of file source data at Student objects
Name and age, it is necessary to increase serial number on the attribute of Student objects and explain.@Column (order=0) indicate this
The 1st row in the value respective file of attribute.The content of file source data is as follows:
Name, age
Zhang San, 19
Li Si, 15
King five, and 21
@Column (order=0) notes are translated by analytics engine according to reflex mechanism, and basic process is:
First, analytics engine gets the second row data (Zhang San, 19) by row resolver from file source data, then
Get all properties (name and age) of Student objects;
Then, analytics engine gets the note@Column (order=0) on attribute name, is parsed from the inside is explained
Order is 0, and the meaning is exactly the data " Zhang San " of first row to be obtained from data (Zhang San, 19), and " Zhang San " is assigned to
name;
Then, the note@Column (order=1) on attribute age are got, are 1 from explaining the inside to parse order,
The meaning is exactly the data " 19 " of secondary series to be obtained from data (Zhang San, 19), and " 19 " are assigned to age;
A Student object is thus generated, Student objects name is Zhang San, and age is 19.
In one embodiment, it in the case of realizing resolving by the program code of Java language exploitation, uses
Java reflex mechanisms.The example of Student objects is continued the above, Java reflections are as follows:
class Student{
@Column (order=0)
String name;
@Column (order=1)
int age;
}
After being reflected in this way by Java, the whatsoever file source data of format can all be converted to that user wants Java pairs
As.
Fig. 1 is returned, in the step S14 of Fig. 1, subsequent operation is executed to the object constructed.Using call back function to institute
The object of construction executes subsequent operation.The step is realized by Fig. 4.
Fig. 4 is the flow chart according to the ... of the embodiment of the present invention that subsequent operation is executed using call back function, as shown in figure 4, retouching
The flow that subsequent operation is executed using call back function is painted.
In the step S41 of Fig. 4, after being parsed, user-defined call back function is called.Call back function be user from
The function of definition can call automatically after file source data is parsed.
In one embodiment, call back function interface is defined by analytics engine, but the realization of call back function is by user
Definition.The main effect of call back function is that allow user to parse file source data invisible, and visible to the processing of data.User
Call back function can be flexibly defined, to handle a variety of demands of user.After analytics engine is parsed, each row of data is obtained
Corresponding object.In one embodiment, with array the object can be stored.
The example for continuing the above Student objects, by parsing three obtained Student objects, these three Student
Object can be stored in object array:
(Zhang San, 19)
(Li Si, 15)
(king five, 21)
In the step S42 of Fig. 4, to executing subsequent operation through parsing obtained object.As set forth above, it is possible to pass through readjustment
Function executes user-defined logic.For example, in one embodiment, user needs to protect the data of parsing after being parsed
It is stored to database, then user can write the operation for preserving database in call back function.In another embodiment, Yong Huxu
In the data storage to caching after parsing, then it only needs to write storage in call back function to the logic cached.
Fig. 5 is the schematic diagram of the device of construction data object according to the ... of the embodiment of the present invention, as shown in figure 5, depicting root
According to the module of the device of the construction data object of the embodiment of the present invention.
Data acquisition module 51 is such as deposited from cloud for downloading one or more data from remote data storage end batch
It stores up end batch and downloads one or more data.And realize the parallel processing to institute's downloading data.In one embodiment, line is used
The mode of Cheng Chi realizes parallel processing.In other embodiments, it can be realized with modes such as process pool, distributed systems parallel
Processing.
Analytics engine module 52, for realizing the process parsed to the data downloaded is explained according to object.Parsing
Engine modules 52 include policy resolution submodule 521, row resolver submodule 522, data item extracting sub-module 523 and object
Construct submodule 524.
Policy resolution submodule 521 executes parsing for obtaining parsing strategy.In one embodiment, plan is parsed
It is slightly defined by the user, and passes to analytics engine.In another embodiment, parsing strategy is predefined in a program,
User follows the specification of parsing strategy when resolution file source data.
Row resolver submodule 522, for reading row data.In one embodiment, analytics engine gets parsing plan
After slightly, row resolver is called.The effect of row resolver is to read data often capable in file source data, and be supplied to policy resolution
Submodule 521 is according to the data for parsing the policy resolution row.
Data item extracting sub-module 523, each row of data for being read out to row resolver parse, and utilize logarithm
Data item corresponding with the object properties to be constructed is extracted according to the note of object, for use as the field of object.
Object formation submodule 524, the data item for being extracted using data item extracting sub-module 523 construct object.
Call back function module 53, for after being parsed, calling user-defined call back function to come to pair through parsing
As executing subsequent operation.Call back function is user-defined function, can be called automatically after file source data is parsed.
Below with reference to Fig. 6, it illustrates the computer systems 600 suitable for the terminal device for realizing the embodiment of the present application
Structural schematic diagram.Terminal device shown in Fig. 6 is only an example, to the function of the embodiment of the present application and should not use model
Shroud carrys out any restrictions.
As shown in fig. 6, computer system 600 includes central processing unit (CPU) 601, it can be read-only according to being stored in
Program in memory (ROM) 602 or be loaded into the program in random access storage device (RAM) 603 from storage section 608 and
Execute various actions appropriate and processing.In RAM 603, also it is stored with system 600 and operates required various programs and data.
CPU 601, ROM 602 and RAM 603 are connected with each other by bus 604.Input/output (I/O) interface 605 is also connected to always
Line 604.
It is connected to I/O interfaces 605 with lower component:Importation 606 including keyboard, mouse etc.;It is penetrated including such as cathode
The output par, c 607 of spool (CRT), liquid crystal display (LCD) etc. and loud speaker etc.;Storage section 608 including hard disk etc.;
And the communications portion 609 of the network interface card including LAN card, modem etc..Communications portion 609 via such as because
The network of spy's net executes communication process.Driver 610 is also according to needing to be connected to I/O interfaces 605.Detachable media 611, such as
Disk, CD, magneto-optic disk, semiconductor memory etc. are mounted on driver 610, as needed in order to be read from thereon
Computer program be mounted into storage section 608 as needed.
Particularly, it according to embodiment disclosed by the invention, may be implemented as counting above with reference to the process of flow chart description
Calculation machine software program.For example, embodiment disclosed by the invention includes a kind of computer program product comprising be carried on computer
Computer program on readable medium, the computer program include the program code for method shown in execution flow chart.
In such embodiment, which can be downloaded and installed by communications portion 609 from network, and/or from can
Medium 611 is dismantled to be mounted.When the computer program is executed by central processing unit (CPU) 601, the system that executes the application
The above-mentioned function of middle restriction.
It should be noted that computer-readable medium shown in the application can be computer-readable signal media or meter
Calculation machine readable storage medium storing program for executing either the two arbitrarily combines.Computer readable storage medium for example can be --- but not
Be limited to --- electricity, magnetic, optical, electromagnetic, infrared ray or semiconductor system, device or device, or arbitrary above combination.Meter
The more specific example of calculation machine readable storage medium storing program for executing can include but is not limited to:Electrical connection with one or more conducting wires, just
It takes formula computer disk, hard disk, random access storage device (RAM), read-only memory (ROM), erasable type and may be programmed read-only storage
Device (EPROM or flash memory), optical fiber, portable compact disc read-only memory (CD-ROM), light storage device, magnetic memory device,
Or above-mentioned any appropriate combination.In this application, can be any include computer readable storage medium or storage journey
The tangible medium of sequence, the program can be commanded the either device use or in connection of execution system, device.And at this
In application, computer-readable signal media may include in a base band or as the data-signal that a carrier wave part is propagated,
Wherein carry computer-readable program code.Diversified forms may be used in the data-signal of this propagation, including but unlimited
In electromagnetic signal, optical signal or above-mentioned any appropriate combination.Computer-readable signal media can also be that computer can
Any computer-readable medium other than storage medium is read, which can send, propagates or transmit and be used for
By instruction execution system, device either device use or program in connection.Include on computer-readable medium
Program code can transmit with any suitable medium, including but not limited to:Wirelessly, electric wire, optical cable, RF etc. or above-mentioned
Any appropriate combination.
Flow chart in attached drawing and block diagram, it is illustrated that according to the system of the various embodiments of the application, method and computer journey
The architecture, function and operation in the cards of sequence product.In this regard, each box in flowchart or block diagram can generation
A part for a part for one module, program segment, or code of table, above-mentioned module, program segment, or code includes one or more
Executable instruction for implementing the specified logical function.It should also be noted that in some implementations as replacements, institute in box
The function of mark can also occur in a different order than that indicated in the drawings.For example, two boxes succeedingly indicated are practical
On can be basically executed in parallel, they can also be executed in the opposite order sometimes, this is depended on the functions involved.Also it wants
It is noted that the combination of each box in block diagram or flow chart and the box in block diagram or flow chart, can use and execute rule
The dedicated hardware based systems of fixed functions or operations is realized, or can use the group of specialized hardware and computer instruction
It closes to realize.
Being described in module involved in the embodiment of the present application can be realized by way of software, can also be by hard
The mode of part is realized.Described module can also be arranged in the processor, for example, can be described as:A kind of processor packet
Include data acquisition module, analytics engine module and call back function module.Wherein, the title of these modules is under certain conditions simultaneously
The restriction to the module itself is not constituted, for example, data acquisition is also described as the " mould for obtaining structured source data
Block ".
As on the other hand, present invention also provides a kind of computer-readable medium, which can be
Included in equipment described in above-described embodiment;Can also be individualism, and without be incorporated the equipment in.Above-mentioned calculating
Machine readable medium carries one or more program, when said one or multiple programs are executed by the equipment, makes
Obtaining the equipment includes:Obtain structured source data;Strategy is parsed according to predefined data, to every in the structured source data
One effective data entry executes parsing, to obtain the number for meeting the data parsing strategy corresponding with the data entry
According to collection;It is described pre- for the data set using the predefined note to data object to extract data item from the data set
One or more attributes that the data object is defined to the note of data object of definition;According to the data item extracted come structure
Make data object corresponding with the data entry.
Technical solution according to the ... of the embodiment of the present invention can be based on reflex mechanism, the batch processing by the way of thread pool
Data in the file preserved in such as cloud storage, file and local file in other remote data storages, and pass through note
Solution carrys out the final analysis object of dynamic analysis Generating Data File, handles analysis object finally by call back function and is saved in number
According in library.To improve efficiency, rate of code reuse is improved.
The said goods can perform the method that the embodiment of the present invention is provided, and has the corresponding function module of execution method and has
Beneficial effect.The not technical detail of detailed description in the present embodiment, reference can be made to the method that the embodiment of the present invention is provided.
Above-mentioned specific implementation mode, does not constitute limiting the scope of the invention.Those skilled in the art should be bright
It is white, design requirement and other factors are depended on, various modifications, combination, sub-portfolio and replacement can occur.It is any
Modifications, equivalent substitutions and improvements made by within the spirit and principles in the present invention etc., should be included in the scope of the present invention
Within.
Claims (20)
1. a kind of method of construction data object, which is characterized in that this method includes:
Obtain structured source data;
Strategy is parsed according to predefined data, solution is executed to each effective data entry in the structured source data
Analysis, to obtain the data set for meeting the data parsing strategy corresponding with the data entry;
It is described predetermined for the data set using the predefined note to data object to extract data item from the data set
One or more attributes that the data object is defined to the note of data object of justice;
Data object corresponding with the data entry is constructed according to the data item extracted.
2. the method as described in claim 1, which is characterized in that the method further includes:
Operation is executed to the data object constructed using call back function.
3. the method as described in claim 1, which is characterized in that the wherein described data parsing strategy includes indicating effective data
Data in the location information and the instruction effective data entry of position of the entry in one or more of source datas
The delimiter information of separator between.
4. method as claimed in claim 3, which is characterized in that according to data parsing strategy in structured source data
Each effective data entry executes the step of parsing:
According to the positional information, one or more effective data strips are determined in one or more of structured source data
Mesh;
For each effective data entry:
According to the delimiter information, at least one data item is determined from the effective data entry, to obtain including institute
The data set through parsing of determining at least one data item.
5. the method as described in claim 1, which is characterized in that the wherein described logarithm includes the number to be constructed according to the note of object
According to the attribute and data item corresponding with the attribute for the data object to be constructed of object in the data set through parsing
Position.
6. method as claimed in claim 5, which is characterized in that wherein using the note to data object come from data set
The step of middle extraction data item further includes:
According to position of the data item corresponding with the attribute for the data object to be constructed in the data set through parsing
The data item is extracted from the data set through parsing.
7. method as claimed in claim 6, which is characterized in that carried from data set using the note to data object
The step of taking data item include:
According to sequence defined in the note to data object, the data item extracted is distributed into the number to be constructed
According to the attribute of object, wherein the data object each to be constructed is corresponding with each effective data entry.
8. the method as described in claim 1, which is characterized in that wherein, the structured source data are multiple structured source numbers
It one in, is executed respectively for each in the multiple structured source data and described obtain, the parsing, described carries
It takes, the constitution step, and for performed by least two structured source data in the multiple structured source data
The acquisition, the parsing, the extraction, the constitution step execute parallel.
9. the method as described in claim 1, which is characterized in that the wherein described data object is Java object, and wherein institute
It is that the Java object based on Java reflex mechanisms is explained to state to the note of data object.
10. a kind of device of construction data object, which is characterized in that described device includes:
Data acquisition module, for obtaining structured source data;
Analytics engine module, including:
Policy resolution submodule has each in the structured source data for parsing strategy according to predefined data
The data entry of effect executes parsing, to obtain the data set for meeting the data parsing strategy corresponding with the data entry;
Data item extracting sub-module, for for the data set using the predefined note to data object with from the data set
Middle extraction data item, predefined one or more attributes that the data object is defined to the note of data object;
Object formation submodule, for constructing data object corresponding with the data entry according to the data item extracted.
11. device as claimed in claim 10, which is characterized in that described device further includes:
Call back function module, for executing operation to the data object constructed using call back function.
12. device as claimed in claim 10, which is characterized in that the wherein described data parsing strategy includes the effective number of instruction
According to the number in the location information and the instruction effective data entry of position of the entry in one or more of source datas
According to the delimiter information of the separator between item.
13. device as claimed in claim 12, which is characterized in that the policy resolution submodule is additionally operable to:
According to the positional information, one or more effective data strips are determined in one or more of structured source data
Mesh;
For each effective data entry:
According to the delimiter information, at least one data item is determined from the effective data entry, to obtain including institute
The data set through parsing of determining at least one data item.
14. device as claimed in claim 10, which is characterized in that the wherein described logarithm includes being constructed according to the note of object
The attribute of data object and data item corresponding with the attribute for the data object to be constructed are in the data set through parsing
In position.
15. device as claimed in claim 14, which is characterized in that the data item extracting sub-module is additionally operable to:
According to position of the data item corresponding with the attribute for the data object to be constructed in the data set through parsing
The data item is extracted from the data set through parsing.
16. device as claimed in claim 15, which is characterized in that the object formation submodule is additionally operable to:
According to sequence defined in the note to data object, the data item extracted is distributed into the number to be constructed
According to the attribute of object to construct data object, wherein the data object each to be constructed is opposite with each effective data entry
It answers.
17. device as claimed in claim 10, which is characterized in that wherein, the structured source data are multiple structured sources
One in data, for each in the multiple structured source data execute respectively it is described obtain, it is described parsing, it is described
Extraction, the constitution step, and for performed by least two structured source data in the multiple structured source data
The acquisition, the parsing, the extraction, the constitution step executes parallel.
18. device as claimed in claim 10, which is characterized in that the wherein described data object is Java object, and wherein
The note to data object is that the Java object based on Java reflex mechanisms is explained.
19. a kind of electronic equipment of construction data object, which is characterized in that including:
One or more processors;
Storage device, for storing one or more programs,
When one or more of programs are executed by one or more of processors so that one or more of processors are real
The now method as described in any in claim 1-9.
20. a kind of computer-readable medium, is stored thereon with computer program, which is characterized in that described program is held by processor
The method as described in any in claim 1-9 is realized when row.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710227914.5A CN108694194A (en) | 2017-04-10 | 2017-04-10 | A kind of method and apparatus of construction data object |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710227914.5A CN108694194A (en) | 2017-04-10 | 2017-04-10 | A kind of method and apparatus of construction data object |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108694194A true CN108694194A (en) | 2018-10-23 |
Family
ID=63842370
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710227914.5A Pending CN108694194A (en) | 2017-04-10 | 2017-04-10 | A kind of method and apparatus of construction data object |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108694194A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110018824A (en) * | 2018-11-28 | 2019-07-16 | 阿里巴巴集团控股有限公司 | A kind of method and apparatus that domain object is converted into view object |
CN112051999A (en) * | 2020-09-03 | 2020-12-08 | 中国银行股份有限公司 | Method and device for generating configured download file |
CN112181804A (en) * | 2020-08-31 | 2021-01-05 | 五八到家有限公司 | Parameter checking method, equipment and storage medium |
CN113835707A (en) * | 2021-09-30 | 2021-12-24 | 唯品会(广州)软件有限公司 | Number making method, device, equipment and readable storage medium |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20080082571A1 (en) * | 2006-09-29 | 2008-04-03 | Miyuki Sakai | System and Method for Transforming Tabular Form Date into Structured Document |
CN102611836A (en) * | 2012-02-06 | 2012-07-25 | 上海理工大学 | High-speed image collecting method based on Labview |
CN103745010A (en) * | 2014-01-28 | 2014-04-23 | 北京京东尚科信息技术有限公司 | Method and device for determining object attribute value based on CSV (Comma Separated Values) file |
CN104182484A (en) * | 2014-08-07 | 2014-12-03 | 北京京东尚科信息技术有限公司 | Method and device for realizing mapping of HBase data and Java domain objects |
CN105491135A (en) * | 2015-12-11 | 2016-04-13 | 小米科技有限责任公司 | Data connection establishing method and device |
CN106202082A (en) * | 2015-04-30 | 2016-12-07 | 阿里巴巴集团控股有限公司 | The method and device of built-up foundation data buffer storage |
CN106302442A (en) * | 2016-08-12 | 2017-01-04 | 广州慧睿思通信息科技有限公司 | A kind of network communication packet analytic method based on Java language |
CN106354481A (en) * | 2015-07-13 | 2017-01-25 | 阿里巴巴集团控股有限公司 | Method and equipment for uniform mapping of HTTP requests |
-
2017
- 2017-04-10 CN CN201710227914.5A patent/CN108694194A/en active Pending
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20080082571A1 (en) * | 2006-09-29 | 2008-04-03 | Miyuki Sakai | System and Method for Transforming Tabular Form Date into Structured Document |
CN102611836A (en) * | 2012-02-06 | 2012-07-25 | 上海理工大学 | High-speed image collecting method based on Labview |
CN103745010A (en) * | 2014-01-28 | 2014-04-23 | 北京京东尚科信息技术有限公司 | Method and device for determining object attribute value based on CSV (Comma Separated Values) file |
CN104182484A (en) * | 2014-08-07 | 2014-12-03 | 北京京东尚科信息技术有限公司 | Method and device for realizing mapping of HBase data and Java domain objects |
CN106202082A (en) * | 2015-04-30 | 2016-12-07 | 阿里巴巴集团控股有限公司 | The method and device of built-up foundation data buffer storage |
CN106354481A (en) * | 2015-07-13 | 2017-01-25 | 阿里巴巴集团控股有限公司 | Method and equipment for uniform mapping of HTTP requests |
CN105491135A (en) * | 2015-12-11 | 2016-04-13 | 小米科技有限责任公司 | Data connection establishing method and device |
CN106302442A (en) * | 2016-08-12 | 2017-01-04 | 广州慧睿思通信息科技有限公司 | A kind of network communication packet analytic method based on Java language |
Non-Patent Citations (4)
Title |
---|
(美)斯佩克特: "《R语言数据操作》", 31 July 2011 * |
LEONGFENG: "利用dom4j包和反射动态解析不同实体类型的xml", 《HTTPS://BLOG.CSDN.NET/》 * |
UUUUTAOSSIENUUUU: "回调函数", 《HTTPS://BLOG.CSDN.NET/》 * |
张晓明: "《计算机网络编程技术》", 31 October 2009 * |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110018824A (en) * | 2018-11-28 | 2019-07-16 | 阿里巴巴集团控股有限公司 | A kind of method and apparatus that domain object is converted into view object |
CN112181804A (en) * | 2020-08-31 | 2021-01-05 | 五八到家有限公司 | Parameter checking method, equipment and storage medium |
CN112181804B (en) * | 2020-08-31 | 2023-09-08 | 五八到家有限公司 | Parameter verification method, device and storage medium |
CN112051999A (en) * | 2020-09-03 | 2020-12-08 | 中国银行股份有限公司 | Method and device for generating configured download file |
CN112051999B (en) * | 2020-09-03 | 2024-04-19 | 中国银行股份有限公司 | Configurable download file generation method and device |
CN113835707A (en) * | 2021-09-30 | 2021-12-24 | 唯品会(广州)软件有限公司 | Number making method, device, equipment and readable storage medium |
CN113835707B (en) * | 2021-09-30 | 2024-01-19 | 唯品会(广州)软件有限公司 | Counting method, counting device, counting equipment and readable storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108171276B (en) | Method and apparatus for generating information | |
CN110472207A (en) | List generation method and device | |
CN108694194A (en) | A kind of method and apparatus of construction data object | |
US20160117291A1 (en) | Conversion of a presentation to darwin information typing architecture (dita) | |
CN109036425A (en) | Method and apparatus for operating intelligent terminal | |
CN108804327A (en) | A kind of method and apparatus of automatic Data Generation Test | |
CN109522341A (en) | Realize method, apparatus, the equipment of the stream data processing engine based on SQL | |
CN114424257A (en) | Automatic rendering and extraction of form data using machine learning | |
CN109871311A (en) | A kind of method and apparatus for recommending test case | |
CN109271403A (en) | A kind of operating method of data query, device, medium and electronic equipment | |
CN108108342A (en) | Generation method, search method and the device of structured text | |
CN108959436A (en) | Dictionary edit methods and system for voice dialogue platform | |
US20220121668A1 (en) | Method for recommending document, electronic device and storage medium | |
CN109101309A (en) | For updating user interface method and device | |
CN109389660A (en) | Image generating method and device | |
CN109002282A (en) | A kind of method and apparatus for realizing animation effect in web page exploitation | |
CN109284367A (en) | Method and apparatus for handling text | |
CN109375910A (en) | Class file generation method, device, electronic equipment and storage medium | |
CN108959294A (en) | A kind of method and apparatus accessing search engine | |
CN106445645B (en) | Method and apparatus for executing distributed computing task | |
CN110109983A (en) | A kind of method and apparatus operating Redis database | |
CN107688609A (en) | A kind of position label recommendation method and computing device | |
Idris | NumPy Cookbook | |
Usuelli | R machine learning essentials | |
CN108460020A (en) | Method and device for obtaining information |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20181023 |