CN108694194A - A kind of method and apparatus of construction data object - Google Patents

A kind of method and apparatus of construction data object Download PDF

Info

Publication number
CN108694194A
CN108694194A CN201710227914.5A CN201710227914A CN108694194A CN 108694194 A CN108694194 A CN 108694194A CN 201710227914 A CN201710227914 A CN 201710227914A CN 108694194 A CN108694194 A CN 108694194A
Authority
CN
China
Prior art keywords
data
parsing
entry
note
item
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710227914.5A
Other languages
Chinese (zh)
Inventor
廖耀华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jingdong Century Trading Co Ltd
Beijing Jingdong Shangke Information Technology Co Ltd
Original Assignee
Beijing Jingdong Century Trading Co Ltd
Beijing Jingdong Shangke Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jingdong Century Trading Co Ltd, Beijing Jingdong Shangke Information Technology Co Ltd filed Critical Beijing Jingdong Century Trading Co Ltd
Priority to CN201710227914.5A priority Critical patent/CN108694194A/en
Publication of CN108694194A publication Critical patent/CN108694194A/en
Pending legal-status Critical Current

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention provides a kind of method of construction data object, which is characterized in that this method includes:Obtain structured source data;Strategy is parsed according to predefined data, parsing is executed to each effective data entry in the structured source data, to obtain the data set for meeting the data parsing strategy corresponding with the data entry;For the data set using the predefined note to data object to extract data item, the predefined one or more attributes for defining the data object to the note of data object from the data set;Data object corresponding with the data entry is constructed according to the data item extracted.

Description

A kind of method and apparatus of construction data object
Technical field
The present invention relates to computer realm more particularly to a kind of construction method, apparatus of data object, electronic equipments and can Read storage medium.
Background technology
In realizing process of the present invention, inventor has found that at least there are the following problems in the prior art:
Existing structured source data analytic technique scheme is flowed by IO reads source data, and often parsing data line will structure The object for making an entity, is then saved in data in database.The reusability of this technical solution code is excessively poor, often A kind of file of content format is required for writing complete process of analysis.Such as " student " source data include " name ", " student number ", " age " three attributes;" school " source data includes then " title ", " address " two attributes.So in parsing " student " and " When the two kinds of source datas in school ", need to build two different process of analysis for different attributes.If there is hundreds of The source data of different-format wants dissection process, individually writes analysis program to each file, workload is very heavy and is easy out It is wrong.
In addition, other than individually writing analysis program to each source data, it is necessary to consider the mode of source data processing.Example Such as, it needs in storage to database after the parsing of " student " source data, and is needed in storage to disk after the parsing of " school " source data. It is writing except analysis program, it is also necessary to which specific operation sequence is write to the different source datas through parsing.
In addition to this, with the development of cloud storage technology and universal, more and more enterprises and individuals store data in In the remote data storage of such as cloud storage.In this case, the method for reading the source data in cloud storage at present is mainly adopted It is read with the mode of Java IO streams.In the case where there is the source data of hundreds of different-format to want dissection process, every time The object for reading data line, building the row data, it is clear that be very inefficient.
Invention content
In view of this, the embodiment of the present invention provides method, apparatus, electronic equipment and the readable storage medium of construction data object Matter, being capable of the upper source data preserved such as batch processing cloud storage, the source data in other remote data storages and local Source data in file, and final analysis object is generated come dynamic analysis source data by the note based on reflection, finally lead to Call back function is crossed to execute subsequent operation to the object that parsing generates.Entire document analysis logic described above can integrate public affairs Code library, for users to use, user only need the object returned after the format of predefined feed data content, source data parsing And it is saved in the call back function of database.
To achieve the above object, one side according to the ... of the embodiment of the present invention provides a kind of side of construction data object Method.
The method of the construction data object of one side according to the ... of the embodiment of the present invention, including:Obtain structured source data; Strategy is parsed according to predefined data, parsing is executed to each effective data entry in the structured source data, with Obtain the data set for meeting the data parsing strategy corresponding with the data entry;For the data set using predefined To the note of data object to extract data item from the data set, described in the predefined note definition to data object One or more attributes of data object;Data pair corresponding with the data entry are constructed according to the data item extracted As.
Optionally, operation is executed to the data object constructed using call back function.
Optionally, wherein data parsing strategy includes indicating effective data entry in one or more of source numbers The separator letter of the separator between data item in the location information and the instruction effective data entry of position in Breath.
Optionally, strategy is parsed according to the data to execute each effective data entry in structured source data The step of parsing further includes according to the positional information, being determined in one or more of structured source data one or more Effective data entry;For each effective data entry:According to the delimiter information, from the effective data entry Middle at least one data item of determination, to obtain including the data set through parsing of identified at least one data item.
Optionally, wherein the logarithm wants structure according to attribute of the note of object including the data object to be constructed and with described Position of the corresponding data item of attribute for the data object made in the data set through parsing.
Optionally, wherein also being wrapped the step of data item using the note to data object to be extracted from data set It includes:According to position of the data item corresponding with the attribute for the data object to be constructed in the data set through parsing from The data item is extracted in the data set through parsing.
Optionally, include the step of data item to be extracted from data set using the note to data object:According to The data item extracted is distributed to the data object to be constructed by sequence defined in the note to data object Attribute, wherein the data object each to be constructed is corresponding with each effective data entry.
Optionally, wherein the structured source data are one in multiple structured source data, for the multiple knot Each in structure source data executes the acquisition, the parsing, the extraction, the constitution step respectively, and is directed to The acquisition performed by least two structured source data in the multiple structured source data, described carries the parsing It takes, the constitution step executes parallel.
Optionally, wherein the data object is Java object, and the wherein described logarithm is to be based on according to the note of object The Java object of Java reflex mechanisms is explained.
Other side according to the ... of the embodiment of the present invention additionally provides a kind of device of construction data object.
The device of the construction data object of one side according to the ... of the embodiment of the present invention includes:Data acquisition module is used for Obtain structured source data;Analytics engine module, including:Policy resolution submodule, for parsing plan according to predefined data Slightly, parsing is executed to each effective data entry in the structured source data, it is corresponding with the data entry to obtain The data set for meeting data parsing strategy;Data item extracting sub-module, it is predefined for being applied for the data set To the note of data object to extract data item from the data set, described in the predefined note definition to data object One or more attributes of data object;Object formation submodule, for being constructed and the data according to the data item extracted The corresponding data object of entry.
Optionally, described device further includes call back function module, for using call back function come to the data pair constructed As executing operation.
Optionally, wherein data parsing strategy includes indicating effective data entry in one or more of source numbers The separator letter of the separator between data item in the location information and the instruction effective data entry of position in Breath.
Optionally, the policy resolution submodule is additionally operable to:According to the positional information, in one or more of structures Change and determines one or more effective data entries in source data;For each effective data entry:According to the separation Information is accorded with, at least one data item is determined from the effective data entry, to obtain including identified at least one number According to the data set through parsing of item.
Optionally, wherein the logarithm wants structure according to attribute of the note of object including the data object to be constructed and with described Position of the corresponding data item of attribute for the data object made in the data set through parsing.
Optionally, the data item extracting sub-module is additionally operable to:According to the attribute pair with the data object to be constructed The data item is extracted from the data set through parsing in position of the data item answered in the data set through parsing.
Optionally, the object formation submodule is additionally operable to:According to defined in the note to data object sequence, The data item extracted is distributed into the attribute of the data object to be constructed to construct data object, wherein each to construct Data object it is corresponding with each effective data entry.
Optionally, wherein the structured source data are one in multiple structured source data, for the multiple knot Each in structure source data executes the acquisition, the parsing, the extraction, the constitution step respectively, and is directed to The acquisition performed by least two structured source data in the multiple structured source data, described carries the parsing It takes, the constitution step executes parallel.
Optionally, wherein the object is Java object, and the wherein described logarithm is to be based on Java according to the note of object The Java object of reflex mechanism is explained.
Other side according to the ... of the embodiment of the present invention additionally provides a kind of electronic equipment of construction data object.
The electronic equipment of the construction data object of other side according to the ... of the embodiment of the present invention, which is characterized in that including: One or more processors;Storage device, for storing one or more programs, when one or more of programs are by described one A or multiple processors execute so that the method that one or more of processors realize construction data object.
Other side according to the ... of the embodiment of the present invention additionally provides a kind of computer-readable medium.
The computer-readable medium of other side according to the ... of the embodiment of the present invention, is stored thereon with computer program, It is characterized in that, the method that construction data object is realized when described program is executed by processor.
One embodiment in foregoing invention has the following advantages that or advantageous effect:Because being noted using the object based on reflection Solution parses source data, constructs the technological means of data object, thus the source data for overcoming different-format need to construct it is different The technical issues of analysis program, and then reach different source datas and handled using identical resolution logic, improve code reuse Technique effect;Because using the parallel processing manner of such as thread pool, the low problem of analyzing efficiency line by line is overcome, is reached Improve the technique effect of analyzing efficiency;Additionally since using call back function mechanism, it is more convenient the follow-up place to generated object Reason.
Further effect possessed by above-mentioned non-usual optional mode adds hereinafter in conjunction with specific implementation mode With explanation.
Description of the drawings
Attached drawing does not constitute inappropriate limitation of the present invention for more fully understanding the present invention.Wherein:
Fig. 1 is the flow chart of the method for construction data object according to the ... of the embodiment of the present invention;
Fig. 2 is the schematic diagram according to the ... of the embodiment of the present invention realized with thread pool to source data parallel processing;
Fig. 3 is the stream according to the ... of the embodiment of the present invention parsed to one or more of source datas using analytics engine Cheng Tu;
Fig. 4 is the flow chart according to the ... of the embodiment of the present invention that subsequent operation is executed using call back function;
Fig. 5 is the schematic diagram of the device of construction data object according to the ... of the embodiment of the present invention;
Fig. 6 is adapted for the structural representation of the computer system for the terminal device or server of realizing the embodiment of the present application Figure
Specific implementation mode
It explains to the exemplary embodiment of the present invention below in conjunction with attached drawing, including the various of the embodiment of the present invention Details should think them only exemplary to help understanding.Therefore, those of ordinary skill in the art should recognize It arrives, various changes and modifications can be made to the embodiments described herein, without departing from scope and spirit of the present invention.Together The description to known function and structure is omitted for clarity and conciseness in sample in following description.
Fig. 1 is the flow chart of the method for construction data object according to an embodiment of the invention, is depicted as shown in Figure 1 The method of construction data object according to an embodiment of the invention.
In one embodiment, source data to be resolved is stored in the data in the file in cloud storage.Although at this Source data to be resolved is downloaded from remote data storage in embodiment, but resolving according to an embodiment of the invention It is also applied for parsing local data.
In the step S11 of Fig. 1, can source data be downloaded in batches from the remote data storage of such as cloud storage first. In this embodiment, source data to be resolved is stored in the file data in cloud storage, in step s 11 deposits file from cloud Batch is downloaded in storage.File is singly downloaded different from the prior art, is singly parsed, but by cloud storage On files in batch download with pending.In one embodiment, downloaded file can be saved in the disk of server On, while returning to file in the path of disk and being saved in file ordered queue.In other embodiments, it can will be downloaded File be saved in caching, for subsequent processing.As an alternative, file to be resolved is local file, then is not necessarily to from long-range Data storage obtains.In other embodiments, source data such as to be resolved is local file, then need not execute step S11.
In the step S12 of Fig. 1, the parallel processing to one or more of source datas may be implemented.Preferably, for The process of one or more source data construction data objects may include S12, with parallel processing one or more source data.One In a embodiment, one or more source datas to be resolved are the file datas in cloud storage.By file under cloud storage batch After being downloaded to local disk, the path of file or other identifier can be saved in document queue.In another embodiment, by file It is downloaded in caching from cloud storage batch, and records path or the other identifier of file in such as listed files.To file The various parallel computation modes such as thread pool, process pool, Distributed Calculation may be used to realize in parallel processing.At another In embodiment, pending one or more source datas not instead of file, the source data of such as tables of data of other forms, It can be handled in a similar way.Realize that the embodiment schematic diagram of parallel processing is as shown in Figure 2 with thread pool.
Fig. 2 is the schematic diagram according to the ... of the embodiment of the present invention realized with thread pool to source data parallel processing, such as Fig. 2 institutes Show, depicts the schematic diagram according to the ... of the embodiment of the present invention realized with thread pool and handled file in parallel.In one embodiment, File after the locally downloading disk of cloud storage batch, is regularly being obtained into the road of file from file ordered queue from the background Diameter is simultaneously distributed to multiple thread pools.As described in Figure, the information that file 0 is obtained in thread 0 obtains file 1 in thread 1 Information, and so on.Downloaded file 1 is concurrently handled to file n in thread 0 to thread n in a manner of thread pool.
In one embodiment, downloaded file can be the file of various types of formats.Such as the following table 1 and following table Shown in 2.
For example, personal information data:
Name, age, educational background, occupation
Zhang San, 25, undergraduate course, teacher
Li Si, 21, training, driver
For example, merchandise news:
Title, price, color, specification
A brand mobile phones, 2999, white, 5.5 cun of screens
B brand mobile phones, 5999, golden, 5.5 cun of screens
Personal information and merchandise news have different attributes, type, unit, format etc..In a kind of this embodiment, Parallel processing is realized using such as thread pool, then can be in thread 0 to thread n, processing independently of each other parses these not The file of same type format.
Fig. 1 is returned to parse one or more source datas using analytics engine in the step S13 of Fig. 1.Specifically Resolving is referring to Fig. 3.
Fig. 3 is the stream according to the ... of the embodiment of the present invention parsed to one or more of source datas using analytics engine Cheng Tu, as shown in figure 3, depicting the stream according to the ... of the embodiment of the present invention parsed to the data downloaded using analytics engine Journey.
In the step S31 of Fig. 3, parsing strategy is obtained first.Analytics engine is according to parsing strategy to every data line It is parsed, can be referred to as data entry per data line.In one embodiment, parsing strategy includes mainly with next A little strategies:(1) invalid data is skipped in definition.For example, preceding how many row in file skipped, that is, how many before file capable do not do It parses (2) and defines separator.For example, the text for defining file content data is separated according to what symbol, for example csv files are Separated by commas accords with file, its content-data is exactly according to separated by commas.
In one embodiment, parsing strategy is defined by the user, and passes to analytics engine.In another embodiment In, parsing strategy is predefined in a program, and user follows the specification of parsing strategy when resolution file data.
In the exemplary embodiment, file source data to be resolved is as follows:
-----*&&&&))^^_
jklfds^&* %&%P }
Name, age, educational background, occupation
Zhang San, 25, undergraduate course, teacher
Li Si, 21, training, driver
In this embodiment, the first row and the second row are useless information, need not parse, then parse policy definition " skipping preceding 2 row ".For example, the format of specification is " skip_lines:2 ", preceding 2 row is skipped in expression.In addition, in the exemplary file In source data, separator is comma.Strategy is then parsed to also need to define " separated by commas ".For example, the format of specification is "separator:, " indicate file content separated by commas.After analytics engine gets parsing strategy, according to parsing plan Specification slightly acts to execute corresponding parsing.Such as " skip_lines:2 ", program can skip preceding 2 row automatically; "separator:, " program can be separated the data of this line automatically according to comma.
On in step s 32 of fig. 3, each row of data is read using row resolver.In one embodiment, analytics engine obtains To after parsing strategy, row resolver is called.The effect of row resolver is to read data often capable in file, and parsing is supplied to draw Hold up the data according to the parsing policy resolution row.
The example embodiment continued the above, in this document:
Name, age, educational background, occupation
Zhang San, 25, undergraduate course, teacher
Li Si, 21, training, driver
Wherein, each row of data is read via row resolver:
The first row is:(name, age, educational background, occupation)
Second row is:(Zhang San, 25, undergraduate course, teacher)
The third line is:(Li Si, 21, training, driver)
In the step S33 of Fig. 3, analytics engine carries out each row of data according to the user-defined note to data object Parsing, data item corresponding with the object properties to be built is converted by each row of data.In one embodiment, to data The definition mode of the note of object is that the serial number occurred in the text according to the field in object is labeled.Implement at one In example, the object to be built is Java object, then is that the object based on Java reflex mechanisms is explained to the note of data object.It is right As explaining the attribute for helping user by defining constructed object, to extract effective object property field.
Term " note " is attached to some metamessages in code, is solved in compiling, operation for some tools Analysis and use, play explanation, the function of configuration.The semanteme of code is not directly affected, but he can be seen as program Tool or class libraries.It can in turn influence the Program Semantics being currently running.For example, Java explains (Annotation) It is equivalent to a kind of label, note is added in a program and is equal to stamp certain label, javac compilers, developing instrument for program Corresponding behaviour can be executed according to label by reflecting whether there is or not which kind of labels in class and various elements to understand with other programs Make.
In the step S34 of Fig. 3, the attribute for the object to be constructed will be assigned to using the data item parsed, To construct object.
In one embodiment, for example, the data in file source data per a line will be converted to student (Student) in this way One object, Student objects have two attributes of name (name) and age (age).And also have student's in file source data Name and age data are separated with separated by commas symbol.So the Content Transformation of file source data at Student objects Name and age, it is necessary to increase serial number on the attribute of Student objects and explain.@Column (order=0) indicate this The 1st row in the value respective file of attribute.The content of file source data is as follows:
Name, age
Zhang San, 19
Li Si, 15
King five, and 21
@Column (order=0) notes are translated by analytics engine according to reflex mechanism, and basic process is:
First, analytics engine gets the second row data (Zhang San, 19) by row resolver from file source data, then Get all properties (name and age) of Student objects;
Then, analytics engine gets the note@Column (order=0) on attribute name, is parsed from the inside is explained Order is 0, and the meaning is exactly the data " Zhang San " of first row to be obtained from data (Zhang San, 19), and " Zhang San " is assigned to name;
Then, the note@Column (order=1) on attribute age are got, are 1 from explaining the inside to parse order, The meaning is exactly the data " 19 " of secondary series to be obtained from data (Zhang San, 19), and " 19 " are assigned to age;
A Student object is thus generated, Student objects name is Zhang San, and age is 19.
In one embodiment, it in the case of realizing resolving by the program code of Java language exploitation, uses Java reflex mechanisms.The example of Student objects is continued the above, Java reflections are as follows:
class Student{
@Column (order=0)
String name;
@Column (order=1)
int age;
}
After being reflected in this way by Java, the whatsoever file source data of format can all be converted to that user wants Java pairs As.
Fig. 1 is returned, in the step S14 of Fig. 1, subsequent operation is executed to the object constructed.Using call back function to institute The object of construction executes subsequent operation.The step is realized by Fig. 4.
Fig. 4 is the flow chart according to the ... of the embodiment of the present invention that subsequent operation is executed using call back function, as shown in figure 4, retouching The flow that subsequent operation is executed using call back function is painted.
In the step S41 of Fig. 4, after being parsed, user-defined call back function is called.Call back function be user from The function of definition can call automatically after file source data is parsed.
In one embodiment, call back function interface is defined by analytics engine, but the realization of call back function is by user Definition.The main effect of call back function is that allow user to parse file source data invisible, and visible to the processing of data.User Call back function can be flexibly defined, to handle a variety of demands of user.After analytics engine is parsed, each row of data is obtained Corresponding object.In one embodiment, with array the object can be stored.
The example for continuing the above Student objects, by parsing three obtained Student objects, these three Student Object can be stored in object array:
(Zhang San, 19)
(Li Si, 15)
(king five, 21)
In the step S42 of Fig. 4, to executing subsequent operation through parsing obtained object.As set forth above, it is possible to pass through readjustment Function executes user-defined logic.For example, in one embodiment, user needs to protect the data of parsing after being parsed It is stored to database, then user can write the operation for preserving database in call back function.In another embodiment, Yong Huxu In the data storage to caching after parsing, then it only needs to write storage in call back function to the logic cached.
Fig. 5 is the schematic diagram of the device of construction data object according to the ... of the embodiment of the present invention, as shown in figure 5, depicting root According to the module of the device of the construction data object of the embodiment of the present invention.
Data acquisition module 51 is such as deposited from cloud for downloading one or more data from remote data storage end batch It stores up end batch and downloads one or more data.And realize the parallel processing to institute's downloading data.In one embodiment, line is used The mode of Cheng Chi realizes parallel processing.In other embodiments, it can be realized with modes such as process pool, distributed systems parallel Processing.
Analytics engine module 52, for realizing the process parsed to the data downloaded is explained according to object.Parsing Engine modules 52 include policy resolution submodule 521, row resolver submodule 522, data item extracting sub-module 523 and object Construct submodule 524.
Policy resolution submodule 521 executes parsing for obtaining parsing strategy.In one embodiment, plan is parsed It is slightly defined by the user, and passes to analytics engine.In another embodiment, parsing strategy is predefined in a program, User follows the specification of parsing strategy when resolution file source data.
Row resolver submodule 522, for reading row data.In one embodiment, analytics engine gets parsing plan After slightly, row resolver is called.The effect of row resolver is to read data often capable in file source data, and be supplied to policy resolution Submodule 521 is according to the data for parsing the policy resolution row.
Data item extracting sub-module 523, each row of data for being read out to row resolver parse, and utilize logarithm Data item corresponding with the object properties to be constructed is extracted according to the note of object, for use as the field of object.
Object formation submodule 524, the data item for being extracted using data item extracting sub-module 523 construct object.
Call back function module 53, for after being parsed, calling user-defined call back function to come to pair through parsing As executing subsequent operation.Call back function is user-defined function, can be called automatically after file source data is parsed.
Below with reference to Fig. 6, it illustrates the computer systems 600 suitable for the terminal device for realizing the embodiment of the present application Structural schematic diagram.Terminal device shown in Fig. 6 is only an example, to the function of the embodiment of the present application and should not use model Shroud carrys out any restrictions.
As shown in fig. 6, computer system 600 includes central processing unit (CPU) 601, it can be read-only according to being stored in Program in memory (ROM) 602 or be loaded into the program in random access storage device (RAM) 603 from storage section 608 and Execute various actions appropriate and processing.In RAM 603, also it is stored with system 600 and operates required various programs and data. CPU 601, ROM 602 and RAM 603 are connected with each other by bus 604.Input/output (I/O) interface 605 is also connected to always Line 604.
It is connected to I/O interfaces 605 with lower component:Importation 606 including keyboard, mouse etc.;It is penetrated including such as cathode The output par, c 607 of spool (CRT), liquid crystal display (LCD) etc. and loud speaker etc.;Storage section 608 including hard disk etc.; And the communications portion 609 of the network interface card including LAN card, modem etc..Communications portion 609 via such as because The network of spy's net executes communication process.Driver 610 is also according to needing to be connected to I/O interfaces 605.Detachable media 611, such as Disk, CD, magneto-optic disk, semiconductor memory etc. are mounted on driver 610, as needed in order to be read from thereon Computer program be mounted into storage section 608 as needed.
Particularly, it according to embodiment disclosed by the invention, may be implemented as counting above with reference to the process of flow chart description Calculation machine software program.For example, embodiment disclosed by the invention includes a kind of computer program product comprising be carried on computer Computer program on readable medium, the computer program include the program code for method shown in execution flow chart. In such embodiment, which can be downloaded and installed by communications portion 609 from network, and/or from can Medium 611 is dismantled to be mounted.When the computer program is executed by central processing unit (CPU) 601, the system that executes the application The above-mentioned function of middle restriction.
It should be noted that computer-readable medium shown in the application can be computer-readable signal media or meter Calculation machine readable storage medium storing program for executing either the two arbitrarily combines.Computer readable storage medium for example can be --- but not Be limited to --- electricity, magnetic, optical, electromagnetic, infrared ray or semiconductor system, device or device, or arbitrary above combination.Meter The more specific example of calculation machine readable storage medium storing program for executing can include but is not limited to:Electrical connection with one or more conducting wires, just It takes formula computer disk, hard disk, random access storage device (RAM), read-only memory (ROM), erasable type and may be programmed read-only storage Device (EPROM or flash memory), optical fiber, portable compact disc read-only memory (CD-ROM), light storage device, magnetic memory device, Or above-mentioned any appropriate combination.In this application, can be any include computer readable storage medium or storage journey The tangible medium of sequence, the program can be commanded the either device use or in connection of execution system, device.And at this In application, computer-readable signal media may include in a base band or as the data-signal that a carrier wave part is propagated, Wherein carry computer-readable program code.Diversified forms may be used in the data-signal of this propagation, including but unlimited In electromagnetic signal, optical signal or above-mentioned any appropriate combination.Computer-readable signal media can also be that computer can Any computer-readable medium other than storage medium is read, which can send, propagates or transmit and be used for By instruction execution system, device either device use or program in connection.Include on computer-readable medium Program code can transmit with any suitable medium, including but not limited to:Wirelessly, electric wire, optical cable, RF etc. or above-mentioned Any appropriate combination.
Flow chart in attached drawing and block diagram, it is illustrated that according to the system of the various embodiments of the application, method and computer journey The architecture, function and operation in the cards of sequence product.In this regard, each box in flowchart or block diagram can generation A part for a part for one module, program segment, or code of table, above-mentioned module, program segment, or code includes one or more Executable instruction for implementing the specified logical function.It should also be noted that in some implementations as replacements, institute in box The function of mark can also occur in a different order than that indicated in the drawings.For example, two boxes succeedingly indicated are practical On can be basically executed in parallel, they can also be executed in the opposite order sometimes, this is depended on the functions involved.Also it wants It is noted that the combination of each box in block diagram or flow chart and the box in block diagram or flow chart, can use and execute rule The dedicated hardware based systems of fixed functions or operations is realized, or can use the group of specialized hardware and computer instruction It closes to realize.
Being described in module involved in the embodiment of the present application can be realized by way of software, can also be by hard The mode of part is realized.Described module can also be arranged in the processor, for example, can be described as:A kind of processor packet Include data acquisition module, analytics engine module and call back function module.Wherein, the title of these modules is under certain conditions simultaneously The restriction to the module itself is not constituted, for example, data acquisition is also described as the " mould for obtaining structured source data Block ".
As on the other hand, present invention also provides a kind of computer-readable medium, which can be Included in equipment described in above-described embodiment;Can also be individualism, and without be incorporated the equipment in.Above-mentioned calculating Machine readable medium carries one or more program, when said one or multiple programs are executed by the equipment, makes Obtaining the equipment includes:Obtain structured source data;Strategy is parsed according to predefined data, to every in the structured source data One effective data entry executes parsing, to obtain the number for meeting the data parsing strategy corresponding with the data entry According to collection;It is described pre- for the data set using the predefined note to data object to extract data item from the data set One or more attributes that the data object is defined to the note of data object of definition;According to the data item extracted come structure Make data object corresponding with the data entry.
Technical solution according to the ... of the embodiment of the present invention can be based on reflex mechanism, the batch processing by the way of thread pool Data in the file preserved in such as cloud storage, file and local file in other remote data storages, and pass through note Solution carrys out the final analysis object of dynamic analysis Generating Data File, handles analysis object finally by call back function and is saved in number According in library.To improve efficiency, rate of code reuse is improved.
The said goods can perform the method that the embodiment of the present invention is provided, and has the corresponding function module of execution method and has Beneficial effect.The not technical detail of detailed description in the present embodiment, reference can be made to the method that the embodiment of the present invention is provided.
Above-mentioned specific implementation mode, does not constitute limiting the scope of the invention.Those skilled in the art should be bright It is white, design requirement and other factors are depended on, various modifications, combination, sub-portfolio and replacement can occur.It is any Modifications, equivalent substitutions and improvements made by within the spirit and principles in the present invention etc., should be included in the scope of the present invention Within.

Claims (20)

1. a kind of method of construction data object, which is characterized in that this method includes:
Obtain structured source data;
Strategy is parsed according to predefined data, solution is executed to each effective data entry in the structured source data Analysis, to obtain the data set for meeting the data parsing strategy corresponding with the data entry;
It is described predetermined for the data set using the predefined note to data object to extract data item from the data set One or more attributes that the data object is defined to the note of data object of justice;
Data object corresponding with the data entry is constructed according to the data item extracted.
2. the method as described in claim 1, which is characterized in that the method further includes:
Operation is executed to the data object constructed using call back function.
3. the method as described in claim 1, which is characterized in that the wherein described data parsing strategy includes indicating effective data Data in the location information and the instruction effective data entry of position of the entry in one or more of source datas The delimiter information of separator between.
4. method as claimed in claim 3, which is characterized in that according to data parsing strategy in structured source data Each effective data entry executes the step of parsing:
According to the positional information, one or more effective data strips are determined in one or more of structured source data Mesh;
For each effective data entry:
According to the delimiter information, at least one data item is determined from the effective data entry, to obtain including institute The data set through parsing of determining at least one data item.
5. the method as described in claim 1, which is characterized in that the wherein described logarithm includes the number to be constructed according to the note of object According to the attribute and data item corresponding with the attribute for the data object to be constructed of object in the data set through parsing Position.
6. method as claimed in claim 5, which is characterized in that wherein using the note to data object come from data set The step of middle extraction data item further includes:
According to position of the data item corresponding with the attribute for the data object to be constructed in the data set through parsing The data item is extracted from the data set through parsing.
7. method as claimed in claim 6, which is characterized in that carried from data set using the note to data object The step of taking data item include:
According to sequence defined in the note to data object, the data item extracted is distributed into the number to be constructed According to the attribute of object, wherein the data object each to be constructed is corresponding with each effective data entry.
8. the method as described in claim 1, which is characterized in that wherein, the structured source data are multiple structured source numbers It one in, is executed respectively for each in the multiple structured source data and described obtain, the parsing, described carries It takes, the constitution step, and for performed by least two structured source data in the multiple structured source data The acquisition, the parsing, the extraction, the constitution step execute parallel.
9. the method as described in claim 1, which is characterized in that the wherein described data object is Java object, and wherein institute It is that the Java object based on Java reflex mechanisms is explained to state to the note of data object.
10. a kind of device of construction data object, which is characterized in that described device includes:
Data acquisition module, for obtaining structured source data;
Analytics engine module, including:
Policy resolution submodule has each in the structured source data for parsing strategy according to predefined data The data entry of effect executes parsing, to obtain the data set for meeting the data parsing strategy corresponding with the data entry;
Data item extracting sub-module, for for the data set using the predefined note to data object with from the data set Middle extraction data item, predefined one or more attributes that the data object is defined to the note of data object;
Object formation submodule, for constructing data object corresponding with the data entry according to the data item extracted.
11. device as claimed in claim 10, which is characterized in that described device further includes:
Call back function module, for executing operation to the data object constructed using call back function.
12. device as claimed in claim 10, which is characterized in that the wherein described data parsing strategy includes the effective number of instruction According to the number in the location information and the instruction effective data entry of position of the entry in one or more of source datas According to the delimiter information of the separator between item.
13. device as claimed in claim 12, which is characterized in that the policy resolution submodule is additionally operable to:
According to the positional information, one or more effective data strips are determined in one or more of structured source data Mesh;
For each effective data entry:
According to the delimiter information, at least one data item is determined from the effective data entry, to obtain including institute The data set through parsing of determining at least one data item.
14. device as claimed in claim 10, which is characterized in that the wherein described logarithm includes being constructed according to the note of object The attribute of data object and data item corresponding with the attribute for the data object to be constructed are in the data set through parsing In position.
15. device as claimed in claim 14, which is characterized in that the data item extracting sub-module is additionally operable to:
According to position of the data item corresponding with the attribute for the data object to be constructed in the data set through parsing The data item is extracted from the data set through parsing.
16. device as claimed in claim 15, which is characterized in that the object formation submodule is additionally operable to:
According to sequence defined in the note to data object, the data item extracted is distributed into the number to be constructed According to the attribute of object to construct data object, wherein the data object each to be constructed is opposite with each effective data entry It answers.
17. device as claimed in claim 10, which is characterized in that wherein, the structured source data are multiple structured sources One in data, for each in the multiple structured source data execute respectively it is described obtain, it is described parsing, it is described Extraction, the constitution step, and for performed by least two structured source data in the multiple structured source data The acquisition, the parsing, the extraction, the constitution step executes parallel.
18. device as claimed in claim 10, which is characterized in that the wherein described data object is Java object, and wherein The note to data object is that the Java object based on Java reflex mechanisms is explained.
19. a kind of electronic equipment of construction data object, which is characterized in that including:
One or more processors;
Storage device, for storing one or more programs,
When one or more of programs are executed by one or more of processors so that one or more of processors are real The now method as described in any in claim 1-9.
20. a kind of computer-readable medium, is stored thereon with computer program, which is characterized in that described program is held by processor The method as described in any in claim 1-9 is realized when row.
CN201710227914.5A 2017-04-10 2017-04-10 A kind of method and apparatus of construction data object Pending CN108694194A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710227914.5A CN108694194A (en) 2017-04-10 2017-04-10 A kind of method and apparatus of construction data object

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710227914.5A CN108694194A (en) 2017-04-10 2017-04-10 A kind of method and apparatus of construction data object

Publications (1)

Publication Number Publication Date
CN108694194A true CN108694194A (en) 2018-10-23

Family

ID=63842370

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710227914.5A Pending CN108694194A (en) 2017-04-10 2017-04-10 A kind of method and apparatus of construction data object

Country Status (1)

Country Link
CN (1) CN108694194A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110018824A (en) * 2018-11-28 2019-07-16 阿里巴巴集团控股有限公司 A kind of method and apparatus that domain object is converted into view object
CN112051999A (en) * 2020-09-03 2020-12-08 中国银行股份有限公司 Method and device for generating configured download file
CN112181804A (en) * 2020-08-31 2021-01-05 五八到家有限公司 Parameter checking method, equipment and storage medium
CN113835707A (en) * 2021-09-30 2021-12-24 唯品会(广州)软件有限公司 Number making method, device, equipment and readable storage medium

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080082571A1 (en) * 2006-09-29 2008-04-03 Miyuki Sakai System and Method for Transforming Tabular Form Date into Structured Document
CN102611836A (en) * 2012-02-06 2012-07-25 上海理工大学 High-speed image collecting method based on Labview
CN103745010A (en) * 2014-01-28 2014-04-23 北京京东尚科信息技术有限公司 Method and device for determining object attribute value based on CSV (Comma Separated Values) file
CN104182484A (en) * 2014-08-07 2014-12-03 北京京东尚科信息技术有限公司 Method and device for realizing mapping of HBase data and Java domain objects
CN105491135A (en) * 2015-12-11 2016-04-13 小米科技有限责任公司 Data connection establishing method and device
CN106202082A (en) * 2015-04-30 2016-12-07 阿里巴巴集团控股有限公司 The method and device of built-up foundation data buffer storage
CN106302442A (en) * 2016-08-12 2017-01-04 广州慧睿思通信息科技有限公司 A kind of network communication packet analytic method based on Java language
CN106354481A (en) * 2015-07-13 2017-01-25 阿里巴巴集团控股有限公司 Method and equipment for uniform mapping of HTTP requests

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080082571A1 (en) * 2006-09-29 2008-04-03 Miyuki Sakai System and Method for Transforming Tabular Form Date into Structured Document
CN102611836A (en) * 2012-02-06 2012-07-25 上海理工大学 High-speed image collecting method based on Labview
CN103745010A (en) * 2014-01-28 2014-04-23 北京京东尚科信息技术有限公司 Method and device for determining object attribute value based on CSV (Comma Separated Values) file
CN104182484A (en) * 2014-08-07 2014-12-03 北京京东尚科信息技术有限公司 Method and device for realizing mapping of HBase data and Java domain objects
CN106202082A (en) * 2015-04-30 2016-12-07 阿里巴巴集团控股有限公司 The method and device of built-up foundation data buffer storage
CN106354481A (en) * 2015-07-13 2017-01-25 阿里巴巴集团控股有限公司 Method and equipment for uniform mapping of HTTP requests
CN105491135A (en) * 2015-12-11 2016-04-13 小米科技有限责任公司 Data connection establishing method and device
CN106302442A (en) * 2016-08-12 2017-01-04 广州慧睿思通信息科技有限公司 A kind of network communication packet analytic method based on Java language

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
(美)斯佩克特: "《R语言数据操作》", 31 July 2011 *
LEONGFENG: "利用dom4j包和反射动态解析不同实体类型的xml", 《HTTPS://BLOG.CSDN.NET/》 *
UUUUTAOSSIENUUUU: "回调函数", 《HTTPS://BLOG.CSDN.NET/》 *
张晓明: "《计算机网络编程技术》", 31 October 2009 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110018824A (en) * 2018-11-28 2019-07-16 阿里巴巴集团控股有限公司 A kind of method and apparatus that domain object is converted into view object
CN112181804A (en) * 2020-08-31 2021-01-05 五八到家有限公司 Parameter checking method, equipment and storage medium
CN112181804B (en) * 2020-08-31 2023-09-08 五八到家有限公司 Parameter verification method, device and storage medium
CN112051999A (en) * 2020-09-03 2020-12-08 中国银行股份有限公司 Method and device for generating configured download file
CN112051999B (en) * 2020-09-03 2024-04-19 中国银行股份有限公司 Configurable download file generation method and device
CN113835707A (en) * 2021-09-30 2021-12-24 唯品会(广州)软件有限公司 Number making method, device, equipment and readable storage medium
CN113835707B (en) * 2021-09-30 2024-01-19 唯品会(广州)软件有限公司 Counting method, counting device, counting equipment and readable storage medium

Similar Documents

Publication Publication Date Title
CN108171276B (en) Method and apparatus for generating information
CN110472207A (en) List generation method and device
CN108694194A (en) A kind of method and apparatus of construction data object
US20160117291A1 (en) Conversion of a presentation to darwin information typing architecture (dita)
CN109036425A (en) Method and apparatus for operating intelligent terminal
CN108804327A (en) A kind of method and apparatus of automatic Data Generation Test
CN109522341A (en) Realize method, apparatus, the equipment of the stream data processing engine based on SQL
CN114424257A (en) Automatic rendering and extraction of form data using machine learning
CN109871311A (en) A kind of method and apparatus for recommending test case
CN109271403A (en) A kind of operating method of data query, device, medium and electronic equipment
CN108108342A (en) Generation method, search method and the device of structured text
CN108959436A (en) Dictionary edit methods and system for voice dialogue platform
US20220121668A1 (en) Method for recommending document, electronic device and storage medium
CN109101309A (en) For updating user interface method and device
CN109389660A (en) Image generating method and device
CN109002282A (en) A kind of method and apparatus for realizing animation effect in web page exploitation
CN109284367A (en) Method and apparatus for handling text
CN109375910A (en) Class file generation method, device, electronic equipment and storage medium
CN108959294A (en) A kind of method and apparatus accessing search engine
CN106445645B (en) Method and apparatus for executing distributed computing task
CN110109983A (en) A kind of method and apparatus operating Redis database
CN107688609A (en) A kind of position label recommendation method and computing device
Idris NumPy Cookbook
Usuelli R machine learning essentials
CN108460020A (en) Method and device for obtaining information

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20181023