Background technology
DOI identifiers, are a kind of Digital Object Unique Identifier (Digital Object Unique Identifier-
DOI).DOI identifiers belong to the mechanism of a set of identification digital resource, and the object for including has video, report or books etc..It was both
There is a set of mechanism for resource name, also have a set of agreement that identifier is resolved to specific address, under being also cloud computing background
Optimal " big data " sample is stored and application technology.
Specifically, the avatar of DOI mainly includes:Quick Response Code, bar code, character code, domain names etc., digital object
Uniqueness, is the characteristic feature of DOI, is also " identity card " number of digital Age.DOI identifiers are by prefix and suffix two parts
Composition, between separated with "/", and prefix is further divided into two parts with ". ".Prefix is true by international digital object identifier foundation
Fixed, suffix portion is voluntarily specified by resource publisher, for distinguishing a single numerical data, with uniqueness.
Additionally, at present comparative maturity, approved and had been enter into the identifier resolution system of practical stage by industry and be
Handle System (hereinafter referred to as Handle systems), the system is subsidized CNRI mechanisms and is researched and developed by U.S. DARPA earliest,
Its exploitation director is known as the Robert of one of technique of internet pioneer personage, ICP/IP protocol maker. Mr. Cann
(Dr.Robert.Kahn), Handle systems are of great interest after releasing and welcome, at present its relevant criterion by
IETF is received as RFC documents.Handle systems are a general distributed name service systems, and it is including a set of opening
The reference implementation model of system agreement, unique name space and agreement, can be with efficient, expansible, reliable way
Network unique identifier registration and analysis service are provided.
Handle systems have following outstanding features:1) safely and efficiently parsing and administrative mechanism, with certification/award
The functions such as power, data confidentiality, service verification and secret protection;2) can realize to identifier independently of physical movement environment
And its Distributed Services and the management of attribute.At present, Handle systems can be the applications such as digital library, digital publishing
A kind of efficient, expansible, open unique identifier system is provided.Above-mentioned DOI identifiers are based on Handle identifiers
Proper subclass, Handle identifiers are a kind of technologies of similar IP, are another identifier standards of TCP/IP inventions.
So-called biological data can be included:The various types of data such as the experiment equipment of biologic medical, hair, blood sample.
Biological big data field, because biologic medical data volume is very big, correlation is chaotic, is not easy to management.And digital object is unique
Identifier, is " big data " sample storage optimal under cloud computing background and application technology.
For the solution of big data, there are following several ways at present:
Chinese patent ZL200510112526, a kind of method of generation identifier, including:The span of A, configuration ID,
The interval of ID is divided according to the availability of ID;B, the interval to dividing are numbered, and using random function volume is produced
A random number in the range of number, selects numbering with the random number identical interval for producing;It is used for selected by judging
The number of ID in the interval of ID is produced, if only one of which ID in the interval, the ID is taken as the new ID for producing;
If there is more than one ID in the interval, in the interval in ID number ranges is produced using random function
Random number, takes ID corresponding with the random number in the interval, updates the interval of ID.The whole numbers of satisfaction can be produced
Word, unique, random and length can configure the ID of requirement.What it considered is only the identifier generation method of generic industry, not
Consider industry characteristic.
Additionally, Chinese patent application 201410487306.4, the processing method and processing device of DOI in a kind of interactive information, use
To improve the efficiency that user obtains DOI institutes identification information.Method includes:The digital object that server is obtained in interactive information is unique
Identifier DOI;The DOI is parsed, the information that DOI is identified is obtained;To the information that client push DOI is identified,
So that client shows the information that DOI is identified in interaction interface.
A kind of Chinese patent application 201410838339.9, information displaying method and device, the method includes:Monitoring is directed to
The assigned operation of DOI, when monitoring to perform assigned operation for the DOI, shows the corresponding profile informations of the DOI.It is main
A kind of identifier for going to operate (go to play) automatically according to corresponding resource (such as, music file) is provided.
A kind of Chinese patent application 201410785058.1, information displaying method and device, the method includes:To in the page
Digital Object Unique Identifier DOI and the display location of other information be monitored, when the exhibition for monitoring the other information
When showing that position overlaps with the display location of the DOI, process is hidden to the other information.In this application, exist
Scanning DOI during, by JavaScript to monitor the page in DOI display location, if monitoring other in the page
When information blocks the DOI, then other information is hidden, so that DOI can clearly show.If above-mentioned application is aobvious
Blocked by other resources (such as, file) when showing, go to be shown to automatically before, so as to convenient display.
Identifier at least needs to meet following several aspect functions:
1) it is convenient to print
2) unify with it is general
3) state during collection can be reviewed
In view of technical scheme above all there is a problem of it is different, so needing to be proposed for managing biological data identifier
Specification, and it is supporting proposition generation system.
The content of the invention
The technical problem to be solved in the present invention is to provide for managing biological data identifier, and being capable of specification biology number
According to generation identifier method.
Above-mentioned technical problem is solved, a kind of method of the generation identifier of biological data comprises the steps:
Collection biological data content, according to setting rule identifier is produced, and the identifier is configured at least include:
To the first mark for producing international uniform coding,
To the second mark for recording collection source,
To the 3rd mark for distinguishing biological data classification,
With for record acquisition time, generate uniquely identified the 4th identify,
By above-mentioned identifier synchronization to database.
Preferably, the identifier is configured to:
<International Handle identifiers>/<Collecting mechanism>.<Collection biological data source character of institution>.<The biological number of collection
According to classification>.<Biological name>.<Biological tissue's name>.<Timestamp>.<Three serial numbers>
Preferably, first mark is further included:International uniform encodes Handle or DOI identifiers.
Preferably, second mark is further included:Collecting mechanism and/or collection biological data source character of institution,
Based on SPREC principles (Standard PREanalytical Code), based on SPREC principles, it is also the shape for formulating identifier
State, but acquisition state can be write identifier the inside.Source and the logarithm of biological data are acquired from the identifier
According to being traced to the source.
Preferably, the 3rd mark is further included:Using the biological name and/or self-defined organization name of Uniform Name
Biological tissue's name.
Preferably, the 4th mark is further included:To record acquisition time timestamp and/or as unique mark
Serial number.
Preferably, the method for the collection biological data content is:It is manually entered, is gathered by embedded chip.
Preferably, method also includes being sent into specified mailbox according to SMTP, POP3 or http protocol.
Preferably, the port of the collection biological data content adopts socket communication agreement with database.
The system that a kind of generation identifier of biological data is additionally provided based on the invention described above, it includes:Terminal, client
End server and service end server,
The terminal is configured to:Collection biological data content, produces in the client-server according to setting rule
Identifier,
Wherein, the identifier is configured at least include:
To the first mark for producing international uniform coding,
To the second mark for recording collection source,
To the 3rd mark for distinguishing biological data classification,
With for record acquisition time, generate uniquely identified the 4th identify,
The service end server is configured to receive above-mentioned identifier
Beneficial effects of the present invention:
1) because the identifier is configured at least include:Identify to produce international uniform coding first, to
Second mark in collection source is recorded, to the 3rd mark for distinguishing biological data classification, with for record acquisition time, life
Identify into uniquely identified the 4th.The method that above-mentioned identifier is generated, it is exactly unique that identifier is generated.Not only can trace to the source
Gatherer process, additionally it is possible to which unique mark gathers content, thus records to the state of gatherer process.For substantial amounts of biology
Data, are got up to access quickly using this kind of method management of the present invention, many soon compared to the DNS speed for being such as based on domain name.
Using the identifier in the present invention, not only facilitate print, it is unified with it is general, while when unique identifier can also review collection
State, is easy to the normalization of big data to gather and later stage process.
2) in addition, using the prefix of handle, can be with network-wide access, the global data base of handle:http://
hdl.handle.net/。
3) gathered by embedded chip, by embedded hardware mobile collection gatherer process is more facilitated, gathered
Mode is more portable.Further, can be prevented from artificially missing defeated and be distorted with chip generation identifier.
Specific embodiment
The principle of the disclosure is described referring now to some example embodiments.It is appreciated that these embodiments are merely for saying
It is bright and help those skilled in the art to understand and the embodiment disclosure purpose and describe, rather than advise model of this disclosure
Any restriction enclosed.Content of this disclosure described here can in mode described below outside various modes implement.
As described herein, term " including " and its various variants are construed as open-ended term, it means that " bag
Include but be not limited to ".Term "based" is construed as " being based at least partially on ".Term " one embodiment " it is understood that
For " at least one embodiment ".Term " another embodiment " is construed as " at least one other embodiment ".
It is appreciated that socket communication agreement in this application, two on network program is two-way logical by one
The exchange of data is realized in letter connection, and one end of this connection is referred to as a socket.Set up network service connection at least to want a pair
Port numbers (socket).First, server is monitored:It is the server side socket simultaneously specific client socket of delocalization, and
Be in etc. state to be connected, monitor in real time network state.Secondly, client request:Refer to and carried by the socket of client
Go out connection request, the target to be connected is the socket of server end.For this purpose, the socket of client must first describe it wanting
The socket of the server of connection, it is indicated that the address of server side socket and port numbers, then just to server side socket
Propose connection request.Finally, connection confirms:Refer to and receive in other words client socket when server side socket is listened to
Connection request, it sets up a new thread, the description of server side socket with regard to the request of customer in response end socket
Client is issued, once client confirms this description, connection is just established.And server side socket keeps monitoring
State, continues to the connection request of other client sockets
Long connection in the application is referred to, is to create and keep reliable and stable company between a client and a server
Connect.Common practice is that an endless loop, the in the circulating cycle variation of Monitoring Data are added in the program of server.Work as discovery
During new data, browser being outputed it to immediately and being disconnected, browser initiates request to enter after data are received, again
Long poll (long-polling) mode of next cycle.Length is connected in the page and is embedded in a Ge Yin KURA iframe, by this
The src attributes of Yin KURA iframe are set to the request to a long connection or are asked using xhr, and server end is just able to continuously
Ground is toward client input data.
Refer to Fig. 1 is the method flow schematic diagram in the present invention, is comprised the steps in embodiment:Step S100 is gathered
Biological data content, according to setting rule identifier is produced, and the identifier is configured at least include:Step S101 is to produce
First mark of raw international uniform coding, as the present embodiment in it is preferred, first mark is further included:International uniform
Coding Handle or DOI identifiers.Step S102 to record collection source second mark, as the present embodiment in it is excellent
Choosing, second mark is further included:Collecting mechanism and/or collection biological data source character of institution, it is former based on SPREC
Reason, acquires the source of biological data and data is traced to the source from the identifier.Step S103 is to distinguish biology
Data classification the 3rd mark, as the present embodiment in it is preferred, it is described 3rd mark further include:Using Uniform Name
Biological tissue's name of biological name and/or self-defined organization name.Step S104 is with for record acquisition time, the unique mark of generation
Know the 4th mark, as the present embodiment in it is preferred, it is described 4th mark further include:To record the acquisition time time
Stab and/or as uniquely identified serial number.Step S105 is by above-mentioned identifier synchronization to database.
The Handle is international standard:Based on RFC 3650.Such as Handle is, 200.500.11926.
Handle systems are a general distributed name service systems, including a set of open system agreement, unique
The reference implementation model of identifier name space and agreement, can be provided with efficient, expansible, reliable way and be based on network
Unique identifier registration and analysis service.
In certain embodiments.<Collection biological data source character of institution>Including but not limited to:The biological data of collection
Whether administrative examination and approval is related to.
In certain embodiments,<Collection biological data source character of institution>Including but not limited to:The biological data of collection
Mechanism character of institution.
In certain embodiments,<Collection biological data source character of institution>Including but not limited to:Customer resources classification.
In certain embodiments,<Collection biological data source character of institution>Including but not limited to:
Table 1
In certain embodiments,<Collection biological data source character of institution>Including but not limited to:Inside retain coding, it is interior
Portion's Test code, office, public institution, state-owned enterprise, private marketing enterprises, private unlisted company, natural person.
Table 2
In certain embodiments,<Collection biological data source character of institution>Including but not limited to:Inside retain coding, it is interior
Portion's Test code, directly collection, shared collection, commission collection etc..
Table 3
In certain embodiments,<Collection biological data source character of institution>Including but not limited to:Inside retain coding, it is interior
Portion's Test code, individual event biological data, multinomial biological data etc..
Table 4
In certain embodiments, using compatible linnaean's system, biological binomial nomenclature, space is with " _ " segmentation for biological name.
In certain embodiments, biological tissue's name adopts self-defined organization name, such as, self-defining character string, comprising 26
English alphabet small letter and numeral 0-9.
In certain embodiments,<Timestamp>.<Three serial numbers>To record acquisition time and unique mark.Such as,
Timestamp YYYY-MM-DD-HH-MM-SS-NN, YYYY:Year, MM:Month, DD:Day, HH:When, MM:Point, SS:Second, NN:Millisecond.
Preferred in above-mentioned steps, identifier is set to:<International Handle identifiers>/<Collecting mechanism>.<Collection is biological
Data source character of institution>.<Collection biological data classification>.<Biological name>.<Biological tissue's name>.<Timestamp>.<Three bit streams
Water number>.
Preferred in above-mentioned steps, the method for the collection biological data content is:Be manually entered, by embedded chip
Collection.
Preferred in above-mentioned steps, method also includes being sent into specified mailbox according to SMTP, POP3 or http protocol,
Increase data synchronization efficiency by sending to specified mailbox.
Preferred in above-mentioned steps, the port of the collection biological data content is assisted with database using socket communication
View.
Fig. 2 is the Identifer structure schematic diagram in Fig. 1, and preferred in above-mentioned steps, identifier is set to:<It is international
Handle identifiers>/<Collecting mechanism>.<Collection biological data source character of institution>.<Collection biological data classification>.<It is biological
Title>.<Biological tissue's name>.<Timestamp>.<Three serial numbers>.
Fig. 3 is a kind of preferred embodiment schematic diagram in Fig. 1, and a kind of method of the generation identifier of biological data is wrapped
Include following steps:
Step S10 is manually entered
Step S11 is gathered by embedded chip, and those skilled in the art can understand, the embedded chip can be by
Experiment equipment, hair, blood sample of biologic medical etc. are acquired according to different biochips and obtain, and by embedded core
Host computer is uploaded to after piece process.Such as, Embedded Mobile collecting device is a general hand-hold scanning equipment, by scanning
Living resources sample, can obtain biological specimen digital information, and acquisition mode is more portable.Again such as, for biological data
It is analyzed and gathers using blood sample analysis instrument, skin and hair microscopic analyzer etc..
Step S100 gather biological data content, according to setting rule produce identifier, the identifier be configured to
Include less:
Step S101 is identified to produce international uniform coding first,
Step S102 is identified to record collection source second,
Step S103 to distinguish biological data classification the 3rd mark,
Step S104 with for record acquisition time, generate uniquely identified the 4th identify,
Step S1041 is sent into specified mailbox according to SMTP, POP3 or http protocol,
The port and database that biological data content is gathered described in step S1042 adopts socket communication agreement,
The identifier is configured to:
<International Handle identifiers>/<Collecting mechanism>.<Collection biological data source character of institution>.<The biological number of collection
According to classification>.<Biological name>.<Biological tissue's name>.<Timestamp>.<Three serial numbers>.
Gathered by embedded chip in step S11, the operating system that can be based in raspberry group generates data
Mode be to be generated by specific software.Raspberry is sent, and Raspberry Pi are abbreviated as RPi, and (or RasPi/RPI) is
Design for student computer programming education, only the microcomputer of credit card-sized, its system is based on Linux.Such as, it can
Based on the microcomputer mainboard of ARM, with SD/MicroSD cards as memory hard disk, have around card mainboard 1/2/4 USB interface and
One 10/100 Ethernet interface (A types do not have network interface), can connect keyboard, mouse and netting twine, while possessing video analog signal
TV output interface and HDMI HD video output interfaces, only slightly larger than a credit card master is all incorporated into upper-part
On plate, possessing the basic function of all PC need to only connect television set and keyboard, can just perform such as electrical form, word processing, object for appreciation
The various functions such as game, broadcasting HD video.Raspberry Pi B moneys provide only computer board, without internal memory, power supply, keyboard, machine
Case or line.
UI interfaces are that a collecting mechanism metadata panel and one gather remarks in specific software, including:Collecting mechanism attribute,
Gatherer process and collection service etc..Wherein, collecting mechanism attribute is included but is not limited to:Mechanism coding, the collection for automatically generating
Timestamp, collection serial number etc..Gatherer process is included but is not limited to:Collection property, source of customers classification, customer service classification with
And customer resources classification etc..Collection service is included but is not limited to:Servicing property or service item etc..Collection remarks include but do not limit
In:Collection text and collection accessory address etc..
Step S105 is by above-mentioned identifier synchronization to database.
In above-mentioned steps S1041, send into specified mailbox according to SMTP, POP3 or http protocol, use can be improved
Family liveness.Being similar to each website can have state to update, and for the generation of biological data identifier, most of users will not
Jing often goes refreshing to look at either with or without what new content, by mail user can not only be allowed timely to be fed back, and can make again
Obtain data timely to be put on record.
Fig. 4 be the present invention system structure diagram, a kind of biological data generation identifier system, including:Terminal,
Client-server 100 and service end server 200, the terminal is configured to:Collection biological data content, in the client
End server 100 produces identifier according to setting rule, wherein, the identifier is configured at least include:To produce state
First mark of border Unified coding, to the second mark for recording collection source, to the 3rd mark for distinguishing biological data classification
Know, with identifying for recording acquisition time, generating uniquely identified the 4th, the service end server 200 is configured to connect
Above-mentioned identifier is received, the method for the collection biological data content is:It is manually entered, is gathered by embedded chip.Client
Server 100 and service end server 200 are using long connection.
In certain embodiments, in system also include mail server, according to SMTP, POP3 or http protocol send to
In specified mailbox.
In certain embodiments, the port of the collection biological data content adopts socket communication agreement with database.
It should be appreciated that each several part of the present invention can be realized with hardware, software, firmware or combinations thereof.Above-mentioned
In embodiment, the software that multiple steps or method can in memory and by suitable instruction execution system be performed with storage
Or firmware is realizing.For example, if realized with hardware, and in another embodiment, can be with well known in the art
Any one of row technology or their combination are realizing:With for realizing the logic gates of logic function to data-signal
Discrete logic, the special IC with suitable combinational logic gate circuit, programmable gate array (PGA), scene
Programmable gate array (FPGA) etc..
In the description of this specification, reference term " one embodiment ", " some embodiments ", " example ", " specifically show
The description of example " or " some examples " etc. means to combine specific features, structure, material or spy that the embodiment or example are described
Point is contained at least one embodiment of the present invention or example.In this manual, to the schematic representation of above-mentioned term not
Necessarily refer to identical embodiment or example.And, the specific features of description, structure, material or feature can be any
One or more embodiments or example in combine in an appropriate manner.
In general, the various embodiments of the disclosure can be with hardware or special circuit, software, logic or its any combination
Implement.Some aspects can be implemented with hardware, and some other aspect can be implemented with firmware or software, and the firmware or software can
With by controller, microprocessor or other computing devices.Although the various aspects of the disclosure be shown and described as block diagram,
Flow chart is represented using some other drawing, but it is understood that frame described herein, equipment, system, techniques or methods can
With in a non limiting manner with hardware, software, firmware, special circuit or logic, common hardware or controller or other calculating
Equipment or some of combination are implemented.
In addition, although operation is with particular order description, but this is understood not to require this generic operation with shown suitable
Sequence is performed or with generic sequence execution, or requires that all shown operations are performed to realize expected result.In some feelings
Under shape, multitask or parallel processing can be favourable.Similarly, although the details of some specific implementations is superincumbent to beg for
Included by, but these are not necessarily to be construed as any restriction of scope of this disclosure, but the description of feature is only pin
To specific embodiment.Some features described in detached some embodiments can also in combination be held in single embodiment
OK.Mutually oppose, the various features described in single embodiment can also be implemented separately in various embodiments or to appoint
The mode of what suitable sub-portfolio is implemented.
Although the disclosure is described with specific structural features and/or method action, but it is understood that will in appended right
The disclosure limited in book is asked to be not necessarily limited to above-mentioned specific features or action.But, above-mentioned specific features and action are only public
Open to implement the exemplary forms of claim.