CN107832345A

CN107832345A - The method of base station data unique numberization mark

Info

Publication number: CN107832345A
Application number: CN201710960854.8A
Authority: CN
Inventors: 万景琨; 高山岳
Original assignee: Qianxun Position Network Co Ltd
Current assignee: Qianxun Position Network Co Ltd
Priority date: 2017-10-16
Filing date: 2017-10-16
Publication date: 2018-03-23

Abstract

The invention provides a kind of method of base station data unique numberization mark, comprise the following steps：Step 1, according to the title of base station or referred to as and can as unique identification field as input, if the character that field type field must be regular length and each composition character string can exhaustion restriction character and not reproducible；Step 2, unique numeral is converted into for the character string of regular length according to field；Step 3, repeat step 2, until all String fields all change into unique numeral, and the digital jointing of all conversions is got up；Step 4, Digital ID is reversible, unique mark can be converted into base station name and time of origin if searching visualization requirement.The present invention is easy to implement, and effectively improves recall precision.Only in all kinds of mapping base stations, timing uploads the data set to the present invention, reduces the shared flow of repetition upload and data storage brings unnecessary space waste, while reduce the time loss in retrieving.

Description

The method of base station data unique numberization mark

Technical field

The present invention relates to technical field of software development, and in particular to big data memory scan cleaning technique field.

Background technology

In recent years, as the development of technology, the mankind are increasingly urgent to the demand of all kinds of precise positioning services.Each traditional base Data volume is uploaded when standing firm to explode.Big data now is required the unique identification of magnanimity base station in data cleansing searching field Compare it is high, conventional characters string unique mark not only expended in retrieving cpu resource simultaneously recall precision it is low, even if now To the mode that character string unique mark indexes as data volume explodes in traditional database, index cost also sharply increases.Very Base station data is represented using Digital ID completely more, because reversible in mark the letter that was originally contained of base station can not be represented Breath, cause to need to inquire about the useful information of other relation table acquisition in retrieval, this virtually improves retrieval cost and reduction Recall precision.

The content of the invention

Present invention solves the technical problem that it is exactly that traditional base station Data Identification is changed into unique and reversible numeral to mark Know, encoded message digit in mark by once identifying, while Gray code can equally reduce and participate in unique mark base station Essential information.

The technical solution adopted by the present invention is as follows：

A kind of method of base station data unique numberization mark, comprises the following steps：

Step 1, according to the title of physical base station or referred to as and can as unique identification field as input, if It is character string type field, it is necessary to which the character for being regular length and each forming character string can be exhaustive, and restriction character can not Repeat.

Step 2, according to the character string type field that field is regular length, it is converted into unique Digital ID.

Step 3, repeat step 2, until all String fields all change into unique Digital ID, and by all conversions Digital ID is stitched together.

Step 4, Digital ID is reversible, unique mark can be converted into base station name if searching visualization requirement With broadcast the time.

Beneficial effects of the present invention are as follows：

1st, recall precision is improved, and compared to the retrieval of traditional database, takes out Digital ID while needs to do additionally to look into Look for；The present invention can realize once to search and just can all be retrieved the information of main uniqueness.

2nd, storage efficiency is improved, passes through the coded system of length-specific, it is possible to reduce character string type unique mark is deposited Store up space.

Brief description of the drawings

Fig. 1 is schematic flow sheet of the present invention.

Embodiment

Hereinafter, the present invention is further elaborated in conjunction with the accompanying drawings and embodiments.Fig. 1 be a kind of base station data of the present invention only The method flow schematic diagram of one digital representation, comprises the following steps：

Define one：Input data determines：The regular length character string field or numeric type field that any base station defines It can serve as inputting.

Define two：String field changes unique numerical identification：The character string fixed for length, the type base of character string This judgement can be exhaustive, is described below and assumes that field character string is the restriction regular length character mark chosen in (a-z or A-Z) Show.Determine length and can limit character String field change into unique numerical identification may be considered by these numeral In the fully intermeshing of regular length, and with this numeral come to each fully intermeshing element numerals.Specific conversion formula is as follows： (cantor deploys, and cantor expansion is exactly a kind of special hash function, and its use range is the arrangement for some numbers The compression and storage of carry out state)：

X=an* (n-1)！+an-1*(n-2)！+...+ai*(i-1)！+...+a2*1！+al*0！

Wherein, it is to come which (since 0) in the current element not occurred that an, which is, and n is the length of fixed character string.

For example：

For the base coded that regular length is 4 " SFGA ", its conversion formula is：

X (" SFGA ")=a4*3！+a3*2！+a2*1！+a1*0！

Which big element a4=" S " this element is in the array [S, F, G, A] of restriction.Compared by ascii table Understand, S is the 3rd big element (being calculated since 0).So a4=3.A3=" F ", because the character smaller than F has 1, so a3 =1.A2=" G ", the member smaller than G are known as 2, but because previous element F had occurred, a2=1

X (" SFGA ")=3*3！+1*2！+1*1！+0*0！=18+2+1=21

Define three：Unique identification number is reversible：For given numeral 21, exhaustion is combined in [S, F, G, A]：

1) 20 are obtained with 21-1 first, illustrate there are 20 arrangements before given character string (this number is subtracted 1 in itself).

2) 3 are removed with 20！More than 32 are obtained, illustrates to have that 3 numbers are smaller than the 1st, so first is G.

3) 2 are removed with 2！More than 10 is obtained, illustrates to have that 1 number is smaller than the 2nd, so being F.

4) 1 is removed with 1！Obtain more than 10, similarly, illustrate to have that 1 number is smaller than the 3rd, in remaining character string dimension A and Can only be element G in G.

5) last remaining element can only be A.

So this character string is SFGA.

Define four：Multiple character string unique numerical identification splicings：Multiple Digital IDs can be by each field regular coding To be spliced into new unique character string.For example, there are two String fields：Base station time and base station code field, Two fields can be converted into Digital ID according to defining two, while the numeral of maximum is estimated that according to the character string of restriction Digit, such as, the exhaustive array [S, F, G, A] of base station abbreviation field, maximum Digital ID can be defined to 999, so number First three bit digital of word identification field can represent field referred to as, and 5 bit digitals represent the time field of base station after can similarly limiting.

Its status data can be broadcast, determines that base station is unique as embodiment, each base station each second to survey and draw physical base station Property typically using base station abbreviation and broadcast unique mark of the timestamp as data.The abbreviation of base station is typically all to limit character The 4-8 positions of collection, by referred to as being decoded to base station using two methods that define, can referred to as be reflected all base station character strings as mark Penetrate into the expression of 3-5 bit digitals, 13 bit digitals (Digital Time-stamp for being accurate to millisecond) typically limited further according to timestamp.Pass through Above-mentioned definition three controls base station unique mark in the range of 18, effectively saves the unnecessary space of storage character string.By According to three coding rules are defined, the field contents of original participation coding are decoded into also according to existing numeric identifier, Save the time for searching index.

Main advantages of the present invention include：

1st, this method reduces memory space, if realizing the above method using the language of some support tail recursions, Further improve spatial multiplex ratio and save the space complexity realized and stored.

2nd, the present invention improves operational efficiency, as described above, first encoding can improve the efficiency of Search and Orientation data, drop Low CPU usage.Once decode simultaneously, can obtain participating in the content of coding, avoid unnecessary lookup.

3rd, the present invention is easily achieved, and reduces implementation complexity, also reduces maintenance cost.This method is based on common Cantor expansion carrys out encoding and decoding, and method realizes that threshold is low, while is easy to test and safeguards.

Although the present invention is disclosed as above with preferred embodiment, it is not for limiting the present invention, any this area Technical staff without departing from the spirit and scope of the present invention, may be by the methods and technical content of the disclosure above to this hair Bright technical scheme makes possible variation and modification, therefore, every content without departing from technical solution of the present invention, according to the present invention Any simple modifications, equivalents, and modifications made to above example of technical spirit, belong to technical solution of the present invention Protection domain.

Claims

A kind of 1. method of base station data unique numberization mark, it is characterised in that comprise the following steps：

Step 1, using the field of base station as input；

Step 2, field is converted into unique Digital ID；

Step 3, repeat step 2, until all fields are all converted into unique Digital ID, and by the Digital ID of all conversions It is stitched together；

Step 4, Digital ID is reversible.
A kind of 2. method of base station data unique numberization mark as claimed in claim 1, it is characterised in that the field bag Include the character string type field or numeric type field of base station.
A kind of 3. method of base station data unique numberization mark as claimed in claim 2, it is characterised in that the character string Type field includes the title or abbreviation of base station.
A kind of 4. method of base station data unique numberization mark as claimed in claim 2, it is characterised in that the character string The length of type field is fixed, and each character for forming character string can be exhaustive, and does not repeat.
5. the method for a kind of base station data unique numberization mark as claimed in claim 4, it is characterised in that pass through cantor Character string type field is converted into unique Digital ID by expansion.
A kind of 6. method of base station data unique numberization mark as claimed in claim 1, it is characterised in that the step 2 In referred to as and broadcast timestamp by base station field be converted into unique Digital ID.
A kind of 7. method of base station data unique numberization mark as claimed in claim 1, it is characterised in that the step 4 During Digital ID is reversible, unique Digital ID is converted into base station name and broadcasts the time.