Background
In recent years, "big data", "machine learning" and "cloud computing" have become hot words in production and life of people, and ubiquitous social and business activities generate various data continuously, and a new generation of information communication technology is promoted to develop rapidly by combining with an efficient and rapidly-developing machine learning technology.
On the other hand, as geographic information system technology, satellite positioning technology, and communication technology are becoming mature, personal positioning and trajectory recording become simple and fast, location-based services are rapidly spreading, are increasingly paid attention to by people, and show great economic and social benefits. Especially in the modern construction of cities, the intelligent city can not be supported by position service information everywhere, and the production and the life of people can be more convenient and faster by high-quality and diversified position service.
Based on the improvement of the position service quality, the positioning technology with high efficiency and high precision can not be used. At present, although positioning technologies such as GPS, WLAN, radio frequency, bluetooth, etc. are applied in different fields to different degrees, each positioning technology is not suitable for various scenes due to the limitation of its own technology or hardware cost, etc., and is widely popularized and used.
The urban road identification marking method based on the drive test data takes the collected target area drive test data as basic experimental data, realizes the division of target areas according to roads under a complex actual environment, namely identifies and marks an urban road by a group of track data. After the area where the target is located is determined, the method can further locate the road where the target is located by combining base station signaling data uploaded by the target, so that the goal is located from the surface to the line.
In the data processing process, a road feature extraction algorithm is mainly applied. The road feature extraction algorithm is in view of the Hough algorithm. The Hough algorithm is a classical method in image transformation processing and is mainly used for separating and classifying geometric shapes with certain identical characteristics in an image. Compared with other means, the Hough transform can better reduce noise interference in the process of searching straight lines and circles.
the specific implementation mode is as follows:
the invention is further described below with reference to the accompanying drawings.
As shown in fig. 1, a professional drive test device is used to collect base station signaling data for a positioning target area, wherein the collection trajectory follows the urban road construction situation. Each triangular mark in the figure represents a data sample point. Each sampling point contains mobile, Unicom and telecommunication system data and WIFI data information. For telecommunication 4G, the parameter information is shown in fig. 2.
And aiming at the drive test data which is stored in a chaotic manner in the acquisition region, the Hough transformation idea is applied to process the data, so that the extraction of the road characteristics of the region is realized. The data extraction flow is shown in fig. 3.
The operation steps are as follows:
the method comprises the following steps: large area sample data is determined.
Step two: taking each data point as a base point, dispersing the angle 180 degrees into 180 degrees, namely, the precision is 1 degree, obtaining 180 straight lines in different directions, and calculating the distance r of the straight line from the origin point according to each angle theta.
Step three: and counting the number k of the corresponding group of sampling points meeting (theta, r +/-f), wherein f is a distance threshold. If k is not less than m, the group of data is considered to be located on a road, and a piece of road information is determined. Where m is a defined threshold.
And repeating the steps 1) to 3), and identifying and classifying all sampling points in the target area to finally obtain all road grouping information in the current area. In the calculation process, two thresholds of f and m are mainly selected. f is the difference between the straight line of all sampling points under a certain angle and the original point, if the difference between the distance corresponding to each point and r is less than f, the points are considered to be positioned on the straight line determined by the angle and r, namely, the points belong to the same road. And f, on one hand, the distance between the urban roads including transverse parallel roads, intersecting roads and the like is fully considered by combining with the urban road construction standard of the current area. On the other hand, the influence of the width of the middle green isolation belt of a single urban road on the selection of the threshold f also needs to be considered.
Urban roads seem to be complicated in crossing, but have certain construction standards. According to the road network density of large and medium cities suggested by the urban road traffic planning and designing standard (GB50220-95) in China, the grid spacing of roads with different levels can be calculated under the condition of assuming square grid roads. The urban road space division is shown in fig. 4.
Usually, the collection work of drive test data is carried out on sidewalks and motor vehicle roads on two sides, and under different urban road traffic regulations, the transverse distances among four roads under the same road are different, and the transverse distances among different parts of different roads are also different. When the f value is selected, the data of the sidewalk and the motor lane of the current road needs to be ensured to be different from the data of other roads, and the data of the sidewalk and the motor lane on the current road is ensured to belong to the same category. With reference to fig. 5, an appropriate f is selected to be applicable to both the longitude and latitude directions. Wherein, the conversion rule of longitude and latitude and rice is as follows: in longitude, the difference is about 111km by 1 latitude, so 100m corresponds to about 0.0009 degrees in longitude, which degree is independent of latitude; on a weft, the difference is about 111cosA by 1 longitude (a is the latitude of the weft).
And the value of m is an accumulated point number threshold value, namely for a certain (theta, r), if the number of points meeting the condition exceeds m, the points are considered to belong to a certain straight line, and an urban road is determined. The value of m is not fixed and is dynamically adjusted according to the experimental data.
Fig. 6 and 7 are schematic diagrams of longitudinal and transverse road data extraction results of road feature extraction, respectively. Besides less data loss of non-smooth parts, the road characteristic extraction algorithm can comprehensively extract urban road data according to the road distribution characteristics. Compared with the original road data, the road characteristic extraction algorithm has an average data extraction rate of 95% for different roads, and can better meet application conditions.
After the sampled data are grouped according to the urban road information, the target area can be represented by the currently identified road instead. Since the mobile terminal walks on urban roads most of the time, grouping the area data by road information plays a key role in determining the road information where the mobile terminal is located at the present time. Meanwhile, the acquisition of the road information can also play a certain auxiliary guiding role in the judgment of the target behavior.