WO2020134299A1 - Indoor and outdoor label distinguishing method, training method and device of classifier and medium - Google Patents

Indoor and outdoor label distinguishing method, training method and device of classifier and medium Download PDF

Info

Publication number
WO2020134299A1
WO2020134299A1 PCT/CN2019/109438 CN2019109438W WO2020134299A1 WO 2020134299 A1 WO2020134299 A1 WO 2020134299A1 CN 2019109438 W CN2019109438 W CN 2019109438W WO 2020134299 A1 WO2020134299 A1 WO 2020134299A1
Authority
WO
WIPO (PCT)
Prior art keywords
random forest
training
outdoor
indoor
classification model
Prior art date
Application number
PCT/CN2019/109438
Other languages
French (fr)
Chinese (zh)
Inventor
钟勇才
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Publication of WO2020134299A1 publication Critical patent/WO2020134299A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting

Definitions

  • This article relates to the field of communications, and in particular to a method for distinguishing indoor and outdoor tags, a training method for classifiers, equipment and media.
  • LBS Location Based Service
  • Some of the mobile services occur indoors, and some mobile services occur outdoors. How to accurately determine whether a mobile service user is indoors or outdoors is very important for a specific room. For example, distinguishing between indoor and outdoor users can solve the problem of how operators can accurately identify deep coverage, and customize accurate station addition solutions accordingly.
  • the indoor coverage is insufficient, add a room substation; if the outdoor coverage is insufficient, add an outdoor station: for the elderly or children in need of care, you can judge whether they are in the room or area by indoor and outdoor; and within the company Access the network, once you leave the office building, you cannot access company information, etc.
  • indoor and outdoor differentiation of mobile services requires high real-time performance and high accuracy.
  • problems of low efficiency, high misjudgment rate, and real-time performance in determining the distinction between indoor and outdoor mobile users are problems of low efficiency, high misjudgment rate, and real-time performance in determining the distinction between indoor and outdoor mobile users.
  • the technical problem to be solved in this paper is to provide a method for distinguishing indoor and outdoor marks, a training method for classifiers, equipment and media, to at least solve the problem of high misjudgment rate in determining indoor and outdoor marks of users .
  • a method for distinguishing indoor and outdoor marks of users in the embodiments of this document includes: collecting measurement report data of a target user; and inputting the measurement report data of the target user into indoor and outdoor marks for classifying users Random forest classifier; determine indoor and outdoor marks of the target user according to the classification calculation of the random forest classifier.
  • a training method of a random forest classifier in the embodiment of this document includes: extracting training data from the collected measurement report data of the sample users in the target area and the actual indoor and outdoor tags corresponding to each piece of training data Set; input the training data set into a preset random forest classification model for training; during the training process, search the optimal model parameters of the random forest classification model through a grid; correspond to the optimal model parameters
  • the random forest classification model is used as the random forest classifier.
  • a communication node device in the embodiments herein includes a memory and a processor.
  • the memory stores a user's indoor and outdoor marking programs, and the processor executes the computer program to implement the above distinction method. A step of.
  • a random forest classifier training device in the embodiments herein includes a memory and a processor, the memory stores a random forest classifier training program, and the processor executes the computer program to Steps to achieve the above training method.
  • a computer-readable storage medium in the embodiments herein stores a user's indoor and outdoor labeling program, and the computer program may be executed by at least one processor to implement the steps of the above distinguishing method.
  • a computer-readable storage medium in the embodiments herein stores a training program of a random forest classifier, and the computer program may be executed by at least one processor to implement the steps of the training method above.
  • FIG. 1 is a flowchart of a method for distinguishing indoor and outdoor marks of users in an embodiment of this document;
  • FIG. 2 is a flowchart of a method for selectively distinguishing indoor and outdoor user marks in the embodiment of this document;
  • FIG. 3 is a prediction effect diagram of indoor and outdoor marks of a target user in the embodiment of this document.
  • the embodiments herein provide a method for distinguishing indoor and outdoor marks of users.
  • the method includes: S101, collecting measurement report data (MR, Measurement) of a target user; S102, measuring the target user The report data is input to a random forest classifier for classifying indoor and outdoor marks POSITIONMARK_REAL of users; S103, the indoor and outdoor marks of the target user are determined according to the classification calculation of the random forest classifier.
  • MR measurement report data
  • S102 measuring the target user
  • the report data is input to a random forest classifier for classifying indoor and outdoor marks POSITIONMARK_REAL of users
  • S103 the indoor and outdoor marks of the target user are determined according to the classification calculation of the random forest classifier.
  • the target user refers to the user to be located, and the user generally refers to the mobile user.
  • MR records the mobile user's serving cell ID (identification), RSRP (test power value), (LTE reference signal reception quality rsrq), TA_CALC (delay), AOA (incidence angle), STARTTIME (start time) , ENDTIME (end time), IMSI (International Mobile Subscriber Identity) and other wireless measurement information.
  • the MR data of the target user collected in the embodiment of this document includes AOA (angle of incidence), TA_CALC (time delay), RSRP (test power value), TADLTVALUE (downlink time delay), TIME_DIFFERENCE (time difference endtime-starttime).
  • Indoor and outdoor marks are used to mark whether a user is indoor or outdoor, and can also be described as indoor or outdoor marks or indoor and outdoor marks.
  • the method in the embodiment of this document can be applied to the communication node side, for example, the base station side; in the determination process, the base station can collect MR data of the target user in real time, so the MR data in the embodiment of this document can also be described as real-time MR data. Since the determination process is realized by the classification calculation of the random forest classifier, the determination process is also a prediction process.
  • the collected target user's MR data is input to a random forest classifier for classification calculation, so that the indoor and outdoor marks of the target user can be determined according to the classification calculation, and thus the false judgment rate can be effectively reduced in determining the indoor and outdoor marks of the user, and Judgment based on MR data effectively guarantees the real-time nature of the process of determining user indoor and outdoor marking.
  • the input of the measurement report data of the target user before the random forest classifier used to classify the indoor and outdoor labeling of the user includes: Collect the measurement report data of the sample users in the target area, and the indoor or outdoor tags corresponding to each measurement report data; extract the training from the collected measurement report data of the sample users in the target area and the actual indoor and outdoor tags corresponding to each training data Data set; input the training data set into a preset random forest classification model for training; during the training process, search the optimal model parameters of the random forest classification model through the GRIDSEARCHCV grid; enter the optimal model The random forest classification model corresponding to the parameter serves as the random forest classifier.
  • the target area may be a designated area, and the model parameters may include the number of decision trees N_ESTIMATORS and the calculated attribute CRITERION; the random forest classification model may be implemented through Python code, and the random forest classification model in the embodiment of this document may be simply referred to as a model.
  • the collected target can be The measurement report data of the sample users in the area and the indoor or outdoor tags corresponding to each measurement report data are used as the original data, and the original data is preprocessed to remove abnormal data.
  • the AOA incidence angle
  • TA_CALC time delay
  • RSRP test power value
  • TADLTVALUE downlink time delay
  • TIME_DIFFERENCE time difference endtime-starttime
  • other features in the training data set are extracted as independent variables X
  • the corresponding POSITIONMARK_REAL indoor and outdoor mark
  • the independent variable X is used to determine the indoor and outdoor mark Y; that is, each training data in the training data set is set as an independent variable, and each training data is set
  • the corresponding actual indoor and outdoor markers are set as dependent variables determined by the independent variables, which can be regarded as a 0-1 classification problem, which can effectively reduce the complexity of the random forest classifier training process and effectively improve the indoor and outdoor markers of users Prediction accuracy.
  • the random forest classification model obtained by the verification training can also be predicted through the test data set, and the prediction accuracy of the obtained indoor and outdoor marks of the user can be guaranteed by the prediction verification.
  • data preprocessing is performed on the original data to remove abnormal data, and feature values are extracted from the original data from which the abnormal data is removed to obtain a data set, and the data set is divided into a training data set and a test data set.
  • the test data set is continuously input into the trained random forest model for cross-prediction and verification until a relatively superior model is found as the final random forest classification model.
  • using the random forest classification model corresponding to the optimal model parameter as the random forest classifier may include: extracting a test data set from the sample measurement report data; and inputting the test data set Go to the random forest classification model corresponding to the optimal model parameters for prediction verification; determine the minimum mean square error between the prediction verification result and the actual indoor and outdoor markers corresponding to the test data set; where the mean square error is not greater than
  • the random forest classification model corresponding to the optimal model parameter is used as the random forest classifier; when the mean square error is greater than the threshold, the random forest classification model is searched again through a grid Optimal model parameters.
  • the embodiments herein provide a specific method for distinguishing indoor and outdoor user marks.
  • the method is mainly divided into two stages: an offline stage and an online stage.
  • the offline stage is mainly used for random forest classification Training
  • the online phase is mainly used for real-time prediction of the target, including:
  • Step 201 Collect MR data of the sample user in the target area.
  • MR data records the user's service cell ID, TA_CALC, RSRP, RSRQ, TA, AOA, MRTIME, STARTTIME, ENDTIME, IMSI and other wireless measurement information during the business process, and the POSTIONMARK_REAL indoor and outdoor marks corresponding to each measurement information.
  • Step 202 abnormal data processing.
  • Step 203 Select MR data corresponding to the feature value.
  • the characteristic values such as AOA (angle of incidence), TA_CALC (delay), RSRP (test power value), TADLTVALUE (downlink delay), TIME_DIFFERENCE (time difference endtime-starttime) are selected as independent variables X .
  • the corresponding POSITIONMARK_REAL (indoor and outdoor mark) is set as the dependent variable Y.
  • the random forest classification model has better accuracy and generalization.
  • Step 204 Train the model to optimize model parameters.
  • Step 205 A mechanism for measuring the accuracy of the model.
  • Crossover prediction verifies the minimum mean square error between the predicted value and the true value of the test set data.
  • the model is better if the error is smaller, otherwise it is worse.
  • the model with the highest accuracy rate is selected, and the model is saved; when the mean square error is not greater than a preset threshold, the corresponding A random forest classification model serves as the random forest classifier; when the mean square error is greater than the threshold, the grid is again searched for optimal model parameters of the random forest classification model.
  • Step 206 Collect real-time MR data of the target user.
  • AOA angle of incidence
  • TA_CALC time delay
  • RSRP test power value
  • TADLTVALUE downlink delay
  • TIME_DIFFERENCE Time difference (endtime-starttime) several indicators, and then use these indicators to actually predict the indoor and outdoor marks of mobile users.
  • Step 207 Real-time MR data preprocessing.
  • Step 208 Real-time MR data is input into a random forest classifier for prediction.
  • the processed real-time MR data is input into the previously trained random forest classifier, and then fitted by the random forest classifier.
  • Step 209 The indoor and outdoor marking results corresponding to the real-time MR data of these target users can be obtained.
  • the embodiments herein effectively improve the prediction accuracy of indoor and outdoor markings of users, and effectively ensure the real-time nature of the process of determining indoor and outdoor markings of users.
  • the embodiments herein provide a training method for a random forest classifier.
  • the method includes: extracting training data sets from the collected measurement report data of sample users in the target area and the actual indoor and outdoor tags corresponding to each piece of training data;
  • the training data set is input into a preset random forest classification model for training; during the training process, the grid is searched for the optimal model parameters of the random forest classification model; the random forest classification corresponding to the optimal model parameters is classified
  • the model serves as the random forest classifier.
  • the training process of the random forest classifier of the embodiment of this document is the same as the training process of the first embodiment.
  • the embodiments herein provide a communication node device, wherein the device includes a memory and a processor, the memory stores a user's indoor and outdoor marking programs, and the processor executes the computer program to implement the first embodiment and The steps of the method according to any one of the second embodiment.
  • the communication node device may be a base station or the like.
  • the embodiments herein provide a random forest classifier training device.
  • the device includes a memory and a processor.
  • the memory stores a random forest classifier training program.
  • the processor executes the computer program to implement the embodiment. Three steps of the method.
  • the embodiments herein provide a computer-readable storage medium, wherein the storage medium stores a user's indoor and outdoor marking programs, and the computer program may be executed by at least one processor to implement the first and second embodiments. Any one of the steps of the method.
  • the embodiments herein provide a computer-readable storage medium, wherein the storage medium stores a random forest classifier training program, and the computer program may be executed by at least one processor to implement the method described in Embodiment 3. step.
  • the collected target user's MR data is input to a random forest classifier for classification calculation, so that the indoor and outdoor marks of the target user can be determined according to the classification calculation, and then the user indoor In terms of external marking, the rate of misjudgment is effectively reduced, and the judgment is based on MR data, which effectively guarantees the real-time nature of determining the indoor and outdoor marking process of users.

Abstract

An indoor and outdoor label distinguishing method, a training method and device of a classifier and a medium. The distinguishing method comprises: collecting measurement report data of a target user (S101); inputting the measurement report data of the target user into a random forest classifier for classifying user indoor and outdoor labels (S102); and determining the indoor and outdoor label of the target user according to the classified counting of the random forest classifier (S103).

Description

室内外标记的区分方法、分类器的训练方法及设备和介质Method for distinguishing indoor and outdoor marks, training method for classifier, equipment and medium
本文要求享有2018年12月25日提交的名称为“室内外标记的区分方法、分类器的训练方法及设备和介质”的中国专利申请CN201811595402.5的优先权,其全部内容通过引用并入本文中。This article claims the priority of the Chinese patent application CN201811595402.5, entitled "Division method for indoor and outdoor marks, training method for classifiers and equipment and media", filed on December 25, 2018, the entire contents of which are incorporated herein by reference in.
技术领域Technical field
本文涉及通信领域,特别是涉及一种室内外标记的区分方法、分类器的训练方法及设备和介质。This article relates to the field of communications, and in particular to a method for distinguishing indoor and outdoor tags, a training method for classifiers, equipment and media.
背景技术Background technique
在移动互联网时代,人们的生活方式和行为习惯都被智能终端所改变。人们习惯性地通过基于位置服务(LBS,Location Based Service)寻找商场、医院、银行,甚至交友等等,其中部分移动业务是发生在室内,也有部分移动业务发生在室外。如何针对某个特定房间,能够准确判断出移动业务用户位于室内还是位于室外至关重要。例如:区分室内外用户可以解决运营商关注的如何精准识别深度覆盖问题,并依此定制精准加站方案。如果是室内覆盖不足,则添加室分站;如果是室外覆盖不足,则添加室外站:对于需要照顾的老人或小孩,可以通过室内外区分判断他们是否在房间或者区域内;以及在公司内部可以访问网络,一旦离开办公楼就无法访问公司信息等。In the mobile Internet era, people's lifestyles and behaviors have been changed by smart terminals. People habitually use location-based services (LBS, Location Based Service) to find shopping malls, hospitals, banks, even friends, etc. Some of the mobile services occur indoors, and some mobile services occur outdoors. How to accurately determine whether a mobile service user is indoors or outdoors is very important for a specific room. For example, distinguishing between indoor and outdoor users can solve the problem of how operators can accurately identify deep coverage, and customize accurate station addition solutions accordingly. If the indoor coverage is insufficient, add a room substation; if the outdoor coverage is insufficient, add an outdoor station: for the elderly or children in need of care, you can judge whether they are in the room or area by indoor and outdoor; and within the company Access the network, once you leave the office building, you cannot access company information, etc.
对上述应用的需求分析,移动业务室内外区分对实时性要求高,同时还对准确性要求高。但是,在一些情况下在判断移动用户室内外区分方面存在效率低下、误判率高和实时性得不到保证的问题。According to the demand analysis of the above applications, indoor and outdoor differentiation of mobile services requires high real-time performance and high accuracy. However, in some cases, there are problems of low efficiency, high misjudgment rate, and real-time performance in determining the distinction between indoor and outdoor mobile users.
发明内容Summary of the invention
为了克服上述缺陷,本文要解决的技术问题是提供一种室内外标记的区分方法、分类器的训练方法及设备和介质,用以至少解决在确定用户室内外标记方面存在误判率高的问题。In order to overcome the above defects, the technical problem to be solved in this paper is to provide a method for distinguishing indoor and outdoor marks, a training method for classifiers, equipment and media, to at least solve the problem of high misjudgment rate in determining indoor and outdoor marks of users .
为解决上述技术问题,本文实施例中的一种用户室内外标记的区分方法,包括:采集目标用户的测量报告数据;将所述目标用户的测量报告数据输入到用于分类用户的室 内外标记的随机森林分类器;根据所述随机森林分类器的分类计算,确定所述目标用户的室内外标记。In order to solve the above technical problems, a method for distinguishing indoor and outdoor marks of users in the embodiments of this document includes: collecting measurement report data of a target user; and inputting the measurement report data of the target user into indoor and outdoor marks for classifying users Random forest classifier; determine indoor and outdoor marks of the target user according to the classification calculation of the random forest classifier.
为解决上述技术问题,本文实施例中的一种随机森林分类器的训练方法,包括:从采集的目标区域内样本用户的测量报告数据和每条训练数据对应的实际室内外标记中提取训练数据集;将所述训练数据集输入到预设的随机森林分类模型中进行训练;在训练过程中,通过网格搜索所述随机森林分类模型的最优模型参数;将所述最优模型参数对应的随机森林分类模型作为所述随机森林分类器。In order to solve the above technical problems, a training method of a random forest classifier in the embodiment of this document includes: extracting training data from the collected measurement report data of the sample users in the target area and the actual indoor and outdoor tags corresponding to each piece of training data Set; input the training data set into a preset random forest classification model for training; during the training process, search the optimal model parameters of the random forest classification model through a grid; correspond to the optimal model parameters The random forest classification model is used as the random forest classifier.
为解决上述技术问题,本文实施例中的一种通信节点设备,包括存储器和处理器,所述存储器存储有用户的室内外标记程序,所述处理器执行所述计算机程序,以实现如上区分方法的步骤。To solve the above technical problem, a communication node device in the embodiments herein includes a memory and a processor. The memory stores a user's indoor and outdoor marking programs, and the processor executes the computer program to implement the above distinction method. A step of.
为解决上述技术问题,本文实施例中的一种随机森林分类器的训练设备,包括存储器和处理器,所述存储器存储有随机森林分类器的训练程序,所述处理器执行所述计算机程序以实现如上训练方法的步骤。In order to solve the above technical problem, a random forest classifier training device in the embodiments herein includes a memory and a processor, the memory stores a random forest classifier training program, and the processor executes the computer program to Steps to achieve the above training method.
为解决上述技术问题,本文实施例中的一种计算机可读存储介质,存储有用户的室内外标记程序,所述计算机程序可被至少一个处理器执行,以实现如上区分方法的步骤。To solve the above technical problem, a computer-readable storage medium in the embodiments herein stores a user's indoor and outdoor labeling program, and the computer program may be executed by at least one processor to implement the steps of the above distinguishing method.
为解决上述技术问题,本文实施例中的一种计算机可读存储介质,存储有随机森林分类器的训练程序,所述计算机程序可被至少一个处理器执行,以实现如上训练方法的步骤。To solve the above technical problem, a computer-readable storage medium in the embodiments herein stores a training program of a random forest classifier, and the computer program may be executed by at least one processor to implement the steps of the training method above.
本文上述说明仅是本文技术方案的概述,为了能够更清楚了解本文的技术手段,而可依照说明书的内容予以实施,并且为了让本文的上述和其它目的、特征和优点能够更明显易懂,以下特举本文的具体实施方式。The above description in this article is only an overview of the technical solutions in this article. In order to understand the technical means in this article more clearly, it can be implemented in accordance with the content of the specification, and in order to make the above and other purposes, features and advantages of this article more obvious and understandable, the following Specific implementation of this article.
附图说明BRIEF DESCRIPTION
通过阅读下文优选实施方式的详细描述,各种其他的优点和益处对于本领域普通技术人员将变得清楚明了。附图仅用于示出优选实施方式的目的,而并不认为是对本文的限制。而且在整个附图中,用相同的参考符号表示相同的部件。在附图中:By reading the detailed description of the preferred embodiments below, various other advantages and benefits will become clear to those of ordinary skill in the art. The drawings are only for the purpose of showing the preferred embodiments, and are not considered as limitations to this document. Furthermore, throughout the drawings, the same reference symbols are used to denote the same components. In the drawings:
图1是本文实施例中一种用户室内外标记的区分方法的流程图;FIG. 1 is a flowchart of a method for distinguishing indoor and outdoor marks of users in an embodiment of this document;
图2是本文实施例中一种可选地用户室内外标记的区分方法的流程图;2 is a flowchart of a method for selectively distinguishing indoor and outdoor user marks in the embodiment of this document;
图3是本文实施例中目标用户的室内外标记的预测效果图。FIG. 3 is a prediction effect diagram of indoor and outdoor marks of a target user in the embodiment of this document.
具体实施方式detailed description
下面将参照附图更详细地描述本公开的示例性实施例。虽然附图中显示了本公开的示例性实施例,然而应当理解,可以以各种形式实现本公开而不应被这里阐述的实施例所限制。相反,提供这些实施例是为了能够更透彻地理解本公开,并且能够将本公开的范围完整的传达给本领域的技术人员。Hereinafter, exemplary embodiments of the present disclosure will be described in more detail with reference to the accompanying drawings. Although the exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure can be implemented in various forms and should not be limited by the embodiments set forth herein. Rather, these embodiments are provided to enable a more thorough understanding of the present disclosure and to fully convey the scope of the present disclosure to those skilled in the art.
在后续的描述中,使用用于表示元件的诸如“模块”、“部件”或“单元”的后缀仅为了有利于本文的说明,其本身没有特定的意义。因此,“模块”、“部件”或“单元”可以混合地使用。In the subsequent description, the use of suffixes such as "module", "part" or "unit" used to denote elements is only for the benefit of the description herein, and has no specific meaning in itself. Therefore, "module", "component" or "unit" can be used in a mixed manner.
使用用于区分元件的诸如“第一”、“第二”等前缀仅为了有利于本文的说明,其本身没有特定的意义。The use of prefixes such as "first", "second", etc. for distinguishing elements is only for the benefit of the description herein, and has no specific meaning in itself.
实施例一Example one
本文实施例提供一种用户室内外标记的区分方法,如图1所示,所述方法包括:S101,采集目标用户的测量报告数据(MR,Measurement Report);S102,将所述目标用户的测量报告数据输入到用于分类用户的室内外标记POSITIONMARK_REAL的随机森林分类器;S103,根据所述随机森林分类器的分类计算,确定所述目标用户的室内外标记。The embodiments herein provide a method for distinguishing indoor and outdoor marks of users. As shown in FIG. 1, the method includes: S101, collecting measurement report data (MR, Measurement) of a target user; S102, measuring the target user The report data is input to a random forest classifier for classifying indoor and outdoor marks POSITIONMARK_REAL of users; S103, the indoor and outdoor marks of the target user are determined according to the classification calculation of the random forest classifier.
其中,目标用户指代待定位的用户,用户一般指代移动用户。MR记录了移动用户在业务过程中的服务小区ID(标识)、RSRP(测试功率值)、(LTE参考信号接收质量rsrq)、TA_CALC(时延)、AOA(入射角度)、STARTTIME(开始时间)、ENDTIME(结束时间)、IMSI(国际移动用户识别码)等无线测量信息。在本文实施例中采集的目标用户的MR数据包括AOA(入射角度)、TA_CALC(时延)、RSRP(测试功率值)、TADLTVALUE(下行时延)、TIME_DIFFERENCE(时间差endtime-starttime)。室内外标记用于标记用户处于室内或室外,也可以描述为室内或室外标记、室内室外标记。The target user refers to the user to be located, and the user generally refers to the mobile user. MR records the mobile user's serving cell ID (identification), RSRP (test power value), (LTE reference signal reception quality rsrq), TA_CALC (delay), AOA (incidence angle), STARTTIME (start time) , ENDTIME (end time), IMSI (International Mobile Subscriber Identity) and other wireless measurement information. The MR data of the target user collected in the embodiment of this document includes AOA (angle of incidence), TA_CALC (time delay), RSRP (test power value), TADLTVALUE (downlink time delay), TIME_DIFFERENCE (time difference endtime-starttime). Indoor and outdoor marks are used to mark whether a user is indoor or outdoor, and can also be described as indoor or outdoor marks or indoor and outdoor marks.
本文实施例中方法可以应用于通信节点侧,例如基站侧;在确定过程中,基站可以实时采集的目标用户的MR数据,因此本文实施例中MR数据也可以描述为实时MR数据。由于确定过程是通过随机森林分类器的分类计算来实现的,因此确定过程也是一个预测过程。The method in the embodiment of this document can be applied to the communication node side, for example, the base station side; in the determination process, the base station can collect MR data of the target user in real time, so the MR data in the embodiment of this document can also be described as real-time MR data. Since the determination process is realized by the classification calculation of the random forest classifier, the determination process is also a prediction process.
本文实施例通过将采集的目标用户的MR数据输入到随机森林分类器进行分类计算, 从而可以根据分类计算确定目标用户的室内外标记,进而在确定用户室内外标记方面有效降低误判率,并且基于MR数据进行判断,有效保证确定用户室内外标记过程中的实时性。In this embodiment, the collected target user's MR data is input to a random forest classifier for classification calculation, so that the indoor and outdoor marks of the target user can be determined according to the classification calculation, and thus the false judgment rate can be effectively reduced in determining the indoor and outdoor marks of the user, and Judgment based on MR data effectively guarantees the real-time nature of the process of determining user indoor and outdoor marking.
在上述实施例的基础上,下面给出几个具体及可选实施方式,用以细化和优化本文实施例,以使本文实施例的方案的实施更方便,准确。需要说明的是,在不冲突的情况下,以下实施方式可以互相任意组合。Based on the foregoing embodiments, several specific and optional implementations are given below to refine and optimize the embodiments in this document, so that the implementation of the solutions in the embodiments in this document is more convenient and accurate. It should be noted that the following embodiments can be arbitrarily combined with each other without conflict.
为了有效保证确定用户室内外标记过程中的实时性,在一些实施方式中,所述将所述目标用户的测量报告数据输入到用于分类用户的室内外标记的随机森林分类器之前,包括:采集目标区域内样本用户的测量报告数据,以及每条测量报告数据对应的室内或室外标签;从采集的目标区域内样本用户的测量报告数据和每条训练数据对应的实际室内外标记中提取训练数据集;将所述训练数据集输入到预设的随机森林分类模型中进行训练;在训练过程中,通过GRIDSEARCHCV网格搜索所述随机森林分类模型的最优模型参数;将所述最优模型参数对应的随机森林分类模型作为所述随机森林分类器。In order to effectively ensure the real-time performance in determining the indoor and outdoor labeling of the user, in some embodiments, the input of the measurement report data of the target user before the random forest classifier used to classify the indoor and outdoor labeling of the user includes: Collect the measurement report data of the sample users in the target area, and the indoor or outdoor tags corresponding to each measurement report data; extract the training from the collected measurement report data of the sample users in the target area and the actual indoor and outdoor tags corresponding to each training data Data set; input the training data set into a preset random forest classification model for training; during the training process, search the optimal model parameters of the random forest classification model through the GRIDSEARCHCV grid; enter the optimal model The random forest classification model corresponding to the parameter serves as the random forest classifier.
其中,目标区域可以是一个指定区域,模型参数可以包括决策树个数N_ESTIMATORS和计算属性CRITERION;可以通过Python代码实现随机森林分类模型,本文实施例中随机森林分类模型可以简称为模型。当然在将所述训练数据集和每条训练数据对应的实际室内外标记输入到预设的随机森林分类模型中进行训练之前,为了提高用户的室内外标记的预测准确性,可以将采集的目标区域内样本用户的测量报告数据以及每条测量报告数据对应的室内或室外标签作为原始数据,对所述原始数据进行数据预处理,剔除异常数据。The target area may be a designated area, and the model parameters may include the number of decision trees N_ESTIMATORS and the calculated attribute CRITERION; the random forest classification model may be implemented through Python code, and the random forest classification model in the embodiment of this document may be simply referred to as a model. Of course, before inputting the training data set and the actual indoor and outdoor markers corresponding to each piece of training data into the preset random forest classification model for training, in order to improve the prediction accuracy of the indoor and outdoor markers of the user, the collected target can be The measurement report data of the sample users in the area and the indoor or outdoor tags corresponding to each measurement report data are used as the original data, and the original data is preprocessed to remove abnormal data.
在预测过程中,提取训练数据集中的AOA(入射角度)、TA_CALC(时延)、RSRP(测试功率值)、TADLTVALUE(下行时延)、TIME_DIFFERENCE(时间差endtime-starttime)等特征为自变量X,对应的POSITIONMARK_REAL(室内室外标记)设为因变量Y,用自变量X决定室内室外标记Y;也就是说,将所述训练数据集中每个训练数据设置为自变量,将所述每个训练数据对应的实际室内外标记设置为由所述自变量决定的因变量,可以看成一个0-1分类问题,从而可以有效降低随机森林分类器训练过程的复杂度,并有效提高用户的室内外标记的预测准确性。In the prediction process, the AOA (incidence angle), TA_CALC (time delay), RSRP (test power value), TADLTVALUE (downlink time delay), TIME_DIFFERENCE (time difference endtime-starttime) and other features in the training data set are extracted as independent variables X, The corresponding POSITIONMARK_REAL (indoor and outdoor mark) is set as the dependent variable Y, and the independent variable X is used to determine the indoor and outdoor mark Y; that is, each training data in the training data set is set as an independent variable, and each training data is set The corresponding actual indoor and outdoor markers are set as dependent variables determined by the independent variables, which can be regarded as a 0-1 classification problem, which can effectively reduce the complexity of the random forest classifier training process and effectively improve the indoor and outdoor markers of users Prediction accuracy.
在预测过程中,也可以通过测试数据集来预测验证训练得到的随机森林分类模型,通过预测验证来保证得到的用户的室内外标记的预测准确性。也就说,对所述原始数据进行数据预处理,剔除异常数据,并从剔除异常数据的原始数据中提取特征值得到数据 集,将数据集分成训练数据集和测试数据集两部分。不断将将测试数据集输入到训练后的随机森林模型中进行交差预测验证,直到找到相对较优的模型作为最终的随机森林分类模型。也就是说,所述将所述最优模型参数对应的随机森林分类模型作为所述随机森林分类器,可以包括:从所述样本测量报告数据中提取测试数据集;将所述测试数据集输入到所述最优模型参数对应的随机森林分类模型进行预测验证;确定预测验证结果与所述测试数据集对应设置的实际室内外标记之间的最小均方误差;在所述均方误差不大于预设的阈值时,将所述最优模型参数对应的随机森林分类模型作为所述随机森林分类器;在所述均方误差大于所述阈值时,重新通过网格搜索所述随机森林分类模型的最优模型参数。In the prediction process, the random forest classification model obtained by the verification training can also be predicted through the test data set, and the prediction accuracy of the obtained indoor and outdoor marks of the user can be guaranteed by the prediction verification. In other words, data preprocessing is performed on the original data to remove abnormal data, and feature values are extracted from the original data from which the abnormal data is removed to obtain a data set, and the data set is divided into a training data set and a test data set. The test data set is continuously input into the trained random forest model for cross-prediction and verification until a relatively superior model is found as the final random forest classification model. That is, using the random forest classification model corresponding to the optimal model parameter as the random forest classifier may include: extracting a test data set from the sample measurement report data; and inputting the test data set Go to the random forest classification model corresponding to the optimal model parameters for prediction verification; determine the minimum mean square error between the prediction verification result and the actual indoor and outdoor markers corresponding to the test data set; where the mean square error is not greater than When a preset threshold is used, the random forest classification model corresponding to the optimal model parameter is used as the random forest classifier; when the mean square error is greater than the threshold, the random forest classification model is searched again through a grid Optimal model parameters.
实施例二Example 2
基于实施例一,本文实施例提供一种具体的用户室内外标记的区分方法,如图2所示,所述方法主要分成两个阶段:离线阶段和在线阶段,离线阶段主要用于随机森林分类器的训练,在线阶段主要用于对目标用于的实时预测,包括:Based on Embodiment 1, the embodiments herein provide a specific method for distinguishing indoor and outdoor user marks. As shown in FIG. 2, the method is mainly divided into two stages: an offline stage and an online stage. The offline stage is mainly used for random forest classification Training, the online phase is mainly used for real-time prediction of the target, including:
步骤201、采集目标区域内样本用户的MR数据。Step 201: Collect MR data of the sample user in the target area.
选取一个指定区域,通过在基站侧采集用户上报的12000条MR数据。MR数据记录了用户在业务过程中的服务小区ID、TA_CALC、RSRP、RSRQ、TA、AOA、MRTIME、STARTTIME、ENDTIME、IMSI等无线测量信息,以及每条测量信息对应的POSTIONMARK_REAL室内外标记。Select a designated area and collect 12,000 MR data reported by users on the base station side. MR data records the user's service cell ID, TA_CALC, RSRP, RSRQ, TA, AOA, MRTIME, STARTTIME, ENDTIME, IMSI and other wireless measurement information during the business process, and the POSTIONMARK_REAL indoor and outdoor marks corresponding to each measurement information.
步骤202、异常数据处理。Step 202, abnormal data processing.
将采取的12000条MR数据中的各个字段异常的数据或空值用0来代替,并对整个数据矩阵进行正交归一化处理。随机选取数据集中75%数据作为训练集,25%数据作为测试集分别保存到两个文件中。Replace the abnormal data or null values of each field in the 12,000 MR data taken with 0, and perform orthogonal normalization processing on the entire data matrix. Randomly select 75% of the data in the data set as the training set, and 25% of the data as the test set are saved in two files.
步骤203、特征值对应的MR数据选取。Step 203: Select MR data corresponding to the feature value.
由于MR记录的指标项比较多,对整个模型的计算和准确性造成很多的影响。为了提高模型的计算和准确性,选取AOA(入射角度)、TA_CALC(时延)、RSRP(测试功率值)、TADLTVALUE(下行时延)、TIME_DIFFERENCE(时间差endtime-starttime)等特征值为自变量X,对应的POSITIONMARK_REAL(室内室外标记)设为因变量Y。这样将该问题转换为数学问题,用环境变量X决定室内室外标记Y,可以看成一个0-1分类问题,在本文实施例中随机森林分类模型具有更优的准确性和泛化性。通过Python代码构建随机森林分类模型,将训练数据集输入到RANDOMFORESTCLASSIFIER模型 中开始进行训练。Because there are many index items recorded by MR, it has a lot of influence on the calculation and accuracy of the entire model. In order to improve the calculation and accuracy of the model, the characteristic values such as AOA (angle of incidence), TA_CALC (delay), RSRP (test power value), TADLTVALUE (downlink delay), TIME_DIFFERENCE (time difference endtime-starttime) are selected as independent variables X , The corresponding POSITIONMARK_REAL (indoor and outdoor mark) is set as the dependent variable Y. In this way, the problem is converted into a mathematical problem, and the indoor variable X is determined by the environmental variable X, which can be regarded as a 0-1 classification problem. In the embodiment of this article, the random forest classification model has better accuracy and generalization. Build a random forest classification model through Python code, input the training data set into the RANDOMFORESTCLASSIFIER model and start training.
步骤204、训练模型优化模型参数。Step 204: Train the model to optimize model parameters.
将训练数据集输入到RANDOMFORESTCLASSIFIER模型中,再通过GRIDSEARCHCV网格搜索最优的随机森林分类算法的决策树个数N_ESTIMATORS和计算属性CRITERION;将测试数据集输入到训练后的模型中进行交差验证。如果误差越小则选择该模型,否则继续调整模型参数,直到模型验证测试数据的误差足够小。Input the training data set into the RANDOMFORESTCLASSIFIER model, and then search the optimal number of decision trees N_ESTIMATORS and the calculated attribute CRITERION through the GRIDSEARCHCV grid for the random forest classification algorithm; input the test data set into the trained model for cross-validation verification. If the error is smaller, select the model, otherwise continue to adjust the model parameters until the error of the model verification test data is small enough.
步骤205、衡量模型准确率机制。Step 205. A mechanism for measuring the accuracy of the model.
将测试集数据输入到训练好的随机森林分类模型进行交叉预测验证,Input the test set data into the trained random forest classification model for cross-prediction verification,
交差预测验证测试集数据的预测值和真实值之间的最小均方误差。如果该误差越小模型越好,反之则差。每次将模型的预测数据集的准确性记录起来,选择准确率最高的模型,将该模型保存起来;在所述均方误差不大于预设的阈值时,将所述最优模型参数对应的随机森林分类模型作为所述随机森林分类器;在所述均方误差大于所述阈值时,重新通过网格搜索所述随机森林分类模型的最优模型参数。Crossover prediction verifies the minimum mean square error between the predicted value and the true value of the test set data. The model is better if the error is smaller, otherwise it is worse. Each time the accuracy of the prediction data set of the model is recorded, the model with the highest accuracy rate is selected, and the model is saved; when the mean square error is not greater than a preset threshold, the corresponding A random forest classification model serves as the random forest classifier; when the mean square error is greater than the threshold, the grid is again searched for optimal model parameters of the random forest classification model.
步骤206、采集目标用户的实时MR数据。Step 206: Collect real-time MR data of the target user.
随机选择一个区域的目标用户,在基站侧采集部分目标用户的MR实时数据,至少包括AOA(入射角度)、TA_CALC(时延)、RSRP(测试功率值)、TADLTVALUE(下行时延)、TIME_DIFFERENCE(时间差endtime-starttime)几个指标,再用这些指标实际预测移动用户的室内外标记。Randomly select target users in an area and collect MR real-time data of some target users on the base station side, including at least AOA (angle of incidence), TA_CALC (time delay), RSRP (test power value), TADLTVALUE (downlink delay), TIME_DIFFERENCE( Time difference (endtime-starttime) several indicators, and then use these indicators to actually predict the indoor and outdoor marks of mobile users.
步骤207、实时MR数据预处理。Step 207: Real-time MR data preprocessing.
实时数据中可能存在异常或空值数据,将这些异常数据用0来替代,选取训练模型对应的几个指标作为特征值。对特征值数据进行正交归一化处理,可以有效避免过拟合现象的发生。There may be abnormal or null data in the real-time data, and these abnormal data are replaced with 0, and several indexes corresponding to the training model are selected as the feature values. Orthogonal normalization of eigenvalue data can effectively avoid the occurrence of overfitting.
步骤208、实时MR数据输入随机森林分类器进行预测。Step 208: Real-time MR data is input into a random forest classifier for prediction.
如图3所示,将处理后的实时MR数据输入之前训练好的随机森林分类器中,经过随机森林分类器的拟合。As shown in Figure 3, the processed real-time MR data is input into the previously trained random forest classifier, and then fitted by the random forest classifier.
步骤209、可得这些目标用户的实时MR数据对应的室内、室外标记结果。Step 209: The indoor and outdoor marking results corresponding to the real-time MR data of these target users can be obtained.
本文实施例有效提高用户的室内外标记的预测准确性,并且有效保证确定用户室内外标记过程中的实时性。The embodiments herein effectively improve the prediction accuracy of indoor and outdoor markings of users, and effectively ensure the real-time nature of the process of determining indoor and outdoor markings of users.
实施例三Example Three
本文实施例提供一种随机森林分类器的训练方法,所述方法包括:从采集的目标区域内样本用户的测量报告数据和每条训练数据对应的实际室内外标记中提取训练数据集;将所述训练数据集输入到预设的随机森林分类模型中进行训练;在训练过程中,通过网格搜索所述随机森林分类模型的最优模型参数;将所述最优模型参数对应的随机森林分类模型作为所述随机森林分类器。The embodiments herein provide a training method for a random forest classifier. The method includes: extracting training data sets from the collected measurement report data of sample users in the target area and the actual indoor and outdoor tags corresponding to each piece of training data; The training data set is input into a preset random forest classification model for training; during the training process, the grid is searched for the optimal model parameters of the random forest classification model; the random forest classification corresponding to the optimal model parameters is classified The model serves as the random forest classifier.
本文实施例随机森林分类器的训练过程与实施例一的训练过程相同,在具体实现时,可以参阅实施例一,具有相应的技术效果。The training process of the random forest classifier of the embodiment of this document is the same as the training process of the first embodiment. For specific implementation, refer to the first embodiment, which has corresponding technical effects.
实施例四Example 4
本文实施例提供一种通信节点设备,其中,所述设备包括存储器和处理器,所述存储器存储有用户的室内外标记程序,所述处理器执行所述计算机程序,以实现如实施例一和实施例二中任意一项所述方法的步骤。其中通信节点设备可以是基站等。The embodiments herein provide a communication node device, wherein the device includes a memory and a processor, the memory stores a user's indoor and outdoor marking programs, and the processor executes the computer program to implement the first embodiment and The steps of the method according to any one of the second embodiment. The communication node device may be a base station or the like.
实施例五Example 5
本文实施例提供一种随机森林分类器的训练设备,所述设备包括存储器和处理器,所述存储器存储有随机森林分类器的训练程序,所述处理器执行所述计算机程序以实现如实施例三所述方法的步骤。The embodiments herein provide a random forest classifier training device. The device includes a memory and a processor. The memory stores a random forest classifier training program. The processor executes the computer program to implement the embodiment. Three steps of the method.
实施例六Example Six
本文实施例提供一种计算机可读存储介质,其中,所述存储介质存储有用户的室内外标记程序,所述计算机程序可被至少一个处理器执行,以实现如实施例一和实施例二中任意一项所述方法的步骤。The embodiments herein provide a computer-readable storage medium, wherein the storage medium stores a user's indoor and outdoor marking programs, and the computer program may be executed by at least one processor to implement the first and second embodiments. Any one of the steps of the method.
实施例七Example 7
本文实施例提供一种计算机可读存储介质,其中,所述存储介质存储有随机森林分类器的训练程序,所述计算机程序可被至少一个处理器执行,以实现如实施例三所述方法的步骤。The embodiments herein provide a computer-readable storage medium, wherein the storage medium stores a random forest classifier training program, and the computer program may be executed by at least one processor to implement the method described in Embodiment 3. step.
需要说明的是,实施例三至实施例七的具体实现可以参阅实施例一,具有相应的技术效果。It should be noted that the specific implementation of the third to seventh embodiments can refer to the first embodiment, which has corresponding technical effects.
需要说明的是,在本文中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者装置不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者装置所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素, 并不排除在包括该要素的过程、方法、物品或者装置中还存在另外的相同要素。It should be noted that in this article, the terms "include", "include" or any other variant thereof are intended to cover non-exclusive inclusion, so that a process, method, article or device that includes a series of elements includes not only those elements, It also includes other elements that are not explicitly listed, or include elements inherent to this process, method, article, or device. Without more restrictions, the element defined by the sentence "including one..." does not exclude that there are other identical elements in the process, method, article or device that includes the element.
上述本文实施例序号仅仅为了描述,不代表实施例的优劣。The sequence numbers of the above embodiments of the present invention are for description only, and do not represent the advantages and disadvantages of the embodiments.
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到上述实施例方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本文的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质(如ROM/RAM、磁碟、光盘)中,包括若干指令用以使得一台终端(可以是手机,计算机,服务器,空调器,或者网络设备等)执行本文各个实施例所述的方法。Through the description of the above embodiments, those skilled in the art can clearly understand that the methods in the above embodiments can be implemented by means of software plus a necessary general hardware platform, and of course, can also be implemented by hardware, but in many cases the former is better Implementation. Based on this understanding, the technical solutions in this article can be embodied in the form of software products, which can be embodied in the form of software products. The computer software products are stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk) ) Includes several instructions to enable a terminal (which may be a mobile phone, computer, server, air conditioner, or network device, etc.) to perform the methods described in the various embodiments of this document.
本文实施例有益效果如下:上述的各个实施例中通过将采集的目标用户的MR数据输入到随机森林分类器进行分类计算,从而可以根据分类计算确定目标用户的室内外标记,进而在确定用户室内外标记方面有效降低误判率,并且基于MR数据进行判断,有效保证确定用户室内外标记过程中的实时性。The beneficial effects of the embodiments herein are as follows: In each of the above embodiments, the collected target user's MR data is input to a random forest classifier for classification calculation, so that the indoor and outdoor marks of the target user can be determined according to the classification calculation, and then the user indoor In terms of external marking, the rate of misjudgment is effectively reduced, and the judgment is based on MR data, which effectively guarantees the real-time nature of determining the indoor and outdoor marking process of users.
上面结合附图对本文的实施例进行了描述,但是本文并不局限于上述的具体实施方式,上述的具体实施方式仅仅是示意性的,而不是限制性的,本领域的普通技术人员在本文的启示下,在不脱离本文宗旨和权利要求所保护的范围情况下,还可做出很多形式,这些均属于本文的保护之内。The embodiments of this document have been described above in conjunction with the drawings, but this document is not limited to the above-mentioned specific embodiments. The above-mentioned specific embodiments are merely illustrative, not limiting, and those of ordinary skill in the art Inspired by this, there are many forms that can be made without departing from the scope of the purpose and claims of this article, which are all covered by this article.

Claims (10)

  1. 一种用户室内外标记的区分方法,其中,所述方法包括:A method for distinguishing indoor and outdoor user marks, wherein the method includes:
    采集目标用户的测量报告数据;Collect measurement report data of target users;
    将所述目标用户的测量报告数据输入到用于分类用户的室内外标记的随机森林分类器;Input the measurement report data of the target user into a random forest classifier for classifying indoor and outdoor tags of users;
    根据所述随机森林分类器的分类计算,确定所述目标用户的室内外标记。According to the classification calculation of the random forest classifier, determine indoor and outdoor marks of the target user.
  2. 如权利要求1所述的方法,其中,所述将所述目标用户的测量报告数据输入到用于分类用户的室内外标记的随机森林分类器之前,包括:The method according to claim 1, wherein the inputting the measurement report data of the target user into the random forest classifier used to classify the indoor and outdoor marks of the user includes:
    从采集的目标区域内样本用户的测量报告数据和每条训练数据对应的实际室内外标记中提取训练数据集;Extract the training data set from the collected measurement report data of the sample users in the target area and the actual indoor and outdoor tags corresponding to each piece of training data;
    将所述训练数据集输入到预设的随机森林分类模型中进行训练;Input the training data set into a preset random forest classification model for training;
    在训练过程中,通过网格搜索所述随机森林分类模型的最优模型参数;During the training process, search the optimal model parameters of the random forest classification model through a grid;
    将所述最优模型参数对应的随机森林分类模型作为所述随机森林分类器。A random forest classification model corresponding to the optimal model parameter is used as the random forest classifier.
  3. 如权利要求2所述的方法,其中,所述将所述训练数据集输入到预设的随机森林分类模型中进行训练之前,包括:The method according to claim 2, wherein the inputting the training data set into a preset random forest classification model for training includes:
    将所述训练数据集中每个训练数据设置为自变量,将所述每个训练数据对应的实际室内外标记设置为由所述自变量决定的因变量。Each training data in the training data set is set as an independent variable, and the actual indoor and outdoor marks corresponding to each training data are set as dependent variables determined by the independent variables.
  4. 如权利要求2所述的方法,其中,所述将所述最优模型参数对应的随机森林分类模型作为所述随机森林分类器,包括:The method according to claim 2, wherein the random forest classification model corresponding to the optimal model parameter as the random forest classifier includes:
    从所述样本测量报告数据中提取测试数据集;Extract test data sets from the sample measurement report data;
    将所述测试数据集输入到所述最优模型参数对应的随机森林分类模型进行预测验证;Input the test data set into the random forest classification model corresponding to the optimal model parameter for prediction and verification;
    确定预测验证结果与所述测试数据集对应设置的实际室内外标记之间的最小均方误差;Determine the minimum mean square error between the prediction verification result and the actual indoor and outdoor marks set corresponding to the test data set;
    在所述均方误差不大于预设的阈值时,将所述最优模型参数对应的随机森林分类模型作为所述随机森林分类器;When the mean square error is not greater than a preset threshold, use the random forest classification model corresponding to the optimal model parameter as the random forest classifier;
    在所述均方误差大于所述阈值时,重新通过网格搜索所述随机森林分类模型的最优模型参数。When the mean square error is greater than the threshold, the grid is searched again for the optimal model parameters of the random forest classification model.
  5. 如权利要求1-4中任意一项所述的方法,其中,所述测量报告数据包括入射角度、时延、测试功率值、下行时延和时间差。The method according to any one of claims 1-4, wherein the measurement report data includes an incident angle, a time delay, a test power value, a downlink time delay, and a time difference.
  6. 一种随机森林分类器的训练方法,其中,所述方法包括:A random forest classifier training method, wherein the method includes:
    从采集的目标区域内样本用户的测量报告数据和每条训练数据对应的实际室内外标记中提取训练数据集;Extract the training data set from the collected measurement report data of the sample users in the target area and the actual indoor and outdoor tags corresponding to each piece of training data;
    将所述训练数据集输入到预设的随机森林分类模型中进行训练;Input the training data set into a preset random forest classification model for training;
    在训练过程中,通过网格搜索所述随机森林分类模型的最优模型参数;During the training process, search the optimal model parameters of the random forest classification model through a grid;
    将所述最优模型参数对应的随机森林分类模型作为所述随机森林分类器。A random forest classification model corresponding to the optimal model parameter is used as the random forest classifier.
  7. 一种通信节点设备,其中,所述设备包括存储器和处理器,所述存储器存储有用户的室内外标记程序,所述处理器执行所述计算机程序,以实现如权利要求1-5中任意一项所述方法的步骤。A communication node device, wherein the device includes a memory and a processor, the memory stores a user's indoor and outdoor marking programs, and the processor executes the computer program to implement any one of claims 1-5 Item of the method.
  8. 一种随机森林分类器的训练设备,其中,所述设备包括存储器和处理器,所述存储器存储有随机森林分类器的训练程序,所述处理器执行所述计算机程序以实现如权利要求6所述方法的步骤。A random forest classifier training device, wherein the device includes a memory and a processor, the memory stores a random forest classifier training program, and the processor executes the computer program to implement the claim 6 Describe the steps of the method.
  9. 一种计算机可读存储介质,其中,所述存储介质存储有用户的室内外标记程序,所述计算机程序可被至少一个处理器执行,以实现如权利要求1-5中任意一项所述方法的步骤。A computer-readable storage medium, wherein the storage medium stores a user's indoor and outdoor marking program, and the computer program can be executed by at least one processor to implement the method according to any one of claims 1-5 A step of.
  10. 一种计算机可读存储介质,其中,所述存储介质存储有随机森林分类器的训练程序,所述计算机程序可被至少一个处理器执行,以实现如权利要求6所述方法的步骤。A computer-readable storage medium, wherein the storage medium stores a training program of a random forest classifier, and the computer program is executable by at least one processor to implement the steps of the method of claim 6.
PCT/CN2019/109438 2018-12-25 2019-09-30 Indoor and outdoor label distinguishing method, training method and device of classifier and medium WO2020134299A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201811595402.5A CN111368862A (en) 2018-12-25 2018-12-25 Method for distinguishing indoor and outdoor marks, training method and device of classifier and medium
CN201811595402.5 2018-12-25

Publications (1)

Publication Number Publication Date
WO2020134299A1 true WO2020134299A1 (en) 2020-07-02

Family

ID=71128575

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/109438 WO2020134299A1 (en) 2018-12-25 2019-09-30 Indoor and outdoor label distinguishing method, training method and device of classifier and medium

Country Status (2)

Country Link
CN (1) CN111368862A (en)
WO (1) WO2020134299A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113398569A (en) * 2021-06-15 2021-09-17 网易(杭州)网络有限公司 Card set classification processing method, card set classification training method, card set searching method and card set searching equipment
CN113993068A (en) * 2021-10-18 2022-01-28 郑州大学 Positioning and direction finding system and method and BLE positioning equipment

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112181055A (en) * 2020-09-28 2021-01-05 广东小天才科技有限公司 Indoor and outdoor state judgment method, wearable device and computer readable storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101695152A (en) * 2009-10-12 2010-04-14 中国科学院计算技术研究所 Indoor positioning method and system thereof
WO2010148769A1 (en) * 2009-11-11 2010-12-29 中兴通讯股份有限公司 User terminal location method and equipment thereof and user terminal navigation mehtid and equipment thereof
CN104239034A (en) * 2014-08-19 2014-12-24 北京奇虎科技有限公司 Occasion identification method and occasion identification device for intelligent electronic device as well as information notification method and information notification device
CN105025440A (en) * 2015-07-09 2015-11-04 深圳天珑无线科技有限公司 Indoor/outdoor scene detection method and device
CN108151743A (en) * 2017-12-13 2018-06-12 联想(北京)有限公司 Indoor and outdoor location recognition method and system

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108616900B (en) * 2016-12-12 2021-06-11 中国移动通信有限公司研究院 Method for distinguishing indoor and outdoor measurement reports and network equipment
CN109034177B (en) * 2018-05-24 2022-07-29 东南大学 Indoor and outdoor identification method for mobile intelligent terminal

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101695152A (en) * 2009-10-12 2010-04-14 中国科学院计算技术研究所 Indoor positioning method and system thereof
WO2010148769A1 (en) * 2009-11-11 2010-12-29 中兴通讯股份有限公司 User terminal location method and equipment thereof and user terminal navigation mehtid and equipment thereof
CN104239034A (en) * 2014-08-19 2014-12-24 北京奇虎科技有限公司 Occasion identification method and occasion identification device for intelligent electronic device as well as information notification method and information notification device
CN105025440A (en) * 2015-07-09 2015-11-04 深圳天珑无线科技有限公司 Indoor/outdoor scene detection method and device
CN108151743A (en) * 2017-12-13 2018-06-12 联想(北京)有限公司 Indoor and outdoor location recognition method and system

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113398569A (en) * 2021-06-15 2021-09-17 网易(杭州)网络有限公司 Card set classification processing method, card set classification training method, card set searching method and card set searching equipment
CN113398569B (en) * 2021-06-15 2024-02-02 网易(杭州)网络有限公司 Card group classification processing, model training and card group searching method and equipment
CN113993068A (en) * 2021-10-18 2022-01-28 郑州大学 Positioning and direction finding system and method and BLE positioning equipment
CN113993068B (en) * 2021-10-18 2024-01-30 郑州大学 Positioning and direction finding system, method and BLE positioning equipment

Also Published As

Publication number Publication date
CN111368862A (en) 2020-07-03

Similar Documents

Publication Publication Date Title
WO2020134299A1 (en) Indoor and outdoor label distinguishing method, training method and device of classifier and medium
US10405052B2 (en) Method and apparatus for identifying television channel information
AU2018264440B2 (en) Identity authentication method, device and system
CN109635117B (en) Method and device for recognizing user intention based on knowledge graph
WO2021189730A1 (en) Method, apparatus and device for detecting abnormal dense subgraph, and storage medium
CN104185275B (en) A kind of indoor orientation method based on WLAN
CN106507475B (en) Room area WiFi localization method and system based on EKNN
WO2021051917A1 (en) Artificial intelligence (ai) model evaluation method and system, and device
CN103117903A (en) Internet surfing unusual flow detection method and device
CN110493363B (en) System and method for distinguishing random MAC address of smart phone
WO2022007559A1 (en) Palm print recognition method, feature extraction model training method, device and medium
KR20140093772A (en) Method for recommending point of interest using user preferences and moving patterns
CN110781805A (en) Target object detection method, device, computing equipment and medium
CN109559336B (en) Object tracking method, device and storage medium
CN111328102B (en) Method and device for identifying common coverage relation
CN110288468B (en) Data feature mining method and device, electronic equipment and storage medium
CN105408894A (en) Method and device for determining user identity category
CN111626767A (en) Resource data distribution method, device and equipment
CN113221721A (en) Image recognition method, device, equipment and medium
CN116563841B (en) Detection method and detection device for power distribution network equipment identification plate and electronic equipment
Fan et al. WiFi based indoor localization with multiple kernel learning
CN110519685A (en) Indoor orientation method, device and medium based on WiFi
CN116263906A (en) Method, device and storage medium for determining post address
CN113657440A (en) Rejection sample inference method and device based on user feature clustering
Chao et al. An innovative indoor location algorithm based on supervised learning and wifi fingerprint classification

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19906016

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205 DATED 19/11/2021)

122 Ep: pct application non-entry in european phase

Ref document number: 19906016

Country of ref document: EP

Kind code of ref document: A1