CN116032741A - Equipment identification method and device, electronic equipment and computer storage medium - Google Patents
Equipment identification method and device, electronic equipment and computer storage medium Download PDFInfo
- Publication number
- CN116032741A CN116032741A CN202111253826.5A CN202111253826A CN116032741A CN 116032741 A CN116032741 A CN 116032741A CN 202111253826 A CN202111253826 A CN 202111253826A CN 116032741 A CN116032741 A CN 116032741A
- Authority
- CN
- China
- Prior art keywords
- device information
- information
- model
- character
- matching sub
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 110
- 238000012549 training Methods 0.000 claims description 117
- 239000013598 vector Substances 0.000 claims description 44
- 230000015654 memory Effects 0.000 claims description 31
- 230000011218 segmentation Effects 0.000 claims description 17
- 238000006243 chemical reaction Methods 0.000 claims description 14
- 238000004590 computer program Methods 0.000 claims description 13
- 238000000605 extraction Methods 0.000 claims description 10
- 238000012545 processing Methods 0.000 claims description 10
- 230000014509 gene expression Effects 0.000 claims description 9
- 238000004140 cleaning Methods 0.000 claims description 8
- 230000008569 process Effects 0.000 description 24
- 230000006870 function Effects 0.000 description 16
- 238000005516 engineering process Methods 0.000 description 13
- 238000010586 diagram Methods 0.000 description 12
- 238000013527 convolutional neural network Methods 0.000 description 9
- 239000000203 mixture Substances 0.000 description 8
- 230000001360 synchronised effect Effects 0.000 description 6
- 230000000717 retained effect Effects 0.000 description 5
- 238000004891 communication Methods 0.000 description 4
- 230000009193 crawling Effects 0.000 description 4
- 239000000284 extract Substances 0.000 description 4
- 238000012360 testing method Methods 0.000 description 4
- 238000011161 development Methods 0.000 description 3
- 230000018109 developmental process Effects 0.000 description 3
- 230000004913 activation Effects 0.000 description 2
- 238000013528 artificial neural network Methods 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 238000010276 construction Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 230000003068 static effect Effects 0.000 description 2
- KLDZYURQCUYZBL-UHFFFAOYSA-N 2-[3-[(2-hydroxyphenyl)methylideneamino]propyliminomethyl]phenol Chemical compound OC1=CC=CC=C1C=NCCCN=CC1=CC=CC=C1O KLDZYURQCUYZBL-UHFFFAOYSA-N 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000007635 classification algorithm Methods 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 201000001098 delayed sleep phase syndrome Diseases 0.000 description 1
- 208000033921 delayed sleep phase type circadian rhythm sleep disease Diseases 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 238000010921 in-depth analysis Methods 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 238000012163 sequencing technique Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 239000002699 waste material Substances 0.000 description 1
Images
Landscapes
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
Description
技术领域technical field
本申请涉及通信技术领域,尤其涉及一种设备识别方法、装置、电子设备和计算机存储介质。The present application relates to the technical field of communications, and in particular to a device identification method, device, electronic device and computer storage medium.
背景技术Background technique
随着因特网及物联网的高速发展,网络已成为人们生活中必不可少的一部分,越来越多智能终端的出现,对网关下挂设备的管理提出了挑战。With the rapid development of the Internet and the Internet of Things, the network has become an indispensable part of people's lives, and the emergence of more and more smart terminals poses challenges to the management of devices connected to the gateway.
近年来,随着三网融合和宽带化的快速发展,智能网关逐渐走进人们的日常生活,成为开启智能化生活的钥匙,如何深挖智能网关的潜在价值是一个值得思考和研究的问题,智能网关是智能家居的“心脏”,有了智能网关用户便可轻而易举地操控家里的智能设备。因此,识别家庭用户使用智能网关关联了哪些设备就成为了亟待解决的问题。In recent years, with the rapid development of triple play integration and broadband, intelligent gateways have gradually entered people's daily lives and become the key to open intelligent life. How to dig out the potential value of intelligent gateways is a question worthy of consideration and research. The smart gateway is the "heart" of the smart home. With the smart gateway, users can easily control the smart devices at home. Therefore, identifying which devices a home user associates with a smart gateway has become an urgent problem to be solved.
发明内容Contents of the invention
本申请提供了一种设备识别方法、装置、电子设备和计算机存储介质,能够对智能网关关联的设备进行精准识别,提高了识别效率和准确率。The present application provides a device identification method, device, electronic device, and computer storage medium, which can accurately identify devices associated with an intelligent gateway, and improve identification efficiency and accuracy.
本申请的技术方案是这样实现的:The technical scheme of the present application is realized like this:
第一方面,本申请实施例提供了一种设备识别方法,该方法包括:In the first aspect, the embodiment of the present application provides a device identification method, the method includes:
获取待识别设备信息;Obtain information about the device to be identified;
将所述待识别设备信息输入预设识别模型,其中,所述预设识别模型包括单字符匹配子模型和字符串匹配子模型;Inputting the information of the device to be identified into a preset recognition model, wherein the preset recognition model includes a single-character matching sub-model and a character string matching sub-model;
利用所述单字符匹配子模型对所述待识别设备信息进行识别,确定第一识别结果,以及利用所述字符串匹配子模型对所述待识别设备信息进行识别,确定第二识别结果;Using the single-character matching sub-model to identify the device information to be identified, determine a first identification result, and use the character string matching sub-model to identify the device information to be identified, and determine a second identification result;
根据所述第一识别结果和所述第二识别结果,确定设备识别结果。A device identification result is determined according to the first identification result and the second identification result.
第二方面,本申请实施例提供了一种设备识别装置,该设备识别装置包括获取单元,识别单元和确定单元,其中,In the second aspect, an embodiment of the present application provides an apparatus for identifying equipment, the apparatus for identifying equipment includes an acquiring unit, an identifying unit, and a determining unit, wherein,
所述获取单元,配置为获取待识别设备信息;The acquiring unit is configured to acquire information about the device to be identified;
所述识别单元,配置为将所述待识别设备信息输入预设识别模型,其中,所述预设识别模型包括单字符匹配子模型和字符串匹配子模型;并利用所述单字符匹配子模型对所述待识别设备信息进行识别,确定第一识别结果,以及利用所述字符串匹配子模型对所述待识别设备信息进行识别,确定第二识别结果;The recognition unit is configured to input the information of the device to be recognized into a preset recognition model, wherein the preset recognition model includes a single-character matching sub-model and a string matching sub-model; and utilizes the single-character matching sub-model Identifying the information of the device to be identified, determining a first identification result, and using the character string matching sub-model to identify the information of the device to be identified, and determining a second identification result;
所述确定单元,配置为根据所述第一识别结果和所述第二识别结果,确定设备识别结果。The determining unit is configured to determine a device identification result according to the first identification result and the second identification result.
第三方面,本申请实施例提供了一种电子设备,该电子设备包括存储器和处理器,其中,In a third aspect, an embodiment of the present application provides an electronic device, where the electronic device includes a memory and a processor, wherein,
所述存储器,用于存储能够在所述处理器上运行的计算机程序;said memory for storing a computer program capable of running on said processor;
所述处理器,用于在运行所述计算机程序时,执行如第一方面所述的设备识别方法。The processor is configured to execute the device identification method according to the first aspect when running the computer program.
第四方面,本申请实施例提供了一种计算机存储介质,计算机存储介质存储有计算机程序,所述计算机程序被至少一个处理器执行时实现如第一方面所述的设备识别方法。In a fourth aspect, an embodiment of the present application provides a computer storage medium, where a computer program is stored in the computer storage medium, and when the computer program is executed by at least one processor, the device identification method as described in the first aspect is implemented.
本申请实施例所提供的一种设备识别方法、装置、电子设备和计算机存储介质,通过获取待识别设备信息;将待识别设备信息输入预设识别模型,其中,预设识别模型包括单字符匹配子模型和字符串匹配子模型;利用单字符匹配子模型对待识别设备信息进行识别,确定第一识别结果,以及利用字符串匹配子模型对待识别设备信息进行识别,确定第二识别结果;根据第一识别结果和第二识别结果,确定设备识别结果。这样,通过单字符匹配子模型和字符串匹配子模型对待识别设备信息进行组合识别,即根据两个子模型的识别结果联合确定待识别设备信息的设备识别结果,能够提高对设备信息进行识别的准确性,同时还提高了识别效率。A device identification method, device, electronic device, and computer storage medium provided in the embodiments of the present application obtain the information of the device to be identified; input the information of the device to be identified into a preset recognition model, wherein the preset recognition model includes single-character matching The sub-model and the string matching sub-model; use the single-character matching sub-model to identify the information of the device to be recognized, determine the first recognition result, and use the string matching sub-model to identify the information of the device to be recognized, and determine the second recognition result; according to the first The first recognition result and the second recognition result determine the device recognition result. In this way, the combination of the single-character matching sub-model and the character string matching sub-model to recognize the device information to be recognized, that is, to jointly determine the device recognition result of the device information to be recognized according to the recognition results of the two sub-models, can improve the accuracy of device information recognition. and improve the recognition efficiency.
附图说明Description of drawings
图1为本申请实施例提供的一种设备识别方法的流程示意图;FIG. 1 is a schematic flowchart of a device identification method provided in an embodiment of the present application;
图2为本申请实施例提供的一种预设识别模型的训练流程示意图;FIG. 2 is a schematic diagram of a training process of a preset recognition model provided by an embodiment of the present application;
图3为本申请实施例提供的一种设备识别方法的详细流程示意图;Fig. 3 is a detailed flowchart of a device identification method provided in an embodiment of the present application;
图4为本申请实施例提供的另一种设备识别方法的详细流程示意图;FIG. 4 is a schematic flowchart of another device identification method provided in the embodiment of the present application;
图5为本申请实施例提供的又一种设备识别方法的详细流程示意图;FIG. 5 is a schematic flowchart of another device identification method provided in the embodiment of the present application;
图6为本申请实施例提供的一种预设识别模型的训练框架示意图;FIG. 6 is a schematic diagram of a training framework of a preset recognition model provided by an embodiment of the present application;
图7为本申请实施例提供的再一种设备识别方法的详细流程示意图;FIG. 7 is a schematic flowchart of another device identification method provided in the embodiment of the present application;
图8为本申请实施例提供的一种设备识别装置的组成结构示意图;FIG. 8 is a schematic diagram of the composition and structure of a device identification device provided by an embodiment of the present application;
图9为本申请实施例提供的一种电子设备的组成结构示意图;FIG. 9 is a schematic diagram of the composition and structure of an electronic device provided by an embodiment of the present application;
图10为本申请实施例提供的另一种电子设备的组成结构示意图。FIG. 10 is a schematic diagram of the composition and structure of another electronic device provided by the embodiment of the present application.
具体实施方式Detailed ways
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述。可以理解的是,此处所描述的具体实施例仅用于解释相关申请,而非对该申请的限定。另外还需要说明的是,为了便于描述,附图中仅示出了与有关申请相关的部分。The technical solutions in the embodiments of the present application will be clearly and completely described below in conjunction with the drawings in the embodiments of the present application. It should be understood that the specific embodiments described here are only used to explain the related application, not to limit the application. It should also be noted that, for the convenience of description, only the parts related to the relevant application are shown in the drawings.
除非另有定义,本文所使用的所有的技术和科学术语与属于本申请的技术领域的技术人员通常理解的含义相同。本文中所使用的术语只是为了描述本申请实施例的目的,不是旨在限制本申请。Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the technical field to which this application belongs. The terms used herein are only for the purpose of describing the embodiments of the present application, and are not intended to limit the present application.
在以下的描述中,涉及到“一些实施例”,其描述了所有可能实施例的子集,但是可以理解,“一些实施例”可以是所有可能实施例的相同子集或不同子集,并且可以在不冲突的情况下相互结合。In the following description, references to "some embodiments" describe a subset of all possible embodiments, but it is understood that "some embodiments" may be the same subset or a different subset of all possible embodiments, and Can be combined with each other without conflict.
需要指出,本申请实施例所涉及的术语“第一\第二\第三”仅仅是区别类似的对象,不代表针对对象的特定排序,可以理解地,“第一\第二\第三”在允许的情况下可以互换特定的顺序或先后次序,以使这里描述的本申请实施例能够以除了在这里图示或描述的以外的顺序实施。It should be pointed out that the term "first\second\third" involved in the embodiment of this application is only to distinguish similar objects, and does not represent a specific ordering of objects. Understandably, "first\second\third" Where permitted, the specific order or sequencing may be interchanged such that the embodiments of the application described herein can be practiced in sequences other than those illustrated or described herein.
目前,智能网关下挂设备的识别方法非常有限,大多数方案都是创建一个设备型号数据库,然后将智能网关上报的设备信息与设备型号数据库中的关键字进行字符串匹配,完全匹配成功则可获得该设备的型号、类型等信息。但是,该方法只适用于数据量少、设备类型相对固定的智能网关下挂设备识别,该方法至少存在如下缺点:(1)当智能网关上报的设备信息的关键字不完整或关键词排列顺序稍有变动,设备的识别效果将大大降低;(2)过分依赖于设备型号数据库,当出现数据库中未包含的新设备时,就会将其当作无效设备信息而舍弃掉;(3)无法对未匹配成功的设备信息进行深入分析,造成智能网关上报设备信息利用率较低;(4)设备型号数据库无法根据智能网关上报设备信息进行自动化更新,导致只能依靠人工进行筛选和扩充可识别的设备型号种类。At present, there are very limited methods for identifying devices connected to the smart gateway. Most of the solutions are to create a device model database, and then perform string matching between the device information reported by the smart gateway and the keywords in the device model database. Obtain information such as the model and type of the device. However, this method is only suitable for identification of devices connected to smart gateways with a small amount of data and relatively fixed device types. This method has at least the following disadvantages: (1) When the keywords of the device information reported by the smart gateway are incomplete or the order of keywords is incomplete If there is a slight change, the identification effect of the device will be greatly reduced; (2) rely too much on the device model database, when a new device not included in the database appears, it will be discarded as invalid device information; (3) cannot In-depth analysis of the equipment information that has not been successfully matched leads to a low utilization rate of the equipment information reported by the intelligent gateway; (4) the equipment model database cannot be automatically updated according to the equipment information reported by the intelligent gateway, resulting in manual screening and expansion of identifiable type of device model.
也就是说,智能网关上报的下挂设备信息,具有数据量较大、设备类型较多(例如:智能电视、智能手机、智能空调、智能摄像头等)、信息格式不统一(例如:Honor_7A-6036f0010、T1-821w-ed7e90066d182、15557ac28e_bbk-H8A-6173d等)等特点,导致精准识别智能网关下挂设备型号等信息的难度较大。That is to say, the connected device information reported by the smart gateway has a large amount of data, many types of devices (for example: smart TV, smart phone, smart air conditioner, smart camera, etc.), and the information format is not uniform (for example: Honor_7A-6036f0010 , T1-821w-ed7e90066d182, 15557ac28e_bbk-H8A-6173d, etc.), making it difficult to accurately identify the model of the device connected to the smart gateway.
基于此,本申请实施例提供了一种设备识别方法,该方法的基本思想是:获取待识别设备信息;将待识别设备信息输入预设识别模型,其中,预设识别模型包括单字符匹配子模型和字符串匹配子模型;利用单字符匹配子模型对待识别设备信息进行识别,确定第一识别结果,以及利用字符串匹配子模型对待识别设备信息进行识别,确定第二识别结果;根据第一识别结果和第二识别结果,确定设备识别结果。这样,通过单字符匹配子模型和字符串匹配子模型对待识别设备信息进行组合识别,即根据两个子模型的识别结果联合确定待识别设备信息的设备识别结果,从而能够提高对设备信息进行识别的准确性,同时还提高了识别效率。Based on this, the embodiment of the present application provides a device identification method. The basic idea of the method is: obtain the information of the device to be identified; input the information of the device to be identified into the preset recognition model, wherein the preset recognition model includes a single-character match The model and the character string matching sub-model; use the single-character matching sub-model to identify the information of the device to be recognized, determine the first recognition result, and use the character string matching sub-model to identify the information of the device to be recognized, and determine the second recognition result; according to the first The recognition result and the second recognition result determine the device recognition result. In this way, the combination of the single-character matching sub-model and the character string matching sub-model is used to identify the device information to be recognized, that is, the device recognition result of the device information to be recognized is jointly determined according to the recognition results of the two sub-models, thereby improving the accuracy of identifying the device information. Accuracy, but also improve the recognition efficiency.
下面将结合附图对本申请各实施例进行详细说明。Various embodiments of the present application will be described in detail below with reference to the accompanying drawings.
本申请的一实施例中,参见图1,其示出了本申请实施例提供的一种设备识别方法的流程示意图。如图1所示,该方法可以包括:In an embodiment of the present application, refer to FIG. 1 , which shows a schematic flowchart of a device identification method provided in an embodiment of the present application. As shown in Figure 1, the method may include:
S101、获取待识别设备信息。S101. Obtain information about a device to be identified.
需要说明的是,本申请实施例提供的设备识别方法可以应用于设备识别装置,或者集成有该装置的电子设备。这里,电子设备可以是诸如计算机、智能手机、平板电脑、笔记本电脑、掌上电脑、个人数字助理(Personal Digital Assistant,PDA)、导航装置、服务器等等,本申请实施例对此不作具体限定。It should be noted that the device identification method provided in the embodiment of the present application may be applied to a device identification device, or an electronic device integrated with the device. Here, the electronic device may be, for example, a computer, a smart phone, a tablet computer, a notebook computer, a palmtop computer, a personal digital assistant (Personal Digital Assistant, PDA), a navigation device, a server, etc., which are not specifically limited in this embodiment of the present application.
还需要说明的是,本申请实施例提供的设备识别方法主要应用于对家庭中的智能网关下挂设备信息进行精准识别,因此待识别设备信息通常是智能网关所上报的设备信息。可以理解,待识别设备信息也可以是通过其它途径获取到的,本申请实施例对此不作具体限定。It should also be noted that the device identification method provided by the embodiment of the present application is mainly applied to accurately identify the information of the devices connected to the smart gateway in the home, so the device information to be identified is usually the device information reported by the smart gateway. It can be understood that the information of the device to be identified may also be obtained through other means, which is not specifically limited in this embodiment of the present application.
另外,待识别设备信息通常为一系列可能包括与设备相关的信息的报文,例如:wang-njaf-oppo-k9,201-phicomm fwr706,netcore-mg-1200ac,dh-nvr2108hs-8p-s1-kdgj,hikvision ezviz cs-n1p-204等等。In addition, the information of the device to be identified is usually a series of messages that may include information related to the device, for example: wang-njaf-oppo-k9, 201-phicomm fwr706, netcore-mg-1200ac, dh-nvr2108hs-8p-s1- kdgj, hikvision ezviz cs-n1p-204 and so on.
S102、将待识别设备信息输入预设识别模型,其中,预设识别模型包括单字符匹配子模型和字符串匹配子模型。S102. Input the information of the device to be identified into a preset recognition model, where the preset recognition model includes a single-character matching sub-model and a character string matching sub-model.
S103、利用单字符匹配子模型对待识别设备信息进行识别,确定第一识别结果,以及利用字符串匹配子模型对待识别设备信息进行识别,确定第二识别结果。S103. Use the single-character matching sub-model to identify the information of the device to be recognized, determine a first recognition result, and use the character string matching sub-model to recognize the device information to be recognized, and determine a second recognition result.
需要说明的是,本申请实施例通过预设识别模型对待识别设备信息进行识别,以得到设备识别结果,从而确定该待识别设备信息所对应的设备标签,例如设备型号、设备品牌和设备类型等标签信息。It should be noted that, in this embodiment of the present application, the device information to be recognized is recognized by a preset recognition model to obtain the device recognition result, so as to determine the device label corresponding to the device information to be recognized, such as the device model, device brand, and device type, etc. Label Information.
具体来说,预设识别模型可以包括单字符匹配子模型和字符串匹配子模型,在将待识别设备信息输入预设识别模型之后,分别利用单字符匹配子模型和字符串匹配子模型对待识别设备信息进行识别,分别得到第一识别结果和第二识别结果,进而根据第一识别结果和第二识别结果确定最终的设备识别结果。在这里,单字符匹配子模型和字符串匹配子模型可以为卷积神经网络结构,也可以为其他神经网络结构,本申请实施例并不作具体限定。Specifically, the preset recognition model may include a single-character matching sub-model and a character string matching sub-model. After inputting the device information to be recognized into the preset recognition model, the single-character matching sub-model and the character string matching The equipment information is identified to obtain a first identification result and a second identification result respectively, and then a final equipment identification result is determined according to the first identification result and the second identification result. Here, the single character matching sub-model and the character string matching sub-model may be a convolutional neural network structure or other neural network structures, which are not specifically limited in this embodiment of the present application.
进一步地,对于预设识别模型的训练过程而言,参见图2,其示出了本申请实施例提供的一种预设识别模型的训练流程示意图。如图2所示,该流程可以包括:Further, for the training process of the preset recognition model, refer to FIG. 2 , which shows a schematic diagram of a training flow of a preset recognition model provided by an embodiment of the present application. As shown in Figure 2, the process can include:
S201、从预设设备型号数据库中,获取样本训练集。S201. Obtain a sample training set from a preset device model database.
需要说明的是,在确定预设识别模型时,首先获取用于模型训练的样本训练集,该样本训练集由预设设备型号数据库中获取,即样本训练集由预设型号数据库中所获取到的设备信息和设备信息对应的设备标签组成。在该样本训练集中包括有至少一个设备信息和至少一个设备信息各自对应的设备标签,其中,设备标签至少可以包括与设备信息对应的设备型号、设备品牌和设备类型等标签信息。It should be noted that when determining the preset recognition model, the sample training set used for model training is first obtained, and the sample training set is obtained from the preset device model database, that is, the sample training set is obtained from the preset model database It consists of the device information and the device label corresponding to the device information. The sample training set includes at least one device information and at least one device label corresponding to the device information, wherein the device label may at least include label information such as device model, device brand, and device type corresponding to the device information.
进一步地,对于预设设备型号数据库来说,在一些实施例中,该方法还可以包括:Further, for the preset device model database, in some embodiments, the method may also include:
获取原始设备信息集合,原始设备信息集合包括至少一个设备信息;Acquiring an original device information set, where the original device information set includes at least one piece of device information;
对原始设备信息集合中的设备信息进行数据清洗与数据统计,得到中间设备信息集合以及中间设备信息集合中每一设备信息的出现次数;Perform data cleaning and data statistics on the equipment information in the original equipment information set, and obtain the intermediate equipment information set and the occurrence times of each equipment information in the intermediate equipment information set;
将中间设备信息集合中每一设备信息的出现次数与预设次数阈值进行比较,获取出现次数大于预设次数阈值的设备信息;Comparing the number of occurrences of each device information in the intermediate device information set with a preset number of times threshold, and acquiring device information whose number of occurrences is greater than the preset number of times threshold;
根据出现次数大于预设次数阈值的设备信息,构建设备信息总数据库;Build a total database of device information based on the device information whose occurrence times are greater than the preset threshold;
对设备信息总数据库进行抓取处理,得到预设设备型号数据库。The general database of equipment information is captured and processed to obtain a database of preset equipment models.
需要说明的是,在原始设备信息集合中,包括至少一个设备信息。由于本申请实施例主要应用于对智能网关下挂设备的精准识别,因此在获取原始设备信息集合时,可以是获取所有智能网关上报的设备信息,这些设备信息可以基于网关插件进行采集获取。It should be noted that at least one piece of device information is included in the original device information set. Since the embodiment of the present application is mainly applied to the accurate identification of the devices attached to the smart gateway, when obtaining the original device information set, all the device information reported by the smart gateway can be obtained, and the device information can be collected based on the gateway plug-in.
这样,在获取原始设备信息集合之后,至少设备信息中可能会存在有一些无效设备信息、重复设备信息等,因此还可以对原始设备信息集合中的设备信息进行数据清洗和数据统计,将无效和重复的设备信息进行剔除,仅保留有效设备信息,并统计每一个有效设备信息的出现次数,得到中间设备信息集合,以及中间设备信息集合中每一设备信息的出现次数。In this way, after obtaining the original device information set, at least there may be some invalid device information, duplicate device information, etc. in the device information, so it is also possible to perform data cleaning and data statistics on the device information in the original device information Duplicated device information is eliminated, only valid device information is retained, and the number of occurrences of each valid device information is counted to obtain a set of intermediate device information and the number of occurrences of each device information in the set of intermediate device information.
在一些具体的实施例中,对原始设备信息集合中的设备信息进行数据清洗与数据统计,得到中间设备信息集合以及中间设备信息集合中每一设备信息的出现次数,可以包括:In some specific embodiments, data cleaning and data statistics are performed on the device information in the original device information set to obtain the intermediate device information set and the number of occurrences of each device information in the intermediate device information set, which may include:
利用预设表达式对原始设备信息集合中的无效设备信息进行剔除,得到至少一个有效设备信息;Eliminate invalid device information in the original device information set by using a preset expression to obtain at least one valid device information;
统计至少一个有效设备信息中每一有效设备信息的出现次数,并对至少一个有效设备信息进行去重处理,得到至少一个目标设备信息;Counting the number of occurrences of each valid device information in the at least one valid device information, and performing deduplication processing on the at least one valid device information to obtain at least one target device information;
根据至少一个目标设备信息构建中间设备信息集合,以及确定中间设备信息集合中每一设备信息的出现次数。An intermediate device information set is constructed according to at least one target device information, and the occurrence times of each device information in the intermediate device information set are determined.
需要说明的是,在利用预设表达式将无效设备信息进行剔除时,通常是指将设备信息为空、匿名等情况的设备信息进行剔除;另外,由于设备信息通常包括汉字、字母以及数字,除此之外的特殊字符通常也代表设备信息无效,因此也可以将其进行剔除。还需要说明的是,随着技术发展,设备信息也会随之变化,不排除在有效设备信息中存在特殊字符的可能,因此,对于包含特殊字符的设备信息是否为无效设备信息还可以结合实际场景进行确定。It should be noted that when using preset expressions to remove invalid device information, it usually refers to removing device information that is empty or anonymous; in addition, since device information usually includes Chinese characters, letters, and numbers, Other special characters usually mean that the device information is invalid, so they can also be removed. It should also be noted that with the development of technology, device information will also change accordingly, and the possibility of special characters in valid device information cannot be ruled out. Therefore, whether the device information containing special characters is invalid device information can also be combined with the actual The scene is determined.
还需要说明的是,在将无效设备信息进行剔除时,所利用的预设表达式可以为正则表达式。It should also be noted that when removing invalid device information, the preset expression used may be a regular expression.
这样,将原始设备信息集合中的无效设备信息进行剔除之后,就得到了至少一个有效设备信息。对于这至少一个有效设备信息,统计每一有效设备信息的出现次数,并对有效设备信息进行去重处理,具体地,对于多个完全一致的有效设备信息,仅保留其中的一个,从而得到至少一个目标设备信息及其所对应的出现次数。另外,需要注意的是,在统计每一有效设备信息的出现次数时,可以通过预设统计工具实现,预设统计工具可以包括MapReduce等,其中,MapReduce是一种用于大规模数据集并行运算的编程模型。In this way, after the invalid device information in the original device information set is eliminated, at least one piece of valid device information is obtained. For the at least one valid device information, count the occurrence times of each valid device information, and perform deduplication processing on the valid device information, specifically, for multiple completely consistent valid device information, only keep one of them, so as to obtain at least A target device information and its corresponding occurrence count. In addition, it should be noted that when counting the number of occurrences of each valid device information, it can be realized through preset statistical tools. The preset statistical tools can include MapReduce, etc. Among them, MapReduce is a method for parallel computing of large-scale data sets. programming model.
这样,本申请实施例就可以得到清洗并去重后的至少一个目标设备信息,并据此构建中间设备信息集合,以及确定出中间设备信息集合中每一设备信息的出现次数。In this way, the embodiment of the present application can obtain the cleaned and deduplicated at least one target device information, construct an intermediate device information set based on this, and determine the occurrence times of each device information in the intermediate device information set.
如此,将中间设备信息集合中每一设备信息的出现次数与预设次数阈值进行比较,获取其中的出现次数大于预设次数阈值的设备信息,利用出现次数大于预设次数阈值的设备信息就可以构建出设备信息总数据库。In this way, the number of occurrences of each device information in the intermediate device information set is compared with the preset number of times threshold, and the device information whose number of occurrences is greater than the preset number of times threshold is obtained, and the device information with the number of occurrences greater than the preset number of times threshold can be used. Construct a general database of equipment information.
还需要说明的是,对于智能网关上报的设备信息,设备信息的出现次数越大就说明该设备的使用频率越高,本申请实施例将中间设备信息集合中的出现次数大于预设次数阈值的设备信息进行保留,对于预设次数阈值的设定,可以结合实际使用需求进行确定,本申请实施例对此不作具体限定。It should also be noted that for the device information reported by the smart gateway, the greater the number of occurrences of the device information, the higher the frequency of use of the device. In this embodiment of the application, the number of occurrences in the intermediate device information set is greater than the preset threshold The device information is retained, and the setting of the preset times threshold can be determined in combination with actual usage requirements, which is not specifically limited in this embodiment of the present application.
还需要说明的是,由于本申请实施例主要针对家庭中智能网关下挂设备进行识别分析,设备信息仅出现1次或者2次的设备极大概率不属于家庭的成员设备,如果设备信息过多还会提高首次训练预设识别模型的难度,对识别精度造成影响,因此本申请实施例仅根据出现次数大于预设次数阈值的有效设备信息构建设备信息总数据库。另外,如果应用场景比较简单,且涉及的设备信息的数量较少,也可以将所有的设备信息都进行保留,但是,由于设备信息通常是海量的,因此有必要进行删除操作。It should also be noted that since the embodiment of the present application mainly focuses on the identification and analysis of the devices connected to the smart gateway in the home, the device whose device information appears only once or twice has a high probability that it does not belong to the member device of the family. If there are too many device information It will also increase the difficulty of training the preset recognition model for the first time, which will affect the recognition accuracy. Therefore, the embodiment of the present application only constructs the total device information database based on valid device information whose occurrence times are greater than the preset number threshold. In addition, if the application scenario is relatively simple and the amount of device information involved is small, all the device information can also be retained. However, since the device information is usually massive, it is necessary to perform a deletion operation.
这样,在确定设备信息总数据库之后,对设备信息总数据库进行抓取处理,从而得到预设设备型号数据库。具体地,在一些实施例中,对设备信息总数据库进行抓取处理,得到预设设备型号数据库,可以包括:In this way, after the general database of equipment information is determined, the general database of equipment information is captured, so as to obtain the database of preset equipment models. Specifically, in some embodiments, the general database of device information is captured to obtain the database of preset device models, which may include:
利用网络爬虫方式分别获取设备信息总数据库中的每一设备信息的种子URL,并利用网络爬虫技术获取每一种子URL对应的网页信息;Obtaining the seed URL of each device information in the general device information database by means of a web crawler, and obtaining the web page information corresponding to each seed URL using web crawler technology;
利用网络爬虫方式分别获取设备信息总数据库中的每一设备信息的种子统一资源定位器URL,并利用所述网络爬虫方式获取每一设备信息的种子URL对应的网页信息;Obtaining the URL of the seed uniform resource locator of each device information in the general database of device information respectively by means of a web crawler, and obtaining the webpage information corresponding to the seed URL of each device information by means of the web crawler;
从每一设备信息的种子URL对应的网页信息中,确定每一设备信息对应的设备标签;From the web page information corresponding to the seed URL of each device information, determine the device label corresponding to each device information;
根据每一设备信息以及每一设备信息对应的设备标签,构建预设设备型号数据库。A preset device model database is constructed according to each piece of equipment information and the equipment label corresponding to each piece of equipment information.
也就是说,对于设备信息总数据库中的任一设备信息,首先利用网络爬虫自动从搜索引擎的查询结果中获取该设备信息相关的种子统一资源定位器(Uniform ResourceLocator,URL),其中,搜索引擎的查询结果是在搜索引擎中对设备信息进行搜索后确定的。That is to say, for any device information in the total device information database, firstly, the web crawler is used to automatically obtain the seed Uniform Resource Locator (URL) related to the device information from the query results of the search engine, wherein the search engine The query results for are determined after searching device information in a search engine.
然后,对于抓取的种子URL,再次利用网络爬虫技术获取对应的网页信息,从网页信息中筛选出对应的设备标签,如设备型号、设备品牌和设备类型等标签信息。Then, for the captured seed URL, the web crawler technology is used to obtain the corresponding web page information again, and the corresponding device tags are screened out from the web page information, such as tag information such as device model, device brand, and device type.
最后,根据每一个设备信息和对应的每一个设备标签,构建预设设备型号数据库。Finally, a preset device model database is constructed according to each device information and each corresponding device label.
进一步地,在一些实施例中,该方法还可以包括:Further, in some embodiments, the method may also include:
若其中一设备信息无法成功获取到种子URL,则将其中一设备信息存储至无效设备信息数据库;和/或,If one of the device information cannot successfully obtain the seed URL, then store one of the device information in the invalid device information database; and/or,
若其中一设备信息无法成功确定出设备标签,则将其中一设备信息存储至无效设备信息数据库。If one of the equipment information cannot successfully determine the equipment label, one of the equipment information is stored in the invalid equipment information database.
需要说明的是,对于设备信息总数据库中的设备信息,如果在搜索引擎进行搜索之后,无法搜索到某些设备信息的相关信息,即对于这些设备信息,无法成功获取种子URL,或者对于某些设备信息,虽然获取到种子URL但是再次爬虫时,无法成功获取设备标签,对于这些设备信息,可能是因为设备信息无效或者是对应设备未全面上市等原因导致在网页中暂时无法获取到其设备标签,则将其存储至无效设备信息数据库。It should be noted that, for the device information in the general database of device information, if after the search engine searches, the relevant information of some device information cannot be found, that is, for these device information, the seed URL cannot be obtained successfully, or for some For device information, although the seed URL is obtained, the device label cannot be successfully obtained when crawling again. For these device information, it may be because the device information is invalid or the corresponding device is not fully listed, etc., so that the device label cannot be obtained temporarily on the webpage. , it is stored in the invalid device information database.
这样,在构建出预设设备型号数据库之后,可以从预设设备型号数据库中获取样本训练集,以便进行模型训练。In this way, after the preset device model database is constructed, a sample training set can be obtained from the preset device model database for model training.
S202、利用样本训练集对单字符匹配子网络进行训练,得到单字符匹配子模型,以及利用样本训练集对字符串匹配子网络进行训练,得到字符串匹配子模型。S202. Use the sample training set to train the single-character matching sub-network to obtain a single-character matching sub-model, and use the sample training set to train the string matching sub-network to obtain a string matching sub-model.
需要说明的是,利用样本训练集分别对单字符匹配子网络和字符串匹配子网络进行训练,就能够分别得到单字符匹配子模型和字符串匹配子模型。另外,单字符匹配子网络和字符串匹配子网络通常选择卷积神经网络。It should be noted that, by using the sample training set to train the single-character matching sub-network and the string matching sub-network, the single-character matching sub-model and the string matching sub-model can be respectively obtained. In addition, the single-character matching sub-network and string matching sub-network usually choose convolutional neural network.
进一步地,对于单字符匹配子模型和字符串匹配子模型的训练,在一些实施例中,该方法还可以包括:Further, for the training of the single-character matching sub-model and the string matching sub-model, in some embodiments, the method may also include:
根据至少一个设备信息各自对应的设备标签,创建设备标签数据库;Create a device tag database according to the device tags corresponding to at least one piece of device information;
相应地,利用样本训练集对单字符匹配子网络进行训练,得到单字符匹配子模型,可以包括:Correspondingly, using the sample training set to train the single-character matching sub-network to obtain a single-character matching sub-model, which may include:
对样本训练集中的设备信息进行单字符分割,确定单字符词库;Perform single-character segmentation on the device information in the sample training set to determine the single-character lexicon;
对单字符词库和样本训练集进行单字符级向量转换,得到单字符训练向量集;Perform single-character-level vector conversion on the single-character thesaurus and sample training set to obtain a single-character training vector set;
将单字符训练向量集输入单字符匹配子网络,利用设备标签数据库进行有监督迭代训练,得到字符匹配子模型;Input the single-character training vector set into the single-character matching sub-network, and use the device label database for supervised iterative training to obtain the character matching sub-model;
利用样本训练集对字符串匹配子网络进行训练,得到字符串匹配子模型,可以包括:Use the sample training set to train the string matching sub-network to obtain the string matching sub-model, which can include:
对样本训练集中的设备信息进行字符串分割,确定字符串词库;Carry out string segmentation on the device information in the sample training set, and determine the string lexicon;
对字符串词库和样本训练集进行字符串级向量转换,得到字符串训练向量集;Perform string-level vector conversion on the string thesaurus and sample training set to obtain a string training vector set;
将字符串训练向量集输入字符串匹配子网络,利用设备标签数据库进行有监督迭代训练,得到字符串匹配子模型。The string training vector set is input into the string matching sub-network, and the device label database is used for supervised iterative training to obtain the string matching sub-model.
需要说明的是,在创建设备标签数据库时,可以是从样本训练集中提取出所有的设备标签并进行去重处理之后,根据去重处理后的设备标签创建设备标签数据库。也就是说,在设备标签数据库中,每种设备标签仅保留一个。It should be noted that, when creating the device label database, all device labels may be extracted from the sample training set and deduplicated, and then the device label database may be created according to the deduplicated device labels. That is to say, in the device label database, only one of each kind of device label is reserved.
还需要说明的是,本申请实施例利用设备标签数据库分别对单字符匹配子网络和字符串匹配子网络进行有监督迭代训练来获取单字符匹配子模型和字符串匹配子模型。It should also be noted that, in the embodiment of the present application, the device label database is used to perform supervised iterative training on the single-character matching sub-network and the string matching sub-network respectively to obtain the single-character matching sub-model and the string matching sub-model.
具体来说,分别对样本训练集中的设备信息进行单字符分割或者字符串分割,从而分别得到了单字符词库和字符串词库;然后对单字符词库与样本训练集进行单字符级向量转换,得到单字符训练向量集,对字符串词库与样本训练集进行字符串级向量转换,得到字符串训练向量集。Specifically, single-character segmentation or string segmentation is performed on the device information in the sample training set, so as to obtain the single-character thesaurus and string thesaurus respectively; and then the single-character-level vector Convert to obtain a single-character training vector set, perform string-level vector conversion on the string lexicon and sample training set, and obtain a string training vector set.
示例性地,对于设备信息:yunuo-oppo-k9,对其进行单字符分割的结果为:y u nu o o p p o k 9,共十一个字符;对其进行字符串分割的结果为:yunuo oppo k9,共三个词。For example, for the device information: yunuo-oppo-k9, the result of single-character segmentation is: y u nu o o p p o k 9, a total of eleven characters; the result of string segmentation is : yunuo oppo k9, a total of three words.
将单字符训练向量集输入单字符匹配子网络中进行关键特征提取与分类,利用设备标签数据库进行有监督迭代训练,得到字符匹配子模型;将字符串训练向量集输入字符串匹配子网络中进行关键特征提取与分类,利用设备标签数据库进行有监督迭代训练,得到字符串匹配子模型。Input the single-character training vector set into the single-character matching sub-network for key feature extraction and classification, use the device label database for supervised iterative training, and obtain the character matching sub-model; input the string training vector set into the string matching sub-network for further Key feature extraction and classification, using the device label database for supervised iterative training, to obtain a string matching sub-model.
还需要说明的是,在对两个子模型进行迭代训练时,对于单字符匹配子网络和字符串匹配子网络可以分别采用两个独立的损失函数进行,两个独立的损失函数可以为相同的损失函数,例如均为交叉熵损失函数。这样,在完成分类迭代训练之后,就可以分别获得单字符匹配子模型和字符串匹配子模型。It should also be noted that when iteratively training the two sub-models, two independent loss functions can be used for the single-character matching sub-network and the string matching sub-network, and the two independent loss functions can be the same loss Functions, such as cross-entropy loss functions. In this way, after the classification iterative training is completed, the single-character matching sub-model and the string matching sub-model can be respectively obtained.
S203、根据单字符匹配子模型和字符串匹配子模型,确定预设识别模型。S203. Determine a preset recognition model according to the single-character matching sub-model and the character string matching sub-model.
需要说明的是,将单字符匹配子模型和字符串匹配子模型进行模型组合,即得到预设识别模型。It should be noted that the preset recognition model is obtained by combining the single-character matching sub-model and the character string matching sub-model.
也就是说,本申请实施例可以根据上述步骤S201~S203得到预设识别模型。这样,就可以利用预设识别模型中的单字符匹配子模型和字符串匹配子模型分别对待识别设备信息进行识别,分别得到第一识别结果和第二识别结果。That is to say, in the embodiment of the present application, the preset recognition model can be obtained according to the above steps S201-S203. In this way, the single-character matching sub-model and the character string matching sub-model in the preset recognition model can be used to respectively recognize the device information to be recognized, and obtain the first recognition result and the second recognition result respectively.
对于第一识别结果而言,在一些实施例中,利用单字符匹配子模型对待识别设备信息进行识别,确定第一识别结果,可以包括:For the first recognition result, in some embodiments, using the single-character matching sub-model to identify the device information to be recognized, and determining the first recognition result may include:
利用单字符匹配子模型对待识别设备信息进行特征提取与分类,确定待识别设备信息属于每一设备标签的第一概率;Using the single-character matching sub-model to perform feature extraction and classification on the equipment information to be identified, and determine the first probability that the equipment information to be identified belongs to each equipment label;
从第一概率中选取最大值,将最大值对应的设备标签确定为第一识别结果。The maximum value is selected from the first probability, and the device label corresponding to the maximum value is determined as the first recognition result.
对于第二识别结果而言,在一些实施例中,利用字符串匹配子模型对待识别设备信息进行识别,确定第二识别结果,可以包括:For the second recognition result, in some embodiments, using the character string matching sub-model to identify the device information to be recognized, and determining the second recognition result may include:
利用字符串匹配子模型对待识别设备信息进行特征提取与分类,确定待识别设备信息属于每一设备标签的第二概率;Using the character string matching sub-model to perform feature extraction and classification on the equipment information to be identified, and determine the second probability that the equipment information to be identified belongs to each equipment label;
从第二概率中选取最大值,将最大值对应的设备标签确定为第二识别结果。The maximum value is selected from the second probability, and the device label corresponding to the maximum value is determined as the second recognition result.
需要说明的是,分别利用单字符匹配子模型和字符串匹配子模型对待识别设备信息进行关键特征提取与分类,从而通过单字符匹配子模型可以得到待识别设备信息属于每一个设备标签的第一概率,将第一概率中的最大值所对应的设备标签确定为第一识别结果;通过字符串匹配子模型可以得到待识别设备信息属于每一个设备标签的第二概率,将第二概率中的最大值所对应的设备标签确定为第二识别结果。这样,就分别得到了第一识别结果和第二识别结果。It should be noted that the single-character matching sub-model and the character string matching sub-model are used to extract and classify the key features of the device information to be recognized, so that the device information to be recognized can be obtained through the single-character matching sub-model. Probability, the device label corresponding to the maximum value in the first probability is determined as the first recognition result; the second probability that the device information to be recognized belongs to each device label can be obtained through the string matching sub-model, and the second probability in the second probability is The device label corresponding to the maximum value is determined as the second recognition result. In this way, the first recognition result and the second recognition result are respectively obtained.
除此之外,在一些实施例中,在获取待识别设备信息之后,该方法还可以包括:In addition, in some embodiments, after obtaining the information of the device to be identified, the method may further include:
将待识别设备信息与无效设备信息数据库中的设备信息进行匹配;Matching the device information to be identified with the device information in the invalid device information database;
若匹配成功,则确定待识别设备信息为无效设备信息;If the matching is successful, it is determined that the device information to be identified is invalid device information;
若匹配不成功,则执行将待识别设备信息输入预设识别模型的步骤。If the matching is unsuccessful, the step of inputting the information of the device to be identified into the preset identification model is performed.
需要说明的是,由于待识别设备信息存在无效的可能,因此,本申请实施例在获取待识别设备信息之后,还可以首先将待识别设备信息与无效设备信息数据库中的设备信息进行匹配,如果匹配成功,那么可以确定待识别设备信息为无效设备信息,这时候无需再通过预设识别模型进行识别;如果匹配不成功,那么需要执行将待识别设备信息输入预设识别模型的步骤,以进行识别。这样,还能够减轻设备识别装置的计算压力,避免进行无效的识别,导致资源浪费。It should be noted that, since the information of the device to be identified may be invalid, in this embodiment of the present application, after obtaining the information of the device to be identified, it may first match the information of the device to be identified with the device information in the invalid device information database, if If the matching is successful, it can be determined that the information of the device to be identified is invalid. At this time, there is no need to use the preset identification model for identification; identify. In this way, the calculation pressure of the equipment identification device can also be reduced, and invalid identification can be avoided, resulting in waste of resources.
还需要说明的是,在本申请实施例中,如果待识别设备信息为无效设备信息,那么还可以利用待识别设备信息对无效设备信息数据库进行更新。It should also be noted that, in the embodiment of the present application, if the device information to be identified is invalid device information, the invalid device information database may also be updated with the device information to be identified.
S104、根据第一识别结果和第二识别结果,确定设备识别结果。S104. Determine a device identification result according to the first identification result and the second identification result.
需要说明的是,根据第一识别结果和第二识别结果,就能够确定最终的设备识别结果。具体地,在一些实施例中,根据第一识别结果和第二识别结果,确定设备识别结果,可以包括:It should be noted that, according to the first identification result and the second identification result, the final device identification result can be determined. Specifically, in some embodiments, determining the device identification result according to the first identification result and the second identification result may include:
若第一识别结果等于第二识别结果,则将第一识别结果确定为设备识别结果;If the first identification result is equal to the second identification result, determining the first identification result as the device identification result;
若第一识别结果不等于第二识别结果,则确定第一识别结果对应的第一分类概率和第二识别结果对应的第二分类概率,并根据第一分类概率与第一判断阈值的比较结果以及第二分类概率与第二判断阈值的比较结果,确定设备识别结果。If the first recognition result is not equal to the second recognition result, then determine the first classification probability corresponding to the first recognition result and the second classification probability corresponding to the second recognition result, and according to the comparison result of the first classification probability and the first judgment threshold And a comparison result between the second classification probability and the second judgment threshold to determine the device identification result.
需要说明的是,在确定设备识别结果时,如果第一识别结果和第二识别结果相同,则直接将第一识别结果(或者第二识别结果)确定为最终的设备识别结果。It should be noted that, when determining the device recognition result, if the first recognition result and the second recognition result are the same, the first recognition result (or the second recognition result) is directly determined as the final device recognition result.
如果第一识别结果和第二识别结果不同,则结合第一识别结果所对应的第一分类概率(即前述第一概率的最大值)、第二识别结果对应的第二分类概率(即前述第二概率的最大值),以及第一判断阈值和第二判断阈值进一步进行确定。If the first recognition result is different from the second recognition result, the first classification probability corresponding to the first recognition result (that is, the maximum value of the aforementioned first probability) and the second classification probability corresponding to the second recognition result (that is, the aforementioned first classification probability) are combined. The maximum value of the two probabilities), and the first judgment threshold and the second judgment threshold are further determined.
其中,第一判断阈值是用于判断单字符匹配子模型的识别结果是否可靠的最优阈值,第二判断阈值是用于判断字符串匹配子模型的识别结果是否可靠的最优阈值。本申请实施例可以向单字符匹配子模型和字符串匹配子模型中分别输入相同的测试集,从而分别得到第一判断阈值和第二判断阈值。Wherein, the first judgment threshold is an optimal threshold for judging whether the recognition result of the single-character matching sub-model is reliable, and the second judgment threshold is an optimal threshold for judging whether the recognition result of the character string matching sub-model is reliable. In this embodiment of the present application, the same test set can be respectively input into the single-character matching sub-model and the character string matching sub-model, so as to obtain the first judgment threshold and the second judgment threshold respectively.
具体地,在一些实施例中,根据第一分类概率与第一判断阈值的比较结果以及第二分类概率与第二判断阈值的比较结果,确定设备识别结果,可以包括:Specifically, in some embodiments, determining the device identification result according to the comparison result between the first classification probability and the first judgment threshold and the comparison result between the second classification probability and the second judgment threshold may include:
若第一分类概率大于或者等于第一判断阈值,且第二分类概率小于第二判断阈值,则确定设备识别结果为第一识别结果;If the first classification probability is greater than or equal to the first judgment threshold, and the second classification probability is less than the second judgment threshold, then determining that the device identification result is the first identification result;
若第一分类概率小于第一判断阈值,且第二分类概率大于或者等于第二判断阈值,则确定设备识别结果为第二识别结果;If the first classification probability is less than the first judgment threshold, and the second classification probability is greater than or equal to the second judgment threshold, then determining that the device identification result is the second identification result;
若第一分类概率大于或者等于第一判断阈值且第二分类概率大于或者等于第二判断阈值,或者,第一分类概率小于第一判断阈值且第二分类概率小于第二判断阈值,则确定设备识别结果为未识别。If the first classification probability is greater than or equal to the first judgment threshold and the second classification probability is greater than or equal to the second judgment threshold, or the first classification probability is less than the first judgment threshold and the second classification probability is less than the second judgment threshold, then the determination device The recognition result is not recognized.
需要说明的是,在第一识别结果和第二识别结果不同的情况下,如果第一分类概率大于或者等于第一判断阈值,且第二分类概率小于第二判断阈值,那么将第一识别结果确定为设备识别结果。如果第一分类概率小于第一判断阈值,且第二分类概率大于或者等于第二判断阈值,那么将第二识别结果确定为设备识别结果。如果第一分类概率大于或者等于第一判断阈值,且第二分类概率大于或者等于第二判断阈值;或者,第一分类概率小于第一判断阈值,且第二分类概率小于第二判断阈值,那么确定设备识别结果为未识别。这样,就得到了待识别设备信息最终的设备识别结果。It should be noted that, if the first recognition result is different from the second recognition result, if the first classification probability is greater than or equal to the first judgment threshold, and the second classification probability is smaller than the second judgment threshold, then the first recognition result Determined as a device recognition result. If the first classification probability is less than the first judgment threshold and the second classification probability is greater than or equal to the second judgment threshold, then the second recognition result is determined as the device recognition result. If the first classification probability is greater than or equal to the first judgment threshold, and the second classification probability is greater than or equal to the second judgment threshold; or, the first classification probability is less than the first judgment threshold, and the second classification probability is less than the second judgment threshold, then Make sure that the device recognition result is not recognized. In this way, the final device recognition result of the device information to be recognized is obtained.
进一步地,在一些实施例中,在确定设备识别结果之后,该方法还可以包括:Further, in some embodiments, after the device identification result is determined, the method may further include:
若设备识别结果为未识别,则将待识别设备信息增加到预设设备型号数据库中,得到新的预设设备型号数据库;If the device identification result is unrecognized, the information of the device to be identified is added to the default device model database to obtain a new preset device model database;
利用新的预设设备型号数据库对预设识别模型进行更新训练。The preset recognition model is updated to train with the new database of preset device models.
需要说明的是,如果设备识别结果为未识别,则说明待识别设备信息可能为新的设备信息,这时候,就将待识别设备信息增加到预设设备型号数据库中,从而得到新的预设设备型号数据库,并利用新的预设设备型号数据库对预设识别模型进行更新训练。It should be noted that if the device recognition result is not recognized, it means that the device information to be recognized may be new device information. At this time, the device information to be recognized is added to the preset device model database to obtain a new preset The device model database, and use the new preset device model database to update and train the preset recognition model.
还需要说明的是,在对预设识别模型进行更新训练时,可以根据前述步骤S201~S203中描述的部分步骤,利用网络爬虫技术获取该待识别设备信息的设备标签并更新预设设备型号数据库,并更新单字符词库和字符串词库之后,再次对单字符匹配子模型和字符串匹配子模型进行更新训练,实现对预设识别模型的迭代更新。It should also be noted that when updating and training the preset recognition model, according to some of the steps described in the aforementioned steps S201-S203, the web crawler technology can be used to obtain the device label of the device information to be recognized and update the default device model database , and after updating the single-character thesaurus and the string thesaurus, the single-character matching sub-model and the string matching sub-model are updated and trained again to realize iterative updating of the preset recognition model.
还需要说明的是,如果对待识别设备信息进行网络爬虫的过程中,未成功获取到种子URL或设备标签,则如前述,可以将待识别设备信息加入无效设备信息数据库中。It should also be noted that if the seed URL or device label is not successfully obtained during the web crawling process of the device information to be identified, as mentioned above, the device information to be identified can be added to the invalid device information database.
另外,在前述实施例中,无效设备信息库中的设备信息可能是暂时在网络上获取到其设备标签,随着网络信息的增多,该设备信息可能有效;例如,在产品处于测试阶段时,并未公开设备相关信息,网络爬虫则无法获取到其设备标签,当产品正式上市后,设备相关信息在网络公开,则设备信息就可能有效。In addition, in the foregoing embodiments, the device information in the invalid device information library may be temporarily obtained from the network for its device label, and as the network information increases, the device information may be valid; for example, when the product is in the testing phase, Device-related information is not disclosed, and web crawlers cannot obtain its device labels. When the product is officially launched, device-related information is disclosed on the Internet, and the device information may be valid.
因此,在一些实施例中,该方法还可以包括:Therefore, in some embodiments, the method may also include:
判断无效设备信息数据库中的设备信息是否有效;Judging whether the device information in the invalid device information database is valid;
若判断无效设备信息数据库中的设备信息有效,则利用该设备信息对预设识别模型进行更新训练。If it is determined that the device information in the invalid device information database is valid, the preset recognition model is updated and trained using the device information.
需要说明的是,本申请实施例可以间隔一定时间周期或者在预设时间,利用网络爬虫技术尝试获取无效设备信息数据库中的设备信息的设备标签,如果成功获取,说明该设备信息已经变为有效设备信息,就可以据此对预设识别模型进行更新训练,更新的方法同前述,对应的设备信息就从无效设备信息数据库中删除,从而避免了由于信息更新不及时导致的误判。It should be noted that the embodiment of the present application may use web crawler technology to try to obtain the device label of the device information in the invalid device information database at intervals of a certain period of time or at a preset time. If it is successfully obtained, it means that the device information has become valid. The device information can be used to update and train the preset recognition model. The update method is the same as the above, and the corresponding device information is deleted from the invalid device information database, thereby avoiding misjudgment caused by untimely information updates.
本实施例提供了一种识别方法,通过获取待识别设备信息;将待识别设备信息输入预设识别模型,其中,预设识别模型包括单字符匹配子模型和字符串匹配子模型;利用单字符匹配子模型对待识别设备信息进行识别,确定第一识别结果,以及利用字符串匹配子模型对待识别设备信息进行识别,确定第二识别结果;根据第一识别结果和第二识别结果,确定设备识别结果。这样,通过单字符匹配子模型和字符串匹配子模型对待识别设备信息进行组合识别,根据两个子模型的识别结果联合确定待识别设备信息的设备识别结果,从而有效提高了对设备信息进行识别的准确性,同时还提高了识别效率;即使在设备信息的关键字不完整或者关键词排列顺序变换的情况下,也能够对设备信息进行精准识别,识别效果大大提高;另外,在预设识别模型的训练和更新过程中,通过网络爬虫技术对设备信息对应的设备标签进行自动化获取,实现了设备标签的自动准确获取和预设识别模型的自动更新,对于不存在于预设设备型号数据库中的新的设备信息也能够进行有效识别。This embodiment provides an identification method, by acquiring the information of the equipment to be identified; inputting the information of the equipment to be identified into the preset identification model, wherein the preset identification model includes a single-character matching sub-model and a character string matching sub-model; The matching sub-model identifies the information of the device to be recognized, determines the first recognition result, and uses the character string matching sub-model to recognize the information of the device to be recognized, and determines the second recognition result; according to the first recognition result and the second recognition result, determine the device recognition result. In this way, through the single character matching sub-model and the string matching sub-model to identify the equipment information to be identified, the equipment identification result of the equipment information to be identified is jointly determined according to the identification results of the two sub-models, thereby effectively improving the identification of equipment information. Accuracy, while also improving the recognition efficiency; even when the keywords of the device information are incomplete or the order of the keywords is changed, the device information can be accurately recognized, and the recognition effect is greatly improved; in addition, in the preset recognition model During the training and updating process, the device label corresponding to the device information is automatically obtained through the web crawler technology, which realizes the automatic and accurate acquisition of the device label and the automatic update of the preset recognition model. New device information can also be effectively identified.
本申请的另一实施例中,参见图3,其示出了本申请实施例提供的一种设备识别方法的详细流程示意图。如图3所示,该详细流程可以包括:In another embodiment of the present application, refer to FIG. 3 , which shows a detailed flowchart of a device identification method provided in an embodiment of the present application. As shown in Figure 3, the detailed process may include:
S301、清洗及统计智能网关上报设备信息。S301. The cleaning and statistics intelligent gateway reports device information.
需要说明的是,在本申请实施例中,可以基于智能网关上报设备信息作为训练预设识别模型的原始数据信息。具体来说,首先对所有智能网关上报设备信息进行清洗与统计,以备模型训练所需。It should be noted that, in the embodiment of the present application, the device information reported by the smart gateway may be used as raw data information for training the preset recognition model. Specifically, firstly, clean and count the device information reported by all smart gateways in preparation for model training.
进一步地,对于步骤S301,其具体实现过程可以参见图4,其示出了本申请实施例提供的另一种设备识别方法的详细流程示意图。如图4所示,该详细流程可以包括:Further, for the specific implementation process of step S301 , refer to FIG. 4 , which shows a detailed flowchart of another device identification method provided by the embodiment of the present application. As shown in Figure 4, the detailed process may include:
S301a、将智能网关上报设备信息存储至设备信息总数据库。S301a. Store the device information reported by the smart gateway into the general device information database.
需要说明的是,本步骤可以创建基于Hadoop分布式文件系统(HadoopDistributed File System,HDFS)的设备信息总数据库(也称作设备上报信息总数据库),并将所有智能网关上报的设备信息存储至设备信息总数据库。由于HDFS具有高可靠性、高扩展性、高吞吐量等优点,因此本申请实施例利用HDFS创建设备信息总数据库,但是不作具体限定。It should be noted that this step can create a total device information database based on Hadoop Distributed File System (Hadoop Distributed File System, HDFS), and store all device information reported by the smart gateway to the device General database of information. Since HDFS has advantages such as high reliability, high scalability, and high throughput, the embodiment of the present application utilizes HDFS to create a total device information database, but no specific limitation is made.
S301b、利用正则表达式剔除设备信息总数据库中无效设备信息。S301b. Use a regular expression to eliminate invalid equipment information in the general equipment information database.
需要说明的是,本步骤可以采用正则表达式,剔除设备信息总数据库中除汉字、字母、数字以外的特殊字符以及设备信息为空、匿名等情况的无效设备信息。It should be noted that regular expressions can be used in this step to eliminate special characters other than Chinese characters, letters, and numbers in the general equipment information database, as well as invalid equipment information in which the equipment information is empty or anonymous.
S301c、利用预设统计工具统计不同设备信息的出现次数,并完成去重。S301c. Use a preset statistical tool to count occurrence times of different device information, and complete deduplication.
需要说明的是,本步骤可以使用预设统计工具(例如MapReduce)统计设备信息总数据库中不同设备信息(即剔除无效设备信息后保留的下的有效设备信息)的出现次数,并对设备信息总数据库中的设备信息进行去重操作。It should be noted that in this step, a preset statistical tool (such as MapReduce) can be used to count the number of occurrences of different device information in the total device information database (that is, valid device information retained after eliminating invalid device information), and the total device information The device information in the database is deduplicated.
S301d、根据设备信息的统计次数筛选有效且高频的设备信息。S301d. Filter effective and high-frequency device information according to the statistical times of the device information.
S301e、更新设备信息总数据库。S301e. Update the general database of device information.
需要说明的是,根据设备信息的出现次数的统计结果以及预设次数阈值,利用设备信息的出现次数大于预设次数阈值的设备信息更新设备信息总数据库,仅在设备信息总数据库中保留有效且高频的设备信息。It should be noted that, according to the statistical results of the number of occurrences of the device information and the preset number threshold, the device information with the number of occurrences of the device information greater than the preset number threshold is used to update the total device information database, and only valid and High-frequency device information.
S302、网络爬虫构建/更新预设设备型号数据库。S302. The web crawler constructs/updates a preset device model database.
需要说明的是,预设设备型号数据库(也称作设备型号数据库)可以包括:设备信息总数据库中的设备信息及其相对应的设备型号、设备品牌、设备类型四种数据。It should be noted that the preset device model database (also referred to as the device model database) may include: device information in the general device information database and its corresponding four types of data: device model, device brand, and device type.
进一步地,对于步骤S302,其具体实现过程可以参见图5,其示出了本申请实施例提供的又一种设备识别方法的详细流程示意图。如图5所示,该详细流程可以包括:Further, for the specific implementation process of step S302, refer to FIG. 5 , which shows a detailed flow chart of another device identification method provided by the embodiment of the present application. As shown in Figure 5, the detailed process may include:
S302a、网络爬虫技术自动获取设备信息总数据库中各设备信息的种子URL。S302a. The web crawler technology automatically obtains the seed URL of each device information in the general device information database.
需要说明的是,本步骤利用网络爬虫技术自动地从搜索引擎的查询结果中获取设备信息的相关种子URL。It should be noted that, in this step, the web crawler technology is used to automatically obtain the relevant seed URL of the device information from the query results of the search engine.
S302b、抓取种子URL所对应的网页信息。S302b. Grab webpage information corresponding to the seed URL.
需要说明的是,本步骤对已抓取的种子URL再次利用网络爬虫技术获得对应的网页信息。It should be noted that, in this step, the web crawler technology is used again to obtain the corresponding web page information for the crawled seed URL.
S302c、从网页信息中筛选出设备型号、设备品牌、设备类型三种信息。S302c. Filter out three types of information, namely device model, device brand, and device type, from the web page information.
需要说明的是,从网页信息中筛选出设备型号、设备品牌、设备类型等三种信息(即设备标签,也称作设备型号信息)。It should be noted that three types of information (ie, device label, also referred to as device model information) such as device model, device brand, and device type are screened out from the webpage information.
S302d、如果设备标签抓取成功,则将设备信息及其对应的设备型号、设备品牌、设备类型添加进设备型号数据库。S302d. If the device label is captured successfully, add the device information and its corresponding device model, device brand, and device type into the device model database.
S302e、如果设备标签抓取不成功,则将设备信息添加进无效设备信息数据库。S302e. If the capture of the device label is unsuccessful, add the device information into the invalid device information database.
需要说明的是,对于能够成功抓取到设备标签的设备信息,就将设备信息以及对应的设备标签(即设备信息对应的设备型号、设备品牌、设备类型)添加进设备型号数据库。It should be noted that, for the device information for which the device tag can be successfully captured, the device information and the corresponding device tag (that is, the device model, device brand, and device type corresponding to the device information) are added to the device model database.
对于不能成功抓取到设备标签的设备信息,就将设备信息添加进无效设备信息数据库。For the device information for which the device label cannot be successfully captured, the device information is added to the invalid device information database.
对设备信息总数据库的每一个设备信息均采取以上步骤进行网络爬虫,完成设备型号数据库的构建,将网络爬虫无法获取到设备型号、设备品牌及设备类型的设备信息保存到无效设备信息数据库。For each piece of equipment information in the general equipment information database, the above steps are taken to crawl the network to complete the construction of the equipment model database, and save the equipment information that the network crawler cannot obtain to the equipment model, equipment brand and equipment type to the invalid equipment information database.
S303、基于卷积神经网络的预设识别模型训练。S303. Preset recognition model training based on convolutional neural network.
需要说明的是,在得到设备型号数据库之后,就可以根据设备型号数据库进行模型训练,得到基于卷积神经网络的预设识别模型(也称作设备精准识别模型)。It should be noted that after obtaining the device model database, model training can be performed according to the device model database to obtain a preset recognition model based on convolutional neural network (also called precise device recognition model).
需要说明的是,对于步骤S303,模型训练的网络结构以及训练过程可以参见图6,其示出了本申请实施例提供的一种预设识别模型的训练框架示意图。如图6所示,该框架可以包括:训练向量集获取部分601,子网络训练部分602和模型组合部分603。It should be noted that, for step S303, the network structure and training process of model training can be referred to FIG. 6, which shows a schematic diagram of a training framework of a preset recognition model provided by the embodiment of the present application. As shown in FIG. 6 , the framework may include: a training vector set
其中,在训练向量集获取部分601中,主要功能包括分别对设备型号数据库中的设备信息分别采用单字符分割法和字符串分割法得到单字符词库(也称作letter词库)和字符串词库(也称作word词库)后,再分别对单字符词库和字符词库分别与设备型号数据库进行单字符级向量转换和字符串级向量转换,以得到训练向量集M1和训练向量集N1;其中,M1表示单字符训练向量集,N1表示字符串训练向量集。Among them, in the training vector set
在子网络训练部分602中,包括单字符匹配子网络和字符串匹配子网络;其中,单字符匹配子网络包括卷积层C1、卷积层C2、卷积层C3、全连接层F1、dropout层、激活层R1、全连接层F2、交叉熵损失层S1;字符串匹配子网络包括卷积层D1、卷积层D2、卷积层D3、全连接层T1、dropout层、激活层K1、全连接层T2、交叉熵损失层S2。子网络训练部分602的主要功能包括,将M1和N1分别输入到单字符匹配子网络和字符串匹配子网络中进行设备信息的关键特征提取,两个子网络采用两个独立的交叉熵损失函数和统一的设备标签(也称作设备标签,来自于设备标签数据库)进行迭代有监督训练,完成设备标签分类迭代训练后,分别获得单字符匹配子模型(也称作设备识别letter模型)和字符串匹配子模型(也称作设备识别word模型)。In
在模型组合部分603中,主要功能包括对单字符匹配子模型和字符串匹配子模型进行模型组合,得到预设识别模型。In the
需要说明的是,基于卷积神经网络的分类训练主要采用单字符匹配子网络和字符串匹配子网络并行的网络结构,网络结构可以参照图6,训练模型所用的样本训练集即为网络爬虫所构建的设备型号数据库,将设备型号数据库中每一个设备信息所对应的设备型号、设备品牌、设备类型提取出来后进行统一的去重操作,使用去重后的设备型号、设备品牌、设备类型构建设备标签数据库。然后利用单字符分割法和字符串分割法,实现了根据样本训练集自动化创建和更新单字符词库和字符串词库,单字符词库和字符串词库分别与设备型号数据库利用单字符级向量转换和字符串级向量转换获得训练向量集M1和训练向量集N1,将M1和N1分别输入到单字符匹配子网络和字符串匹配子网络中进行设备信息的关键特征提取,两个子网络采用两个独立的交叉熵损失函数和统一的设备标签进行迭代有监督训练,完成分类迭代训练后分别获得单字符匹配子模型和字符串匹配子模型,并两个子模型的组合作为预设识别模型。It should be noted that the classification training based on the convolutional neural network mainly adopts a network structure in which the single-character matching subnetwork and the string matching subnetwork are parallel. The constructed device model database extracts the device model, device brand, and device type corresponding to each device information in the device model database and performs a unified deduplication operation, using the deduplicated device model, device brand, and device type to construct Device label database. Then, using the single-character segmentation method and the string segmentation method, the automatic creation and update of the single-character thesaurus and the string thesaurus are realized based on the sample training set. Vector conversion and string-level vector conversion to obtain training vector set M1 and training vector set N1, respectively input M1 and N1 into the single-character matching sub-network and string matching sub-network for key feature extraction of device information, the two sub-networks use Two independent cross-entropy loss functions and a unified device label are used for iterative supervised training. After the classification iterative training is completed, the single-character matching sub-model and the string matching sub-model are respectively obtained, and the combination of the two sub-models is used as the preset recognition model.
S304、预设识别模型的使用。S304. Use of a preset recognition model.
需要说明的是,首先向两个子模型中输入相同测试集,分别获得两个子模型识别的最优阈值(即前述实施例中的第一判断阈值和第二判断阈值),将待识别设备信息分别输入到单字符匹配子模型和字符串匹配子模型中进行设备信息关键特征提取与分类,并将该设备信息属于每个设备标签的概率进行由大到小的排列,分别取两个子模型中分类概率最大的设备标签作为该子模型的识别结果,将两个子模型的识别结果与其最优阈值进行比较,根据设定的阈值判别规则,最终输出设备识别结果(也称作设备精准识别结果)。It should be noted that, first, the same test set is input into the two sub-models, and the optimal thresholds for the identification of the two sub-models (ie, the first judgment threshold and the second judgment threshold in the foregoing embodiment) are respectively obtained, and the information of the equipment to be identified is respectively Input it into the single-character matching sub-model and the string matching sub-model to extract and classify the key features of the device information, and arrange the probability of the device information belonging to each device label from large to small, and take the classification in the two sub-models respectively The device label with the highest probability is used as the recognition result of the sub-model. The recognition results of the two sub-models are compared with their optimal thresholds. According to the set threshold discrimination rules, the device recognition result (also called the precise device recognition result) is finally output.
确定设备识别结果遵循如下阈值判别规则,具体参见式(1):It is determined that the device identification result follows the following threshold discrimination rules, see formula (1) for details:
在式(1)中,TL表示第一判断阈值,TW表示第二判断阈值,RL表示待识别设备信息在单字符匹配子模型中的第一识别结果,其属于RL的第一分类概率为PL,RW表示待识别设备信息在字符串匹配子模型中的第二识别结果,其属于RW的第二分类概率为PW,R表示设备识别结果。In formula (1), T L represents the first judgment threshold, T W represents the second judgment threshold, and RL represents the first recognition result of the device information to be recognized in the single-character matching sub-model, which belongs to the first recognition result of RL . The classification probability is PL , and R W represents the second recognition result of the device information to be recognized in the string matching sub-model, and the second classification probability belonging to R W is P W , and R represents the device recognition result.
S305、新增智能网关下挂设备自动化识别及更新。S305. Automatically identify and update the devices attached to the smart gateway.
需要说明的是,对于步骤S305,其具体实现过程可以参见图7,其示出了本申请实施例提供的再一种设备识别方法的详细流程示意图。如图7所示,该详细流程可以包括:It should be noted that, for the specific implementation process of step S305, reference may be made to FIG. 7, which shows a detailed flowchart of another device identification method provided by the embodiment of the present application. As shown in Figure 7, the detailed process may include:
S305a、智能网关上报待识别设备信息。S305a. The intelligent gateway reports the information of the device to be identified.
S305b、判断待识别设备信息是否存在于无效设备信息数据库中。S305b. Determine whether the device information to be identified exists in the invalid device information database.
需要说明的是,如果判断结果为是,则执行步骤S305i;如果判断结果为否,则执行步骤S305c。It should be noted that, if the judgment result is yes, execute step S305i; if the judgment result is no, execute step S305c.
S305c、将待识别设备信息输入预设识别模型进行识别。S305c. Input the information of the equipment to be identified into a preset identification model for identification.
S305d、判断识别结果是否为未识别。S305d. Determine whether the recognition result is unrecognized.
需要说明的是,如果判断结果为是,则执行步骤S305e;如果判断结果为否,则执行步骤S305j。It should be noted that, if the judgment result is yes, execute step S305e; if the judgment result is no, execute step S305j.
S305e、输出待识别设备信息。S305e. Output the information of the device to be identified.
S305f、网络爬虫更新设备型号数据库。S305f. The web crawler updates the device model database.
S305g、更新单字符词库和字符串词库。S305g, updating the single-character thesaurus and the character string thesaurus.
S305h、基于卷积神经网络的预设识别模型迭代训练。S305h, iteratively training the preset recognition model based on the convolutional neural network.
需要说明的是,如果利用网络爬虫能够成功获取待识别设备信息的设备标签,则利用该设备信息以及对应的设备标签更新设备型号书库,并进一步更新单字符词库和字符串词库,最后基于卷积神经网络的预设识别模型迭代训练,完成对预设识别模型的迭代更新。It should be noted that if the device label of the device information to be identified can be successfully obtained by using a web crawler, then the device model library is updated using the device information and the corresponding device label, and the single-character thesaurus and string thesaurus are further updated. Finally, based on The preset recognition model of the convolutional neural network is iteratively trained to complete the iterative update of the preset recognition model.
S305i、输出待识别设备信息为无效设备信息。S305i. Outputting that the device information to be identified is invalid device information.
需要说明的是,如果待识别设备信息存在于无效设备信息数据库中,则直接判定待识别设备信息为无效设备信息。It should be noted that if the device information to be identified exists in the invalid device information database, it is directly determined that the device information to be identified is invalid device information.
S305j、输出已识别的设备型号、设备品牌和设备类型。S305j. Output the identified device model, device brand and device type.
需要说明的是,在待识别设备信息的识别过程中,首先将智能网关上报的待识别设备信息与无效设备信息数据库中的设备信息进行匹配,如果匹配成功则结束识别,输出:该上报信息为无效设备信息,同时还可以对无效设备信息数据库进行更新;如果匹配不成功则继续执行前述步骤S304的过程,利用预设识别模型对待识别设备信息进行识别,当经过预设识别模型所获得的识别结果为“未识别”时,依次按照前述步骤S302和S303,将未识别的新增设备信息加入设备型号数据库中,完成自动化爬取设备标签并更新预设识别模型。It should be noted that, in the identification process of the equipment information to be identified, firstly, the equipment information reported by the smart gateway is matched with the equipment information in the invalid equipment information database, and if the matching is successful, the identification ends, and the output: the reported information is Invalid device information, at the same time, the database of invalid device information can also be updated; if the matching is unsuccessful, continue the process of the aforementioned step S304, and use the preset recognition model to identify the device information to be recognized. When the recognition obtained by the preset recognition model When the result is "unrecognized", follow the above steps S302 and S303 in sequence to add the unrecognized newly added device information into the device model database, complete automatic crawling of device tags and update the preset recognition model.
综上可知,本申请实施例提供了一种设备识别方法,该方法的实现可划分为五个步骤:S301清洗及统计智能网关上报设备信息、S302网络爬虫构建/更新设备型号数据库、S303基于卷积神经网络的预设识别模型训练、S304预设识别模型的使用、S305新增智能网关下挂设备自动化识别及更新。To sum up, the embodiment of the present application provides a device identification method, and the implementation of the method can be divided into five steps: S301 cleaning and counting the device information reported by the intelligent gateway, S302 constructing/updating the device model database by a web crawler, and S303 based on volume The training of the preset recognition model of the product neural network, the use of the preset recognition model in S304, and the automatic recognition and update of the equipment connected to the new intelligent gateway in S305.
步骤S301简述如下:创建基于Hadoop分布式文件系统的设备信息总数据库,将所有智能网关上报设备信息存储至设备信息总数据库。Step S301 is briefly described as follows: create a general device information database based on the Hadoop distributed file system, and store the device information reported by all intelligent gateways in the general device information database.
采用正则表达式,剔除设备信息总数据库中除汉字、字母、数字以外的特殊字符以及设备信息为空、匿名等情况的无效设备信息。Use regular expressions to eliminate special characters other than Chinese characters, letters, and numbers in the general database of equipment information, as well as invalid equipment information where the equipment information is empty or anonymous.
使用MapReduce统计设备信息总数据库中不同设备信息出现的次数,并对设备信息进行去重操作。Use MapReduce to count the number of occurrences of different device information in the total device information database, and deduplicate the device information.
根据以上统计结果,结合预设次数阈值,利用设备信息的出现次数大于预设次数阈值的设备信息更新原设备信息总数据库,仅保留有效且高频的设备信息。该过程流程图如图4。According to the above statistical results, combined with the preset frequency threshold, the original device information total database is updated with the device information whose occurrence frequency is greater than the preset frequency threshold, and only effective and high-frequency device information is retained. The flow chart of this process is shown in Figure 4.
步骤S302简述如下:设备型号数据库包括:设备信息总数据库中的设备信息及其相对应的设备型号、设备品牌、设备类型四种数据。Step S302 is briefly described as follows: the device model database includes: device information in the device information general database and its corresponding device model, device brand, and device type.
首先利用网络爬虫技术自动地从搜索引擎的查询结果中获取设备信息的相关种子URL。First, the web crawler technology is used to automatically obtain the relevant seed URL of the device information from the query results of the search engine.
然后对已抓取的种子URL再次利用网络爬虫技术获得对应的网页信息。Then use the web crawler technology again to obtain the corresponding web page information on the captured seed URL.
最后从网页信息中筛选出设备型号、设备品牌、设备类型等三种信息。Finally, three types of information, such as device model, device brand, and device type, are screened out from the web page information.
将智能网关上报的每一个有效设备信息均采取以上步骤进行网络爬虫,完成设备型号数据库的构建,将爬虫无法获取到设备型号、品牌及类型的智能网关上报设备信息的保存到无效设备信息数据库。该过程详见图5。For each valid device information reported by the smart gateway, take the above steps to crawl the network to complete the construction of the device model database, and save the device information reported by the smart gateway that the crawler cannot obtain to the device model, brand and type to the invalid device information database. The process is shown in Figure 5 in detail.
步骤S303简述如下:基于卷积神经网络的设备型号分类训练主要采用单字符匹配子网络和字符串匹配子网络并行的网络结构,网络结构如图X所示,训练所用的样本训练集即为网络爬虫所构建的设备型号数据库,并将设备型号数据库中每一个设备信息所对应的设备型号、设备品牌、设备类型提取出来后进行统一的去重操作,使用去重后的设备型号、设备品牌、设备类型构建设备标签数据库,然后利用单字符分割法和字符串分割法,实现了根据训练数据集自动化创建和更新单字符词库和字符串词库,单字符词库和字符串词库分别与设备型号数据库利用单字符级向量转换和字符串级向量转换获得训练向量集M1和训练向量集N1,将M1和N1分别输入到单字符匹配子网络和字符串匹配子网络中进行设备信息的关键特征提取,两个子网络采用两个独立的交叉熵损失函数和统一的设备标签进行迭代有监督训练,完成设备型号分类迭代训练后分别获得单字符匹配子模型和字符串匹配子模型,并两个子模型的组合作为预设识别模型。该过程详见图6。Step S303 is briefly described as follows: The device model classification training based on the convolutional neural network mainly adopts a parallel network structure of a single-character matching subnetwork and a character string matching subnetwork. The network structure is shown in Figure X, and the sample training set used for training is The device model database built by the web crawler extracts the device model, device brand, and device type corresponding to each device information in the device model database, and then performs a unified deduplication operation, and uses the deduplicated device model and device brand , device type to build a device label database, and then use the single-character segmentation method and string segmentation method to automatically create and update the single-character thesaurus and string thesaurus according to the training data set. The single-character thesaurus and the string thesaurus are respectively With the device model database, the training vector set M1 and the training vector set N1 are obtained by using single-character-level vector conversion and string-level vector conversion, and M1 and N1 are respectively input into the single-character matching sub-network and string matching sub-network for device information For key feature extraction, the two sub-networks use two independent cross-entropy loss functions and unified device labels for iterative supervised training. After completing the iterative training for device model classification, a single-character matching sub-model and a string The combination of two sub-models is used as the preset recognition model. The process is shown in Figure 6 in detail.
步骤S304简述如下:首先向两个子模型中输入相同测试集,分别获得两个子模型进行设备识别的最优阈值,将待识别设备信息分别输入到单字符匹配子模型和字符串匹配子模型中进行设备信息关键特征提取与分类,并将该设备信息属于每个设备标签的概率进行由大到小的排列,分别取两个模型中分类概率最大的标签作为该模型的识别结果,将两个模型的识别结果与其阈值进行比较,根据设定的阈值判别规则,最终输出设备精准识别结果。确定设备识别结果遵循的阈值判别规则参见式(1)Step S304 is briefly described as follows: first, input the same test set into the two sub-models, respectively obtain the optimal thresholds for device identification of the two sub-models, and input the information of the device to be identified into the single-character matching sub-model and the character string matching sub-model respectively Extract and classify the key features of equipment information, and arrange the probability that the equipment information belongs to each equipment label from large to small, respectively take the label with the highest classification probability in the two models as the recognition result of the model, and combine the two The recognition result of the model is compared with its threshold, and according to the set threshold discrimination rules, the accurate recognition result of the device is finally output. Determine the threshold discrimination rules followed by the device recognition results, see formula (1)
步骤S305简述如下:首先将新增的待识别设备信息与无效设备信息数据库中的信息进行匹配,如果匹配成功则结束识别,输出:该上报信息为无效设备信息;如果匹配不成功则继续步骤S304,当经过预设识别模型所获得的结果为“未识别”时,依次进行步骤S302和步骤S303,将未识别的设备信息加入设备型号数据库中,完成自动化爬取设备标签并更新预设识别模型。该过程详见图7。Step S305 is briefly described as follows: first, match the newly added device information to be identified with the information in the invalid device information database, and if the matching is successful, then end the identification, and output: the reported information is invalid device information; if the matching is unsuccessful, continue to the step S304, when the result obtained through the preset recognition model is "unrecognized", proceed to step S302 and step S303 in sequence, add the unrecognized device information to the device model database, complete automatic crawling of device labels and update the preset recognition Model. The process is shown in Figure 7 in detail.
本实施例提供了一种设备识别方法,通过上述实施例对前述实施例的具体实现进行了详细阐述,从中可以看出,与相关技术提供的智能网关下挂设备识别方法相比,本申请实施例将大数据分析技术、网络爬虫与卷积神经网络分类算法相结合,实现了下挂设备的类型、品牌及型号等多维度信息的精准识别。利用网络爬虫技术对设备信息总数据库中的智能网关上报设备信息所对应的设备型号、设备品牌、设备类型进行自动化抓取,并构建预设设备型号数据库和无效设备信息数据库;利用单字符分割法和字符串分割法分别构建单字符词库和字符串词库,单字符词库和字符串词库分别与设备型号数据库利用单字符级向量转换和字符串级向量转换获得对应的训练向量集,将两种训练向量集分别输入到对应的单字符匹配子网络和字符串匹配子网络中进行设备信息的关键特征提取,两个子网络采用两个独立的交叉熵损失函数和统一的设备标签进行迭代有监督训练,最终获得单字符匹配子模型和字符串匹配子模型,将两个子模型的组合作为预设识别模型;当智能网关上报新的待识别设备信息时,利用预设识别模型对待识别设备信息进行关键特征提取与分类识别,当预设识别模型无法精准分类时,自动启动网络爬虫更新设备型号数据库,利用更新后的设备型号数据库再次进行预设识别模型训练,获得更新后的设备精准识别模型,不仅实现了智能网关下挂设备的自动化精准识别,而且在提高智能网关下挂设备的识别准确率的同时,实现了自动化识别新增设备标签并更新设备型号数据库。This embodiment provides a device identification method. The implementation of the foregoing embodiments is described in detail through the above-mentioned embodiments. For example, the combination of big data analysis technology, web crawler and convolutional neural network classification algorithm has realized the accurate identification of multi-dimensional information such as the type, brand and model of the connected equipment. Use web crawler technology to automatically capture the device model, device brand, and device type corresponding to the device information reported by the smart gateway in the general device information database, and build a preset device model database and an invalid device information database; use the single character segmentation method The single-character thesaurus and the string thesaurus are respectively constructed with the device model database using the single-character-level vector conversion and the string-level vector conversion to obtain the corresponding training vector sets. Input the two training vector sets into the corresponding single-character matching sub-network and string matching sub-network to extract key features of device information. The two sub-networks use two independent cross-entropy loss functions and unified device labels for iteration. With supervised training, the single-character matching sub-model and string matching sub-model are finally obtained, and the combination of the two sub-models is used as the preset recognition model; when the smart gateway reports new device information to be recognized, the preset recognition model is used to identify the device to be recognized Extract key features and classify and identify the information. When the preset recognition model cannot be accurately classified, the web crawler is automatically started to update the device model database, and the updated device model database is used to conduct preset recognition model training again to obtain updated accurate device recognition. The model not only realizes the automatic and accurate identification of the devices attached to the smart gateway, but also realizes the automatic identification of new device labels and updates the device model database while improving the recognition accuracy of the devices connected to the smart gateway.
本申请的再一实施例中,参见图8,其示出了本申请实施例提供的一种设备识别装置80的组成结构示意图。如图8所示,该设备识别装置80可以包括获取单元801,识别单元802和确定单元803,其中,In yet another embodiment of the present application, refer to FIG. 8 , which shows a schematic diagram of the composition and structure of a
获取单元801,配置为获取待识别设备信息;The obtaining
识别单元802,配置为将所述待识别设备信息输入预设识别模型,其中,所述预设识别模型包括单字符匹配子模型和字符串匹配子模型;并利用所述单字符匹配子模型对所述待识别设备信息进行识别,确定第一识别结果,以及利用所述字符串匹配子模型对所述待识别设备信息进行识别,确定第二识别结果;The
确定单元803,配置为根据所述第一识别结果和所述第二识别结果,确定设备识别结果。The determining
在一些实施例中,确定单元803,还配置为将所述待识别设备信息与无效设备信息数据库中的设备信息进行匹配;以及若匹配成功,则确定所述待识别设备信息为无效设备信息;以及若匹配不成功,则执行将所述待识别设备信息输入预设识别模型的步骤。In some embodiments, the determining
在一些实施例中,参见图8,该设备识别装置80还可以包括训练单元804,配置为从预设设备型号数据库中,获取样本训练集;其中,所述样本训练集包括至少一个设备信息以及所述至少一个设备信息各自对应的设备标签,所述设备标签至少包括设备型号、设备品牌和设备类型;以及利用所述样本训练集对单字符匹配子网络进行训练,得到所述单字符匹配子模型,以及利用所述样本训练集对字符串匹配子网络进行训练,得到所述字符串匹配子模型;以及根据所述单字符匹配子模型和所述字符串匹配子模型,确定所述预设识别模型。In some embodiments, referring to FIG. 8 , the apparatus for identifying
在一些实施例中,训练单元804,还配置为获取原始设备信息集合,所述原始设备信息集合包括至少一个设备信息;以及对所述原始设备信息集合中的设备信息进行数据清洗与数据统计,得到中间设备信息集合以及所述中间设备信息集合中每一设备信息的出现次数;以及将所述中间设备信息集合中每一设备信息的出现次数与预设次数阈值进行比较,获取出现次数大于所述预设次数阈值的设备信息;以及根据所述出现次数大于所述预设次数阈值的设备信息,构建设备信息总数据库;以及对所述设备信息总数据库进行抓取处理,得到所述预设设备型号数据库。In some embodiments, the
在一些实施例中,训练单元804,具体配置为利用预设表达式对所述原始设备信息集合中的无效设备信息进行剔除,得到至少一个有效设备信息;以及统计所述至少一个有效设备信息中每一有效设备信息的出现次数,并对所述至少一个有效设备信息进行去重处理,得到至少一个目标设备信息;以及根据所述至少一个目标设备信息构建所述中间设备信息集合,以及确定所述中间设备信息集合中每一设备信息的出现次数。In some embodiments, the
在一些实施例中,训练单元804,还具体配置为利用网络爬虫方式分别获取所述设备信息总数据库中的每一设备信息的种子URL,并利用所述网络爬虫方式获取每一设备信息的种子URL对应的网页信息;以及从所述每一设备信息的种子URL对应的网页信息中,确定每一设备信息对应的设备标签;以及根据每一设备信息以及每一设备信息对应的设备标签,构建所述预设设备型号数据库。In some embodiments, the
在一些实施例中,训练单元804,还配置为若其中一设备信息无法成功获取到种子URL,则将所述其中一设备信息存储至无效设备信息数据库;和/或,若其中一设备信息无法成功确定出设备标签,则将所述其中一设备信息存储至无效设备信息数据库。In some embodiments, the
在一些实施例中,训练单元804,还具体配置为根据所述至少一个设备信息各自对应的设备标签,创建设备标签数据库;以及对所述样本训练集中的设备信息进行单字符分割,确定单字符词库;对所述单字符词库和所述样本训练集进行单字符级向量转换,得到单字符训练向量集;将所述单字符训练向量集输入所述单字符匹配子网络,利用所述设备标签数据库进行有监督迭代训练,得到所述字符匹配子模型;以及对所述样本训练集中的设备信息进行字符串分割,确定字符串词库;对所述字符串词库和所述样本训练集进行字符串级向量转换,得到字符串训练向量集;将所述字符串训练向量集输入所述字符串匹配子网络,利用所述设备标签数据库进行有监督迭代训练,得到所述字符串匹配子模型。In some embodiments, the
在一些实施例中,确定单元803,具体配置为利用所述单字符匹配子模型对所述待识别设备信息进行特征提取与分类,确定所述待识别设备信息属于每一设备标签的第一概率;以及从所述第一概率中选取最大值,将所述最大值对应的设备标签确定为所述第一识别结果。In some embodiments, the determining
在一些实施例中,确定单元803,具体配置为利用所述字符串匹配子模型对所述待识别设备信息进行特征提取与分类,确定所述待识别设备信息属于每一设备标签的第二概率;以及从所述第二概率中选取最大值,将所述最大值对应的设备标签确定为所述第二识别结果。In some embodiments, the determining
在一些实施例中,确定单元803,具体配置为若所述第一识别结果等于所述第二识别结果,则将所述第一识别结果确定为所述设备识别结果;以及若所述第一识别结果不等于所述第二识别结果,则确定所述第一识别结果对应的第一分类概率和所述第二识别结果对应的第二分类概率,并根据所述第一分类概率与第一判断阈值的比较结果以及所述第二分类概率与第二判断阈值的比较结果,确定所述设备识别结果。In some embodiments, the determining
在一些实施例中,确定单元803,具体配置为若所述第一分类概率大于或者等于所述第一判断阈值,且所述第二分类概率小于所述第二判断阈值,则确定所述设备识别结果为所述第一识别结果;以及若所述第一分类概率小于所述第一判断阈值,且所述第二分类概率大于或者等于所述第二判断阈值,则确定所述设备识别结果为所述第二识别结果;以及若所述第一分类概率大于或者等于所述第一判断阈值且所述第二分类概率大于或者等于所述第二判断阈值,或者,所述第一分类概率小于所述第一判断阈值且所述第二分类概率小于所述第二判断阈值,则确定所述设备识别结果为未识别。In some embodiments, the determining
在一些实施例中,参见图8,该设备识别装置80还可以包括更新单元805,配置为若所述设备识别结果为未识别,则将所述待识别设备信息增加到预设设备型号数据库中,得到新的预设设备型号数据库;以及利用所述新的预设设备型号数据库对所述预设识别模型进行更新训练。In some embodiments, referring to FIG. 8 , the
在一些实施例中,更新单元805,还配置为判断无效设备信息数据库中的设备信息是否有效;以及若判断无效设备信息数据库中的设备信息有效,则利用设备信息对预设识别模型进行更新训练。In some embodiments, the
可以理解地,在本实施例中,“单元”可以是部分电路、部分处理器、部分程序或软件等等,当然也可以是模块,还可以是非模块化的。而且在本实施例中的各组成部分可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。It can be understood that, in this embodiment, a "unit" may be a part of a circuit, a part of a processor, a part of a program or software, etc., of course it may also be a module, or it may be non-modular. Moreover, each component in this embodiment may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit. The above-mentioned integrated units can be implemented in the form of hardware or in the form of software function modules.
所述集成的单元如果以软件功能模块的形式实现并非作为独立的产品进行销售或使用时,可以存储在一个计算机可读取存储介质中,基于这样的理解,本实施例的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)或processor(处理器)执行本实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(Read Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。If the integrated unit is implemented in the form of a software function module and is not sold or used as an independent product, it can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of this embodiment is essentially or It is said that the part that contributes to the prior art or the whole or part of the technical solution can be embodied in the form of a software product, the computer software product is stored in a storage medium, and includes several instructions to make a computer device (which can It is a personal computer, a server, or a network device, etc.) or a processor (processor) that executes all or part of the steps of the method described in this embodiment. The aforementioned storage medium includes: U disk, mobile hard disk, read only memory (Read Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disk and other various media that can store program codes.
因此,本实施例提供了一种计算机存储介质,该计算机存储介质存储有计算机程序,所述计算机程序被至少一个处理器执行时实现前述实施例中任一项所述设备识别方法。Therefore, this embodiment provides a computer storage medium, where the computer storage medium stores a computer program, and when the computer program is executed by at least one processor, the device identification method described in any one of the preceding embodiments is implemented.
基于上述的一种设备识别装置80的组成以及计算机存储介质,参见图9,其示出了本申请实施例提供的一种电子设备90的组成结构示意图。如图9所示,电子设备90可以包括:通信接口901、存储器902和处理器903;各个组件通过总线系统904耦合在一起。可理解,总线系统904用于实现这些组件之间的连接通信。总线系统904除包括数据总线之外,还包括电源总线、控制总线和状态信号总线。但是为了清楚说明起见,在图9中将各种总线都标为总线系统904。其中,通信接口901,用于在与其他外部网元之间进行收发信息过程中,信号的接收和发送;Based on the above-mentioned composition of an apparatus for identifying an
存储器902,用于存储能够在处理器903上运行的计算机程序;
处理器903,用于在运行所述计算机程序时,执行:The
获取待识别设备信息;Obtain information about the device to be identified;
将所述待识别设备信息输入预设识别模型,其中,所述预设识别模型包括单字符匹配子模型和字符串匹配子模型;Inputting the information of the device to be identified into a preset recognition model, wherein the preset recognition model includes a single-character matching sub-model and a character string matching sub-model;
利用所述单字符匹配子模型对所述待识别设备信息进行识别,确定第一识别结果,以及利用所述字符串匹配子模型对所述待识别设备信息进行识别,确定第二识别结果;Using the single-character matching sub-model to identify the device information to be identified, determine a first identification result, and use the character string matching sub-model to identify the device information to be identified, and determine a second identification result;
根据所述第一识别结果和所述第二识别结果,确定设备识别结果。A device identification result is determined according to the first identification result and the second identification result.
可以理解,本申请实施例中的存储器902可以是易失性存储器或非易失性存储器,或可包括易失性和非易失性存储器两者。其中,非易失性存储器可以是只读存储器(Read-Only Memory,ROM)、可编程只读存储器(Programmable ROM,PROM)、可擦除可编程只读存储器(Erasable PROM,EPROM)、电可擦除可编程只读存储器(Electrically EPROM,EEPROM)或闪存。易失性存储器可以是随机存取存储器(Random Access Memory,RAM),其用作外部高速缓存。通过示例性但不是限制性说明,许多形式的RAM可用,例如静态随机存取存储器(Static RAM,SRAM)、动态随机存取存储器(Dynamic RAM,DRAM)、同步动态随机存取存储器(Synchronous DRAM,SDRAM)、双倍数据速率同步动态随机存取存储器(Double Data RateSDRAM,DDRSDRAM)、增强型同步动态随机存取存储器(Enhanced SDRAM,ESDRAM)、同步链动态随机存取存储器(Synchronous link DRAM,SLDRAM)和直接内存总线随机存取存储器(Direct Rambus RAM,DRRAM)。本文描述的系统和方法的存储器902旨在包括但不限于这些和任意其它适合类型的存储器。It can be understood that the
而处理器903可能是一种集成电路芯片,具有信号的处理能力。在实现过程中,上述方法的各步骤可以通过处理器903中的硬件的集成逻辑电路或者软件形式的指令完成。上述的处理器903可以是通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现场可编程门阵列(Field Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。可以实现或者执行本申请实施例中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。结合本申请实施例所公开的方法的步骤可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器902,处理器903读取存储器902中的信息,结合其硬件完成上述方法的步骤。The
可以理解的是,本文描述的这些实施例可以用硬件、软件、固件、中间件、微码或其组合来实现。对于硬件实现,处理单元可以实现在一个或多个专用集成电路(ApplicationSpecific Integrated Circuits,ASIC)、数字信号处理器(Digital Signal Processing,DSP)、数字信号处理设备(DSP Device,DSPD)、可编程逻辑设备(Programmable LogicDevice,PLD)、现场可编程门阵列(Field-Programmable Gate Array,FPGA)、通用处理器、控制器、微控制器、微处理器、用于执行本申请所述功能的其它电子单元或其组合中。It should be understood that the embodiments described herein may be implemented by hardware, software, firmware, middleware, microcode or a combination thereof. For hardware implementation, the processing unit can be implemented in one or more application specific integrated circuits (Application Specific Integrated Circuits, ASIC), digital signal processor (Digital Signal Processing, DSP), digital signal processing device (DSP Device, DSPD), programmable logic Device (Programmable Logic Device, PLD), Field-Programmable Gate Array (Field-Programmable Gate Array, FPGA), general-purpose processor, controller, microcontroller, microprocessor, other electronic units for performing the functions described in this application or a combination thereof.
对于软件实现,可通过执行本文所述功能的模块(例如过程、函数等)来实现本文所述的技术。软件代码可存储在存储器中并通过处理器执行。存储器可以在处理器中或在处理器外部实现。For a software implementation, the techniques described herein can be implemented through modules (eg, procedures, functions, and so on) that perform the functions described herein. Software codes can be stored in memory and executed by a processor. Memory can be implemented within the processor or external to the processor.
可选地,作为另一个实施例,处理器903还配置为在运行所述计算机程序时,执行前述实施例中任一项所述的设备识别方法。Optionally, as another embodiment, the
基于上述设备识别装置80的组成以及硬件结构示意图,参见图10,其示出了本申请实施例提供的另一种电子设备90的组成结构示意图。如图10所示,该电子设备90至少包括前述实施例中任一项所述的设备识别装置80。Based on the composition and hardware structure schematic diagram of the
对于电子设备90而言,由于通过单字符匹配子模型和字符串匹配子模型对待识别设备信息进行组合识别,根据两个子模型的识别结果联合确定待识别设备信息的识别结果,从而有效提高了对设备信息进行识别的准确性,同时还提高了识别效率。For the
以上所述,仅为本申请的较佳实施例而已,并非用于限定本申请的保护范围。The above descriptions are only preferred embodiments of the present application, and are not intended to limit the protection scope of the present application.
需要说明的是,在本申请中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者装置不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者装置所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括该要素的过程、方法、物品或者装置中还存在另外的相同要素。It should be noted that in this application, the term "comprises", "comprises" or any other variation thereof is intended to cover a non-exclusive inclusion such that a process, method, article or apparatus comprising a set of elements includes not only those elements , but also includes other elements not expressly listed, or also includes elements inherent in such a process, method, article, or device. Without further limitations, an element defined by the phrase "comprising a ..." does not preclude the presence of additional identical elements in the process, method, article, or apparatus comprising that element.
上述本申请实施例序号仅仅为了描述,不代表实施例的优劣。The serial numbers of the above embodiments of the present application are for description only, and do not represent the advantages and disadvantages of the embodiments.
本申请所提供的几个方法实施例中所揭露的方法,在不冲突的情况下可以任意组合,得到新的方法实施例。The methods disclosed in several method embodiments provided in this application can be combined arbitrarily to obtain new method embodiments under the condition of no conflict.
本申请所提供的几个产品实施例中所揭露的特征,在不冲突的情况下可以任意组合,得到新的产品实施例。The features disclosed in several product embodiments provided in this application can be combined arbitrarily without conflict to obtain new product embodiments.
本申请所提供的几个方法或设备实施例中所揭露的特征,在不冲突的情况下可以任意组合,得到新的方法实施例或设备实施例。The features disclosed in several method or device embodiments provided in this application can be combined arbitrarily without conflict to obtain new method embodiments or device embodiments.
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。The above is only a specific implementation of the application, but the scope of protection of the application is not limited thereto. Anyone familiar with the technical field can easily think of changes or substitutions within the technical scope disclosed in the application. Should be covered within the protection scope of this application. Therefore, the protection scope of the present application should be determined by the protection scope of the claims.
Claims (16)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111253826.5A CN116032741A (en) | 2021-10-27 | 2021-10-27 | Equipment identification method and device, electronic equipment and computer storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111253826.5A CN116032741A (en) | 2021-10-27 | 2021-10-27 | Equipment identification method and device, electronic equipment and computer storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116032741A true CN116032741A (en) | 2023-04-28 |
Family
ID=86076651
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111253826.5A Pending CN116032741A (en) | 2021-10-27 | 2021-10-27 | Equipment identification method and device, electronic equipment and computer storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116032741A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116934195A (en) * | 2023-09-14 | 2023-10-24 | 海信集团控股股份有限公司 | Commodity information checking method and device, electronic equipment and storage medium |
CN117194947A (en) * | 2023-08-16 | 2023-12-08 | 惠州市庆展科技有限公司 | Smart home equipment characteristic determining method and system |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106874909A (en) * | 2017-01-18 | 2017-06-20 | 深圳怡化电脑股份有限公司 | A kind of recognition methods of image character and its device |
CN110188618A (en) * | 2019-05-07 | 2019-08-30 | 南京理工大学 | A method for identifying the speed limit value of a speed limit traffic sign |
CN110765973A (en) * | 2019-10-31 | 2020-02-07 | 上海掌门科技有限公司 | Account type identification method and device |
CN113176830A (en) * | 2021-04-30 | 2021-07-27 | 北京百度网讯科技有限公司 | Recognition model training method, recognition device, electronic equipment and storage medium |
CN113221705A (en) * | 2021-04-30 | 2021-08-06 | 平安科技(深圳)有限公司 | Automatic classification method, device, equipment and storage medium of electronic documents |
-
2021
- 2021-10-27 CN CN202111253826.5A patent/CN116032741A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106874909A (en) * | 2017-01-18 | 2017-06-20 | 深圳怡化电脑股份有限公司 | A kind of recognition methods of image character and its device |
CN110188618A (en) * | 2019-05-07 | 2019-08-30 | 南京理工大学 | A method for identifying the speed limit value of a speed limit traffic sign |
CN110765973A (en) * | 2019-10-31 | 2020-02-07 | 上海掌门科技有限公司 | Account type identification method and device |
CN113176830A (en) * | 2021-04-30 | 2021-07-27 | 北京百度网讯科技有限公司 | Recognition model training method, recognition device, electronic equipment and storage medium |
CN113221705A (en) * | 2021-04-30 | 2021-08-06 | 平安科技(深圳)有限公司 | Automatic classification method, device, equipment and storage medium of electronic documents |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117194947A (en) * | 2023-08-16 | 2023-12-08 | 惠州市庆展科技有限公司 | Smart home equipment characteristic determining method and system |
CN116934195A (en) * | 2023-09-14 | 2023-10-24 | 海信集团控股股份有限公司 | Commodity information checking method and device, electronic equipment and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2022068196A1 (en) | Cross-modal data processing method and device, storage medium, and electronic device | |
US20200226485A1 (en) | System and method for identification of multimedia content elements | |
WO2019153551A1 (en) | Article classification method and apparatus, computer device and storage medium | |
CN111752955B (en) | Data processing method, device, equipment and computer readable storage medium | |
CN114780746A (en) | Knowledge graph-based document retrieval method and related equipment thereof | |
CN109242002A (en) | High dimensional data classification method, device and terminal device | |
CN111460153A (en) | Hot topic extraction method and device, terminal device and storage medium | |
CN112215837A (en) | Multi-attribute image semantic analysis method and device | |
CN103678670A (en) | Micro-blog hot word and hot topic mining system and method | |
CN111553215A (en) | Personnel association method and device, and graph convolution network training method and device | |
CN113449084A (en) | Relationship extraction method based on graph convolution | |
CN112784009B (en) | Method and device for mining subject term, electronic equipment and storage medium | |
CN111814923A (en) | Image clustering method, system, device and medium | |
CN116881430B (en) | Industrial chain identification method and device, electronic equipment and readable storage medium | |
CN116032741A (en) | Equipment identification method and device, electronic equipment and computer storage medium | |
CN106844553A (en) | Data snooping and extending method and device based on sample data | |
CN111859079B (en) | Information search method, device, computer equipment and storage medium | |
CN116910592B (en) | Log detection method and device, electronic equipment and storage medium | |
CN117992835A (en) | Multi-strategy label disambiguation partial multi-label classification method, device and storage medium | |
CN117290745A (en) | Alarm log pushing method and device, computer equipment and storage medium | |
CN113312619B (en) | Malicious process detection method and device based on small sample learning, electronic equipment and storage medium | |
CN113779248A (en) | Data classification model training method, data processing method and storage medium | |
CN108984519B (en) | Method, device and storage medium for automatic construction of event corpus based on dual mode | |
CN115146692A (en) | Data clustering method, apparatus, electronic device and readable storage medium | |
Ji et al. | Vocabulary tree incremental indexing for scalable location recognition |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |