Disclosure of Invention
In view of the above, the present invention provides a smart city monitoring method, apparatus and information processing device to improve the above problem.
In a first aspect of the embodiments of the present invention, a smart city monitoring method is provided, which is applied to a smart city monitoring system, where the smart city monitoring system includes an information processing device and a collection device that are communicated with each other, and the collection device is disposed in different areas of a building and is used to collect monitoring information in different areas of the building, and the method includes:
the method comprises the steps that acquisition equipment monitors a target area of a building in real time, acquires monitoring information of the target area, and uploads the monitoring information to information processing equipment in real time through a preset information transmission channel; the information transmission channel is generated by the information processing equipment according to the matching degree between first parameter structured information obtained by the information processing equipment and second parameter structured information of the acquisition equipment, the first parameter structured information is used for representing the communication parameter logic distribution of the information processing equipment, and the second parameter structures the communication parameter logic distribution of the acquisition equipment; the monitoring information comprises image information corresponding to the target area and voice information corresponding to the target area;
the information processing equipment receives monitoring information corresponding to the target area uploaded by the acquisition equipment through the information transmission channel, and performs information category identification on the monitoring information to obtain first category information used for representing the image information and second category information used for representing the voice information, which are included in the monitoring information; classifying the monitoring information according to the first category information and the second category information to obtain the image information corresponding to the first category information and the voice information corresponding to the second category information in the monitoring information;
the information processing equipment extracts a first information feature of the image information and a second information feature of the voice information, performs feature recognition on the first information feature based on a first preset feature database to obtain a first recognition result, and performs feature recognition on the second information feature based on a second preset feature database to obtain a second recognition result; the first preset feature database is an image feature database, and the second preset feature database is a voice feature database;
the information processing device acquires a second confidence degree of the second recognition result obtained by taking the first recognition result as a reference and a first confidence degree of the first recognition result obtained by taking the second recognition result as a reference; obtaining a weighting coefficient for weighting the first recognition result and the second recognition result according to the first confidence degree and the second confidence degree; and carrying out weighted summation on the first recognition result and the second recognition result based on the weighting coefficient to obtain a third recognition result, and determining the third recognition result pair.
Optionally, the extracting the first information feature of the image information and the second information feature of the voice information includes:
determining image coding information of the image information, dividing the image coding information according to coding segmentation marks in the image coding information to obtain a plurality of continuous coding information segments, determining a matching coefficient between every two coding information segments, and performing relevance correction on each coding information segment according to all the determined matching coefficients to obtain a target coding information segment corresponding to each coding information segment;
determining the code character distribution corresponding to each target code information segment, listing the character structure topology of each code character distribution, determining the character distribution characteristics corresponding to each code character distribution according to the character structure topology, and integrating all the character distribution characteristics to obtain the first information characteristics of the image information;
extracting a spectrogram of the voice information, separating a voiceprint curve in the voice information from the spectrogram and acquiring a voiceprint characteristic corresponding to the voiceprint curve;
obtaining text information corresponding to the voice information according to the voice information, performing word segmentation processing on the text information to obtain a plurality of keywords, determining topic information corresponding to the text information according to semantic connection relations among the keywords, and extracting topic features of the topic information;
and determining a first associated weight of the voiceprint features relative to the subject features and a second associated weight of the subject features relative to the voiceprint features, and performing weighted summation on the voiceprint features and the subject features based on the first associated weight and the second associated weight to obtain second information features of the voice information.
Optionally, the performing feature recognition on the first information feature based on a first preset feature database to obtain a first recognition result includes:
acquiring a first target feature with the minimum cosine distance between the first target feature and the first information feature in the first preset feature database;
determining a behavior category corresponding to the first target feature from a preset first mapping relation list, wherein the behavior category is a behavior category obtained by personnel in the image information;
generating the first recognition result based on the behavior category.
Optionally, the performing feature recognition on the second information feature based on a second preset feature database to obtain a second recognition result includes:
determining at least part of second target features of which the similarity values with the second information features in the second preset feature database are smaller than or equal to a set threshold;
and determining a semantic result corresponding to each second target feature according to a preset second mapping relation list, and fusing all the semantic results to obtain the second recognition result.
Optionally, the weighting and summing the first recognition result and the second recognition result based on the weighting coefficient to obtain a third recognition result includes:
respectively listing the first identification result and the second identification result in a numerical code form to obtain a first numerical code list corresponding to the first identification result and a second numerical code list corresponding to the second identification result;
pairing each first list unit in the first numerical code list with each second list unit in the second numerical code to obtain at least part of list unit groups; each list unit group comprises a first list unit and a second list unit;
and generating a third numerical value coding list according to at least part of the list unit group, weighting the third numerical value coding list based on the weighting coefficient to obtain a fourth numerical value coding list, and converting the fourth numerical value coding list into a third identification result based on first conversion logic which lists the first identification result in a numerical value coding form to obtain the first numerical value coding list corresponding to the first identification result or based on second conversion logic which lists the second identification result in a numerical value coding form to obtain the second numerical value coding list corresponding to the second identification result.
In a second aspect of the embodiments of the present invention, a smart city monitoring system is provided, which includes an information processing device and an acquisition device that communicate with each other, where the acquisition device is disposed in different areas of a building;
the system comprises acquisition equipment, information processing equipment and a monitoring system, wherein the acquisition equipment is used for monitoring a target area of a building in real time, acquiring monitoring information of the target area and uploading the monitoring information to the information processing equipment in real time through a preset information transmission channel; the information transmission channel is generated by the information processing equipment according to the matching degree between first parameter structured information obtained by the information processing equipment and second parameter structured information of the acquisition equipment, the first parameter structured information is used for representing the communication parameter logic distribution of the information processing equipment, and the second parameter structures the communication parameter logic distribution of the acquisition equipment; the monitoring information comprises image information corresponding to the target area and voice information corresponding to the target area;
the information processing device is used for receiving monitoring information corresponding to the target area uploaded by the acquisition device through the information transmission channel, and performing information type identification on the monitoring information to obtain first type information used for representing the image information and second type information used for representing the voice information, which are included in the monitoring information; classifying the monitoring information according to the first category information and the second category information to obtain the image information corresponding to the first category information and the voice information corresponding to the second category information in the monitoring information; the information processing equipment extracts a first information feature of the image information and a second information feature of the voice information, performs feature recognition on the first information feature based on a first preset feature database to obtain a first recognition result, and performs feature recognition on the second information feature based on a second preset feature database to obtain a second recognition result; the first preset feature database is an image feature database, and the second preset feature database is a voice feature database; the information processing device acquires a second confidence degree of the second recognition result obtained by taking the first recognition result as a reference and a first confidence degree of the first recognition result obtained by taking the second recognition result as a reference; obtaining a weighting coefficient for weighting the first recognition result and the second recognition result according to the first confidence degree and the second confidence degree; and carrying out weighted summation on the first recognition result and the second recognition result based on the weighting coefficient to obtain a third recognition result, and determining the third recognition result pair.
Optionally, the information processing apparatus is specifically configured to:
determining image coding information of the image information, dividing the image coding information according to coding segmentation marks in the image coding information to obtain a plurality of continuous coding information segments, determining a matching coefficient between every two coding information segments, and performing relevance correction on each coding information segment according to all the determined matching coefficients to obtain a target coding information segment corresponding to each coding information segment;
determining the code character distribution corresponding to each target code information segment, listing the character structure topology of each code character distribution, determining the character distribution characteristics corresponding to each code character distribution according to the character structure topology, and integrating all the character distribution characteristics to obtain the first information characteristics of the image information;
extracting a spectrogram of the voice information, separating a voiceprint curve in the voice information from the spectrogram and acquiring a voiceprint characteristic corresponding to the voiceprint curve;
obtaining text information corresponding to the voice information according to the voice information, performing word segmentation processing on the text information to obtain a plurality of keywords, determining topic information corresponding to the text information according to semantic connection relations among the keywords, and extracting topic features of the topic information;
and determining a first associated weight of the voiceprint features relative to the subject features and a second associated weight of the subject features relative to the voiceprint features, and performing weighted summation on the voiceprint features and the subject features based on the first associated weight and the second associated weight to obtain second information features of the voice information.
Optionally, the information processing apparatus is specifically configured to:
acquiring a first target feature with the minimum cosine distance between the first target feature and the first information feature in the first preset feature database;
determining a behavior category corresponding to the first target feature from a preset first mapping relation list, wherein the behavior category is a behavior category obtained by personnel in the image information;
generating the first recognition result based on the behavior category.
Optionally, the information processing apparatus is specifically configured to:
determining at least part of second target features of which the similarity values with the second information features in the second preset feature database are smaller than or equal to a set threshold;
and determining a semantic result corresponding to each second target feature according to a preset second mapping relation list, and fusing all the semantic results to obtain the second recognition result.
Optionally, the information processing apparatus is specifically configured to:
respectively listing the first identification result and the second identification result in a numerical code form to obtain a first numerical code list corresponding to the first identification result and a second numerical code list corresponding to the second identification result;
pairing each first list unit in the first numerical code list with each second list unit in the second numerical code to obtain at least part of list unit groups; each list unit group comprises a first list unit and a second list unit;
and generating a third numerical value coding list according to at least part of the list unit group, weighting the third numerical value coding list based on the weighting coefficient to obtain a fourth numerical value coding list, and converting the fourth numerical value coding list into a third identification result based on first conversion logic which lists the first identification result in a numerical value coding form to obtain the first numerical value coding list corresponding to the first identification result or based on second conversion logic which lists the second identification result in a numerical value coding form to obtain the second numerical value coding list corresponding to the second identification result.
Advantageous effects
According to the smart city monitoring method and system provided by the embodiment of the invention, firstly, the acquisition equipment uploads the acquired monitoring information of the target area to the information processing equipment through the preset information transmission channel, so that the transmission timeliness and accuracy of the monitoring information can be improved.
Secondly, the information processing equipment performs feature extraction on the image information and the voice information in the monitoring information and performs feature recognition to obtain a first recognition result and a second recognition result.
And finally, determining a weighting coefficient based on the first confidence coefficient and the second confidence coefficient obtained according to the first recognition result and the second recognition result, realizing the weighted summation of the first recognition result and the second recognition result to obtain a third recognition result, and then judging that the target area is abnormal when the early warning level of the third recognition result exceeds the set level.
Therefore, the monitoring information can be deeply mined from the angle of association of the image information and the voice information, and a more comprehensive and reliable monitoring analysis result is obtained, so that the safety of the building is ensured.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. The components of embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations.
Thus, the following detailed description of the embodiments of the present invention, presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures.
In order to improve the phenomenon that monitoring information is not dug in place in the building monitoring process, the embodiment of the invention provides a smart city monitoring method and a smart city monitoring system, which can be used for deeply mining the monitoring information so as to obtain a more comprehensive and reliable monitoring analysis result and ensure the safety of a building.
Referring to fig. 1, a schematic diagram of an architecture of a smart city monitoring system 100 according to an embodiment of the present invention is shown, in which the smart city monitoring system 100 includes an information processing device 200 and a collecting device 300, which are in communication with each other. In this embodiment, the capturing device 300 may be a camera or a microphone, and is not limited herein. The collection device 300 may be located in different areas of a building for collecting monitoring information within the different areas of the building.
Referring to fig. 2, a flowchart of a smart city monitoring method according to an embodiment of the present invention is shown, where the method is applied to the smart city monitoring system 100 in fig. 1. Further, the method can be specifically realized by the contents described in the following steps.
Step S21, the acquisition equipment monitors the target area of the building in real time, acquires the monitoring information of the target area, and uploads the monitoring information to the information processing equipment in real time through a preset information transmission channel; the information transmission channel is generated by the information processing equipment according to the matching degree between first parameter structured information obtained by the information processing equipment and second parameter structured information of the acquisition equipment, the first parameter structured information is used for representing the communication parameter logic distribution of the information processing equipment, and the second parameter structures the communication parameter logic distribution of the acquisition equipment; the monitoring information comprises image information corresponding to the target area and voice information corresponding to the target area.
Step S22, the information processing device receives the monitoring information corresponding to the target area uploaded by the acquisition device through the information transmission channel, and performs information category identification on the monitoring information to obtain first category information used for representing the image information and second category information used for representing the voice information, which are included in the monitoring information; classifying the monitoring information according to the first category information and the second category information to obtain the image information corresponding to the first category information and the voice information corresponding to the second category information in the monitoring information.
Step S23, the information processing apparatus extracts a first information feature of the image information and a second information feature of the voice information, performs feature recognition on the first information feature based on a first preset feature database to obtain a first recognition result, and performs feature recognition on the second information feature based on a second preset feature database to obtain a second recognition result; the first preset feature database is an image feature database, and the second preset feature database is a voice feature database;
step S24, the information processing apparatus acquires a second confidence degree of the second recognition result obtained with the first recognition result as a reference and a first confidence degree of the first recognition result obtained with the second recognition result as a reference; obtaining a weighting coefficient for weighting the first recognition result and the second recognition result according to the first confidence degree and the second confidence degree; weighting and summing the first recognition result and the second recognition result based on the weighting coefficient to obtain a third recognition result, and determining an early warning grade corresponding to the third recognition result; and judging that the target area is abnormal when the early warning level exceeds a set level.
It can be understood that based on the descriptions of the above steps S21-S24, first, the collecting device uploads the collected monitoring information of the target area to the information processing device through the preset information transmission channel, so that the time information and accuracy of the transmission of the monitoring information can be improved. Secondly, the information processing equipment performs feature extraction on the image information and the voice information in the monitoring information and performs feature recognition to obtain a first recognition result and a second recognition result. And finally, determining a weighting coefficient based on the first confidence coefficient and the second confidence coefficient obtained according to the first recognition result and the second recognition result, realizing the weighted summation of the first recognition result and the second recognition result to obtain a third recognition result, and then judging that the target area is abnormal when the early warning level of the third recognition result exceeds the set level. Therefore, the monitoring information can be deeply mined from the angle of association of the image information and the voice information, and a more comprehensive and reliable monitoring analysis result is obtained, so that the safety of the building is ensured.
In an alternative embodiment, in order to accurately determine the first information feature and the second information feature, in step S23, the extracting the first information feature of the image information and the second information feature of the voice information may specifically include what is described in the following steps.
Step S231, determining image coding information of the image information, dividing the image coding information according to the coding division identifiers in the image coding information to obtain a plurality of continuous coding information segments, determining a matching coefficient between every two coding information segments, and performing relevance correction on each coding information segment according to all the determined matching coefficients to obtain a target coding information segment corresponding to each coding information segment.
Step S232, determining the code character distribution corresponding to each target code information segment, listing the character structure topology of each code character distribution, determining the character distribution characteristics corresponding to each code character distribution according to the character structure topology, and integrating all the character distribution characteristics to obtain the first information characteristics of the image information.
Step S233, extracting the spectrogram of the voice information, separating the voiceprint curve in the voice information from the spectrogram, and acquiring the voiceprint characteristics corresponding to the voiceprint curve.
Step S234, obtaining text information corresponding to the voice information according to the voice information, performing word segmentation processing on the text information to obtain a plurality of keywords, determining topic information corresponding to the text information according to semantic connection relations among the keywords, and extracting topic features of the topic information.
Step S235, determining a first association weight of the voiceprint feature with respect to the subject feature and a second association weight of the subject feature with respect to the voiceprint feature, and performing weighted summation on the voiceprint feature and the subject feature based on the first association weight and the second association weight to obtain a second information feature of the speech information.
It can be understood that the first information characteristic and the second information characteristic can be accurately determined through the contents described in the above steps S231 to S235.
In a specific implementation, in step S23, the performing the feature recognition on the first information feature based on the first preset feature database to obtain a first recognition result may specifically include the method described in the following steps.
(11) And acquiring a first target feature with the minimum cosine distance between the first target feature and the first information feature in the first preset feature database.
(12) And determining a behavior category corresponding to the first target feature from a preset first mapping relation list, wherein the behavior category is a behavior category obtained by people in the image information.
(13) Generating the first recognition result based on the behavior category.
In this embodiment, the first recognition result including the behavior class can be accurately determined through the contents described in the above steps (11) to (13).
On the basis of the above, in step S23, the performing the feature recognition on the second information feature based on the second preset feature database to obtain the second recognition result may specifically include the method described in the following steps.
(21) And determining at least part of second target features of which the similarity values with the second information features in the second preset feature database are smaller than or equal to a set threshold.
(22) And determining a semantic result corresponding to each second target feature according to a preset second mapping relation list, and fusing all the semantic results to obtain the second recognition result.
In this embodiment, the second recognition result can be accurately determined through the contents described in the above steps (21) to (23).
In a specific implementation, in step S24, the weighting and summing the first recognition result and the second recognition result based on the weighting coefficient to obtain a third recognition result may specifically include the method described in the following steps.
Step S241, respectively listing the first identification result and the second identification result in a form of numerical codes to obtain a first numerical code list corresponding to the first identification result and a second numerical code list corresponding to the second identification result.
Step S242, pairing each first list unit in the first numerical code list and each second list unit in the second numerical code list to obtain at least part of a list unit group; each list unit group comprises a first list unit and a second list unit.
Step S243, generating a third numerical code list according to at least a part of the list unit group, weighting the third numerical code list based on the weighting coefficient to obtain a fourth numerical code list, and converting the fourth numerical code list into a third identification result based on a first conversion logic that lists the first identification result in a numerical code form to obtain the first numerical code list corresponding to the first identification result or based on a second conversion logic that lists the second identification result in a numerical code form to obtain the second numerical code list corresponding to the second identification result.
It is understood that the third recognition result can be accurately obtained based on the contents described in the above steps S241 to S243.
Referring to fig. 3, a block diagram of an information processing apparatus 200 according to an embodiment of the invention is shown. The information processing apparatus 200 in the embodiment of the present invention has data storage, transmission, and processing functions, and as shown in fig. 3, the information processing apparatus 200 includes: memory 211, processor 212, network module 213 and smart city monitoring device 201.
The memory 211, the processor 212 and the network module 213 are electrically connected directly or indirectly to enable data transmission or interaction. For example, the components may be electrically connected to each other via one or more communication buses or signal lines. The memory 211 stores the smart city monitoring device 20, the smart city monitoring device 20 includes at least one software function module which can be stored in the memory 211 in the form of software or firmware (firmware), and the processor 212 executes various function applications and data processing by running the software programs and modules stored in the memory 211, such as the smart city monitoring device 20 in the embodiment of the present invention, so as to implement the smart city monitoring method in the embodiment of the present invention.
The Memory 211 may be, but is not limited to, a Random Access Memory (RAM), a Read Only Memory (ROM), a Programmable Read-Only Memory (PROM), an Erasable Read-Only Memory (EPROM), an electrically Erasable Read-Only Memory (EEPROM), and the like. The memory 211 is used for storing a program, and the processor 212 executes the program after receiving an execution instruction.
The processor 212 may be an integrated circuit chip having data processing capabilities. The Processor 212 may be a general-purpose Processor including a Central Processing Unit (CPU), a Network Processor (NP), and the like. The various methods, steps and logic blocks disclosed in embodiments of the present invention may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The network module 213 is configured to establish a communication connection between the information processing apparatus 200 and another communication terminal apparatus through a network, so as to implement transceiving operation of network signals and data. The network signal may include a wireless signal or a wired signal.
It is to be understood that the configuration shown in fig. 3 is merely an illustration, and the information processing apparatus 200 may further include more or less components than those shown in fig. 3, or have a different configuration from that shown in fig. 1. The components shown in fig. 1 may be implemented in hardware, software, or a combination thereof.
An embodiment of the present invention also provides a computer-readable storage medium, which includes a computer program. The computer program controls the information processing device 200 in which the readable storage medium is located to execute the steps corresponding to the information processing device 200 in the smart city monitoring method shown in fig. 2 when running.
In summary, according to the smart city monitoring method and system provided by the embodiment of the invention, firstly, the acquisition device uploads the acquired monitoring information of the target area to the information processing device through the preset information transmission channel, so that the transmission timeliness and accuracy of the monitoring information can be improved.
Secondly, the information processing equipment performs feature extraction on the image information and the voice information in the monitoring information and performs feature recognition to obtain a first recognition result and a second recognition result.
And finally, determining a weighting coefficient based on the first confidence coefficient and the second confidence coefficient obtained according to the first recognition result and the second recognition result, realizing the weighted summation of the first recognition result and the second recognition result to obtain a third recognition result, and then judging that the target area is abnormal when the early warning level of the third recognition result exceeds the set level.
Therefore, the monitoring information can be deeply mined from the angle of association of the image information and the voice information, and a more comprehensive and reliable monitoring analysis result is obtained, so that the safety of the building is ensured.
In the embodiments provided in the present invention, it should be understood that the disclosed apparatus and method can be implemented in other ways. The apparatus and method embodiments described above are illustrative only, as the flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
In addition, the functional modules in the embodiments of the present invention may be integrated together to form an independent part, or each module may exist separately, or two or more modules may be integrated to form an independent part.
The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, an information processing device 200, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes. It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.