CN110569448A - Content labeling method and system - Google Patents

Content labeling method and system Download PDF

Info

Publication number
CN110569448A
CN110569448A CN201810468776.4A CN201810468776A CN110569448A CN 110569448 A CN110569448 A CN 110569448A CN 201810468776 A CN201810468776 A CN 201810468776A CN 110569448 A CN110569448 A CN 110569448A
Authority
CN
China
Prior art keywords
technology
cloud server
lbs
voice
content
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN201810468776.4A
Other languages
Chinese (zh)
Inventor
张运军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Double Monkey Technology Co Ltd
Original Assignee
Shenzhen Double Monkey Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Double Monkey Technology Co Ltd filed Critical Shenzhen Double Monkey Technology Co Ltd
Priority to CN201810468776.4A priority Critical patent/CN110569448A/en
Publication of CN110569448A publication Critical patent/CN110569448A/en
Withdrawn legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T19/00Manipulating 3D models or images for computer graphics
    • G06T19/003Navigation within 3D models or images

Abstract

A content tagging system comprising: the system comprises terminal equipment and a cloud server; the terminal equipment is connected with the cloud server through a network. The terminal device includes: array microphone, vr physical camera, 3d interactive interface. The cloud server includes: speech recognition, NLP technology, neural network technology CNN, LSTM training content model and LBS accurate positioning search technology. The user only needs to use professional equipment to upload image, characters, pronunciation and handle the back permanent save at the high in the clouds, and someone utilizes the VR technique in the same position next time, obtains the information display that everybody left before here from the high in the clouds, avoids the travel of civilization to take place, through the virtual formation real environment that fuses the camera and shoot of VR, forms real and virtual stack and has arrived the message, and the sound is stayed, stays and shines in same position.

Description

Content labeling method and system
Technical Field
The embodiment of the invention relates to the technical field of information, in particular to a content labeling method and system.
Background
In recent years, LBS, NLP technology, search engine and artificial intelligence technology are widely applied to various fields of life, a user uploads four contents of images, characters and voice to a cloud end by using professional equipment, the cloud end stores the contents based on the LBS, and forms a big data bin by training a content model through neural network technology CNN and LSTM, the processed contents are stored in a data cluster, and various interfaces are provided for a third party to access.
The method mainly solves the problem that when people travel to a tourist attraction for tourism, some information of the tourist attraction is generally known through explanation of a guide or introduction of contents such as characters and the like of the tourist attraction. Such one-way information output sometimes does not facilitate the patron's full understanding of the tourist attraction. In addition, in some current situations, when tourists travel at scenic spots, the tourists often like to make a comment of making a symbol or a character on a building of the scenic spot to travel to the tourist and the like, which brings great damage to the building of the scenic spot.
disclosure of Invention
The technical problem mainly solved by the embodiment of the invention is to provide a content marking method and a content marking system, based on LBS positioning technology, algorithm, image recognition, NLP semantic analysis technology, search engine technology and vr virtual scene and real scene fitting technology, images, characters and voice are uploaded to a cloud server, a neural network technology is adopted to train a model to process the images, the voice, and semantic analysis is adopted to form a large database based on positions, semantics and a friendship chain, finally, the content to be marked is matched with the training model through the search engine technology and the deep neural network technology, and the content semantics is analyzed to obtain associated content and is matched with the marked content.
In order to solve the technical problems, the invention adopts a technical scheme that: there is provided a content marking system comprising: the system comprises terminal equipment and a cloud server; the terminal equipment is connected with the cloud server through a network.
The terminal device includes: array microphone, vr physical camera, 3d interactive interface.
The cloud server includes: speech recognition, NLP technology, neural network technology CNN, LSTM training content model and LBS accurate positioning search technology.
An annotation method, comprising: firstly, a professional terminal inputs voice or writes text or takes images with LBS information (three-dimensional coordinates X, Y and Z) wanted by a user through an array microphone, a cloud server recognizes the voice into characters by adopting voice recognition, analyzes the characters by using an NLP technology, and then trains a content model by using an LSTM and stores data into a database cluster.
Secondly, a professional terminal and a vr physical camera are used for photographing, LBS information is taken, the LBS information is reported to a cloud server, the cloud server adopts a neural network technology CNN and LBS accurate positioning search technology, and all information which is reserved in the same direction (three-dimensional coordinates X, Y and Z) and the same scene is inquired;
And finally, displaying the data returned by the cloud on a professional terminal device by using a 3d interactive interface, so that the user can finally see how many text messages, how many voice messages and how many images are left.
The user only needs to use professional equipment with the image, the characters, the pronunciation upload the high in the clouds processing back permanent storage, next time someone uses professional equipment to utilize the VR technique in the same position, obtain the information display that everybody left before here from the high in the clouds, avoid the emergence of uneventful tourism, and let "this trip" permanent retention, simultaneously through picture recognition technology, speech recognition technology, characters semantic analysis technique, the location technique, we carry out the deep analysis to visitor's content, match close relevant content, and form real environment through the virtual integration of VR to the camera formation of shooing, form real and virtual stack and arrived the message, the sound is stayed, stay and shine in same position.
Drawings
Fig. 1 is a block diagram of a content tagging system according to an embodiment of the present invention.
Detailed Description
In order to facilitate an understanding of the invention, the invention is described in more detail below with reference to the accompanying drawings and detailed description. It will be understood that when an element is referred to as being "secured to" another element, it can be directly on the other element or intervening elements may also be present. When an element is referred to as being "connected" to another element, it can be directly connected to the other element or intervening elements may be present. The terms "vertical," "horizontal," "left," "right," and the like as used herein are for descriptive purposes only.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. The terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used herein, the term "and/or" includes any and all combinations of one or more of the associated listed items.
As shown in fig. 1, a content tagging system includes: the system comprises terminal equipment and a cloud server; the terminal equipment is connected with the cloud server through a network.
The terminal device includes: array microphone, vr physical camera, 3d interactive interface.
the cloud server includes: speech recognition, NLP technology, neural network technology CNN, LSTM training content model and LBS accurate positioning search technology.
A user uses a handheld terminal device, a VR camera technology is used, the current LBS three-dimensional coordinate (X, Y, Z) and an image are uploaded to a cloud server, the server can perform image identification according to the LBS three-dimensional coordinate (X, Y, Z) and the image to confirm the position and the direction, a neural network algorithm is adopted to match the content in a database cluster and return the result set comprising the image, characters and voice to the terminal device, the terminal utilizes the VR technology to superpose the result with a real scene obtained by a real camera to form the same place and the same direction, associated voice is generated, the problem of the left image is solved, and the left image is used for really realizing all sounds, images and characters of the game to be permanently retained and shared to a person who wants to see.
The key to NLP technology processing natural language is to let computers "understand" natural language, so natural language processing is also called natural language understanding. Three levels of NLP analysis techniques:
The NLP analysis technique is roughly divided into three levels: lexical analysis, syntactic analysis, and semantic analysis.
1) Lexical analysis
lexical analysis includes word segmentation, part-of-speech tagging, named entity recognition, and word sense disambiguation.
Word segmentation and part-of-speech tagging are well understood.
Named entity recognition is the task of recognizing named entities such as person names, place names, and organization names in sentences. Each named entity is made up of one or more terms.
Word sense disambiguation is the determination of the true meaning of each or some of the words based on the context of the sentence.
2) Syntactic analysis
The syntactic analysis is to change an input sentence from a sequence form into a tree structure, so that collocation or modification relations among words in the sentence can be captured, and the step is a key step in NLP.
There are two mainstream syntactic analysis methods in the research community at present: phrase structure syntax system, dependency structure syntax system. Where dependency syntax systems have now become a hotspot in studying syntactic analysis.
the dependency grammar has a simple representation form, is easy to understand and label, and can easily represent semantic relations among words, for example, relations such as affairs, time and the like can be formed among sentence components. The semantic relation can be conveniently applied to the aspects of fish semantic analysis, information extraction and the like. Dependencies may also enable more efficient implementation of decoding algorithms.
The syntactic structure obtained by syntactic analysis can help the semantic analysis of the upper layer and some applications, such as machine translation, question answering, text mining, information retrieval and the like.
3) Semantic analysis
The ultimate goal of semantic analysis is to understand the true semantics of a sentence expression. What form to represent semantics at that time has not been well resolved. Semantic role labeling is a relatively mature shallow semantic analysis technique. Given a predicate in a sentence, the task of semantic role labeling is to label the parameters of the predicate, such as the implementation, the story, the time, the location, etc., from the sentence. Semantic role labeling is generally completed on the basis of syntactic analysis, and the syntactic structure is crucial to the performance of semantic role labeling.
The neural network technique CNN is a feedforward neural network whose artificial neurons can respond to a part of the surrounding cells within the coverage range, and has an excellent performance for large-scale image processing. It includes a convolutional layer (convolutional layer) and a pooling layer (Pooling layer).
The basic structure of CNN includes two layers, one of which is a feature extraction layer, and the input of each neuron is connected to a local acceptance domain of the previous layer and extracts the feature of the local. Once the local feature is extracted, the position relation between the local feature and other features is determined; the other is a feature mapping layer, each calculation layer of the network is composed of a plurality of feature mappings, each feature mapping is a plane, and the weights of all neurons on the plane are equal. The feature mapping structure adopts a sigmoid function with small influence function kernel as an activation function of the convolution network, so that the feature mapping has displacement invariance. In addition, since the neurons on one mapping surface share the weight, the number of free parameters of the network is reduced. Each convolutional layer in the convolutional neural network is followed by a computation layer for local averaging and quadratic extraction, which reduces the feature resolution.
CNN is used primarily to identify two-dimensional graphs of displacement, scaling and other forms of distortion invariance. Since the feature detection layer of CNN learns from the training data, explicit feature extraction is avoided when CNN is used, while learning from the training data is implicit; moreover, because the weights of the neurons on the same feature mapping surface are the same, the network can learn in parallel, which is also a great advantage of the convolutional network relative to the network in which the neurons are connected with each other. The convolution neural network has unique superiority in the aspects of voice recognition and image processing by virtue of a special structure with shared local weight, the layout of the convolution neural network is closer to that of an actual biological neural network, the complexity of the network is reduced by virtue of weight sharing, and particularly, the complexity of data reconstruction in the processes of feature extraction and classification is avoided by virtue of the characteristic that an image of a multi-dimensional input vector can be directly input into the network.
The LSTM training content model is a long-short term memory network, is a time recursive neural network, and is suitable for processing and predicting important events with relatively long intervals and delays in a time sequence.
LSTM has found many applications in the scientific field. LSTM based systems may learn tasks such as translating languages, controlling robots, image analysis, document summarization, speech recognition image recognition, handwriting recognition, controlling chat robots, predicting diseases, click rates and stocks, synthesizing music, and so forth.
The LBS accurate positioning search technology is a location-based service, which is a value-added service that acquires the location Information (geographical coordinates or geodetic coordinates) of a mobile terminal user through a radio communication network (such as a GSM network and a CDMA network) of a telecommunication mobile operator or an external positioning mode (such as a GPS), and provides corresponding services for the user under the support of a Geographic Information System (GIS) platform.
A labeling method is realized based on the content marking system, and specifically comprises the following steps: firstly, a professional terminal inputs voice or writes text or takes images with LBS information (three-dimensional coordinates X, Y and Z) wanted by a user through an array microphone, a cloud server recognizes the voice into characters by adopting voice recognition, analyzes the characters by using an NLP technology, and then trains a content model by using an LSTM and stores data into a database cluster.
Secondly, a professional terminal and a vr physical camera are used for photographing, LBS information is taken, the LBS information is reported to a cloud server, the cloud server adopts a neural network technology CNN and LBS accurate positioning search technology, and all information which is reserved in the same direction (three-dimensional coordinates X, Y and Z) and the same scene is inquired;
And finally, data returned by the cloud end can be presented by using a 3d interactive interface on professional terminal equipment, and a user can finally see how many text messages, how many voice messages and how many images are left, so that the game is really realized.
the user only needs to use professional equipment to upload images, characters and voice to the cloud end for permanent storage after processing, next time, someone uses professional equipment to utilize VR technique in the same position, obtain the information that everybody left before here from the cloud end and show, avoid the emergence of uneventful tourism, and let "stay to this trip" permanent storage, simultaneously through picture recognition technology, voice recognition technology, characters semantic analysis technique, the location technology, we carry out the deep analysis to visitor's content, match close relevant content, and form real environment through the virtual integration that shoots of VR, form real and virtual stack and arrived the message, the sound is stayed, stay and shine in same position.
Embodiments of the present invention also provide a computer program product comprising a computer program stored on a non-volatile computer-readable storage medium, the computer program comprising program instructions which, when executed by a computer, cause the computer to perform the method as described above.
The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a general hardware platform, and certainly can also be implemented by hardware. It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware related to instructions of a computer program, which can be stored in a computer readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), or the like.
It should be noted that the description of the present invention and the accompanying drawings illustrate preferred embodiments of the present invention, but the present invention may be embodied in many different forms and is not limited to the embodiments described in the present specification, which are provided as additional limitations to the present invention, and the present invention is provided for understanding the present disclosure more fully. Furthermore, the above-mentioned technical features are combined with each other to form various embodiments which are not listed above, and all of them are regarded as the scope of the present invention described in the specification; further, modifications and variations will occur to those skilled in the art in light of the foregoing description, and it is intended to cover all such modifications and variations as fall within the true spirit and scope of the invention as defined by the appended claims.

Claims (3)

1. A content tagging system, comprising: the system comprises terminal equipment and a cloud server; the terminal equipment is connected with the cloud server through a network;
The terminal device includes: the array microphone, the vr physical camera and the 3d interactive interface;
The cloud server includes: speech recognition, NLP technology, neural network technology CNN, LSTM training content model and LBS accurate positioning search technology.
2. a method of labeling, comprising: firstly, a professional terminal inputs voice or writes text or takes images and brings LBS information through an array microphone, a cloud server recognizes the voice into characters by adopting voice recognition, analyzes the characters by using an NLP technology, and then trains a content model by using an LSTM and stores data into a database cluster;
Secondly, using a professional terminal and a vr physical camera to take a picture and bring LBS information, reporting to a cloud server, and inquiring all information left in the same direction and the same scene by the cloud through a neural network technology CNN and LBS accurate positioning search technology;
And finally, displaying the data returned by the cloud on the professional terminal equipment by using a 3d interactive interface.
3. The method of claim 2, wherein the content presented by the 3d interactive interface is a text message, a voice message or an image.
CN201810468776.4A 2018-05-16 2018-05-16 Content labeling method and system Withdrawn CN110569448A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810468776.4A CN110569448A (en) 2018-05-16 2018-05-16 Content labeling method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810468776.4A CN110569448A (en) 2018-05-16 2018-05-16 Content labeling method and system

Publications (1)

Publication Number Publication Date
CN110569448A true CN110569448A (en) 2019-12-13

Family

ID=68771807

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810468776.4A Withdrawn CN110569448A (en) 2018-05-16 2018-05-16 Content labeling method and system

Country Status (1)

Country Link
CN (1) CN110569448A (en)

Similar Documents

Publication Publication Date Title
CN110795543B (en) Unstructured data extraction method, device and storage medium based on deep learning
CN109658928B (en) Cloud multi-mode conversation method, device and system for home service robot
CN110532571B (en) Text processing method and related device
WO2019214453A1 (en) Content sharing system, method, labeling method, server and terminal device
CN106845411B (en) Video description generation method based on deep learning and probability map model
CN112329467B (en) Address recognition method and device, electronic equipment and storage medium
CN109284357A (en) Interactive method, device, electronic equipment and computer-readable medium
CN110851641B (en) Cross-modal retrieval method and device and readable storage medium
CN113705313A (en) Text recognition method, device, equipment and medium
CN110619050B (en) Intention recognition method and device
CN110580516B (en) Interaction method and device based on intelligent robot
CN111177583A (en) Social platform-based interpersonal analysis method and system
CN114298121A (en) Multi-mode-based text generation method, model training method and device
CN112085120B (en) Multimedia data processing method and device, electronic equipment and storage medium
CN109739965A (en) Moving method and device, equipment, the readable storage medium storing program for executing of cross-cutting dialog strategy
CN115062134B (en) Knowledge question-answering model training and knowledge question-answering method, device and computer equipment
CN114548099A (en) Method for jointly extracting and detecting aspect words and aspect categories based on multitask framework
CN115269828A (en) Method, apparatus, and medium for generating comment reply
Mi et al. Interactive natural language grounding via referring expression comprehension and scene graph parsing
CN112528004A (en) Voice interaction method, voice interaction device, electronic equipment, medium and computer program product
CN111125550B (en) Point-of-interest classification method, device, equipment and storage medium
CN113919360A (en) Semantic understanding method, voice interaction method, device, equipment and storage medium
CN110569448A (en) Content labeling method and system
CN115688758A (en) Statement intention identification method and device and storage medium
CN115018215A (en) Population residence prediction method, system and medium based on multi-modal cognitive map

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication
WW01 Invention patent application withdrawn after publication

Application publication date: 20191213