KR20110124568A - Robot system having voice and image recognition function, and recognition method thereof - Google Patents
Robot system having voice and image recognition function, and recognition method thereof Download PDFInfo
- Publication number
- KR20110124568A KR20110124568A KR1020100044027A KR20100044027A KR20110124568A KR 20110124568 A KR20110124568 A KR 20110124568A KR 1020100044027 A KR1020100044027 A KR 1020100044027A KR 20100044027 A KR20100044027 A KR 20100044027A KR 20110124568 A KR20110124568 A KR 20110124568A
- Authority
- KR
- South Korea
- Prior art keywords
- image
- recognition
- voice
- robot
- phoneme
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 128
- 239000000284 extract Substances 0.000 claims abstract description 10
- 230000033001 locomotion Effects 0.000 claims abstract description 7
- 238000003860 storage Methods 0.000 claims description 15
- 238000003384 imaging method Methods 0.000 claims description 5
- 238000004458 analytical method Methods 0.000 abstract description 5
- 230000006870 function Effects 0.000 description 22
- 238000010586 diagram Methods 0.000 description 20
- 230000009471 action Effects 0.000 description 8
- 238000009826 distribution Methods 0.000 description 6
- 238000013139 quantization Methods 0.000 description 5
- 230000006399 behavior Effects 0.000 description 3
- 230000008859 change Effects 0.000 description 3
- 238000004891 communication Methods 0.000 description 3
- 238000011161 development Methods 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 230000000694 effects Effects 0.000 description 2
- 230000008921 facial expression Effects 0.000 description 2
- 238000002372 labelling Methods 0.000 description 2
- 238000013507 mapping Methods 0.000 description 2
- 241000282412 Homo Species 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000001815 facial effect Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
Images
Classifications
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B25—HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
- B25J—MANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
- B25J13/00—Controls for manipulators
- B25J13/08—Controls for manipulators by means of sensing devices, e.g. viewing or touching devices
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B25—HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
- B25J—MANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
- B25J19/00—Accessories fitted to manipulators, e.g. for monitoring, for viewing; Safety devices combined with or specially adapted for use in connection with manipulators
- B25J19/02—Sensing devices
- B25J19/026—Acoustical sensing devices
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B25—HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
- B25J—MANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
- B25J19/00—Accessories fitted to manipulators, e.g. for monitoring, for viewing; Safety devices combined with or specially adapted for use in connection with manipulators
- B25J19/02—Sensing devices
- B25J19/04—Viewing devices
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B25—HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
- B25J—MANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
- B25J9/00—Programme-controlled manipulators
- B25J9/16—Programme controls
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit
Landscapes
- Engineering & Computer Science (AREA)
- Robotics (AREA)
- Mechanical Engineering (AREA)
- Physics & Mathematics (AREA)
- Image Analysis (AREA)
- Human Computer Interaction (AREA)
- Manipulator (AREA)
- Acoustics & Sound (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Multimedia (AREA)
Abstract
The present invention relates to a robot system, and more particularly to a robot system having a voice and image recognition function and a recognition method thereof.
The robot system of the present invention obtains the surrounding sounds, extracts the optimized phonemes, matches the extracted phonemes with a plurality of pre-stored words, recognizes the user's voice, and captures the user's image to obtain a predetermined pattern from the captured image. The input / output device that extracts and compares the previously stored image data to recognize the image, the drive control device that controls the driving of the robot's driver according to the recognition result of the robot input from the input / output device, and the input / output device is set as the basic input / output device. And a hardware control device for controlling the movement of the driver according to a control method of the management operation device and the drive control device to support the drive control device and to efficiently provide and manage the system environment and required information.
Therefore, the present invention can reduce the cost of the robot by incorporating a robot-specific management and operation apparatus capable of voice and image recognition in the system, and by using a fast Fourier transform function, the wavelength of sound acquired by the voice recognition method for each frequency band. By switching, it is easy to remove noise and accurate phoneme analysis can be made.
Description
The present invention relates to a robot system, and more particularly, to a robot system having a voice and image recognition function and a recognition method thereof.
Recently, according to the development of network and communication technology, various robots using the same have been developed and their use is being gradually progressed. Most of these robots were industrial robots such as manipulators and transport robots for the purpose of automating and unmanning production operations in factories.
Recently, development of a practical robot that supports life as a human partner, that is, supports human activities in various scenes of daily life outside the residential environment, has been progressed. Unlike industrial robots, such practical robots have the ability to learn for themselves how to adapt to different humans or various environments in various aspects of the human living environment. In particular, an autonomous mobile robot having an appearance shape close to that of human appearance can perform a motion close to a human motion, and can perform various motions that focus more on entertainment.
Some mobile robots are equipped with a small camera corresponding to an eye, a sound collecting microphone corresponding to an ear, and the like. In this case, the individual mobile robot may recognize the surrounding environment input as the image information or recognize the language from the input ambient sound by performing image processing on the acquired image.
However, although the voice recognition of the conventional robot can recognize the user's voice well in a quiet space, the user's voice recognition is not properly performed due to a number of noises in the open space. Accordingly, current speech recognition is often made of a remote controller or a touch sensor.
In addition, such a mobile robot does not make a significant contribution to the current society. The biggest reason is that no robot-specific operating system is developed. As a result, a robot usually uses a general PC operating system.
However, the input device of a general PC operating system is a mouse or a keyboard, which is different from image recognition or voice recognition, which is an input device used by a robot. Accordingly, the cost is increased due to the development of the operating system dedicated to the robot, there is a problem that the cost of the robot is formed high.
The problem to be solved by the present invention is to obtain a voice, extract the optimized phoneme of the voice, match the phoneme of the pre-stored words, and remove the noise in the speech recognition process, accurate speech recognition can be achieved robots capable of voice and image recognition It is to provide a system and a recognition method thereof.
In addition, the problem to be solved by the present invention is to provide a robot system and a recognition method capable of speech and image recognition that can reduce the cost of the robot by providing a robot system with a built-in robot operating system capable of voice and image recognition will be.
The robot system having a voice and image recognition function according to the present invention acquires surrounding sounds, extracts optimized phonemes, matches the extracted phonemes with a plurality of pre-stored words, recognizes the user's voice, and displays the user's image. An input / output device for recognizing an image by extracting a predetermined pattern from a captured image and capturing a predetermined image, a drive control device for controlling driving of a driver of the robot according to a recognition result of the robot input from the input / output device; A hardware control device that controls the movement of the driver according to the control method of the management operation device and the drive control device to support the drive control device and to provide and manage the system environment and required information efficiently by setting the input / output device as the basic input / output device. Include.
In this case, the input / output device includes a voice recognition storage unit in which a correspondence relation between a word and the phoneme of the word is stored as a dictionary for voice recognition, an image recognition storage unit stored as a dictionary for image recognition for recognizing a user's image, and surroundings. A sound acquisition unit for acquiring the sound of a voice, an image pickup unit for capturing an image of a user, and an optimized phoneme from a sound acquired by the sound acquisition unit, and matching the phoneme of a plurality of words stored in the phoneme and the voice recognition storage unit. It may include a voice recognition unit for recognizing the user's voice and an image recognition unit for extracting a predetermined pattern from the image captured by the image pickup unit to recognize the image by comparing the image data stored in the image recognition storage unit.
Here, the speech recognizer separates the wavelength of the sound acquired by the sound acquirer using a fast Fourier transform function, and extracts optimized phonemes from the separated frequency domain to consonant-based maximum matching method. Recognize the user's voice by matching the phoneme extracted with the phoneme of the pre-stored word.
The image recognition unit may recognize a face using any one of a knowledge-based determination method, a template combination determination method, a shape vector determination method, and a maximum flow matching method using a sector template.
According to an embodiment of the present invention, a voice recognition method of a robot system having a voice and image recognition function includes: (a) dividing a wavelength of a sound obtained by acquiring surrounding sounds into band-specific frequencies; and (b) frequency by bands. Detecting a valid region by switching to a Z plane and (c) matching a phoneme extracted from the valid region with a phoneme of a pre-stored word to recognize a user's voice.
In this case, in step (a), the sound wavelength is separated into band-specific frequencies using a fast Fourier transform function. Recognize the voice.
The image recognition method of the robot system having a voice and image recognition function according to the present invention includes the steps of (a) capturing a subject and determining whether face recognition or shape recognition is performed on the captured image, and (b) face recognition. (C) if it is determined that shape recognition is performed, converting the captured image into digital data and expressing the converted data as a figure. Recognizing comprises the step of.
At this time, in step (b), the face is recognized by one of a knowledge-based determination method, a template combination determination method, a shape vector determination method, and a maximum flow matching method using a sector template.
The present invention can reduce the unit cost of the robot by embedding the robot dedicated management operation apparatus capable of voice and video recognition in the system.
In addition, according to the present invention, by converting the wavelength of the sound obtained by the speech recognition method for each frequency band using a fast Fourier transform function, it is easy to remove noise and accurate phoneme analysis can be performed.
In addition, the present invention can recognize the speech faster than using the conventional Hidden Markov model method by matching the phoneme extracted from the speech recognition method with the phoneme of the pre-stored word using a consonant-based maximum flow matching method.
In addition, the present invention recognizes a face by using a partial image template method in the image recognition method, it is possible to recognize the image quickly and various forms, it is also possible to recognize the face at a medium distance.
1 is a view showing the configuration of a robot control system according to an embodiment of the present invention.
2 is a view showing a voice recognition method of the robot control system according to an embodiment of the present invention.
3 is a view showing the wavelength of sound in frequency blocks.
4 is a diagram illustrating an effective area detected by separating a wavelength of sound by frequency band and then switching to a Z plane.
5 is a diagram illustrating a phoneme distribution of 'ga' sounds in an effective area.
6 is a diagram illustrating a phoneme distribution of a 'ka' sound and a 'ga' sound in an effective area.
7 is a diagram illustrating matching phonemes by a consonant based maximum flow matching method.
8 is a view showing an image recognition method of the robot system according to an embodiment of the present invention.
9 is a diagram illustrating cells of an image quantized by a knowledge-based face recognition method.
FIG. 10 illustrates a method of recognizing a face using a color histogram as a knowledge-based face recognition method.
11 is a diagram illustrating a template image list of a face.
12 is a diagram illustrating a face recognition method using a shape vector.
13 is a diagram illustrating a maximum flow matching method using a sector template.
FIG. 14 is a diagram illustrating sector division between an obscured face and a rotated face.
Hereinafter, a preferred embodiment of the present invention will be described in detail with reference to the accompanying drawings. However, one embodiment of the present invention may be modified in various forms, and the scope of the present invention should not be construed as being limited due to the embodiments described below. One embodiment of the present invention is provided to more easily explain the present invention to those skilled in the art.
1 is a view showing the configuration of a robot control system according to an embodiment of the present invention.
Referring to FIG. 1, the robot control system according to an exemplary embodiment of the present invention includes an input /
The input /
The driving
The
The
The phoneme recognition method of the robot control system according to the exemplary embodiment of the present invention configured as described above will be described.
2 is a view showing a voice recognition method of the robot control system according to an embodiment of the present invention.
Referring to FIG. 2, in the voice recognition method of the robot control system according to an exemplary embodiment of the present disclosure, when surrounding sounds are acquired by the sound acquisition unit 111 (S210), the wavelength of the acquired sound is determined by the band-specific frequency. Separated into (S220). In this case, a method of separating the acquired sound wavelength into frequencies for each band is performed by using a fast Fourier transform function. In this way, in order to extract the optimized phonemes in the areas separated by the frequency band, it is converted to the Z plane (S230). Thereafter, the effective region is detected to extract optimized phonemes in the valid region (S240). The extracted phoneme is matched with a phoneme of a pre-stored word using a consonant-based maximum matching method (S250) to recognize a user's voice (S260).
Looking at the voice recognition method of the robot control system according to an embodiment of the present invention as described above in more detail with reference to Figures 3 to 7 as follows.
3 is a diagram showing a wavelength of sound as a frequency block, and FIG. 4 is a diagram showing an effective area detected by separating a wavelength of a sound by frequency band and then switching to a Z plane, and FIG. 6 is a diagram illustrating a phoneme distribution of a phoneme, FIG. 6 is a diagram illustrating a phoneme distribution of a 'car' and a 'ga' sound in an effective area, and FIG. 7 is a diagram illustrating a phoneme matching using a consonant-based maximum flow matching method. .
Referring to FIG. 3, the wavelength of the acquired sound is separated into frequencies for each band by using a frequency transform function. In this case, the frequency transform function to be used is a fast Fourier transform (FFT) function as described above. to be. Here is the fast Fourier transform function:
Using the fast Fourier transform function, the wavelength of sound is divided into a high frequency region and a low frequency region, and it is possible to know what values are present in the high frequency region and what values are present in the low frequency region. In addition, when the fast Fourier transform is performed as a complex number, values of a real part and an imaginary part are separated. Accurate phoneme analysis can be achieved only by analyzing the combination of these two values. Moreover, the noise of the acquired sound can be removed using such a fast Fourier transform function. In this way, one block is selected, and if the average value corresponding to noise exists by comparing the average frequency of each band of the previous block and the selected block, it is determined as noise and the corresponding frequency is removed.
In this way, the real part and the imaginary part obtained by separating the frequency bands using the fast Fourier function are switched to the Z plane. In this way, when the switch to the Z plane, as shown in Figure 4, the effective area is shown in the unit circle form. The following equation converts to the Z plane.
In the voice frequency, a region of a specific frequency becomes a valid value for vowels and consonants. An effective region detected using an equation for converting to a Z plane is represented as an ellipse region in the Z plane. Here, the phoneme distribution in the effective area is taken as an example of the 'false' sound, and as shown in FIG. 5, the vowel 'ㅏ' is a long
Referring to FIG. 6, the vowel 'ㅏ' is distributed into the
Accordingly, the Hidden Markov Model, which is commonly used for analyzing desired phonemes, can be used. The Hidden Markov model recognizes the phoneme in a chain way that first checks the 'B' sound and then uses the additional information to check the 'ㅋ' and then checks the vowels. However, such a chain method is very complicatedly connected, and there is a problem in that the execution complexity is very large and a lot of time is generated for phoneme recognition. Accordingly, the robot control system of the present invention intends to analyze phonemes using a consonant-based maximum flow matching method.
Referring to FIG. 7, the consonant-based maximum flow matching method extracts consonants by analyzing consonant frequencies in one block of input data after switching to the Z plane. The extracted
Next, an image recognition method of the robot system according to an exemplary embodiment will be described.
8 is a view showing an image recognition method of the robot system according to an embodiment of the present invention.
Referring to FIG. 8, in the image recognition method of the robot system according to an exemplary embodiment of the present disclosure, after imaging the user's image by the imaging unit 113 (S810), it is determined whether the face is recognized or the shape recognition is performed. S820).
If it is determined that the face is recognized, the stored data is compared with the captured image data (S830). In this case, the comparison method is any one of a knowledge-based face recognition method, a template face recognition method, a face recognition method using a shape vector, and a maximum flow matching method using a sector template. A detailed process of recognizing a face in these four methods will be described later.
When the image captured by any one of the above methods is compared with the pre-stored data, the robot recognizes the user image (S840).
On the other hand, if it is determined that shape recognition, the captured image data is converted into digital data (S850).
After that, the converted data is represented as a figure (S860), the shape of the image is recognized as the expressed figure (S870), and the frequency value and color histogram information in the recognized form are transferred (S880), and the robot displays the image of the user. It will be recognized (S840).
First, in the face recognition method, the face recognition method includes one of four methods, a knowledge-based determination method, a partial image template combining determination method, a shape vector determination method, and a face region divided into sectors to determine the maximum flow matching method. Is done in a way.
Among these four methods, we first look at the knowledge-based decision method. The knowledge-based judgment method is a method of recognizing a face using basic knowledge about a face shape using the corresponding knowledge. For example, a facial recognition method using a prior knowledge such as "the color of a human skin is generally what color".
FIG. 9 is a diagram illustrating a cell of an image quantized by a knowledge-based face recognition method, and FIG. 10 is a diagram illustrating a face recognition method using a color histogram as a knowledge-based face recognition method.
Referring to FIG. 9, after quantizing the captured image for face recognition, the cells of the quantized image are leveled. (a) is the original image, (b) is quantization level 2, (c) is quantization level 3, and (d) is quantization level 4. In this case, although four levels are expressed in FIG. 9, the number of levels may be more or less. The quantization level 4 (d) of the cells of the quantized image is used to obtain the overall color distribution and the approximate shape coordinates of the image and to identify the face. Then, as the level is reduced, the face is recognized while watching a more detailed image. At this time, the size of the face should guarantee the validity area. That is, face recognition is possible on the assumption that an image of at least 200 × 200 size is a face in a 320 × 240 size area.
The knowledge-based face recognition method can recognize an image in a more advanced manner, that is, by dynamically setting the size of a face using a color frequency change rate of the image.
Referring to FIG. 10, when the color is viewed at the quantization level 4, an area of the face may be accurately detected, and the face may be recognized based on the color change rate value of the original image based on this information.
11 is a diagram illustrating a template image list of a face.
The partial image template determination method is a method of determining the position of a face, making a basic form of an image into several templates, and then matching and reading the corresponding templates. As illustrated in FIG. 11, the partial image template determination method divides a face to be recognized into partial regions. In this case, the template is divided into a low frequency region template, a high frequency region template, a rotational template, a partial image template, and the like.
When the face of the image is detected, the template in the low frequency region is first matched. This is because the template in the low frequency region has a lot of the same information, so the template comparison is very fast. In this way, a face candidate is recognized by finding a valid candidate image and gradually comparing the high frequency region templates, and then comparing the templates of the partial image. At this time, the face recognition method using the template does not read the face without the template of the rotated face.
Such a partial image template determination method has an advantage in that image recognition is quick and can recognize various shapes, and face recognition at a medium distance is also possible.
12 is a diagram illustrating a face recognition method using a shape vector.
Referring to FIG. 12, a face recognition method using a shape vector is a method of recognizing a face with directionality of a face shape. Before you use this method, look for a face and know the location of your face's eyes, nose, and mouth. Thereafter, it may be used after face recognition using the template face recognition method described above. The face recognition method using the shape vector can recognize the change of the user's facial expression and can even recognize the facial expression.
FIG. 13 is a diagram illustrating a maximum flow matching method using a sector template, and FIG. 14 is a diagram illustrating sector division between an obscured face and a rotated face.
Referring to FIG. 13, in the maximum flow matching method using the sector template, the image captured by the
In the maximum flow matching method using the sector template, face recognition can be performed even when the face is occluded as shown in (a) or when rotation occurs as shown in (b). Here, the circle means recognition. This is because the matching result is the same even when rotated because the image is divided into sectors and maximum flow matching is used.
Finally, in the shape recognition method, the robot must first convert the image into digital data in order to recognize the shape. In this case, the shape labeling is used to convert an image into digital data. In addition, the digital data includes shape data, pattern data, and color data.
The shape labeling is then expressed as a series of figures (lines, lines, rectangles, circles, ellipses, etc.). When expressed as a series of figures, the robot is very easy to recognize the shape. After the robot recognizes the shape, if the frequency pattern value and color histogram information in the shape are transmitted to the application level, the robot recognizes the user.
The robot system according to an exemplary embodiment of the present invention obtains a voice, extracts an optimized phoneme of the voice, and matches the phoneme of a pre-stored word, thereby enabling accurate voice recognition.
In addition, the robot system according to an embodiment of the present invention can reduce the unit cost of the robot by embedding the robot dedicated management operation apparatus in the system.
In addition, the robot system according to an embodiment of the present invention by converting the wavelength of the sound obtained by the speech recognition method for each frequency band by using a fast Fourier transform function, it is easy to remove noise and accurate phoneme analysis can be made.
In addition, the robot system according to an embodiment of the present invention matches phonemes extracted from a speech recognition method with phoneme of a pre-stored word using a consonant-based maximum flow matching method, which is faster than using a conventional Hidden Markov model method. Can recognize voice
In addition, the robot system according to an embodiment of the present invention recognizes the face using the partial image template method in the image recognition method, it is possible to recognize the image quickly and various forms, it is also possible to recognize the face in the middle distance.
110: input and output device 120: drive control device
130: management operating unit 140: hardware control unit
111: sound acquisition unit 112: storage unit for speech recognition
113: imaging unit 114: storage unit for image recognition
115: voice recognition unit 116: image recognition unit
Claims (9)
Acquire the surrounding sound to extract the optimized phonemes, match the extracted phonemes with a plurality of pre-stored words, recognize the user's voice, capture the user's image, extract a predetermined pattern from the captured image, and store the pre-stored image data. An input / output device that recognizes an image in comparison with the image;
A drive control device for controlling driving of the driver of the robot according to a recognition result of the robot inputted from the input / output device;
A management operation apparatus configured to set the input / output device as a basic input / output device to support the driving control device and to efficiently provide and manage a system environment and required information; And
A hardware control device for controlling the movement of the driver in accordance with a control method of the drive control device;
Robot control system comprising a.
A speech recognition storage unit in which a correspondence between words and phonemes of the words is stored as a dictionary for speech recognition;
An image recognition storage unit stored as a dictionary for image recognition for recognizing a user's image;
A sound acquisition unit for acquiring ambient sounds;
An imaging unit for capturing an image of the user;
A voice recognition unit extracting an optimized phoneme from the sound acquired by the sound acquisition unit, and matching a phoneme with a phoneme of a plurality of words stored in the voice recognition storage unit to recognize a user's voice; And
An image recognizing unit recognizing an image by extracting a predetermined pattern from the image captured by the image capturing unit and comparing the image data stored in the image recognizing storage unit;
Robot system having a voice and image recognition function, characterized in that comprising a.
The wavelength of the sound obtained by the sound acquisition unit is separated by a frequency using a fast Fourier transform function, the optimized phoneme is extracted from the separated frequency domain, and extracted using a consonant based maximum matching method. The robot system having a voice and image recognition function, characterized in that to recognize the user's voice by matching the phoneme of the phoneme and the pre-stored words.
A robot system having a voice and image recognition function, wherein a face is recognized by any one of a knowledge-based determination method, a template combination determination method, a shape vector determination method, and a maximum flow matching method using a sector template.
(a) dividing a wavelength of a sound obtained by acquiring surrounding sounds into band-specific frequencies;
(b) converting the band-specific frequencies to a Z plane to detect an effective region; And
(c) recognizing a user's voice by matching the phoneme extracted from the valid region with a phoneme of a pre-stored word;
Speech recognition method of the robot system comprising a.
A speech recognition method of a robotic system, characterized by separating sound wavelengths into band-specific frequencies using a fast Fourier transform function.
A speech recognition method of a robotic system, characterized in that a user's voice is recognized by matching a phoneme extracted from a consonant-based maximum flow matching method with a phoneme of a pre-stored word.
(a) photographing a subject to determine whether face recognition or shape recognition is performed on the captured image;
(b) if it is determined that face recognition is performed, recognizing a face by comparing previously stored data with a captured image; And
(c) if it is determined that shape recognition is performed, converting the photographed image into digital data and recognizing the shape by representing the converted data as a figure;
Image recognition method of a robotic system comprising a.
A method of image recognition in a robotic system, characterized in that the face is recognized by any one of a knowledge-based determination method, a template combination determination method, a shape vector determination method, and a maximum flow matching method using a sector template.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR20100044027A KR101171047B1 (en) | 2010-05-11 | 2010-05-11 | Robot system having voice and image recognition function, and recognition method thereof |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR20100044027A KR101171047B1 (en) | 2010-05-11 | 2010-05-11 | Robot system having voice and image recognition function, and recognition method thereof |
Publications (2)
Publication Number | Publication Date |
---|---|
KR20110124568A true KR20110124568A (en) | 2011-11-17 |
KR101171047B1 KR101171047B1 (en) | 2012-08-03 |
Family
ID=45394298
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
KR20100044027A KR101171047B1 (en) | 2010-05-11 | 2010-05-11 | Robot system having voice and image recognition function, and recognition method thereof |
Country Status (1)
Country | Link |
---|---|
KR (1) | KR101171047B1 (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108942925A (en) * | 2018-06-25 | 2018-12-07 | 珠海格力智能装备有限公司 | Robot control method and device |
WO2020251074A1 (en) * | 2019-06-12 | 2020-12-17 | 엘지전자 주식회사 | Artificial intelligence robot for providing voice recognition function and operation method thereof |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR102447647B1 (en) * | 2022-05-20 | 2022-09-27 | 주식회사 패스트레인 | Method for thumbnail instance exposure adaptive to estimated user type, and device implementing thereof |
-
2010
- 2010-05-11 KR KR20100044027A patent/KR101171047B1/en active IP Right Grant
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108942925A (en) * | 2018-06-25 | 2018-12-07 | 珠海格力智能装备有限公司 | Robot control method and device |
WO2020251074A1 (en) * | 2019-06-12 | 2020-12-17 | 엘지전자 주식회사 | Artificial intelligence robot for providing voice recognition function and operation method thereof |
US11810575B2 (en) | 2019-06-12 | 2023-11-07 | Lg Electronics Inc. | Artificial intelligence robot for providing voice recognition function and method of operating the same |
Also Published As
Publication number | Publication date |
---|---|
KR101171047B1 (en) | 2012-08-03 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107799126B (en) | Voice endpoint detection method and device based on supervised machine learning | |
CN112088402B (en) | Federated neural network for speaker recognition | |
KR102379954B1 (en) | Image processing apparatus and method | |
CN108346427A (en) | Voice recognition method, device, equipment and storage medium | |
CN111048113B (en) | Sound direction positioning processing method, device, system, computer equipment and storage medium | |
CN104361276A (en) | Multi-mode biometric authentication method and multi-mode biometric authentication system | |
US11825278B2 (en) | Device and method for auto audio and video focusing | |
CN111386531A (en) | Multi-mode emotion recognition apparatus and method using artificial intelligence, and storage medium | |
KR20210052036A (en) | Apparatus with convolutional neural network for obtaining multiple intent and method therof | |
KR20080050994A (en) | System and method for integrating gesture and voice | |
CN109558788B (en) | Silence voice input identification method, computing device and computer readable medium | |
KR102290186B1 (en) | Method of processing video for determining emotion of a person | |
CN112507311A (en) | High-security identity verification method based on multi-mode feature fusion | |
JP2019200671A (en) | Learning device, learning method, program, data generation method, and identification device | |
CN111326152A (en) | Voice control method and device | |
KR20210044475A (en) | Apparatus and method for determining object indicated by pronoun | |
CN116312512A (en) | Multi-person scene-oriented audiovisual fusion wake-up word recognition method and device | |
KR101171047B1 (en) | Robot system having voice and image recognition function, and recognition method thereof | |
KR20210066774A (en) | Method and Apparatus for Distinguishing User based on Multimodal | |
US10917721B1 (en) | Device and method of performing automatic audio focusing on multiple objects | |
CN114239610A (en) | Multi-language speech recognition and translation method and related system | |
Ivanko et al. | A novel task-oriented approach toward automated lip-reading system implementation | |
KR102564570B1 (en) | System and method for analyzing multimodal emotion | |
US11218803B2 (en) | Device and method of performing automatic audio focusing on multiple objects | |
US20220262363A1 (en) | Speech processing device, speech processing method, and recording medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
A201 | Request for examination | ||
E902 | Notification of reason for refusal | ||
E701 | Decision to grant or registration of patent right | ||
GRNT | Written decision to grant | ||
FPAY | Annual fee payment |
Payment date: 20150522 Year of fee payment: 4 |
|
FPAY | Annual fee payment |
Payment date: 20160428 Year of fee payment: 5 |
|
FPAY | Annual fee payment |
Payment date: 20170518 Year of fee payment: 6 |
|
FPAY | Annual fee payment |
Payment date: 20180723 Year of fee payment: 7 |
|
FPAY | Annual fee payment |
Payment date: 20190716 Year of fee payment: 8 |