CN113226928A - Unmanned mobile object and information processing method - Google Patents

Unmanned mobile object and information processing method Download PDF

Info

Publication number
CN113226928A
CN113226928A CN201980085549.6A CN201980085549A CN113226928A CN 113226928 A CN113226928 A CN 113226928A CN 201980085549 A CN201980085549 A CN 201980085549A CN 113226928 A CN113226928 A CN 113226928A
Authority
CN
China
Prior art keywords
sound
unmanned mobile
speaker
person
unmanned
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201980085549.6A
Other languages
Chinese (zh)
Inventor
久原俊介
S·W·约翰
小西一畅
浅井胜彦
井上和夫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Panasonic Intellectual Property Management Co Ltd
Original Assignee
Panasonic Intellectual Property Management Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Panasonic Intellectual Property Management Co Ltd filed Critical Panasonic Intellectual Property Management Co Ltd
Publication of CN113226928A publication Critical patent/CN113226928A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S7/00Indicating arrangements; Control arrangements, e.g. balance control
    • H04S7/30Control circuits for electronic adaptation of the sound field
    • H04S7/302Electronic adaptation of stereophonic sound system to listener position or orientation
    • H04S7/303Tracking of listener position or orientation
    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05DSYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
    • G05D1/00Control of position, course, altitude or attitude of land, water, air or space vehicles, e.g. using automatic pilots
    • G05D1/12Target-seeking control
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B64AIRCRAFT; AVIATION; COSMONAUTICS
    • B64UUNMANNED AERIAL VEHICLES [UAV]; EQUIPMENT THEREFOR
    • B64U2101/00UAVs specially adapted for particular uses or applications
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B64AIRCRAFT; AVIATION; COSMONAUTICS
    • B64UUNMANNED AERIAL VEHICLES [UAV]; EQUIPMENT THEREFOR
    • B64U2201/00UAVs characterised by their flight controls
    • B64U2201/20Remote controls
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R1/00Details of transducers, loudspeakers or microphones
    • H04R1/20Arrangements for obtaining desired frequency or directional characteristics
    • H04R1/32Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only
    • H04R1/323Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only for loudspeakers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R1/00Details of transducers, loudspeakers or microphones
    • H04R1/20Arrangements for obtaining desired frequency or directional characteristics
    • H04R1/32Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only
    • H04R1/326Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only for microphones
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S2400/00Details of stereophonic systems covered by H04S but not provided for in its groups
    • H04S2400/15Aspects of sound capture and related signal processing for recording or reproduction

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Aviation & Aerospace Engineering (AREA)
  • General Physics & Mathematics (AREA)
  • Automation & Control Theory (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Remote Sensing (AREA)
  • Signal Processing (AREA)
  • Acoustics & Sound (AREA)
  • Circuit For Audible Band Transducer (AREA)
  • Measurement Of Velocity Or Position Using Acoustic Or Ultrasonic Waves (AREA)
  • Manipulator (AREA)
  • Health & Medical Sciences (AREA)
  • Otolaryngology (AREA)
  • Traffic Control Systems (AREA)

Abstract

An unmanned mobile body (100) is provided with: a directional speaker (107) for outputting sound in a directional direction; and a processor (150) that obtains one or more pieces of sensed data, wherein the processor (150) determines whether or not a second object is present in the vicinity of the first object on the basis of at least one of the one or more pieces of sensed data, calculates the positional relationship between the first object and the second object on the basis of at least one of the one or more pieces of sensed data when it is determined that the second object is present, determines, on the basis of the positional relationship, a first position of the unmanned moving body (100) where the first object and the second object are included in a range where sound is transmitted by the directional speaker (107) with a mass equal to or greater than a predetermined mass, and moves the unmanned moving body (100) to the first position.

Description

Unmanned mobile object and information processing method
Technical Field
The present disclosure relates to an unmanned mobile body and the like.
Background
Patent document 1 proposes a sound emission control device that controls a sound emission state to the outside of a moving body. Patent document 1 shows that a direction corresponding to the recognized position of the subject is set as a sound emission direction.
(Prior art document)
(patent document)
Patent document 1: japanese patent laid-open No. 2005-319952
However, when outputting or collecting sound for a plurality of objects, it may be difficult to output or collect sound for a plurality of objects in a single body if the control is performed only in accordance with the direction of the sound.
Disclosure of Invention
Accordingly, an object of the present disclosure is to provide an unmanned mobile body capable of outputting or collecting sound integrally with a plurality of objects.
For example, an unmanned moving body according to an aspect of the present disclosure includes: a directional speaker that outputs sound in a directional direction; and a processor that obtains one or more pieces of sensing data, wherein the processor determines whether or not a second object is present in the periphery of a first object based on at least one of the one or more pieces of sensing data, calculates a positional relationship between the first object and a second object based on at least one of the one or more pieces of sensing data when it is determined that the second object is present, determines a first position of the unmanned mobile body based on the positional relationship, and moves the unmanned mobile body to the first position so that the first object and the second object are included in a range in which sound is transmitted by the directional speaker with a mass equal to or greater than a predetermined mass.
For example, an unmanned moving body according to an aspect of the present disclosure includes: a directional microphone that collects sound from a pointing direction; and a processor that obtains one or more pieces of sensed data including data obtained from the directional microphone, wherein the processor determines whether or not a second object is present in the vicinity of a first object based on at least one of the one or more pieces of sensed data, calculates a positional relationship between the first object and the second object based on at least one of the one or more pieces of sensed data when it is determined that the second object is present, determines a first position of the unmanned mobile body based on the positional relationship, and moves the unmanned mobile body to the first position so that the first object and the second object are included in a range in which sound is collected by the directional microphone at a mass equal to or greater than a predetermined mass.
The general or specific aspects may be realized by a non-transitory recording medium such as a system, an apparatus, a method, an integrated circuit, a computer program, or a computer-readable CD-ROM, or may be realized by any combination of a system, an apparatus, a method, an integrated circuit, a computer program, and a recording medium.
An unmanned mobile object or the like according to one aspect of the present disclosure can output or collect sound integrally to a plurality of objects.
Drawings
Fig. 1 is a block diagram showing a basic configuration example of an unmanned mobile unit according to embodiment 1.
Fig. 2 is a flowchart showing a basic operation example of the unmanned mobile unit according to embodiment 1.
Fig. 3 is a conceptual diagram illustrating a specific working example of the unmanned mobile unit according to embodiment 1.
Fig. 4 is a block diagram showing a specific configuration example of the unmanned mobile unit according to embodiment 1.
Fig. 5 is a flowchart showing a specific operation example of the unmanned mobile unit according to embodiment 1.
Fig. 6 is a conceptual diagram illustrating attenuation of sound pressure in embodiment 1.
Fig. 7 is a data diagram showing a relationship between sound pressure of a sound source and sound pressure of a place far from the sound source in embodiment 1.
Fig. 8 is a conceptual diagram illustrating a positional relationship between a speaker and a related person and an unmanned mobile object according to embodiment 1.
Fig. 9 is a data diagram showing the relationship between the distance between the speaker and the person concerned, the sound pressure of the sound emitted by the unmanned mobile object, and the sound output range in embodiment 1.
Fig. 10 is a data diagram showing a relationship between sound pressure of a sound source and a range of sound transmission of sound pressure in a predetermined range in embodiment 1.
Fig. 11 is a conceptual diagram illustrating an example in which the distance between the speaker and the relevant person is 3m in embodiment 1.
Fig. 12 is a conceptual diagram illustrating an example in which the distance between the speaker and the relevant person is 10m in embodiment 1.
Fig. 13 is a conceptual diagram illustrating an example of a person who contacts a speaker according to embodiment 1.
Fig. 14 is a conceptual diagram illustrating an example of a person in contact with a speaker via an object according to embodiment 1.
Fig. 15 is a conceptual diagram illustrating an example of a person involved in a conversation with a speaker according to embodiment 1.
Fig. 16 is a conceptual diagram illustrating an example of a person with a small distance to a speaker according to embodiment 1.
Fig. 17 is a conceptual diagram illustrating an example of a person related to the same garment as the speaker in embodiment 1.
Fig. 18 is a conceptual diagram illustrating an example of a person who is present in a predetermined area together with a speaker according to embodiment 1.
Fig. 19 is a conceptual diagram illustrating an example of a person who is close to a speaker according to embodiment 1.
Fig. 20 is a conceptual diagram illustrating an example of a person involved in speech transmission of a speaker according to embodiment 1.
Fig. 21 is a conceptual diagram illustrating an example of movement of embodiment 1 in which a person involved in a range of speech transmission of a speaker is included in a speech output range.
Fig. 22 is a conceptual diagram showing an example of a person involved in a conversation different from a speaker according to embodiment 1.
Fig. 23 is a conceptual diagram showing an example of a person related to sound output and sound collection in embodiment 1.
Fig. 24 is a conceptual diagram illustrating an example of the sound output position on a straight line passing through the position of the speaker and the position of the person in accordance with embodiment 1.
Fig. 25 is a conceptual diagram illustrating an example of a voice output position near a speaker according to embodiment 1.
Fig. 26 is a conceptual diagram illustrating an example of a sound output position near an elderly person in embodiment 1.
Fig. 27 is a conceptual diagram illustrating an example of correcting the sound output position to the front side with the person concerned as the center in embodiment 1.
Fig. 28 is a conceptual diagram illustrating an example of a voice output position determined by a speaker so that a person concerned is included in a voice output range in embodiment 1.
Fig. 29 is a conceptual diagram illustrating an example of the front side sound output positions of the speaker and the related person in embodiment 1.
Fig. 30 is a conceptual diagram illustrating an example of a sound output position on a straight line in an oblique direction with respect to a horizontal plane in embodiment 1.
Fig. 31 is a conceptual diagram illustrating an example of a sound output position on a horizontal straight line in embodiment 1.
Fig. 32 is a conceptual diagram illustrating an example of the sound output position of embodiment 1 having the same height as the speaker and the related person.
Fig. 33 is a conceptual diagram illustrating an example of a higher sound output position than a speaker and a related person in embodiment 1.
Fig. 34 is a conceptual diagram illustrating an example of the height of the sound output position in embodiment 1.
Fig. 35 is a conceptual diagram illustrating an example of the sound output position for excluding the irrelevant person from the sound output range in embodiment 1.
Fig. 36 is a conceptual diagram illustrating an example of the positional relationship on the horizontal plane of the non-relevant person, the speaker, and the unmanned mobile object in embodiment 1.
Fig. 37 is a conceptual diagram illustrating an example of the positional relationship on the vertical plane of the non-relevant person, the speaker, and the unmanned mobile object in embodiment 1.
Fig. 38 is a conceptual diagram illustrating an example of a voice output position for excluding another person from the voice output range in embodiment 1.
Fig. 39 is a conceptual diagram illustrating an example in which the unmanned mobile object of embodiment 1 moves to the sound output position.
Fig. 40 is a conceptual diagram illustrating an example in which the unmanned mobile object of embodiment 1 starts voice output and then moves to a voice output position.
Fig. 41 is a conceptual diagram illustrating an example in which the unmanned mobile body of embodiment 1 moves to the sound output position via the front side.
Fig. 42 is a conceptual diagram illustrating an example in which the unmanned mobile object of embodiment 1 changes the sound output range.
Fig. 43 is a conceptual diagram illustrating an example of selective operation of movement and change of the audio output range in embodiment 1.
Fig. 44 is a conceptual diagram illustrating an example of a case where the person in accordance with embodiment 1 deviates from the sound output range.
Fig. 45 is a conceptual diagram illustrating an example when another person enters the sound output range in embodiment 1.
Fig. 46 is a block diagram showing a basic configuration example of the unmanned mobile unit according to embodiment 2.
Fig. 47 is a flowchart showing a basic operation example of the unmanned mobile unit according to embodiment 2.
Fig. 48 is a conceptual diagram illustrating a specific working example of the unmanned mobile unit according to embodiment 2.
Fig. 49 is a block diagram showing a specific configuration example of the unmanned mobile unit according to embodiment 2.
Fig. 50 is a flowchart showing a specific operation example of the unmanned mobile unit according to embodiment 2.
Fig. 51 is a conceptual diagram illustrating an example of the sound pickup position on a straight line passing through the position of the speaker and the position of the person in accordance with embodiment 2.
Fig. 52 is a conceptual diagram illustrating an example of the sound pickup position near the speaker in embodiment 2.
Fig. 53 is a conceptual diagram illustrating an example of the sound pickup position near the elderly person in embodiment 2.
Fig. 54 is a conceptual diagram illustrating an example of correcting the sound pickup position to the front side with the person concerned as the center in embodiment 2.
Fig. 55 is a conceptual diagram illustrating an example of a sound collection position determined by a speaker so that a person concerned is included in a sound collection range.
Fig. 56 is a conceptual diagram illustrating an example of sound pickup positions on the front side of the speaker and the related person in embodiment 2.
Fig. 57 is a conceptual diagram illustrating an example of the sound pickup position on a straight line in an oblique direction with respect to the horizontal plane in embodiment 2.
Fig. 58 is a conceptual diagram illustrating an example of the sound pickup position on a horizontal straight line in embodiment 2.
Fig. 59 is a conceptual diagram illustrating an example of the sound collection position having the same height as the speaker and the related person in embodiment 2.
Fig. 60 is a conceptual diagram illustrating an example of a higher sound pickup position than a speaker and related persons in embodiment 2.
Fig. 61 is a conceptual diagram illustrating an example of the height of the sound collection position in embodiment 2.
Fig. 62 is a conceptual diagram illustrating an example of the sound pickup position for excluding non-relevant persons from the sound pickup range in embodiment 2.
Fig. 63 is a conceptual diagram illustrating an example of the positional relationship on the horizontal plane of the non-relevant person, the speaker, and the unmanned mobile object in embodiment 2.
Fig. 64 is a conceptual diagram illustrating an example of the positional relationship on the vertical plane of the non-relevant person, the speaker, and the unmanned mobile object in embodiment 2.
Fig. 65 is a conceptual diagram illustrating an example of the sound pickup position for excluding another person from the sound pickup range in embodiment 2.
Fig. 66 is a conceptual diagram illustrating an example of the sound pickup position determined from the voice uttered by the speaker and the voice uttered by the relevant person according to embodiment 2.
Fig. 67 is a conceptual diagram illustrating an example of the unmanned mobile unit moving to the sound collection position in embodiment 2.
Fig. 68 is a conceptual diagram illustrating an example in which the unmanned mobile unit according to embodiment 2 moves to the sound pickup position via the front side.
Fig. 69 is a conceptual diagram illustrating an example in which the unmanned moving object changes the sound collection range according to embodiment 2.
Fig. 70 is a conceptual diagram illustrating an example of selective operation of movement and change of the sound collection range according to embodiment 2.
Fig. 71 is a conceptual diagram illustrating an example of the case where the person relating to embodiment 2 deviates from the sound collection range.
Fig. 72 is a conceptual diagram illustrating an example of when another person enters the sound collection range in embodiment 2.
Fig. 73 is a conceptual diagram illustrating an example when the group of embodiment 2 enters the sound collection range.
Fig. 74 is a conceptual diagram illustrating an example of when a person in accordance with embodiment 2 enters the sound pickup range.
Fig. 75 is a block diagram showing a basic configuration example of an unmanned mobile unit according to embodiment 3.
Fig. 76 is a flowchart showing a basic operation example of the unmanned mobile unit according to embodiment 3.
Fig. 77 is a conceptual diagram illustrating an example of the sound output range and the sound collection range in embodiment 3.
Fig. 78 is a conceptual diagram illustrating an example of collecting sound from a range in which the sound output range and the sound collection range do not overlap in embodiment 3.
Fig. 79 is a conceptual diagram illustrating an example of adjusting a range in which the sound output range and the sound collection range do not overlap in embodiment 3.
Fig. 80 is a block diagram showing a specific configuration example of the unmanned mobile unit according to embodiment 3.
Fig. 81 is a flowchart showing a specific operation example of the unmanned mobile unit according to embodiment 3.
Detailed Description
(knowledge as a basis for the present disclosure)
An unmanned mobile body that carries a microphone (microphone) and a speaker and performs a conversation with a person is being studied. Such an unmanned moving body may be a robot or an unmanned flying body also called an unmanned aerial vehicle. The unmanned mobile object may recognize the contents of the conversation by itself, such as Artificial Intelligence (AI) mounted on the unmanned mobile object, and may perform a conversation with the person. Alternatively, a remote operator, a remote administrator, or the like of the unmanned mobile body may perform a conversation with a person other than the remote operator, the remote administrator, or the like via the unmanned mobile body.
Furthermore, the unmanned aerial vehicle has a characteristic that it flies by rotating the propeller at a high speed, and therefore, the flying sound is large. Therefore, for example, when an unmanned aerial vehicle makes a conversation with a human, the unmanned aerial vehicle generates a sound with a large sound volume in consideration of the flight sound. Accordingly, the unmanned aerial vehicle can recognize the sound generated by the unmanned aerial vehicle by the person who has a conversation with the unmanned aerial vehicle. On the other hand, a person present in the vicinity of the person who has a conversation with the unmanned aerial vehicle feels uncomfortable due to a loud sound emitted from the unmanned aerial vehicle.
In order to suppress such adverse effects, a directional speaker is used so that the sound emitted from the unmanned aircraft is transmitted only to a person who has a conversation with the unmanned aircraft, in addition to increasing the sound volume. In this case, the directional speaker is directed toward the person who has a conversation with the unmanned flying object. Accordingly, the person who has a conversation with the unmanned aerial vehicle can hear the sound emitted from the unmanned aerial vehicle.
In addition, the directional microphone is used so that a large flying sound is not collected by the microphone as noise, and that only a voice of a person who has a conversation with the unmanned flying object is collected by the microphone. In this case, the directional microphone is directed toward the person who has a conversation with the unmanned flying object. Accordingly, the unmanned aerial vehicle can recognize the voice of the person who has a conversation with the unmanned aerial vehicle.
However, the person who has a conversation with the unmanned flying object is not necessarily a single person. For example, a speaker who has performed a conversation with the unmanned aerial vehicle may perform a conversation with the unmanned aerial vehicle together with a person related to the speaker, such as an acquaintance of the speaker, a family member, or a colleague of a company. In such a case, if the unmanned aerial vehicle only utters a voice to the speaker, it is difficult for the relevant person present in the vicinity of the speaker to hear the voice uttered by the unmanned aerial vehicle. Furthermore, if the unmanned aerial vehicle directs only the microphone toward the speaker, it is difficult to appropriately collect the voice of the person in question present in the vicinity of the speaker.
Therefore, it is difficult for the person concerned to enter the conversation between the unmanned flight object and the speaker, and it is difficult for the person concerned to establish the conversation among the unmanned flight object, the speaker, and the person concerned. Such adverse effects are also caused by unmanned moving objects other than the unmanned flying object, in addition to the unmanned flying object.
Thus, for example, an unmanned moving body according to an aspect of the present disclosure includes: a directional speaker that outputs sound in a directional direction; and a processor that obtains one or more pieces of sensing data, wherein the processor determines whether or not a second object is present in the periphery of a first object based on at least one of the one or more pieces of sensing data, calculates a positional relationship between the first object and a second object based on at least one of the one or more pieces of sensing data when it is determined that the second object is present, determines a first position of the unmanned mobile body based on the positional relationship, and moves the unmanned mobile body to the first position so that the first object and the second object are included in a range in which sound is transmitted by the directional speaker with a mass equal to or greater than a predetermined mass.
Accordingly, the unmanned mobile object can appropriately output sound to the first object and the second object. That is, the unmanned mobile body can output sounds integrally to a plurality of objects.
For example, an unmanned moving body according to an aspect of the present disclosure includes: a directional microphone that collects sound from a pointing direction; and a processor that obtains one or more pieces of sensed data including data obtained from the directional microphone, wherein the processor determines whether or not a second object is present in the vicinity of a first object based on at least one of the one or more pieces of sensed data, calculates a positional relationship between the first object and the second object based on at least one of the one or more pieces of sensed data when it is determined that the second object is present, determines a first position of the unmanned mobile body based on the positional relationship, and moves the unmanned mobile body to the first position so that the first object and the second object are included in a range in which sound is collected by the directional microphone at a mass equal to or greater than a predetermined mass.
Accordingly, the unmanned mobile object can appropriately collect sound from the first object and the second object. That is, the unmanned mobile body can collect sound integrally from a plurality of objects. An unmanned mobile body capable of collecting a plurality of objects integrally.
For example, the processor adjusts the range according to the positional relationship, and determines the first position according to the adjusted range.
Accordingly, the unmanned mobile body can appropriately adjust the range of sound output or sound collection according to the positional relationship, and can appropriately include a plurality of objects within the adjusted range.
The first position is, for example, a position on a front side of the first object and the second object.
Accordingly, the unmanned mobile body can move to an appropriate position for performing a conversation with a plurality of objects.
For example, the processor obtains the body information of the first subject and the body information of the second subject from at least one of the one or more sensing data, and determines the first position from the body information of the first subject and the body information of the second subject.
Accordingly, the unmanned mobile body can move to an appropriate position with respect to the body information of the first object and the body information of the second object.
For example, the processor estimates an age of at least one of the first object and the second object based on at least one of the one or more sensing data, and determines the first position based on the age of at least one of the first object and the second object.
Accordingly, the unmanned mobile body can move to an appropriate position according to age, and can appropriately output or collect sound to a plurality of subjects.
And, for example, the processor determines the first position such that a third object unrelated to the first object and the second object is not included in the range.
Accordingly, the unmanned mobile object can suppress sound output or sound collection to the third object having no relevance.
For example, the processor detects a position of an obstacle based on at least one of the one or more sensing data, and determines the first position based on the position of the obstacle.
Accordingly, the unmanned mobile body can appropriately determine the positions for outputting or collecting sound to a plurality of objects according to the position of the obstacle. Further, the unmanned mobile body can suppress, for example, sound output or sound collection to a third object that is not related, by using an obstacle.
For example, the processor may be configured to, when it is determined that the second object is present while the first object is being subjected to sound output or sound collection, move the unmanned mobile object to the first position in a state where the first object is included in the range.
Accordingly, the unmanned mobile object can move to an appropriate position for performing a conversation with the first object and the second object while continuing the conversation with the first object.
For example, the processor may move the unmanned mobile object to the first position through a front side of the first object when determining that the second object is present while the first object is being subjected to sound output or sound collection.
Accordingly, the unmanned mobile body can move to an appropriate position for performing a conversation with the first object and the second object via an appropriate region for performing a conversation with the first object.
For example, the processor may be configured to, when it is determined that the second object is present while the first object is being subjected to sound output or sound collection, move the unmanned mobile body to the first position while maintaining the quality of sound output or sound collection for the first object at a constant level.
Accordingly, the unmanned mobile object can move to an appropriate position for performing a conversation with the first object and the second object while continuing the conversation with the first object.
For example, the second object is an object related to the first object, and the processor obtains at least one of information indicating a relationship with the first object and information indicating a relationship with the unmanned moving body from at least one of the one or more sensing data, and determines whether or not an object existing in the periphery of the first object is related to the first object from at least one of information indicating a relationship with the first object and information indicating a relationship with the unmanned moving body, thereby determining whether or not the second object is present in the periphery of the first object.
Accordingly, the unmanned mobile body can appropriately determine whether or not the second object related to the first object exists in the periphery of the first object.
For example, the processor detects a frequency at which the first object emits sound and a frequency at which the second object emits sound based on at least one of the one or more pieces of sensing data, and determines a position closer to a side of the first object and the second object that emits sound with a higher frequency than a side of the first object and the second object that emits sound with a lower frequency as the first position.
Accordingly, the unmanned mobile body can move to the vicinity of the object whose sound emission frequency is high. Therefore, the unmanned mobile body can appropriately collect sound from a target having a high frequency of sound emission.
For example, the processor detects a volume of the first object and a volume of the second object based on at least one of the one or more sensing data, and determines a position closer to a volume smaller than a volume larger than the volume smaller than the volume larger than the volume smaller than the volume larger than the volume smaller than the volume larger than the volume smaller than the volume of the first object.
Accordingly, the unmanned mobile body can move to the vicinity of the object with a low sound volume. Therefore, the unmanned mobile body can appropriately collect sound from a subject with a low sound volume.
For example, the unmanned mobile unit further includes a directional microphone, and the range is a range in which sound is collected by the directional microphone with a mass equal to or greater than a predetermined mass.
Accordingly, the unmanned moving object can appropriately output sound to the first object and the second object, and can appropriately collect sound from the first object and the second object.
The processor controls the timing of the movement of the unmanned mobile body, for example, in accordance with a conversation between the first object and the unmanned mobile body.
Accordingly, the unmanned mobile body can move at an appropriate timing corresponding to the session.
The processor may move the unmanned mobile object to the first position while the first object is being picked up.
Accordingly, the unmanned mobile object can move while the first object is estimated to be emitting sound and the unmanned mobile object is not outputting sound. Therefore, the unmanned mobile object can suppress the second object from entering the range of the audio output during the audio output, and can transmit the entire content of the audio output to the second object.
For example, the processor may start the sound output from the directional speaker after the movement of the unmanned mobile object is completed, when the sound emitted from the first object ends while the unmanned mobile object is moving.
Accordingly, the unmanned mobile object can start audio output after moving to an appropriate position for audio output of the first object and the second object. Therefore, the unmanned mobile object can suppress the second object from entering the range of the audio output during the audio output, and can transmit the entire content of the audio output to the second object.
The processor may move the unmanned mobile object while sound output or sound collection to the first object is not performed.
Accordingly, the unmanned mobile body can suppress the division of the sound, and can output or collect the sound in units of the whole. Further, the unmanned mobile body can suppress the mixing of noise due to movement.
The one or more sensing data may include, for example, image data generated by an image sensor, and the processor may obtain the positional relationship from the image data generated by the image sensor.
Accordingly, the unmanned mobile object can appropriately obtain the positional relationship between the first object and the second object from the image data.
The one or more sensing data may include, for example, ranging data generated by a ranging sensor, and the processor may obtain the positional relationship based on the ranging data generated by the ranging sensor.
Accordingly, the unmanned mobile body can appropriately obtain the positional relationship between the first object and the second object based on the distance measurement data.
And, for example, the positional relationship includes at least one of a distance and a position associated with the first object and the second object.
Accordingly, the unmanned mobile object can move to an appropriate position according to the distance or position associated with the first object and the second object.
For example, an information processing method according to one aspect of the present disclosure obtains one or more pieces of sensing data, determines whether or not a second object is present in the vicinity of a first object based on at least one of the one or more pieces of sensing data, calculates a positional relationship between the first object and the second object based on at least one of the one or more pieces of sensing data when it is determined that the second object is present, determines a first position of the unmanned mobile body based on the positional relationship, and moves the unmanned mobile body to the first position so that the first object and the second object are included in a range in which a speaker provided in the unmanned mobile body directionally transmits sound with a mass of a predetermined mass or more.
With this, the information processing method is performed, and thus it is possible to appropriately output audio to the first object and the second object. That is, it is possible to output sound to a plurality of objects integrally.
For example, a program according to an aspect of the present disclosure causes a computer to execute the information processing method.
Accordingly, by executing the program, it is possible to appropriately output sound to the first object and the second object. That is, it is possible to output sound to a plurality of objects integrally.
For example, an information processing method according to one aspect of the present disclosure obtains one or more pieces of sensing data, determines whether or not a second object is present in the vicinity of a first object based on at least one of the one or more pieces of sensing data, calculates a positional relationship between the first object and the second object based on at least one of the one or more pieces of sensing data when it is determined that the second object is present, determines a first position of the unmanned mobile body based on the positional relationship, and moves the unmanned mobile body to the first position so that the first object and the second object are included in a range in which a directional microphone provided in the unmanned mobile body collects sound at a mass equal to or greater than a predetermined mass.
With this, the information processing method is performed, and thus, sound can be appropriately collected from the first object and the second object. That is, a plurality of objects can be collected integrally.
For example, a program according to an aspect of the present disclosure causes a computer to execute the information processing method.
Accordingly, by executing the program, it is possible to appropriately collect sound from the first object and the second object. That is, a plurality of objects can be collected integrally.
Further, these general and specific aspects may be realized by a non-transitory recording medium such as a system, an apparatus, a method, an integrated circuit, a computer program, or a computer-readable CD-ROM, or may be realized by any combination of a system, an apparatus, a method, an integrated circuit, a computer program, and a recording medium.
Hereinafter, the embodiments will be described in detail with reference to the drawings. The embodiments described below are intended to show general or specific examples of the present disclosure. The numerical values, shapes, materials, constituent elements, arrangement positions and connection forms of the constituent elements, steps, order of the steps, and the like shown in the following embodiments are merely examples and are not intended to limit the embodiments. Among the components of the following embodiments, components that are not described in the embodiment showing the highest concept will be described as arbitrary components.
In the following description, the ordinal numbers of the first, second, third, etc. may be assigned to the elements. These ordinal numbers are used to identify elements and are assigned to the elements, and do not necessarily correspond to a meaningful order. These ordinal numbers may be replaced as appropriate, may be newly assigned, or may be removed.
In the following description, the sound pressure may be replaced with a sound pressure level or a sound volume, or the sound volume may be replaced with a sound pressure or a sound pressure level. Also, the conversation may be replaced with a communication.
(embodiment mode 1)
Fig. 1 is a block diagram showing a basic configuration example of an unmanned mobile body according to the present embodiment. Fig. 1 shows an unmanned mobile body 100 provided with a directional speaker 107 and a processor 150.
The unmanned mobile body 100 is a device that moves. For example, the unmanned mobile body 100, autonomously moves or is stationary. The unmanned mobile unit 100 may be moved according to an operation when receiving the operation. The unmanned moving object 100 is typically an unmanned flying object, but may be an apparatus that travels on a surface, not limited to an unmanned flying object. The unmanned mobile body 100 may include a moving mechanism such as a motor and an actuator for moving in the air or on a surface.
The unmanned mobile body 100 may further include one or more sensors. For example, the unmanned moving object 100 may include an image sensor, a distance measurement sensor, a microphone as an audio sensor, a human detection sensor, and a position detector as a position sensor.
Directional speaker 107 is a speaker that outputs sound in a direction of directivity. The directivity direction of directional speaker 107 may be adjusted, and the sound pressure of the sound emitted from directional speaker 107 may be adjusted. The directivity direction of directional speaker 107 may be expressed as a sound output direction.
The processor 150 is configured by a circuit that performs information processing. For example, the processor 150 may control the movement of the unmanned mobile unit 100. Specifically, the processor 150 may control the movement of the unmanned mobile unit 100 by controlling the operation of a moving mechanism such as a motor and an actuator for moving in the air or on a surface.
Processor 150 may also adjust the directivity direction of directional speaker 107 or adjust the sound pressure of the sound emitted from directional speaker 107 by transmitting a control signal to directional speaker 107. Further, processor 150 may adjust the direction of unmanned mobile object 100 to adjust the directivity of directional speaker 107.
Fig. 2 is a flowchart showing a basic operation example of the unmanned mobile body 100 shown in fig. 1. Mainly, the processor 150 of the unmanned mobile unit 100 performs the operation shown in fig. 2.
First, the processor 150 obtains one or more pieces of sensing data (S101). The processor 150 may obtain one or more pieces of sensed data from one or more sensors inside the unmanned mobile body 100, or may obtain one or more pieces of sensed data from one or more sensors outside the unmanned mobile body 100. The processor 150 may obtain a plurality of pieces of sensing data from one or more sensors inside the unmanned mobile body 100 and one or more sensors outside the unmanned mobile body 100.
For example, an image sensor, a distance measuring sensor, a microphone, a human detection sensor, a position detector, or the like may be used for one or more sensors outside the unmanned mobile body 100.
The processor 150 determines whether or not a second object is present around the first object based on at least one of the acquired one or more sensing data (S102). For example, the first object is a speaker and the second object is a person associated with the speaker. However, each of the first object and the second object may be not only a human but also an animal or a device. The periphery of the first object is a predetermined range with respect to the first object.
If it is determined that the second object exists in the periphery of the first object, the processor 150 calculates the positional relationship between the first object and the second object from at least one of the one or more sensing data (S103). That is, the processor 150 derives the positional relationship between the first object and the second object from at least one of the one or more sensing data.
For example, the positional relationship includes at least one of a position and a distance related to the first object and the second object. The positional relationship may include the respective positions of the first object and the second object, and may also include the distance between the first object and the second object.
Specifically, the processor 150 may calculate the position of the first object, the position of the second object, the distance between the first object and the second object, and the like using image data obtained from the image sensor. Further, the processor 150 may calculate a distance between the unmanned mobile body 100 and the first object, a distance between the unmanned mobile body 100 and the second object, a distance between the first object and the second object, and the like using the ranging data obtained from the ranging sensor.
The processor 150 then determines the first position based on the calculated positional relationship. The first position is a position where the unmanned mobile object 100 such as the first object and the second object is included in a range where sound is transmitted by the directional speaker 107 with a mass equal to or greater than a predetermined mass. Then, the processor 150 moves the unmanned mobile object 100 to the determined first position (S104).
Accordingly, the unmanned mobile object 100 can appropriately output sound to the first object and the second object. That is, the unmanned mobile object 100 can output sound integrally for a plurality of objects.
For example, the second object is an object related to the first object. The processor 150 may determine whether or not an object existing in the periphery of the first object is related to the first object based on at least one of the one or more sensing data. Further, based on this, the processor 150 may determine whether or not the second object exists in the periphery of the first object.
At this time, the processor 150 may obtain at least one of information showing a relation with the first object and information showing a relation with the unmanned mobile body 100 from at least one of the one or more sensing data. The processor 150 may determine whether or not an object existing in the periphery of the first object is related to the first object, based on at least one of the information indicating the relation with the first object and the information indicating the relation with the unmanned mobile object 100.
Specifically, the processor 150 may determine that the object existing in the periphery of the first object is related to the first object when the object existing in the periphery of the first object satisfies one or more of the plurality of conditions.
For example, the plurality of conditions may include "contact with the first object", "conversation with the first object", "existing at a distance equal to or less than a threshold value with respect to the first object", "existing in a predetermined area together with the first object", "associated with the first object", "close to the first object", "existing in a range of voice transmission of the first object", "uttering voice to the unmanned mobile body 100 in a conversation between the first object and the unmanned mobile body 100", and "watching the unmanned mobile body 100 in a conversation between the first object and the unmanned mobile body 100", or the like.
Fig. 3 is a conceptual diagram illustrating a specific working example of the unmanned mobile body 100 illustrated in fig. 1. In this example, the unmanned mobile body 100 is an unmanned flying body also called an unmanned aerial vehicle. The speaker corresponds to the first object and the related person corresponds to the second object.
For example, the unmanned mobile object 100 outputs a voice to a speaker in the vicinity of the speaker. Then, the unmanned mobile 100 determines whether or not a person is present around the speaker.
For example, the unmanned mobile vehicle 100 senses the vicinity of the speaker using a sensor provided in the unmanned mobile vehicle 100, and determines whether or not a person is present in the vicinity of the speaker based on the result. Specifically, as the sensor provided in the unmanned mobile object 100, an image sensor can be used. When the unmanned mobile unit 100 determines that the person existing around the speaker is the person related to the speaker, it determines that the person related to the speaker exists around the speaker.
When the unmanned mobile unit 100 determines that the relevant person is present in the vicinity of the speaker, the voice output position is determined so that the speaker and the relevant person are included in the voice output range of the voice transmission generated by the unmanned mobile unit 100. The sound output range of the sound transmission from the unmanned mobile body 100 may be determined according to the direction of directivity of the directional speaker 107.
Then, the unmanned mobile object 100 moves to the determined sound output position and outputs sound. Accordingly, the unmanned mobile object 100 can transmit the voice to the speaker and the related person included in the voice output range.
Fig. 4 is a block diagram showing a specific configuration example of the unmanned mobile unit 100 shown in fig. 3. The unmanned mobile body 100 shown in fig. 4 includes a GPS receiver 101, a gyro sensor 102, an acceleration sensor 103, a human detection sensor 104, a distance measurement sensor 105, an image sensor 106, a directional speaker 107, a directional microphone 108, a drive unit 109, a communication unit 110, a control unit 120, a storage unit 130, and a power supply unit 141.
The GPS receiver 101 is a receiver that constitutes a GPS (global Positioning system) for measuring a position and receives a signal to obtain a position. For example, the GPS receiver 101 obtains the position of the unmanned mobile object 100. That is, the GPS receiver 101 operates as a sensor for detecting the position of the unmanned mobile body 100.
The gyro sensor 102 is a sensor that detects the posture of the unmanned mobile body 100, that is, the angle or inclination of the unmanned mobile body 100. The acceleration sensor 103 is a sensor that detects the acceleration of the unmanned mobile body 100. The human detection sensor 104 is a sensor that detects a human in the periphery of the unmanned mobile body 100. The human detection sensor 104 may be an infrared sensor.
The distance measurement sensor 105 is a sensor that measures the distance between the unmanned mobile body 100 and the object, and generates distance measurement data. The image sensor 106 is a sensor for performing imaging, and generates an image by imaging. The image sensor 106 may also be a camera.
Directional speaker 107 is a speaker that outputs sound in the direction of directivity as described above. The directivity direction of directional speaker 107 may be adjusted, and the sound pressure of the sound emitted from directional speaker 107 may be adjusted. The directional microphone 108 is a microphone that collects sound from a directional direction. The directivity direction of the directional microphone 108 may be adjusted, or the sound collection sensitivity of the directional microphone 108 may be adjusted. The pointing direction of the directional microphone 108 may also be expressed as a sound pickup direction.
The driving unit 109 is a motor, an actuator, and the like that move the unmanned mobile body 100. The communication unit 110 is a communicator that communicates with a device outside the unmanned mobile body 100. The communication unit 110 may receive an operation signal for moving the unmanned mobile vehicle 100. The communication unit 110 may transmit and receive the contents of the session.
The control unit 120 corresponds to the processor 150 shown in fig. 1, and is configured by a circuit that performs information processing. Specifically, in this example, the control unit 120 includes a human detection unit 121, a related person determination unit 122, an audio output range determination unit 123, an audio output position determination unit 124, an audio output control unit 125, and a movement control unit 126. That is, the processor 150 may also play their role.
The human detector 121 detects a human present in the periphery of the unmanned mobile object 100. The human detector 121 detects a human present in the periphery of the unmanned mobile object 100 based on sensing data obtained from the human detection sensor 104 or another sensor.
The related person determination unit 122 determines whether or not the person detected by the person detection unit 121 is a related person related to the speaker. The audio output range determining unit 123 determines the audio output range based on the positional relationship between the speaker and the person concerned. The sound output position determining unit 124 determines a sound output position based on the determined sound output range. Sound output control unit 125 transmits a control signal to directional speaker 107 to control sound output from directional speaker 107.
The movement control unit 126 transmits a control signal to the drive unit 109 to control the movement of the unmanned mobile unit 100. In this example, the movement control unit 126 controls the flight of the unmanned aerial vehicle 100.
The storage unit 130 is a memory for storing information, and stores a control program 131 and sound pressure/sound output range correspondence information 132. The control program 131 is a program for information processing performed by the control unit 120. Sound pressure/sound output range correspondence information 132 is information showing a correspondence relationship between the sound pressure of the sound emitted from directional speaker 107 and the sound output range of the transmitted sound having a mass equal to or greater than a predetermined value.
The power supply unit 141 is a circuit that supplies power to a plurality of components included in the unmanned mobile unit 100. The power supply section 141 includes, for example, a power source.
Fig. 5 is a flowchart showing a specific operation example of the unmanned mobile body 100 shown in fig. 3. For example, the plurality of components of the unmanned mobile body 100 shown in fig. 4 perform the operations shown in fig. 5 in an interlocking manner.
First, the unmanned mobile 100 moves to a conversation position for conversation with the speaker (S111). For example, the conversation position is a position where the voice uttered by the speaker is transmitted from the position of the speaker and the sound uttered by the unmanned mobile 100 is transmitted. The speaker may also be predetermined. The unmanned mobile 100 may determine the speaker during flight.
For example, in the unmanned mobile body 100, the human detector 121 detects a speaker based on sensing data obtained from the human detection sensor 104, the image sensor 106, or the like. The movement control unit 126 moves the unmanned mobile object 100 to a conversation position within a predetermined range with respect to the speaker via the drive unit 109.
Then, the unmanned mobile 100 starts a session (S112). That is, the unmanned mobile unit 100 starts at least one of sound output and sound collection. For example, audio output control unit 125 causes directional speaker 107 to start audio output. The control unit 120 may start sound collection by the directional microphone 108.
Then, the unmanned moving object 100 senses the periphery of the speaker (S113). For example, the human detection unit 121 detects a human around the speaker by sensing the human around the speaker by the human detection sensor 104, the image sensor 106, or the like. For this detection, any sensor for detecting a person can be used. The periphery of the speaker corresponds to, for example, an area within a predetermined range with respect to the speaker.
Then, the unmanned mobile 100 determines whether or not a person other than the speaker is detected (S114). For example, the human detection unit 121 determines whether or not a human other than the speaker is detected around the speaker. When a person other than the speaker is not detected (no in S114), the unmanned mobile object 100 repeats sensing of the surroundings of the speaker (S113).
When a person other than the speaker is detected (yes in S114), the unmanned mobile unit 100 determines whether or not the detected person is a person related to the speaker (S115). For example, the related person determination unit 122 may determine whether the detected person is a related person based on whether the distance between the speaker and the related person is within a threshold value, or may determine whether the detected person is a related person based on another determination criterion for grouping or the like. This determination will be described later.
When the detected person is not the relevant person (no in S115), the unmanned mobile object 100 repeats sensing of the vicinity of the speaker (S113).
When the detected person is the relevant person (yes in S115), the unmanned mobile 100 measures the separation distance between the speaker and the relevant person (S116). For example, the sound output range determining unit 123 may calculate a distance between the position of the speaker detected from the sensing data and the position of the person concerned detected from the sensing data, and may measure the distance between the speaker and the person concerned.
Then, the unmanned mobile 100 determines the audio output range based on the distance between the speaker and the relevant person (S117). For example, the audio output range determining unit 123 determines the audio output range based on the measured distance. In this case, the audio output range determining unit 123 increases the audio output range as the distance measured increases.
The sound output range is a range relatively determined by using the unmanned mobile object 100 as a reference, for example, and is also a range in which sound is transmitted by the directional speaker 107 with a mass equal to or higher than a predetermined mass. The predetermined mass or more may correspond to a sound pressure within a predetermined range, or may correspond to a signal-to-noise ratio (signal-to-noise ratio) within a predetermined range. The determination of the audio output range will be described later.
Then, the unmanned mobile object 100 determines the voice output position based on the position of the speaker, the position of the relevant person, and the voice output range (S118). For example, the voice output position determination unit 124 determines the voice output position so as to include the detected position of the speaker and the detected position of the person concerned within the determined voice output range. The determination of the sound output position will be described later.
Then, the unmanned mobile object 100 moves to the audio output position (S119). For example, the movement control unit 126 controls the operation of the driving unit 109 to move the unmanned mobile vehicle 100 to the audio output position. Sound output control unit 125 may control the sound output of directional speaker 107 so that sound is transmitted to the sound output range with a quality equal to or higher than a predetermined quality.
Accordingly, the unmanned mobile object 100 can appropriately output a voice to the speaker and the related person.
In the above example, the unmanned mobile vehicle 100 performs the processing (S113 to S119) for moving to the audio output position after the conversation with the speaker is started (after S112), but the processing for moving to the audio output position may be performed before the conversation with the speaker is started.
In the above example, when the detected person is not the relevant person (no in S115), the unmanned mobile object 100 repeats sensing of the vicinity of the speaker (S113). However, the unmanned mobile object 100 may correct the sound output position so as not to output sound to a person (third object) other than the relevant person. That is, the sound output position determination unit 124 of the unmanned mobile object 100 may correct the sound output position so that a person other than the relevant person is not included in the sound output range.
The sound output position determination unit 124 may correct the sound output position so that a person other than the person concerned deviates from the sound output direction. This can suppress the possibility of a person other than the person concerned entering the sound output range when the person moves.
When the sound output range is fixed, that is, when the sound pressure of the sound emitted from the directional speaker 107 is fixed, the unmanned mobile unit 100 may determine whether or not the distance between the speaker and the person concerned is within the sound output range. Further, when the distance is within the audio output range, the unmanned mobile object 100 may determine the audio output position and move to the determined audio output position. The unmanned mobile object 100 may not move when the distance is not within the sound output range.
Fig. 6 is a conceptual diagram illustrating attenuation of sound pressure. It can be estimated that if the distance from the sound source (specifically, a single sound source) is doubled, the sound pressure is attenuated by 6 dB. In the example of fig. 6, the sound pressure is 74dB at a position 1m from the sound source. At a position 2m from the sound source, the sound pressure was 68 dB. At the position 4m from the sound source, the sound pressure was 62 dB. At the position of 8m from the sound source, the sound pressure was 56 dB.
Fig. 7 is a data diagram showing a relationship between sound pressure of a sound source and sound pressure of a place far from the sound source. In this example, when the sound pressure of the sound source is 56dB, the sound pressure of a location 2m away from the sound source is 50dB, the sound pressure of a location 4m away from the sound source is 44dB, and the sound pressure of a location 8m away from the sound source is 38 dB.
When the sound pressure of the sound source is 62dB, the sound pressure at a location 2m away from the sound source is 56dB, the sound pressure at a location 4m away from the sound source is 50dB, and the sound pressure at a location 8m away from the sound source is 44 dB. When the sound pressure of the sound source is 68dB, the sound pressure at a location 2m away from the sound source is 62dB, the sound pressure at a location 4m away from the sound source is 56dB, and the sound pressure at a location 8m away from the sound source is 50 dB.
Therefore, when the sound pressure of the sound source is 56dB, the sound is transmitted at a sound pressure of 50dB or more in a range of 2m from the sound source. When the sound pressure of the sound source is 62dB, the sound is transmitted at a sound pressure of 50dB or more in a range of 4m from the sound source. When the sound pressure of the sound source is 68dB, the sound is transmitted at a sound pressure of 50dB or more in a range of 8m from the sound source. By using such characteristics, the sound pressure of the sound emitted from the unmanned mobile body 100 and the sound output range extending from the unmanned mobile body 100 in the sound output direction are determined.
Fig. 8 is a conceptual diagram showing the positional relationship of the speaker, the relevant person, and the unmanned mobile object 100. For example, the sound pressure of the sound transmitted from the unmanned mobile 100 to the speaker and the related person may be set to 50dB or more. Further, the distance between the speaker or the related person, which is close to the unmanned mobile vehicle 100, and the unmanned mobile vehicle 100 may be set to 0.5m or more. For example, on such a premise, the sound pressure of the sound emitted from the unmanned mobile unit 100 and the sound output range extending from the unmanned mobile unit 100 in the sound output direction are determined.
Fig. 9 is a data diagram showing the relationship between the distance between the speaker and the person concerned, the sound pressure of the sound emitted from the unmanned mobile unit 100, and the sound output range according to the present embodiment. For example, on the premise described with reference to fig. 6 to 8, when the sound pressure of the sound emitted by the unmanned mobile body 100 is 56dB, the sound output range extending from the unmanned mobile body 100 in the sound output direction, that is, the range in which the sound is transmitted at a sound pressure of 50dB or more is the range of the directivity width × 2 m. The directional width is a width in which sound expands in a direction perpendicular to the sound output direction.
When the sound pressure of the sound emitted from the unmanned mobile object 100 is 62dB, the sound output range extending from the unmanned mobile object 100 in the sound output direction is a range of the directivity width × 4 m. When the sound pressure of the sound emitted from the unmanned mobile object 100 is 68dB, the sound output range extending from the unmanned mobile object 100 in the sound output direction is a range of the directivity width × 8 m.
The unmanned mobile 100 outputs a voice at a position spaced apart from the speaker and the person concerned by at least 0.5 m. Therefore, when the distance between the speaker and the relevant person is in the range of 0m to 1.5m, the speaker and the relevant person can be included in the range of 2m from the unmanned mobile 100. Therefore, in this case, the unmanned mobile body 100 can determine the sound pressure of the sound emitted by the unmanned mobile body 100 to 56dB, and can determine the sound output range extending from the unmanned mobile body 100 in the sound output direction to the directivity width × 2 m.
Similarly, when the distance between the speaker and the relevant person is in the range of 1.5m to 3.5m, the unmanned mobile body 100 can determine the sound pressure of the sound emitted by the unmanned mobile body 100 to be 62dB, and can determine the sound output range extending from the unmanned mobile body 100 in the sound output direction to be the directivity width × 4 m. Similarly, when the distance between the speaker and the relevant person is in the range of 3.5m to 7.5m, the unmanned mobile body 100 can determine the sound pressure of the sound emitted by the unmanned mobile body 100 to 68dB, and can determine the sound output range extending from the unmanned mobile body 100 in the sound output direction to the directional width × 8 m.
Fig. 10 is a data diagram showing a relationship between sound pressure of a sound source and a range of sound transmission of sound pressure in a predetermined range. In this example, specifically, the sound pressure in the prescribed range is 46 to 54 dB.
In this example, when the sound pressure of the sound source is 60dB, the sound is transmitted to a position 2m away from the sound source at a sound pressure of 54dB, to a position 4m away from the sound source at a sound pressure of 48dB, to a position 8m away from the sound source at a sound pressure of 42dB, and to a position 16m away from the sound source at a sound pressure of 36 dB. When the sound pressure of the sound source is 70dB, the sound is transmitted to a position 2m away from the sound source at a sound pressure of 64dB, to a position 4m away from the sound source at a sound pressure of 58dB, to a position 8m away from the sound source at a sound pressure of 52dB, and to a position 16m away from the sound source at a sound pressure of 46 dB.
Therefore, regarding the position where sound is transmitted at a sound pressure of 46 to 54dB, the position is approximately 2 to 5m from the sound source in the case where the sound pressure of the sound source is 60dB, and the position is approximately 6 to 16m from the sound source in the case where the sound pressure of the sound source is 70 dB. That is, regarding the range in which sound is transmitted at a sound pressure of 46 to 54dB, the sound source has a length of 3m in the case where the sound pressure of the sound source is 60dB, and has a length of 10m in the case where the sound pressure of the sound source is 70 dB.
For example, the sound output position is determined so that the speaker and the related person are included in the range where the sound is transmitted at the sound pressure in the predetermined range.
Fig. 11 is a conceptual diagram showing an example in which the separation distance between the speaker and the relevant person is 3 m. Specifically, an example of the voice output position determined from fig. 10 is shown in the case where the distance between the speaker and the relevant person is 3 m. When the distance between the speaker and the person concerned is 3m, the unmanned mobile unit 100 makes a sound at 60dB, so that the speaker and the person concerned who are 3m apart from each other can be included in a range where a sound is transmitted at a sound pressure of 46 to 54 dB.
Then, the unmanned mobile 100 moves to a sound output position where the speaker and the person concerned can be included in a range where the sound is transmitted at a sound pressure of 46 to 54dB, and emits the sound at 60 dB. Accordingly, the unmanned mobile 100 can transmit sound at a sound pressure of 46 to 54dB to the speaker and the person concerned.
Fig. 12 is a conceptual diagram showing an example in which the separation distance between the speaker and the relevant person is 10 m. Specifically, an example of the voice output position determined from fig. 10 is shown in the case where the distance between the speaker and the relevant person is 10 m. In the case where the separation distance between the speaker and the person concerned is 10m, the unmanned mobile unit 100 makes a sound at 70dB, so that the speaker and the person concerned, which are 10m apart from each other, can be included in a range where a sound is transmitted at a sound pressure of 46 to 54 dB.
Then, the unmanned mobile 100 moves to a sound output position where the speaker and the person concerned can be included in a range where the sound is transmitted at a sound pressure of 46 to 54dB, and emits the sound at 70 dB. Accordingly, the unmanned mobile 100 can transmit sound at a sound pressure of 46 to 54dB to the speaker and the person concerned.
Furthermore, the unmanned mobile 100 emits sound at 70dB, and speakers and persons who are 3m away from each other can be included in a range where sound is transmitted at a sound pressure of 46 to 54 dB. However, if the sound pressure is increased, power consumption also increases. Then, the unmanned mobile 100 makes a sound at 60dB when the speaker is 3m away from the person concerned.
That is, the unmanned mobile unit 100 generates a sound having a minimum sound pressure that can include a speaker and a person involved in the transmission of the sound at a sound pressure equal to or higher than a predetermined mass. Then, the unmanned mobile body 100 determines the sound output position based on the minimum sound pressure, and moves to the determined sound output position. This can reduce power consumption. In addition, the audio output range is reduced, and the possibility that the person who is not involved is included in the audio output range is reduced.
The unmanned mobile unit 100 may determine the voice output position so that the current sound pressure of the voice transmitted to the speaker is maintained and a new voice is transmitted to the relevant person. This can suppress the sense of discomfort felt by the speaker. In this case, the minimum sound pressure may not be used to maintain the current sound pressure of the voice transmitted to the speaker. That is, a sound pressure greater than the minimum sound pressure may be used.
In fig. 6 to 12, although the noise generated by the unmanned mobile body 100 is not considered, the noise generated by the unmanned mobile body 100 may be considered. For example, a greater sound pressure may also be utilized.
The relationship between the sound pressure and the sound output range as described above may be stored in the storage unit 130 as the sound pressure/sound output range correspondence information 132.
Next, an example of determining whether or not a detected person is a reference of a related person will be described with reference to fig. 13 to 23. Basically, the unmanned mobile object 100 performs image recognition processing for capturing an image generated around a speaker, and determines the person concerned. The judgment of the person may be performed before the conversation or during the conversation. For example, the person-related determination unit 122 of the unmanned mobile vehicle 100 determines the person based on the following criteria.
Fig. 13 is a conceptual diagram showing an example of a person who is in contact with a speaker. The unmanned mobile 100 may determine a person who contacts the speaker as the relevant person. When the time during which the person contacts the speaker has elapsed for a predetermined time, the unmanned mobile unit 100 may determine that the person contacting the speaker is the relevant person. Accordingly, the unmanned mobile object 100 can suppress a determination error that occurs when a human erroneously touches a speaker.
Further, fig. 13 shows an example in which the parent is the speaker and the child is the related person among the parents and children who are holding hands, but the speaker and the related person may be reversed.
Fig. 14 is a conceptual diagram showing an example of a person concerned who comes into contact with a speaker via an object. The unmanned mobile 100 may determine, as the relevant person, not only a person who directly contacts the speaker but also a person who contacts the speaker via an object. In the example of fig. 14, the person is in contact with the speaker via a wheelchair. In this case, the unmanned mobile 100 may determine a person who contacts the speaker via the wheelchair as the relevant person.
Further, in the same manner as the example in fig. 13, when the time during which the human contacts the speaker via the object has elapsed for a predetermined time, the unmanned mobile object 100 may determine the human contacting the speaker as the relevant person. Although fig. 14 shows an example in which the person who takes the wheelchair is the speaker and the person who pushes the wheelchair is the related person, the speaker and the related person may be reversed.
Fig. 15 is a conceptual diagram showing an example of a person involved in a conversation with a speaker. The unmanned mobile 100 may determine a person having a conversation with the speaker as the relevant person. For example, the unmanned moving object 100 may determine that a speaker is a related person when the speaker opens the mouth of the speaker detected by the image recognition processing. For example, the unmanned moving object 100 may determine that a speaker is a related person when the speaker opens its mouth as detected by the image recognition processing.
Further, fig. 15 shows an example in which a person who opens the mouth to a speaker is the person concerned, but the speaker and the person concerned may be reversed.
Fig. 16 is a conceptual diagram showing an example of a person concerned whose distance to a speaker is small. The unmanned mobile 100 may determine a person having a small distance from the speaker as the relevant person. For example, the unmanned moving object 100 detects the position of the speaker and the positions of the persons other than the speaker, and calculates the distance between the speaker and the persons other than the speaker based on the positions of the speaker and the positions of the persons other than the speaker. When the calculated distance is equal to or less than the threshold value, the unmanned mobile object 100 determines that the person is a related person.
When a predetermined time has elapsed since the speaker or the person other than the speaker whose distance to the speaker is equal to or less than the threshold value, the unmanned mobile unit 100 may determine the person as the relevant person. Accordingly, the unmanned mobile unit 100 can suppress a determination error that occurs due to the speaker or the temporary proximity to a person other than the speaker.
Fig. 17 is a conceptual diagram showing an example of a person who is in the same clothing as a speaker. The unmanned mobile 100 may determine a person having the same clothing as the speaker as the relevant person. Specifically, the unmanned mobile 100 may determine a person under the same uniform as the speaker as the relevant person. For example, the unmanned moving object 100 may perform image recognition processing to determine whether or not the clothing of the speaker and the clothing of the person other than the speaker are identical to each other. When the clothing of the speaker is the same as the clothing of the person other than the speaker, the unmanned mobile 100 may determine the person as the relevant person.
In addition, when the clothing of the speaker and the clothing of the person other than the speaker are the same, and the same clothing is different from the other person, the unmanned mobile 100 may determine the person having the clothing the same as the speaker as the related person.
For example, when the clothing of a plurality of people is the same, the unmanned mobile body 100 may be configured such that a plurality of people having the same clothing are not related people. More specifically, in the case where most people wear business suits and ties, there is a possibility that these people are not the concerned people. Therefore, the unmanned mobile body 100 can suppress a determination error by determining that many people having the same clothing are not related persons.
Fig. 18 is a conceptual diagram showing an example of a person who is present in a predetermined area together with a speaker. The unmanned mobile 100 may determine a person present in the predetermined area together with the speaker as the relevant person. Here, the predetermined area may be a pre-registered location that the speaker uses together with the relevant person. Specifically, as shown in fig. 18, the predetermined area may be a place where a bench is installed. The predetermined area may be the periphery of one table, a conference room, or a vehicle on which a small number of people such as a boat can ride.
Fig. 19 is a conceptual diagram showing an example of a person concerned who is close to a speaker. The unmanned mobile 100 may determine a person approaching the speaker as the relevant person.
For example, the unmanned mobile 100 may detect the position of the speaker and the positions of persons other than the speaker as needed, detect a person near the speaker, and determine the person near the speaker as a relevant person. The person close to the speaker is highly likely to be a person related to the speaker, and can be estimated as intending to listen to the sound emitted from the unmanned mobile body 100. Therefore, the unmanned mobile unit 100 can appropriately output a voice to a person who is close to the speaker by determining the person as a relevant person.
Further, the unmanned mobile 100 may determine that a person other than the speaker is a related person when the person approaches within a predetermined range with respect to the speaker. Further, the unmanned mobile 100 may determine that a person other than the speaker is a related person when a predetermined time has elapsed since the person approaches a state within a predetermined range with respect to the speaker.
Fig. 20 is a conceptual diagram illustrating an example of related persons existing in the range of speech transmission of a speaker. The unmanned mobile 100 may determine a person existing in the range of the speaker's voice transmission as the relevant person. For example, the unmanned mobile unit 100 estimates the range of transmission of a voice uttered by a speaker from the sound pressure of the voice uttered by the speaker. Then, the unmanned mobile unit 100 determines a person existing in the estimated range as a relevant person. In this example, the threshold value corresponding to the example described with reference to fig. 16 is determined according to the sound pressure of the voice uttered by the speaker.
Fig. 21 is a conceptual diagram illustrating an example of movement in which a person related to speech transmission of a speaker is included in a voice output range.
The unmanned mobile 100 responds to the listening of the speaker when the conversation with the speaker is performed. Even if a person other than the speaker hears the response made by the unmanned mobile unit 100 in a state where the person cannot hear the hearing made by the speaker, it is difficult for the person other than the speaker to understand the meaning of the response. Therefore, the unmanned mobile unit 100 determines a person existing in the range of the speaker's voice transmission as a relevant person, and moves so that the person existing in the range of the speaker's voice transmission is included in the voice output range.
This can prevent a person other than the speaker from hearing confusion due to listening to either the audio or the response.
Fig. 22 is a conceptual diagram showing an example of a person involved in a conversation with the unmanned mobile 100, which is different from a speaker. When a person other than the speaker speaks into the unmanned mobile 100 while the unmanned mobile 100 is in conversation with the speaker, the person speaking into the unmanned mobile 100 may be determined as the relevant person. For example, when the unmanned mobile unit 100 detects a voice or speech from a direction different from the speaker while a conversation is being conducted with the speaker using the directional microphone 108, it determines a person present in the direction as a relevant person.
For example, when a speech different from the speech of the speaker is detected while the unmanned mobile object 100 is in conversation with the speaker, the person who uttered the speech may be determined as the relevant person. In this case, a non-directional microphone may also be used. Further, since it can be estimated that the relevant person is located near the speaker, the unmanned mobile object 100 may determine the person who uttered the voice as the relevant person when a voice different from the voice of the speaker is detected from the same direction as the speaker while the conversation is being performed with the speaker by the directional microphone 108.
For example, when a person other than the speaker speaks into the unmanned mobile unit 100 with respect to the content in accordance with the context of the conversation content while the unmanned mobile unit 100 is in conversation with the speaker, the person speaking into the unmanned mobile unit 100 may be determined as the relevant person. That is, the unmanned mobile 100 may determine that the person speaking into the unmanned mobile 100 is a person who is not involved when a person other than the speaker speaks into the unmanned mobile 100 with respect to the content that does not conform to the context of the conversation during the conversation with the speaker.
Any one of the plurality of criteria described with reference to fig. 13 to 22 may be used, or a combination of any two or more of these criteria may be used. When each of the plurality of persons is determined as the relevant person, the unmanned mobile unit 100 may select the relevant person from the plurality of persons determined as the relevant person. That is, the unmanned mobile object 100 may select a final relevant person to transmit sound from among a plurality of relevant persons.
For example, the unmanned mobile unit 100 may select, as the final relevant person, a relevant person closest to the speaker from among the plurality of relevant persons.
For example, the unmanned mobile body 100 may select one or more relevant persons so that the number of relevant persons entering the sound output range is the largest. More specifically, for example, the unmanned mobile object 100 may select one or more relevant persons so that the number of relevant persons present on a straight line passing through the position of the speaker is the largest. Accordingly, the unmanned mobile body 100 can appropriately output sound to a larger number of persons concerned.
For example, the unmanned mobile body 100 may select a person with high accuracy as the final person from a plurality of persons determined as the persons concerned.
Specifically, for example, the unmanned mobile object 100 may select the final relevant person based on an accuracy level predetermined for each determination criterion. Here, the accuracy level may be predetermined to be high for a criterion of determination of whether or not the speaker is in contact with (fig. 13 and 14), a criterion of determination of whether or not the speaker is in conversation with (fig. 15), or the like. The accuracy level may be determined to be medium, for example, as a criterion for determining whether the garment is the same as the speaker (fig. 16), or as a criterion for determining whether the garment is present in a predetermined area together with the speaker (fig. 18).
Further, the unmanned mobile object 100 may select, as the final relevant person, a person determined as the relevant person with a higher accuracy of the determination criterion, from among a plurality of persons determined as the relevant persons, respectively.
Alternatively, the unmanned mobile unit 100 may select, as the final relevant person, a person having more conditions among the plurality of determination criteria that are satisfied, from among the plurality of persons determined as the relevant person. For example, when the condition (fig. 16) close to the speaker, the condition (fig. 16) close to the speaker for a predetermined time or more, and the condition (fig. 17) of the same clothing as the speaker are satisfied, the number of the satisfied conditions is 3. The final person concerned can also be selected on the basis of the number thus counted.
Alternatively, the unmanned mobile unit 100 may evaluate the number of satisfied conditions by weighting the number of satisfied conditions by an accuracy level predetermined for each determination criterion.
Alternatively, the unmanned mobile object 100 may select, as the final relevant person, a person who is to be vocally transferred only by movement within a predetermined range such as a region on the front side of the speaker, from among a plurality of persons who are individually determined to be relevant persons. Accordingly, the speaker can appropriately continue the conversation. The unmanned mobile body 100 can be suppressed from moving greatly within the predetermined range. Therefore, the speaker can continue the conversation appropriately without much concern about the movement of the unmanned mobile vehicle 100.
Alternatively, the unmanned mobile unit 100 may select a person who appropriately performs sound output and sound collection as the final person of interest from among a plurality of persons who are determined to be the persons of interest, respectively. That is, the unmanned mobile 100 may select a person suitable for sound output and sound collection. In other words, the unmanned mobile 100 may select an appropriate person to be involved in a conversation with the unmanned mobile 100.
Fig. 23 is a conceptual diagram showing an example of a person who is suitable for sound output and sound collection. The unmanned mobile 100 selects, as the final relevant person, a person existing in the sound output range where sound is transmitted from the directional speaker 107 and existing in the sound collection range where sound is collected by the directional microphone 108. Here, the sound collection range may be determined based on a sound pressure of a human voice set to be an average in advance. The sound pickup range can also be expressed as a human voice pickup range.
For example, an overlapping range of the sound output range and the sound collection range is determined as a conversation range. The unmanned mobile 100 selects a person included in the conversation area as a final relevant person.
Further, as the unmanned mobile unit 100 moves, the sound output range and the sound collection range change, and the conversation range also changes. Therefore, the unmanned mobile 100 may simulate the movement of the unmanned mobile 100 and select, as the final relevant person, a relevant person that can enter the conversation area together with the speaker from among the plurality of relevant persons.
Furthermore, a plurality of screening methods described above may be combined as appropriate. Fig. 23 shows an example of screening for a plurality of related persons, but the present invention may be applied to determination of related persons. That is, the unmanned mobile 100 may determine a person present at a position suitable for a conversation as the relevant person. Further, another screening method may be used for a method of finally determining a relevant person for a plurality of relevant person candidates.
Although not shown in fig. 13 to 23, the unmanned moving object 100 may determine whether or not a person other than the speaker is a person related to the speaker by face recognition. For example, the speaker and the face of the person related to the speaker may be managed in advance in association with each other. Further, the unmanned mobile 100 may determine a person other than the speaker as the relevant person when the face of the person matches the face associated with the speaker as the face of the relevant person of the speaker. The unmanned mobile object 100 may use other features such as physical characteristics, in addition to the face.
Further, when the person other than the speaker is facing the unmanned mobile vehicle 100 while the unmanned mobile vehicle 100 is in conversation with the speaker, it can be estimated that the person other than the speaker is interested in the conversation between the unmanned mobile vehicle 100 and the speaker. Therefore, in this case, the unmanned mobile unit 100 may determine the person as the relevant person.
The unmanned mobile unit 100 determines the position of the unmanned mobile unit 100 as a voice output position so that the speaker and the related person can be included in the voice output range. For example, the audio output position determination unit 124 of the unmanned mobile object 100 determines the audio output position. Hereinafter, a more specific method for determining the audio output position will be described with reference to fig. 24 to 38.
Fig. 24 is a conceptual diagram showing an example of the sound output position on a straight line passing through the position of the speaker and the position of the relevant person. In this example, the unmanned mobile object 100 determines, as the sound output position, a position on a straight line passing through the position of the speaker and the position of the person concerned, and including the positions of the speaker and the person concerned in a sound output range relatively determined according to the position. Accordingly, the unmanned mobile object 100 can appropriately output the voice to the speaker and the related person along the voice output direction.
Fig. 25 is a conceptual diagram illustrating an example of a sound output position close to a speaker. For example, the unmanned mobile object 100 outputs a voice from a voice output position outside the speaker and the relevant person toward the speaker and the relevant person on a straight line passing through the position of the speaker and the position of the relevant person. In the example of fig. 25, the unmanned mobile object 100 outputs a voice at a voice output position on the speaker side. That is, the unmanned mobile object 100 determines a position close to the speaker as the voice output position.
It can be estimated that a predetermined speaker makes a conversation with the unmanned mobile vehicle 100 more times than the relevant person. Further, when there is a person concerned between the unmanned mobile 100 and the speaker, the person concerned may interfere with the conversation between the unmanned mobile 100 and the speaker. Therefore, the position close to the speaker is determined as the voice output position, and a conversation can be smoothly performed more times.
Fig. 26 is a conceptual diagram illustrating an example of a sound output position near an elderly person. The unmanned mobile 100 may determine a position close to the elderly as the sound output position instead of the position close to the speaker. For example, when the speaker is not determined in advance, the unmanned mobile object 100 determines a position close to the elderly as the sound output position. The unmanned mobile 100 may estimate the age by face recognition.
Hearing can be estimated as decreasing with age. The unmanned mobile body 100 determines a position close to the elderly person as a sound output position, and can transmit sound of a higher sound pressure to the elderly person. Therefore, the unmanned mobile unit 100 can compensate for the reduced hearing.
In the case of parents and children, a position close to the elderly, that is, a position close to parents is determined as a sound output position. This enables the child to be separated from the unmanned mobile body 100.
Further, the unmanned mobile unit 100 may determine that a person whose estimated age is equal to or greater than a predetermined age is an elderly person. Further, when it is determined that one of the speaker and the related person is an elderly person, the unmanned moving object 100 may determine a position close to the elderly person as the sound output position. When both the speaker and the relevant person are determined to be elderly persons, the unmanned moving object 100 may determine a position at an equal distance from both the speaker and the relevant person as the sound output position, or may determine the sound output position based on other conditions.
Fig. 27 is a conceptual diagram illustrating an example of correcting the sound output position to the front side with the person concerned as the center. As shown in the upper side of fig. 27, even when the unmanned mobile unit 100 is present at the lateral position of the speaker and the related person, the unmanned mobile unit 100 can transmit the voice to the speaker and the related person. On the other hand, when the unmanned mobile body 100 is present on the front side of the lateral position of the speaker and the related person, the speaker and the related person can easily perform a conversation with the unmanned mobile body 100.
That is, the unmanned mobile 100 is present on the front side of the speaker and the related person, and thus, a smooth conversation can be provided to the speaker and the related person. Therefore, the unmanned mobile object 100 may correct the sound output position to the front side of the speaker and the person concerned.
Specifically, as shown in the lower side of fig. 27, the unmanned mobile object 100 may correct the sound output position to the front side of the speaker and the relevant person along a circle estimated with the relevant person as the center. Accordingly, the unmanned mobile object 100 can correct the sound output position without changing the distance to the relevant person.
The unmanned mobile object 100 may correct the sound output position along a circle estimated by the speaker as the center. Accordingly, the unmanned mobile object 100 can correct the voice output position without changing the distance to the speaker. However, the unmanned mobile 100 can suppress the variation in the distance between each of the speaker and the related person and the unmanned mobile 100 by using the circle estimated with the one of the speaker and the related person which is far from the unmanned mobile 100 as the center.
The unmanned mobile object 100 may be configured to correct the voice output position along the circle, move in the front direction of at least one of the speaker and the related person, and direct the voice output direction to at least one of the speaker and the related person. The unmanned mobile body 100 may correct the sound output position to a position in the front direction for performing such an operation.
Fig. 28 is a conceptual diagram showing an example of a voice output position determined by a speaker so that the relevant person is included in the voice output range. For example, as shown in the upper side of fig. 28, the unmanned mobile 100 exists on the front side of the speaker during conversation with the speaker. Then, as shown in the lower side of fig. 28, the unmanned mobile object 100 may move along a circle estimated by the speaker as the center so that the person is included in the sound output range. In this case, the unmanned mobile object 100 may determine the voice output position along a circle estimated by the speaker as the center.
Accordingly, the unmanned mobile object 100 can move to a position where it can transmit voice to the speaker and the person concerned without changing the distance to the speaker.
Fig. 29 is a conceptual diagram illustrating an example of the front side sound output positions of a speaker and a related person. When the distance between the speaker and the relevant person in the lateral direction perpendicular to the front direction of the speaker and the relevant person is within the pointing width, the unmanned mobile object 100 may determine the front side positions of the speaker and the relevant person as the sound output positions.
Accordingly, the unmanned mobile unit 100 can output a voice to the speaker and the related person from the front side positions of the speaker and the related person. That is, the unmanned mobile 100 can perform a conversation from the front side positions of the speaker and the relevant person. Therefore, the unmanned mobile 100 can provide a smooth conversation to the speaker and the related person.
Further, it can be estimated that the front side positions of the speaker and the related person are more suitable for conversation than the lateral positions of the speaker and the related person. Therefore, the unmanned mobile unit 100 may preferentially determine the front side positions of the speaker and the relevant person as the sound output positions.
Fig. 30 is a conceptual diagram illustrating an example of sound output positions on a straight line in an oblique direction with respect to a horizontal plane. For example, the unmanned mobile object 100 may obtain the body information of the speaker and the body information of the person through image recognition processing, face recognition processing, or the like. The unmanned mobile 100 may determine the voice output position based on the body information of the speaker and the body information of the person concerned. The body information may be height or face height.
Specifically, when the height of the face of the speaker is far from the height of the face of the person concerned, the unmanned mobile object 100 determines, as the sound output position, a position on a straight line passing through the position of the face of the speaker and the position of the face of the person concerned, that is, a position where the speaker and the person concerned are included in the sound output range. In this case, the unmanned mobile body 100 outputs sound along a sound output direction inclined with respect to the horizontal plane.
Accordingly, the unmanned mobile object 100 can appropriately output the voice to the speaker and the related person along the voice output direction.
Examples of the case where the height of the face of the speaker is far from the height of the face of the person concerned include a case where the speaker and the person concerned are a parent and a child, and a case where the speaker and the person concerned are a person who takes a wheelchair and a person who pushes a wheelchair. Also, fig. 30 shows an example in which a parent is a speaker and a child is a related person, but the speaker and the related person may be reversed.
Also, for sound output in an oblique direction, sound output from low to high and sound output from high to low can be estimated. When the sound is output from low to high, the flying height is low, and therefore, the flying becomes difficult, and there is a possibility that the person may come into contact with the flying height. Also, the unmanned mobile 100 approaches a small child. Therefore, the audio output can be performed from high to low. This can suppress the possibility of collision or the like, for example.
Fig. 31 is a conceptual diagram illustrating an example of a sound output position on a horizontal straight line. When the face of the speaker and the face of the person concerned enter the directional width of the audio output, the unmanned mobile object 100 may determine, as the audio output position, a position in which the face of the speaker and the face of the person concerned are included in the audio output range and which is used for audio output in the horizontal direction. That is, when the height of the face of the speaker and the height of the face of the person concerned are far from each other beyond a predetermined range, the unmanned mobile object 100 may determine a position for outputting a voice in an oblique direction as a voice output position as shown in fig. 30.
In other words, when the difference between the height of the face of the speaker and the height of the face of the person concerned is within a predetermined range, the unmanned mobile object 100 may not change the height. Accordingly, the process can be simplified. However, the height of the unmanned mobile body 100 is increased, so that the possibility of collision or the like can be suppressed, and a smooth conversation can be provided.
Fig. 32 is a conceptual diagram showing an example of the sound output position at the same height as the speaker and the related person. As described above, the unmanned mobile object 100 may determine a position for outputting sound in the horizontal direction as the sound output position. Accordingly, the process can be simplified.
However, in this case, the unmanned mobile object 100 may come into contact with a person. Further, since a person far from the unmanned mobile vehicle 100 and the unmanned mobile vehicle 100 have a conversation over a person near the unmanned mobile vehicle 100, it is difficult to perform the conversation. Specifically, in the example of fig. 32, the person concerned has a conversation with the unmanned mobile object 100 over the speaker, and therefore, it is difficult to perform the conversation.
Fig. 33 is a conceptual diagram showing an example of a higher sound output position than a speaker and a related person. The unmanned mobile 100 may preferentially determine a position higher than the speaker and the relevant person as the sound output position. Accordingly, the unmanned mobile body 100 can suppress the possibility of collision or the like. Further, the unmanned mobile 100 can provide a smooth conversation to a person close to the unmanned mobile 100 as well as to a person far from the unmanned mobile 100.
Fig. 34 is a conceptual diagram illustrating an example of the height of the sound output position. If the sound output position is too high, the angle at which the speaker and the related person look up at the unmanned mobile body 100 becomes too large. Accordingly, the speaker and the related person perform a conversation while looking up at the unmanned mobile object 100, and thus it is difficult to perform a smooth conversation.
Therefore, the upper limit of the height of the sound output position or the angle between the sound output direction and the horizontal plane may be set. For example, the upper limit of the height of the sound output position may be set according to the distance between the unmanned mobile object 100 and one of the speaker and the related person that is close to the unmanned mobile object 100. For example, the upper limit of the height of the sound output position is set to be lower as the distance between the speaker and the person concerned is closer. Accordingly, the angle at which the speaker and the related person look up at the unmanned mobile object 100 can be suppressed to be small.
Fig. 35 is a conceptual diagram showing an example of a sound output position for excluding a non-relevant person from a sound output range. When it is determined that a person other than the speaker is not the relevant person, the unmanned mobile object 100 may determine the voice output position so that the person determined not to be the relevant person is not included in the voice output range. That is, the unmanned mobile object 100 may determine the voice output position so that the non-relevant person is not included in the voice output range when it is determined that the person other than the speaker is the non-relevant person.
For example, the unmanned mobile vehicle 100 determines the sound output position so that the distance between the unmanned mobile vehicle 100 and the person not concerned increases, and moves to the sound output position. Accordingly, the unmanned mobile body 100 can transmit the sound to the non-relevant person with difficulty.
For example, the unmanned mobile object 100 determines the voice output position in a range where voice is transmitted to a speaker without being transmitted to a person other than the relevant person. That is, the unmanned mobile object 100 determines the voice output position so that the irrelevant person is not included in the voice output range and the speaker is included in the voice output range. Accordingly, the unmanned mobile 100 can output a voice to a speaker without outputting a voice to a person other than the person concerned.
The unmanned mobile 100 may determine the voice output position so that the person other than the relevant person is not included in the voice output range and the speaker and the relevant person are included in the voice output range. Accordingly, the unmanned mobile 100 can output the voice to the speaker and the relevant person without outputting the voice to the non-relevant person.
Fig. 36 is a conceptual diagram illustrating an example of the positional relationship on the horizontal plane of the non-relevant person, the speaker, and the unmanned mobile body 100. For example, the unmanned mobile vehicle 100 may determine a position not far from the person as the sound output position, as in the upper example of fig. 36. However, in the upper example of fig. 36, the irrelevant person enters the sound output direction, and therefore, there is a possibility that sound is transmitted to the irrelevant person.
Therefore, the unmanned mobile object 100 may determine the sound output position so that the person other than the relevant person deviates from the sound output direction, as in the lower example of fig. 36. Specifically, the unmanned mobile object 100 may determine, as the sound output position, a position that is not included on a straight line passing through the position of the speaker and the position of the person concerned. Accordingly, the unmanned mobile body 100 can suppress the possibility of sound transmission to the non-relevant person.
In the lower example of fig. 36, the unmanned mobile vehicle 100 may determine the voice output position so that the person other than the relevant person is not included in the voice output range and the speaker and the relevant person are included in the voice output range.
Fig. 37 is a conceptual diagram illustrating an example of the positional relationship on the vertical plane of the non-relevant person, the speaker, and the unmanned mobile object. When the voice is output in the horizontal direction from the unmanned mobile object 100, the possibility that the non-relevant person enters the voice output range or the voice output direction is high, and the voice may be transmitted to the non-relevant person. Therefore, the unmanned mobile 100 may output the voice to the speaker from above the speaker. Accordingly, the unmanned mobile body 100 can suppress the possibility that the non-relevant person enters the sound output range or the sound output direction, and can suppress the possibility that the sound is transmitted to the non-relevant person.
The unmanned mobile 100 may determine the height of the voice output position so that the non-related person is not included in the voice output range and the speaker and the related person are included in the voice output range.
Fig. 38 is a conceptual diagram illustrating an example of a sound output position for excluding other persons from the sound output range. The position of the unmanned mobile vehicle 100 where the speaker is sandwiched between the unmanned mobile vehicle 100 and an obstacle may be determined as the voice output position in order to exclude other people from the voice output range. The unmanned mobile object 100 may move to a voice output position and output a voice to a speaker. Accordingly, the unmanned mobile body 100 can suppress the possibility of sound transmission to other people.
Here, the obstacle is, for example, a physical environment that hinders other people from entering the sound output range. The obstacle may be a physical environment that hinders expansion of the sound output range, or may be a physical environment that a person cannot pass through. Specifically, the obstacle may be a wall, a building, or a cliff.
The unmanned mobile body 100 may detect the position of an obstacle by image recognition processing, or may detect the position of an obstacle by an obstacle detection sensor not shown in the figure.
The unmanned mobile body 100 may specify the position of the obstacle from map information including the position of the obstacle. The map information may be stored in advance in the storage unit 130 of the unmanned mobile vehicle 100, or may be input from an external device to the unmanned mobile vehicle 100 by the communication unit 110 of the unmanned mobile vehicle 100. Further, the unmanned mobile body 100 may detect the position of the unmanned mobile body 100 and detect the position of the obstacle from the map information.
For example, in the upper side of fig. 38, there is no obstacle on the opposite side across the speaker from the unmanned mobile body 100, and therefore there is a possibility that another person enters the sound output range. On the other hand, in the lower side of fig. 38, since an obstacle such as a wall exists on the opposite side from the unmanned mobile object 100 across the speaker, it is possible to suppress the possibility of another person entering the sound output range.
In addition, in addition to the speaker, the person concerned may be considered. Specifically, the position of the unmanned mobile object 100 where the speaker is sandwiched between the unmanned mobile object 100 and an obstacle and the speaker is determined as the sound output position. Accordingly, the unmanned mobile 100 can output the voice to the speaker and the related person without outputting the voice to the other person.
As for the method of determining the sound output position, any one of the plurality of determination methods described with reference to fig. 24 to 38 may be employed, or a combination of two or more of these determination methods may be employed. Next, a plurality of examples of the movement of the unmanned mobile body 100 and the like will be described.
Fig. 39 is a conceptual diagram illustrating an example in which the unmanned mobile body 100 moves to the sound output position. For example, when the unmanned mobile object 100 moves to the voice output position while outputting voice to the speaker, the unmanned mobile object moves to the voice output position so that the speaker does not deviate from the voice output range during the movement. Accordingly, the unmanned mobile unit 100 can continue to transmit voice to the speaker.
Specifically, in this case, the unmanned mobile object 100 moves to the sound output position while directing the directional speaker 107 toward the speaker. The unmanned mobile 100 moves within a predetermined distance from the speaker. The predetermined distance corresponds to the length of the sound output range in the sound output direction. The unmanned mobile object 100 may create a movement path within a predetermined distance from the speaker and move to the sound output position along the created movement path. Accordingly, the unmanned mobile object 100 can move to the voice output position so that the speaker does not deviate from the voice output range during movement.
The unmanned mobile unit 100 may change the sound pressure of the sound emitted from the unmanned mobile unit 100 according to the distance between the unmanned mobile unit 100 and the speaker so that the sound pressure of the sound transmitted to the speaker is maintained constant during the movement of the unmanned mobile unit 100. For example, when the unmanned mobile unit 100 is far from the speaker, the unmanned mobile unit 100 may move while increasing the sound pressure of the sound emitted from the unmanned mobile unit 100. Instead, when the unmanned mobile unit 100 approaches the speaker, the unmanned mobile unit 100 may move while reducing the sound pressure of the sound emitted from the unmanned mobile unit 100.
Fig. 40 is a conceptual diagram illustrating an example in which the unmanned mobile body 100 starts sound output and then moves to a sound output position. For example, the unmanned mobile object 100 moves to a voice output position while outputting voice to a speaker. That is, the unmanned mobile object 100 starts audio output and then moves to an audio output position. In this case, in performing the sound output, the person concerned enters the sound output range. Therefore, it is difficult for the person to grasp the top content among the contents of the audio output.
Therefore, the unmanned mobile 100 may control the timing of the movement of the unmanned mobile 100 according to the conversation between the unmanned mobile 100 and the speaker.
Specifically, the unmanned mobile unit 100 may move to the voice output position while the speaker speaks to the unmanned mobile unit 100. It can be estimated that the unmanned mobile 100 does not output sound while the speaker is speaking with the unmanned mobile 100. Therefore, the unmanned mobile body 100 can suppress the movement to the sound output position while performing sound output, and can suppress the entry of the person into the sound output range during sound output.
For example, the unmanned mobile unit 100 may determine whether or not the speaker is speaking with the unmanned mobile unit 100 by the image recognition processing, or may determine whether or not the speaker is speaking with the unmanned mobile unit 100 by the directional microphone 108.
The unmanned mobile 100 may move to the voice output position while the speaker is collecting sound. While the speaker is collecting sound, it can be estimated that the speaker is speaking with the unmanned mobile unit 100, and the unmanned mobile unit 100 can be estimated, and no sound output is performed. Therefore, the unmanned mobile unit 100 moves to the voice output position while collecting the speaker, and can suppress the person from entering the voice output range during the voice output.
The unmanned mobile unit 100 may control whether or not to move according to the state of the sound collected by the directional microphone 108. Specifically, when the condition of the sound collected by the directional microphone 108 is bad, the unmanned mobile unit 100 does not move. Accordingly, the unmanned mobile body 100 can suppress the situation of the collected sound from becoming worse as it moves.
For example, when the conversation between the speaker and the unmanned mobile vehicle 100 is ended after the unmanned mobile vehicle 100 starts moving to the sound output position and before the unmanned mobile vehicle 100 reaches the sound output position, the unmanned mobile vehicle 100 continues moving to the sound output position. Then, the unmanned mobile object 100 outputs the audio after reaching the audio output position. Accordingly, the unmanned mobile body 100 can suppress the person from entering the audio output range during the audio output.
For example, when the moving distance is long, the unmanned mobile object 100 may move to the sound output position in stages. Specifically, the unmanned mobile object 100 may repeat moving and stopping to the sound output position, and may output sound while stopping. Accordingly, the unmanned mobile body 100 can suppress the person from entering the sound output range during the sound output process performed once. The unmanned mobile 100 can suppress a delay in the response to the speaker.
For example, the unmanned mobile vehicle 100 may move to the voice output position while the conversation between the unmanned mobile vehicle 100 and the speaker is temporarily interrupted. Accordingly, the unmanned mobile 100 can suppress the person in question from entering the voice output range during voice output, and can suppress deterioration of voice output and sound collection for the speaker.
The unmanned mobile 100 may move to the voice output position while the speaker is not outputting or collecting voice. Accordingly, the unmanned mobile body 100 can suppress the person in question from entering the sound output range during sound output, and can suppress deterioration of sound output and sound collection.
For example, when the conversation between the unmanned mobile vehicle 100 and the speaker is ended, the unmanned mobile vehicle 100 may stop the movement because the voice output and the sound collection are not performed. The unmanned mobile 100 may recognize the contents of the conversation to recognize whether the conversation between the unmanned mobile 100 and the speaker is temporarily interrupted or the conversation between the unmanned mobile 100 and the speaker is ended.
Fig. 41 is a conceptual diagram illustrating an example in which the unmanned mobile body 100 moves to the sound output position through the front side. For example, the unmanned mobile object 100 moves to the voice output position through the front side of the speaker. The front side of the speaker corresponds to the visual field range of the speaker. In the case where the unmanned mobile body 100 is out of the visual field range of the speaker, it is difficult for the speaker to have a conversation with the unmanned mobile body 100. The unmanned mobile body 100 moves to the sound output position through the front side of the speaker, and can provide a smooth conversation to the speaker while moving.
Specifically, the unmanned moving object 100 may detect the front side of the speaker by the image recognition processing, and specify the visual field range of the speaker. The unmanned mobile object 100 may create a movement path within a predetermined visual field range and move to the sound output position along the created movement path.
In the above description, the unmanned mobile vehicle 100 moves to the sound output position through the front side of the speaker, but the unmanned mobile vehicle 100 may move to the sound output position through the front side of the speaker and the person concerned. Accordingly, the unmanned mobile 100 can provide a smooth conversation to the relevant person.
Fig. 42 is a conceptual diagram illustrating an example in which the unmanned mobile object 100 changes the sound output range. The unmanned mobile 100 may adjust the sound output range so that the speaker and the related person are included in the sound output range. Specifically, the unmanned mobile object 100 may adjust the sound pressure of the sound emitted from the directional speaker 107 to adjust the sound output range.
As in the upper example of fig. 42, when the accuracy of determining that the person other than the speaker is the relevant person is medium, the unmanned mobile object 100 moves to the voice output position where the speaker and the person other than the speaker enter the voice output direction. The unmanned mobile unit 100 adjusts the sound pressure of the sound emitted from the directional speaker 107 so that the sound is transmitted to the speaker and the sound is not transmitted to a person other than the speaker. That is, the unmanned mobile object 100 reduces the sound pressure of the sound emitted from the directional speaker 107.
Further, as in the lower example of fig. 42, when the accuracy of determining that a person other than the speaker is a relevant person is high, the unmanned mobile body 100 adjusts the sound pressure of the sound emitted from the directional speaker 107 so that the sound is transmitted to the person other than the speaker. That is, the unmanned mobile object 100 increases the sound pressure of the sound emitted from the directional speaker 107.
Accordingly, the unmanned mobile object 100 can immediately output a voice to a person other than the speaker without moving even when the accuracy of the person other than the speaker is improved. Further, the unmanned mobile object 100 may move in the sound output direction without increasing the sound pressure. Accordingly, the unmanned mobile body 100 can suppress an increase in power consumption due to an increase in sound pressure.
Fig. 43 is a conceptual diagram illustrating an example of selective operation of movement and change of the audio output range. The unmanned mobile object 100 can select whether to expand the voice output range or move in the voice output direction when both the speaker and the related person are present in the voice output direction. That is, the unmanned mobile object 100 can include the speaker and the related person in the audio output range by extending the audio output range and can move in the audio output direction, thereby including the speaker and the related person in the audio output range.
However, when the unmanned mobile body 100 expands the sound output range, the sound pressure of the sound emitted from the unmanned mobile body 100 increases. Accordingly, it can be estimated that power consumption increases. Therefore, the unmanned mobile object 100 may move in the audio output direction with priority over the expansion of the audio output range.
When the unmanned mobile 100 is too close to the speaker, the unmanned mobile may come into contact with the speaker. Further, when the unmanned mobile unit 100 is too close to the speaker, the sound transmitted from the unmanned mobile unit 100 may be too loud. Therefore, the unmanned mobile object 100 may move to the nearest position to the speaker in the sound output direction within a possible range. In this state, when the person is not included in the audio output range, the unmanned mobile object 100 may expand the audio output range. Accordingly, the unmanned mobile object 100 can appropriately output a voice to the speaker and the related person.
Fig. 44 is a conceptual diagram illustrating an example of a case where a person is out of a sound output range. For example, when the person concerned deviates from the sound output range, more specifically, when the person concerned himself/herself deviates from the sound output range, it can be estimated that the person concerned does not have the meaning of having a conversation with the unmanned mobile object 100.
Therefore, for example, in the above case, the unmanned mobile body 100 does not move to the sound output position for including the relevant person in the sound output range. Accordingly, the unmanned mobile body 100 can suppress power consumption due to unnecessary movement and can suppress unnecessary sound output to the relevant person.
For example, when the person concerned is out of the sound output range in the sound output of the unmanned mobile object 100, the person concerned does not have the possibility of listening to the sound from the unmanned mobile object 100. Therefore, for example, in this case, the unmanned mobile vehicle 100 may skip the movement for including the person in the sound output range.
However, the person concerned may move in a state of having a meaning of making a conversation with the unmanned mobile vehicle 100. For example, when the state in which the person concerned is not too far from the audio output range continues for a predetermined time or longer, the person concerned may have a meaning of making a conversation with the unmanned mobile vehicle 100. When the person is not far from the audio output range for a predetermined time or longer, the unmanned mobile object 100 may move so as to include the person in the audio output range.
The state in which the relevant person is not too far from the sound output range is, for example, a state in which the relevant person is not present in the sound output range and the relevant person is present in a predetermined range around the sound output range.
Fig. 45 is a conceptual diagram illustrating an example when another person enters the sound output range. When a person other than the speaker enters the sound output range or the sound output direction while the unmanned mobile object 100 is outputting the sound to the speaker, the person may move so that the other person is out of the sound output range or the sound output direction. For example, when the unmanned mobile object 100 detects that another person enters the sound output range or the sound output direction by the image recognition processing, the sound output position may be changed so that another person deviates from the sound output range or the sound output direction, and the unmanned mobile object may move to the changed sound output position.
The unmanned mobile object 100 may determine whether or not another person is a related person, and when determining that another person is not a related person, may change the sound output position so that another person deviates from the sound output range or the sound output direction.
Further, the unmanned mobile object 100 may be configured to move such that, when a person other than the speaker and the related person enters the sound output range or the sound output direction while the speaker and the related person are outputting the sound, the other person is out of the sound output range or the sound output direction.
As described above, unmanned mobile unit 100 according to the present embodiment includes directional speaker 107 and processor 150. Directional speaker 107 outputs sound in the direction of directivity. The processor 150 obtains more than one sensing data.
The processor 150 determines whether or not a second object is present around the first object based on at least one of the one or more sensing data. The processor 150 calculates a positional relationship between the first object and the second object from at least one of the one or more sensing data when it is determined that the second object is present.
Then, the processor 150 determines a first position of the unmanned mobile object 100 in which the first object and the second object are included in a range where the sound is transmitted by the directional speaker 107 with a mass equal to or greater than a predetermined mass, based on the positional relationship, and moves the unmanned mobile object 100 to the first position.
Accordingly, the unmanned mobile object 100 can appropriately output sound to the first object and the second object. That is, the unmanned mobile object 100 outputs sounds integrally to a plurality of objects.
In the above description, although the variable audio output range is used, a fixed audio output range may be used. That is, the sound pressure of the sound emitted from the unmanned mobile body 100 may be fixed. Furthermore, a non-directional speaker may be used instead of directional speaker 107. Even with such a configuration, it is possible to appropriately output sound to a plurality of objects by moving to an appropriate sound output position.
(embodiment mode 2)
Embodiment 1 relates mainly to audio output. The present embodiment relates mainly to sound collection. The configuration and operation described in embodiment 1 can be applied to this embodiment by replacing the sound output and speaker and the like of embodiment 1 with the sound collection and microphone and the like. The configuration and operation of the present embodiment will be described below specifically.
Fig. 46 is a block diagram showing a basic configuration example of the unmanned mobile unit according to the present embodiment. Fig. 46 shows an unmanned mobile unit 200 including a directional microphone 208 and a processor 250.
The unmanned mobile body 200 is a device that moves. For example, the unmanned mobile body 200, moves autonomously or is stationary. When receiving an operation, the unmanned mobile unit 200 may move according to the operation. The unmanned moving object 200 is typically an unmanned flying object, but may be an apparatus that travels on a surface, not limited to an unmanned flying object. The unmanned mobile body 200 may include a moving mechanism such as a motor and an actuator for moving in the air or on a surface.
The unmanned mobile unit 200 may further include one or more sensors. For example, the unmanned moving object 200 may include an image sensor, a distance measurement sensor, a directional microphone 208 or another microphone as an audio sensor, a human detection sensor, or a position detector as a position sensor.
The directional microphone 208 is a microphone that collects sound from a directional direction. The directivity direction of the directional microphone 208 may be adjusted, or the sound collection sensitivity of the directional microphone 208 may be adjusted. The pointing direction of the directional microphone 208 may also be expressed as a sound pickup direction.
The processor 250 is constituted by a circuit for performing information processing. For example, the processor 250 may control the movement of the unmanned mobile unit 100. Specifically, the processor 250 may control the movement of the unmanned mobile unit 200 by controlling the operation of a moving mechanism such as a motor and an actuator for moving in the air or on a surface.
The processor 250 may also adjust the directivity direction of the directional microphone 208 and may also adjust the sound collection sensitivity of the directional microphone 208 by transmitting a control signal to the directional microphone 208. The processor 250 may adjust the direction of the unmanned mobile unit 200 to adjust the directivity direction of the directional microphone 208.
Fig. 47 is a flowchart showing a basic operation example of the unmanned mobile unit 200 shown in fig. 46. Mainly, the processor 250 of the unmanned mobile unit 200 performs the operation shown in fig. 47.
First, the processor 250 obtains one or more sensing data (S201). Processor 250 may obtain one or more pieces of sensed data from one or more sensors inside unmanned mobile body 200, or may obtain one or more pieces of sensed data from one or more sensors outside unmanned mobile body 200. Further, processor 250 may obtain a plurality of pieces of sensed data from one or more sensors inside unmanned moving body 200 and one or more sensors outside unmanned moving body 200.
For example, an image sensor, a distance measuring sensor, a microphone, a human detection sensor, a position detector, or the like may be used as one or more sensors outside the unmanned moving body 200.
The processor 250 determines whether or not a second object exists around the first object based on at least one of the acquired one or more sensing data (S202). For example, the first object is a speaker and the second object is a person associated with the speaker. However, each of the first object and the second object may be not only a human but also an animal or a device.
If it is determined that the second object exists in the periphery of the first object, the processor 250 calculates the positional relationship between the first object and the second object from at least one of the one or more sensing data (S203). That is, the processor 250 derives the positional relationship of the first object and the second object from at least one of the one or more sensing data.
For example, the positional relationship includes at least one of a position and a distance related to the first object and the second object. The positional relationship may include the respective positions of the first object and the second object, and may also include the distance between the first object and the second object.
Specifically, the processor 250 may calculate the position of the first object, the position of the second object, the distance between the first object and the second object, and the like using image data obtained from the image sensor. Further, the processor 250 may calculate a distance between the unmanned mobile body 200 and the first object, a distance between the unmanned mobile body 200 and the second object, a distance between the first object and the second object, and the like, using the ranging data obtained from the ranging sensor.
The processor 250 determines the first position based on the calculated positional relationship. The first position is a position where the unmanned mobile object 200 such as the first object and the second object is included in a range where the directional microphone 208 collects sound with a mass equal to or greater than a predetermined mass. Then, the processor 250 moves the unmanned mobile vehicle 200 to the determined first position (S204).
Accordingly, the unmanned mobile object 100 can appropriately collect sound from the first object and the second object. That is, the unmanned mobile body 100 can collect sound integrally with respect to a plurality of objects.
For example, the second object is an object related to the first object. The processor 250 may determine whether or not an object existing in the periphery of the first object is related to the first object based on at least one of the one or more sensing data. Further, based on this, the processor 250 may determine whether or not the second object exists in the periphery of the first object.
At this time, the processor 250 may obtain at least one of information showing a relationship with the first object and information showing a relationship with the unmanned mobile body 200 from at least one of the one or more sensing data. Further, the processor 250 may determine whether or not an object existing in the periphery of the first object is related to the first object, based on at least one of the information indicating the relation with the first object and the information indicating the relation with the unmanned mobile body 200.
Specifically, the processor 250 may determine that the object existing in the periphery of the first object is related to the first object when the object existing in the periphery of the first object satisfies one or more of the plurality of conditions.
For example, the plurality of conditions may include "contact with the first object", "conversation with the first object", "existing at a distance equal to or less than a threshold value with respect to the first object", "existing in a predetermined area together with the first object", "associated with the first object", "close to the first object", "existing in a range of voice transmission of the first object", "uttering voice to the unmanned mobile body 200 in a conversation between the first object and the unmanned mobile body 200", and "watching the unmanned mobile body 200 in a conversation between the first object and the unmanned mobile body 200", or the like.
Fig. 48 is a conceptual diagram illustrating a specific working example of the unmanned mobile unit 200 illustrated in fig. 46. In this example, the unmanned mobile body 200 is an unmanned flying body also called an unmanned aerial vehicle. The speaker corresponds to the first object and the related person corresponds to the second object.
For example, the unmanned mobile unit 200 collects sound of a speaker in the vicinity of the speaker. Then, the unmanned mobile unit 200 determines whether or not a person is present in the vicinity of the speaker.
For example, the unmanned moving object 200 senses the vicinity of the speaker by a sensor provided in the unmanned moving object 200, and determines whether or not a person is present in the vicinity of the speaker based on the result. Specifically, as the sensor provided in the unmanned mobile unit 200, an image sensor can be used. When the unmanned moving object 200 determines that the person existing in the vicinity of the speaker is the person related to the speaker, it determines that the person related to the speaker exists in the vicinity of the speaker.
When the unmanned moving body 200 determines that the relevant person is present in the vicinity of the speaker, the sound collection position is determined so that the speaker and the relevant person are included in the sound collection range in which the voice is collected by the unmanned moving body 200. The sound collection range in which the unmanned mobile unit 200 collects sound may be determined according to the direction of the directional microphone 208.
Then, the unmanned mobile body 100 moves to the determined sound collection position and collects sound. Accordingly, the unmanned mobile unit 200 can collect sound from the speaker and the related person included in the sound collection range.
Fig. 49 is a block diagram showing a specific configuration example of the unmanned mobile unit 200 shown in fig. 48. The unmanned mobile unit 200 shown in fig. 49 includes a GPS receiver 201, a gyro sensor 202, an acceleration sensor 203, a human detection sensor 204, a distance measurement sensor 205, an image sensor 206, a directional speaker 207, a directional microphone 208, a drive unit 209, a communication unit 210, a control unit 220, a storage unit 230, and a power supply unit 241.
The GPS receiver 201 is a receiver that constitutes a GPS (global Positioning system) for measuring a position and receives a signal to obtain a position. For example, the GPS receiver 201 obtains the position of the unmanned mobile object 200. That is, the GPS receiver 201 operates as a sensor for detecting the position of the unmanned mobile unit 200.
The gyro sensor 202 is a sensor that detects the posture of the unmanned mobile body 200, that is, the angle or inclination of the unmanned mobile body 200. The acceleration sensor 203 is a sensor that detects the acceleration of the unmanned mobile body 200. The human detection sensor 204 is a sensor that detects a human in the periphery of the unmanned mobile body 200. The human detection sensor 204 may be an infrared sensor.
The distance measurement sensor 205 is a sensor that measures the distance between the unmanned moving body 200 and the object, and generates distance measurement data. The image sensor 206 is a sensor for performing imaging, and generates an image by imaging. The image sensor 206 may also be a camera.
Directional speaker 207 is a speaker that outputs sound in a direction of directivity. The directivity direction of directional speaker 207 may be adjusted, and the sound pressure of the sound emitted from directional speaker 207 may be adjusted. The directional direction of directional speaker 207 may also be expressed as a sound output direction. The directional microphone 208 is a microphone that collects sound from a directional direction. The directivity direction of the directional microphone 208 may be adjusted, or the sound collection sensitivity of the directional microphone 208 may be adjusted.
The driving unit 209 is a motor, an actuator, and the like that move the unmanned moving body 200. The communication unit 210 is a communicator that communicates with a device outside the unmanned mobile unit 200. The communication unit 210 may receive an operation signal for moving the unmanned mobile unit 200. The communication unit 210 may transmit and receive the contents of the session.
The control unit 220 corresponds to the processor 250 shown in fig. 46, and is configured by a circuit that performs information processing. Specifically, in this example, the control unit 220 includes a person detection unit 221, a related person determination unit 222, a sound collection range determination unit 223, a sound collection position determination unit 224, a sound collection control unit 225, and a movement control unit 226. That is, the processor 250 can also play their role.
The human detection unit 221 detects a human being present in the periphery of the unmanned mobile unit 200. The human detector 221 detects a human present in the periphery of the unmanned mobile unit 200 based on sensing data obtained from the human detection sensor 204 or another sensor.
The related person determination unit 222 determines whether or not the person detected by the person detection unit 221 is a related person related to the speaker. The sound collection range determination unit 223 determines the sound collection range based on the positional relationship between the speaker and the person concerned. The sound collection position determination unit 224 determines a sound collection position based on the determined sound collection range. The sound collection control unit 225 transmits a control signal to the directional microphone 208 to control sound collection by the directional microphone 208.
The movement control unit 226 transmits a control signal to the driving unit 209 to control the movement of the unmanned mobile unit 200. In this example, the movement control unit 226 controls the flight of the unmanned aerial vehicle 200, which is an unmanned aerial vehicle.
The storage unit 230 is a memory for storing information, and stores a control program 231 and pickup sensitivity collection range correspondence information 232. The control program 231 is a program for information processing performed by the control unit 220. The sound collection sensitivity sound collection range correspondence information 232 is information showing a correspondence relationship between the sound collection sensitivity of the directional microphone 208 and the sound collection range in which sound is collected with a quality equal to or higher than a predetermined quality.
The power supply unit 241 is a circuit that supplies power to a plurality of components included in the unmanned mobile unit 200. For example, the power supply section 241 includes a power source.
Fig. 50 is a flowchart showing a specific operation example of the unmanned mobile unit 200 shown in fig. 48. For example, the plurality of components of the unmanned mobile unit 200 shown in fig. 49 perform the operation shown in fig. 50 in an interlocking manner.
First, the unmanned mobile 200 moves to a conversation position for conversation with the speaker (S211). For example, the conversation position is a position where the voice uttered by the speaker is transmitted from the position of the speaker and the sound uttered by the unmanned mobile unit 200 is transmitted. The speaker may also be predetermined. The unmanned mobile 200 may determine a speaker during flight.
For example, in the unmanned mobile unit 200, the human detector 221 detects a speaker from sensing data obtained from the human detection sensor 204, the image sensor 206, or the like. The movement control unit 226 moves the unmanned mobile unit 200 to a conversation position within a predetermined range with respect to the speaker via the driving unit 209.
Then, the unmanned mobile 200 starts a session (S212). That is, the unmanned mobile unit 200 starts at least one of sound output and sound collection. For example, the sound collection control unit 225 causes the directional microphone 208 to start collecting sound. Further, control unit 220 may start audio output from directional speaker 207.
Then, the unmanned moving object 200 senses the periphery of the speaker (S213). For example, the human detection unit 221 detects a human around the speaker by sensing the human around the speaker by the human detection sensor 204, the image sensor 206, or the like. For this detection, any sensor for detecting a person can be used. The periphery of the speaker corresponds to, for example, an area within a predetermined range with respect to the speaker.
Then, the unmanned moving object 200 determines whether or not a person other than the speaker is detected (S214). For example, the human detection unit 221 determines whether or not a human other than the human speaker is detected around the human speaker. When a person other than the speaker is not detected (no in S214), the unmanned mobile unit 200 repeats sensing of the surroundings of the speaker (S213).
When a person other than the speaker is detected (yes in S214), the unmanned mobile unit 200 determines whether or not the detected person is a person related to the speaker (S215). For example, the related person determination unit 222 may determine whether the detected person is a related person based on whether the distance between the speaker and the related person is within a threshold value, or may determine whether the detected person is a related person based on another determination criterion for grouping or the like. This determination is the same as that described in embodiment 1.
When the detected person is not the relevant person (no in S215), the unmanned mobile unit 200 repeats sensing of the vicinity of the speaker (S213).
If the detected person is the relevant person (yes in S215), the unmanned mobile unit 200 measures the separation distance between the speaker and the relevant person (S216). For example, the sound collection range determination unit 223 may calculate a distance between the position of the speaker detected from the sensed data and the position of the person concerned detected from the sensed data, and may measure the distance between the speaker and the person concerned.
Then, the unmanned moving body 200 determines the sound collection range based on the distance between the speaker and the relevant person (S217). For example, the sound collection range determination unit 223 determines the sound collection range based on the measured distance. In this case, the sound collection range determining unit 223 increases the sound collection range as the distance measured increases.
The sound collection range is a range relatively determined by using the unmanned mobile object 200 as a reference, for example, and is also a range in which sound is collected by the directional microphone 208 with a mass equal to or higher than a predetermined mass. The predetermined mass or more may correspond to a sound pressure within a predetermined range, or may correspond to a signal-to-noise ratio (signal-to-noise ratio) within a predetermined range.
Then, the unmanned mobile unit 200 determines the sound collection position based on the position of the speaker, the position of the relevant person, and the sound collection range (S218). For example, the sound collection position determination unit 224 determines the sound collection position so as to include the position of the detected speaker and the position of the detected person concerned within the determined sound collection range. The determination of the sound pickup position will be described later.
Then, the unmanned mobile body 200 moves to the sound pickup position (S219). For example, the movement control unit 226 controls the operation of the driving unit 209 to move the unmanned mobile unit 200 to the sound collection position. The sound collection control unit 225 may control sound collection by the directional microphone 208 so that sound is collected from the sound collection range with a quality equal to or higher than a predetermined quality.
Accordingly, the unmanned mobile unit 200 can appropriately collect sound from the speaker and the related person.
In the above example, the unmanned mobile unit 200 performs the process for moving to the sound pickup position (S213 to S219) after the conversation with the speaker is started (after S212), but the process for moving to the sound pickup position may be performed before the conversation with the speaker is started.
In the above example, when the detected person is not the relevant person (no in S215), the unmanned mobile unit 200 repeats sensing of the vicinity of the speaker (S213). However, the unmanned mobile unit 200 may correct the sound collection position so as not to collect sound of a person (third object) other than the relevant person. That is, the sound collection position determination unit 224 of the unmanned moving body 200 may correct the sound collection position so that a person other than the relevant person is not included in the sound collection range.
The sound collection position determination unit 224 may correct the sound collection position so that a person other than the person concerned deviates from the sound collection direction. This can prevent a person other than the person concerned from entering the sound collection range when the person moves.
The sound collection range is also expressed as a human voice sound collection range, and is a range in which human voice can be collected at a sound pressure equal to or higher than a predetermined sound pressure, for example. Specifically, the sound collection range is a range extending from the directional microphone 208 in the sound collection direction, and is a range within a predetermined distance (for example, 5m) from the directional microphone 208. The predetermined distance depends on the sound collection sensitivity of the directional microphone 208. The higher the sound collection sensitivity, the longer the predetermined distance, and the larger the sound collection range.
Therefore, for example, the sound collection control unit 225 can increase the sound collection sensitivity of the directional microphone 208 and expand the sound collection range. The sound collection control unit 225 can reduce the sound collection sensitivity of the directional microphone 208 and reduce the sound collection range. The sound collection control unit 225 may remove noise that increases due to the improvement of the sound collection sensitivity by the noise removal filter.
Further, the range in which a human voice can be collected at a sound pressure equal to or higher than a predetermined sound pressure also depends on the sound pressure of the human voice. The sound pressure of a voice uttered by a person is personally different. Therefore, the sound collection range may be defined according to the average sound pressure of the voice uttered by the person. Therefore, the sound collection range is a reference range, and the human voice in the sound collection range is not necessarily collected at a sound pressure equal to or higher than a predetermined sound pressure.
The unmanned moving object 200 may recognize the attribute of the speaker or the related person, and determine the sound collection range based on the recognized attribute. For example, the unmanned mobile unit 200 may determine the sound collection range according to the sex, age, or the like.
Alternatively, the unmanned mobile unit 200 may authenticate a speaker or a related person, and determine the sound pickup range for the authenticated speaker or related person based on the sound pressure registered in advance. Alternatively, the unmanned mobile unit 200 may store information of the speaker or the person concerned and the sound pressure as a history, estimate the sound pressure of the speaker or the person concerned from the past history, and determine the sound pickup range from the estimated sound pressure. Face information, which is information of a speaker or a person concerned, may also be stored and used for authentication.
The sound pressure of human voice may be measured experimentally and the sound collection range may be determined based on the result. In this case, the sound collection range may be determined for each sound collection sensitivity. The sound collection range may be determined based on the characteristics of the single sound source shown in fig. 6.
Further, the sound pressure of the voice uttered by the speaker may be different from the sound pressure of the voice uttered by the person concerned. In this case, the sound pickup range may be determined based on a small sound pressure among the sound pressures, may be determined based on a large sound pressure among the sound pressures, or may be determined based on an average sound pressure of the sound pressures. The sound collection range may be determined based on the sound pressure of the voice uttered by the speaker, or may be determined based on the sound pressure of the voice uttered by the person concerned.
When the sound collection range is fixed, that is, when the sound collection sensitivity of the directional microphone 208 is fixed, the unmanned mobile unit 200 may determine whether or not the distance between the speaker and the person concerned falls within the sound collection range. Further, the unmanned mobile unit 200 may determine the sound collection position and move to the determined sound collection position when the separation distance falls within the sound collection range. The unmanned mobile unit 200 may not move when the distance does not fall within the sound collection range.
For the sake of simplicity, the following description will be made on the premise that the sound collection range is defined by the average sound pressure of a voice uttered by a person without taking individual differences into consideration, and the sound collection range can be adjusted by adjusting the sound collection sensitivity. However, the sound pickup range may be adjusted in consideration of individual differences.
The relationship between the sound collection sensitivity and the sound collection range may be stored in the storage unit 230 as the sound collection sensitivity and sound collection range correspondence information 232.
The sound collection range determining section 223 determines the sound collection sensitivity and the sound collection range so that the speaker and the person concerned are included in the sound collection range, based on the distance between the speaker and the person concerned. For example, this operation is performed in the same manner as the operation of determining the sound pressure and the sound output range so that the speaker and the person concerned are included in the sound output range based on the distance between the speaker and the person concerned in embodiment 1 by the sound output range determining unit 123.
In the present embodiment, the criteria for determining whether or not a person other than a speaker is a relevant person are the same as those described in embodiment 1 with reference to fig. 13 to 23, and therefore, the description thereof is omitted.
The position of the unmanned mobile unit 200 at which the speaker and the relevant person can be included in the sound collection range is determined as the sound collection position in the unmanned mobile unit 200. For example, the sound collection position determination unit 224 of the unmanned mobile unit 200 determines the sound collection position. A more specific method for determining the sound collection position will be described below with reference to fig. 51 to 66.
Fig. 51 is a conceptual diagram illustrating an example of the sound pickup position on a straight line passing through the position of the speaker and the position of the person concerned. In this example, the unmanned moving object 200 determines, as the sound collection position, a position including the speaker and the position of the person in the sound collection range relatively determined based on the position on the straight line passing through the position of the speaker and the position of the person. Accordingly, the unmanned mobile unit 200 can appropriately collect sound from the speaker and the relevant person along the sound collection direction.
Fig. 52 is a conceptual diagram illustrating an example of a sound pickup position near a speaker. For example, the unmanned moving object 200 collects sound from the sound collection positions on the outer sides of the speaker and the relevant person toward the speaker and the relevant person on a straight line passing through the position of the speaker and the position of the relevant person. In the example of fig. 52, the unmanned mobile unit 200 collects sound at a sound collection position on the speaker side. That is, the unmanned mobile unit 200 determines a position close to the speaker as the sound collection position.
It can be estimated that a predetermined speaker makes a conversation with the unmanned mobile unit 200 more times than the relevant person. Further, when there is a person concerned between the unmanned mobile unit 200 and the speaker, the person concerned may interfere with the conversation between the unmanned mobile unit 200 and the speaker. Therefore, the position close to the speaker is determined as the sound collection position, and a conversation can be smoothly performed a greater number of times.
Fig. 53 is a conceptual diagram illustrating an example of a sound pickup position near an elderly person. The unmanned mobile 200 may determine a position close to the elderly as the sound pickup position instead of the position close to the speaker. For example, when the speaker is not determined in advance, the unmanned mobile unit 200 determines a position close to the elderly person as the sound pickup position. The unmanned mobile 200 may estimate the age by face recognition.
It can be estimated that the sound pressure of a voice uttered by a person decreases with age. The unmanned mobile body 200 can collect a voice uttered by the elderly person at a low sound pressure by determining a position close to the elderly person as a sound pickup position. Therefore, the unmanned mobile unit 200 can compensate for the sound pressure that decreases with age.
In the case of parents and children, a position close to the elderly, that is, a position close to parents is determined as a sound collection position. This enables the child to be away from the unmanned mobile body 200.
Further, the unmanned mobile unit 200 may determine that a person whose estimated age is equal to or greater than a predetermined age is an elderly person. When it is determined that one of the speaker and the related person is an elderly person, the unmanned moving body 200 may determine a position close to the elderly person as the sound collection position. When both the speaker and the relevant person are determined to be elderly persons, the unmanned moving object 200 may determine a position at a distance equal to both as the sound collection position, or may determine the sound collection position based on other conditions.
Fig. 54 is a conceptual diagram illustrating an example of correcting the sound pickup position to the front side with the person concerned as the center. As shown in the upper side of fig. 54, even when the unmanned mobile unit 200 is present at the lateral position of the speaker and the related person, the unmanned mobile unit 200 can collect sound from the speaker and the related person. On the other hand, when the unmanned mobile body 200 is present on the front side of the lateral position of the speaker and the related person, the speaker and the related person can easily perform a conversation with the unmanned mobile body 200.
That is, the unmanned moving object 200 is present on the front side of the speaker and the related person, and thus, a smooth conversation can be provided to the speaker and the related person. Therefore, the unmanned mobile unit 200 may correct the sound pickup position to the front side of the speaker and the relevant person.
Specifically, as shown in the lower side of fig. 54, the unmanned moving object 200 may correct the sound pickup position to the front side of the speaker and the relevant person along a circle estimated with the relevant person as the center. Accordingly, the unmanned mobile body 200 can correct the sound pickup position without changing the distance to the relevant person.
The unmanned mobile unit 200 may correct the sound pickup position along a circle estimated by the speaker as the center. Accordingly, the unmanned mobile unit 200 can correct the sound pickup position without changing the distance to the speaker. However, the unmanned moving object 200 can suppress the variation in the distance between each of the speaker and the related person and the unmanned moving object 200 by using the circle estimated with the one of the speaker and the related person which is far from the unmanned moving object 200 as the center.
The unmanned moving object 200 may move in the front direction of at least one of the speaker and the related person without correcting the sound collection position along the circle, and may direct the sound collection direction to at least one of the speaker and the related person. The unmanned mobile body 100 may correct the collected sound position to a position in the front direction for performing such an operation.
Fig. 55 is a conceptual diagram illustrating an example of a sound pickup position determined by a speaker so that a person is included in a sound pickup range. For example, as shown in the upper side of fig. 55, the unmanned mobile 200 exists on the front side of the speaker in conversation with the speaker. Then, as shown in the lower side of fig. 55, the unmanned mobile unit 200 may move along a circle estimated by the speaker as the center so that the person is included in the sound collection range. In this case, the unmanned mobile unit 200 may determine the sound pickup position along a circle estimated by the speaker as the center.
Accordingly, the unmanned mobile unit 200 can move to a position where sound can be collected from the speaker and the relevant person without changing the distance to the speaker.
Fig. 56 is a conceptual diagram illustrating an example of sound pickup positions on the front side of a speaker and a related person. When the distance between the speaker and the relevant person in the lateral direction perpendicular to the front direction of the speaker and the relevant person is within the pointing width, the unmanned mobile unit 200 may determine the front side position of the speaker and the relevant person as the sound collection position.
Accordingly, the unmanned mobile unit 200 can collect sound from the speaker and the related person at the position on the front side of the speaker and the related person. That is, the unmanned mobile unit 200 can perform a conversation from the front side positions of the speaker and the relevant person. Therefore, the unmanned mobile 200 can provide a smooth conversation to the speaker and the related person.
Further, it can be estimated that the front side positions of the speaker and the related person are more suitable for conversation than the lateral positions of the speaker and the related person. Therefore, the unmanned mobile unit 200 may preferentially determine the front side positions of the speaker and the relevant person as the sound collection positions.
Fig. 57 is a conceptual diagram illustrating an example of the sound pickup position on a straight line in an oblique direction with respect to the horizontal plane. For example, the unmanned moving object 200 may obtain the body information of the speaker and the body information of the person through image recognition processing, face recognition processing, or the like. The unmanned mobile unit 200 may determine the sound pickup position based on the physical information of the speaker and the physical information of the person concerned. The body information may be height or face height.
Specifically, when the height of the face of the speaker is far from the height of the face of the person concerned, the unmanned mobile unit 200 determines, as the sound collection position, a position on a straight line passing through the position of the face of the speaker and the position of the face of the person concerned, that is, a position where the speaker and the person concerned are included in the sound collection range. In this case, the unmanned mobile unit 200 collects sound along a sound collection direction inclined with respect to the horizontal plane.
Accordingly, the unmanned mobile unit 200 can appropriately collect sound from the speaker and the relevant person along the sound collection direction.
Examples of the case where the height of the face of the speaker is far from the height of the face of the person concerned include a case where the speaker and the person concerned are a parent and a child, and a case where the speaker and the person concerned are a person who takes a wheelchair and a person who pushes a wheelchair. Also, fig. 57 shows an example in which a parent is a speaker and a child is a related person, but the speaker and the related person may be reversed.
Also, for sound pickup in an oblique direction, it is possible to estimate sound pickup from low to high and sound pickup from high to low. When sound is collected from a low position to a high position, the flying height is low, and therefore, the flying becomes difficult and there is a possibility of contact with a person. Also, the unmanned mobile body 200 approaches a small child. Therefore, the sound can be collected from high to low. This can suppress the possibility of collision or the like, for example.
On the other hand, when the voice is collected from high to low, the speaker and the related person utter the voice above the position where the unmanned mobile unit 200 exists. Therefore, the voice is diffused, and it is difficult to collect the voice. Therefore, the sound can be collected from low to high.
Further, it is also possible to switch between sound collection from low to high and sound collection from high to low. In a place with few people, sound may be collected from low to high, and in a place with many people, sound may be collected from high to low.
Fig. 58 is a conceptual diagram illustrating an example of the sound pickup position on a horizontal straight line. When the face of the speaker and the face of the person concerned have entered the directional width of sound collection, the unmanned mobile unit 200 may determine, as the sound collection position, a position that includes the position of the face of the speaker and the face of the person concerned in the sound collection range and is used for collecting sound in the horizontal direction. That is, when the height of the face of the speaker or the height of the face of the person concerned is beyond a predetermined range and is far away, the unmanned mobile unit 200 may determine a position for collecting sound in an oblique direction as a sound collection position as shown in fig. 57.
In other words, when the difference between the height of the face of the speaker and the height of the face of the person concerned is within a predetermined range, the unmanned mobile unit 200 may not change the height. Accordingly, the process can be simplified. However, the height of the unmanned mobile body 200 is increased, so that the possibility of collision or the like can be suppressed, and a smooth conversation can be provided.
Fig. 59 is a conceptual diagram illustrating an example of the sound pickup position at the same height as the speaker and the related person. As described above, the unmanned mobile unit 200 may determine a position for collecting sound in the horizontal direction as the sound collection position. Accordingly, the process can be simplified.
However, in this case, the unmanned mobile unit 200 may come into contact with a person. Further, since a person far from the unmanned mobile body 200 and the unmanned mobile body 200 have a conversation over a person near the unmanned mobile body 200, it is difficult to perform the conversation. Specifically, in the example of fig. 59, the person concerned has a conversation with the unmanned mobile object 200 over the speaker, and therefore, it is difficult to perform the conversation.
Fig. 60 is a conceptual diagram illustrating an example of a higher sound pickup position than a speaker and related persons. The unmanned mobile 200 may be configured to determine a position higher than the speaker and the relevant person as the sound pickup position. This makes it possible to suppress the possibility of collision or the like of the unmanned mobile body 200. Further, the unmanned mobile body 200 can provide a smooth conversation to a person close to the unmanned mobile body 200 and also to a person far from the unmanned mobile body 200.
Fig. 61 is a conceptual diagram illustrating an example of the height of the sound pickup position. If the sound pickup position is too high, the angle at which the speaker and the related person look up at the unmanned moving body 200 becomes too large. Accordingly, the speaker and the related person perform a conversation while looking up at the unmanned mobile object 200, and thus it is difficult to perform a smooth conversation.
Accordingly, the height of the sound pickup position or the upper limit of the angle between the sound pickup direction and the horizontal plane may be set. For example, the upper limit of the height of the sound pickup position may be set according to the distance between the unmanned mobile body 200 and one of the speaker and the related person that is close to the unmanned mobile body 200. For example, the upper limit of the height of the sound pickup position is set to be lower as the distance between the speaker and the person concerned is closer. Accordingly, the angle at which the speaker and the related person look up at the unmanned mobile object 200 can be suppressed to be small.
Fig. 62 is a conceptual diagram illustrating an example of a sound pickup position for excluding a non-relevant person from the sound pickup range. When it is determined that a person other than the speaker is not the relevant person, the unmanned mobile unit 200 may determine the sound pickup position so that the person determined not to be the relevant person is not included in the sound pickup range. That is, the unmanned mobile unit 200 may determine the sound pickup position so that the non-relevant person is not included in the sound pickup range when it is determined that the person other than the speaker is the non-relevant person.
For example, the unmanned moving body 200 determines the sound collection position so that the distance between the unmanned moving body 200 and the person not concerned increases, and moves to the sound collection position. Accordingly, the unmanned mobile body 200 can make it difficult to collect sounds from the non-relevant persons.
For example, the unmanned mobile unit 200 determines the sound collection position in a range from the speaker to the non-relevant person without collecting the sound. That is, the unmanned mobile unit 200 determines the sound collection position so that the irrelevant person is not included in the sound collection range and the speaker is included in the sound collection range. Accordingly, the unmanned mobile unit 200 can collect sound from a speaker without collecting sound from a person who is not concerned.
For example, the unmanned mobile unit 200 may be distant from the non-relevant person within a range in which the speaker can collect the sound at a sound pressure equal to or higher than a predetermined sound pressure. Specifically, the unmanned mobile unit 200 may calculate a range in which a sound is collected from the speaker at a sound pressure equal to or higher than a predetermined sound pressure, based on the sound pressure of the sound collected from the speaker before the movement, and determine a position farthest from the person other than the speaker in the range as the sound collection position. Accordingly, the unmanned mobile unit 200 can collect voice from a person other than the relevant person with difficulty, and can maintain appropriate sound collection for the speaker.
The unmanned mobile unit 200 may determine the sound collection position so that the non-relevant person is not included in the sound collection range and the speaker and the relevant person are included in the sound collection range. Accordingly, the unmanned mobile unit 200 can collect sound of the speaker and the relevant person without collecting sound of the non-relevant person.
Fig. 63 is a conceptual diagram illustrating an example of the positional relationship on the horizontal plane of the non-relevant person, the speaker, and the unmanned mobile unit 200. For example, the unmanned mobile unit 200 may determine a position not far from the person as the sound collection position, as in the upper example of fig. 63. However, in the upper example of fig. 63, the non-relevant person enters the sound collecting direction, and therefore, there is a possibility that sound is collected from the non-relevant person.
Therefore, as in the example on the lower side of fig. 63, the unmanned moving body 200 may determine the sound collection position so that the irrelevant person is out of the sound collection direction. Specifically, the unmanned mobile unit 200 may determine, as the sound collection position, a position that is not included on a straight line passing through the position of the speaker and the position of the person concerned. This makes it possible for the unmanned mobile unit 200 to suppress the possibility of collecting sounds from the non-relevant persons.
In the example on the lower side of fig. 63, the unmanned mobile unit 200 may determine the sound collection position so that the non-relevant person is not included in the sound collection range and the speaker and the relevant person are included in the sound collection range.
Fig. 64 is a conceptual diagram illustrating an example of the positional relationship on the vertical plane of the non-relevant person, the speaker, and the unmanned mobile body 200. When the unmanned mobile unit 200 collects sound in the horizontal direction, the possibility that the non-relevant person enters the sound collection range or the sound collection direction is high, and the possibility that sound is collected from the non-relevant person exists. Therefore, the unmanned moving body 200 may collect sound from the speaker from the upper side of the speaker. Accordingly, the unmanned mobile body 200 can suppress the possibility of the non-relevant person entering the sound pickup range or the sound pickup direction, and can suppress the possibility of sound pickup from the non-relevant person.
The unmanned mobile unit 200 may determine the height of the sound collection position so that the non-relevant person is not included in the sound collection range and the speaker and the relevant person are included in the sound collection range.
Fig. 65 is a conceptual diagram illustrating an example of a sound pickup position for excluding other persons from the sound pickup range. The position of the unmanned moving body 200 where the speaker is sandwiched between the unmanned moving body 200 and an obstacle may be determined as the sound collection position in order to exclude other people from the sound collection range in the unmanned moving body 200. The unmanned mobile unit 200 may move to the sound collection position to collect sound from the speaker. This makes it possible for the unmanned mobile unit 200 to suppress the possibility of collecting sounds from other persons.
Here, the obstacle is, for example, a physical environment that hinders other people from entering the sound pickup range. The obstacle may be a physical environment that hinders expansion of the sound pickup range, or may be a physical environment that a person cannot pass through. Specifically, the obstacle may be a wall, a building, or a cliff.
The unmanned mobile body 200 may detect the position of the obstacle by image recognition processing, or may detect the position of the obstacle by an obstacle detection sensor not shown in the figure.
The unmanned mobile object 200 may specify the position of the obstacle from map information including the position of the obstacle. The map information may be stored in advance in the storage unit 230 of the unmanned mobile unit 200, or may be input from an external device to the unmanned mobile unit 200 by the communication unit 220 of the unmanned mobile unit 200. Further, the unmanned mobile object 200 may detect the position of the unmanned mobile object 200 and detect the position of the obstacle from the map information.
For example, in the upper side of fig. 65, since there is no obstacle on the opposite side across the speaker from the unmanned mobile unit 200, there is a possibility that another person enters the sound pickup range. On the other hand, in the lower side of fig. 65, since an obstacle such as a wall exists on the opposite side from the unmanned mobile unit 200 across the speaker, it is possible to suppress the possibility of another person entering the sound pickup range.
In addition, in addition to the speaker, the person concerned may be considered. Specifically, the position of the speaker sandwiched between the unmanned moving object 200 and the obstacle and the position of the unmanned moving object 200 of the speaker are determined as the sound pickup position. Accordingly, the unmanned mobile unit 200 can collect sound from the speaker and the related person without collecting sound from other persons.
Fig. 66 is a conceptual diagram illustrating an example of a sound pickup position determined from a voice uttered by a speaker and a voice uttered by the relevant person.
The unmanned mobile unit 200 may determine a position close to the speaker or the related person having the higher frequency of sound emission as the sound collection position. Specifically, the unmanned moving body 200 may obtain the occurrence frequency of the speaker and the occurrence frequency of the person concerned from the collected voice, and determine a position close to the one with a high utterance frequency as the sound collection position. For example, when the number of times the speaker speaks into the unmanned mobile unit 200 is larger than the number of times the speaker speaks into the unmanned mobile unit 200, the position near the speaker is determined as the sound pickup position.
Accordingly, the unmanned mobile unit 200 can collect sound more appropriately from the speaker and the related person who produce sound at a higher frequency.
Further, the unmanned moving object 200 may determine a position close to the speaker or the related person whose sound volume is smaller as the sound pickup position. Specifically, the unmanned mobile unit 200 may obtain the volume of the voice of the speaker and the volume of the voice of the person concerned from the collected voices, and determine a position close to the lower one of the volumes as the sound collection position. For example, when the volume of the person concerned is smaller than the volume of the speaker, the position close to the person concerned is determined as the sound pickup position.
More specifically, the unmanned mobile unit 200 estimates the sound pressure of the voice uttered by the speaker and the sound pressure of the voice uttered by the person concerned as the sound volume based on the collected sounds. Then, the unmanned mobile unit 200 compares the volume estimated for the speaker with the volume estimated for the relevant person, and specifies the smaller one.
The unmanned mobile unit 200 may estimate the sound pressure of the voice uttered by the speaker and the sound pressure of the voice uttered by the person as the volume, respectively, with reference to a table showing a relationship among the sound pressure of the voice uttered by the person, the distance between the person and the unmanned mobile unit 200, and the sound pressure of the voice collected by the unmanned mobile unit 200. The table may be stored in advance in the storage unit 230.
Further, the unmanned moving body 200 can appropriately collect sound even from the speaker or the related person having a low volume by moving to the sound collection position near the speaker or the related person having a low volume and collecting sound.
As for the method of determining the sound collection position, any one of a plurality of determination methods described with reference to fig. 51 to 66 may be employed, or a combination of two or more of these determination methods may be employed. Next, a plurality of examples of the movement of the unmanned mobile body 200 and the like will be described.
Fig. 67 is a conceptual diagram illustrating an example in which the unmanned mobile unit 200 moves to the sound pickup position. For example, when the unmanned mobile unit 200 moves to the sound pickup position while picking up sound from the speaker, the unmanned mobile unit moves to the sound pickup position so that the speaker does not deviate from the sound pickup range during the movement. Accordingly, the unmanned mobile unit 200 can continue to collect sound from the speaker.
Specifically, in this case, the unmanned mobile unit 200 moves to the sound pickup position while directing the directional microphone 208 toward the speaker. The unmanned mobile unit 200 moves within a predetermined distance from the speaker. The predetermined distance corresponds to the length of the sound collection range in the sound collection direction. The unmanned moving object 200 may create a movement path within a predetermined distance from the speaker and move to the sound collection position along the created movement path. Accordingly, the unmanned mobile unit 200 can move to the sound collection position so that the speaker does not deviate from the sound collection range during the movement.
The unmanned mobile unit 200 may change the sound pickup sensitivity according to the distance between the unmanned mobile unit 200 and the speaker so that the sound pressure of the sound collected from the speaker is kept constant during the movement. For example, the unmanned mobile unit 200 may move while increasing the sound collection sensitivity when the unmanned mobile unit is far from the speaker. Instead, the unmanned mobile unit 200 may move while reducing the sound collection sensitivity when approaching the speaker.
The unmanned mobile unit 200 may move when the conversation is interrupted, so as not to collect the conversation divided by the person who is in the sound collection range while the person is in the conversation during sound collection.
Fig. 68 is a conceptual diagram illustrating an example in which the unmanned mobile body 200 moves to the sound pickup position through the front side. For example, the unmanned mobile unit 200 moves to the sound pickup position through the front side of the speaker. The front side of the speaker corresponds to the visual field range of the speaker. In a case where the unmanned mobile body 200 is out of the visual field range of the speaker, it is difficult for the speaker to have a conversation with the unmanned mobile body 200. The unmanned mobile unit 200 moves to the sound pickup position through the front side of the speaker, and can provide a smooth conversation to the speaker while moving.
Specifically, the unmanned moving object 200 may detect the front side of the speaker by the image recognition processing, and specify the visual field range of the speaker. The unmanned mobile unit 200 may create a movement path within a predetermined visual field range and move to the sound collection position along the created movement path.
In the above description, the unmanned moving body 200 moves to the sound pickup position through the front side of the speaker, but the unmanned moving body 200 may move to the sound pickup position through the front side of the speaker and the person concerned. Accordingly, the unmanned mobile 200 can provide a smooth conversation to the relevant person.
Fig. 69 is a conceptual diagram illustrating an example in which the unmanned mobile body 200 changes the sound pickup range. The unmanned mobile unit 200 may adjust the sound collection range so that the speaker and the related person are included in the sound collection range. Specifically, the unmanned mobile unit 200 may adjust the sound collection range by adjusting the sound collection sensitivity of the directional microphone 208.
Further, as in the upper example of fig. 69, when the accuracy of determining that the person other than the speaker is the relevant person is medium, the unmanned mobile unit 200 moves to the sound collection position where the speaker and the person other than the speaker enter the sound collection direction. The unmanned mobile unit 200 adjusts the sound collection sensitivity of the directional microphone 208 so that the voice is collected from the speaker and not collected from the other people. That is, the unmanned mobile unit 200 reduces the sound collection sensitivity of the directional microphone 208.
Further, as in the lower example of fig. 69, when the accuracy of determining that a person other than the speaker is a relevant person is high, the unmanned mobile unit 200 adjusts the sound collection sensitivity of the directional microphone 208 so as to collect sound from the person other than the speaker. That is, the unmanned mobile unit 200 improves the sound collection sensitivity of the directional microphone 208.
Accordingly, the unmanned mobile unit 200 can collect sound immediately without moving even when the accuracy of the person other than the speaker is improved. Further, the unmanned mobile unit 200 may move in the sound collection direction without increasing the sound collection sensitivity. Accordingly, the unmanned mobile unit 200 can suppress an increase in power consumption due to an improvement in the sound pickup sensitivity.
Fig. 70 is a conceptual diagram illustrating an example of selective operation of movement and change of the sound pickup range. The unmanned mobile unit 200 can select whether to expand the sound collection range or move in the sound collection direction when both the speaker and the person are present in the sound collection direction. That is, the unmanned mobile unit 200 can include the speaker and the related person in the sound collection range by enlarging the sound collection range and moving the speaker and the related person in the sound collection direction, thereby including the speaker and the related person in the sound collection range.
However, when the unmanned mobile body 200 expands the sound collection range, the sound collection sensitivity increases. Accordingly, it can be estimated that power consumption increases. Therefore, the unmanned mobile unit 200 may be moved in the sound collection direction with priority over the expansion of the sound collection range.
When the unmanned mobile unit 200 is too close to the speaker, the unmanned mobile unit may come into contact with the speaker. When the unmanned mobile unit 200 is too close to the speaker, the voice collected from the speaker may be too loud. Therefore, the unmanned mobile unit 200 may move to the nearest position to the speaker in the sound collecting direction within a possible range. In this state, when the relevant person is not included in the sound collection range, the unmanned mobile body 200 may expand the sound collection range. Accordingly, the unmanned mobile unit 200 can appropriately collect sound from the speaker and the related person.
Fig. 71 is a conceptual diagram illustrating an example of a case where a person is out of the sound pickup range. For example, when the person concerned deviates from the sound collection range, more specifically, when the person concerned himself or herself deviates from the sound collection range, it can be estimated that the person concerned does not have the meaning of having a conversation with the unmanned mobile unit 200.
Therefore, for example, in the above case, the unmanned mobile unit 200 does not move to the sound collection position for including the relevant person in the sound collection range. Accordingly, the unmanned mobile body 200 can suppress power consumption due to unnecessary movement and can suppress unnecessary sound collection for the person concerned.
However, the person concerned may move in a state of having a meaning of making a conversation with the unmanned mobile object 200. For example, when the person concerned is not far from the sound collection range for a predetermined time or longer, the person concerned may have a meaning of making a conversation with the unmanned mobile unit 200. When the person is not far from the sound collection range for a predetermined time or longer, the unmanned mobile unit 200 may move so as to include the person in the sound collection range.
The state in which the person is not too far from the sound collection range is, for example, a state in which the person is not present in the sound collection range and the person is present in a predetermined range around the sound collection range.
Fig. 72 is a conceptual diagram illustrating an example when another person enters the sound pickup range. When a person different from the speaker enters the sound collection range or the sound collection direction while the speaker collects sound, the unmanned mobile unit 200 may move so that the other person deviates from the sound collection range or the sound collection direction. For example, when it is detected that another person enters the sound collection range or the sound collection direction by the image recognition processing, the unmanned moving body 200 may change the sound collection position so that the other person deviates from the sound collection range or the sound collection direction, and move the unmanned moving body to the changed sound collection position.
Further, the unmanned moving body 200 may change the sound collection position and move to the changed sound collection position so that the other person deviates from the sound collection range or the sound collection direction at the timing when the directional microphone 208 detects that the voice of the other person is collected.
For example, when another person who is present in the sound pickup range or the sound pickup direction does not utter a voice, the voice of the other person is not collected. In such a case, the unmanned mobile unit 200 may not move because the influence of another person does not occur. When the voice of another person is collected and the influence of another person is generated, the unmanned mobile unit 200 may move the other person so as to deviate from the sound collection range or the sound collection direction.
Further, the unmanned moving body 200 may determine whether or not another person is a related person, and when determining that another person is not a related person, change the sound pickup position so that another person deviates from the sound pickup range or the sound pickup direction.
Further, the unmanned moving body 200 may move the speaker and the person concerned so that the other person is out of the sound collection range or the sound collection direction when the person different from the speaker and the person concerned enters the sound collection range or the sound collection direction while the speaker and the person concerned collect sound.
Fig. 73 is a conceptual diagram illustrating an example when a small group enters the sound pickup range. There is a possibility that a group consisting of a plurality of persons different from the speaker makes a conversation within the group. Therefore, when the group enters the sound collection range or the sound collection direction, the unmanned mobile unit 200 may move so as not to collect the conversation in the group. That is, in this case, the unmanned mobile unit 200 may move so that the small group that enters the sound collection range or the sound collection direction deviates from the sound collection range or the sound collection direction.
For example, when the group is detected to have entered the sound collection range or the sound collection direction by the image recognition processing, the unmanned moving body 200 may change the sound collection position so that the group is out of the sound collection range or the sound collection direction, and move to the changed sound collection position.
The unmanned moving object 200 may determine whether or not a plurality of persons different from the speaker constitute a group, based on a criterion of whether or not a person other than the speaker is a person related to the speaker. That is, the unmanned mobile unit 200 may determine whether or not a plurality of persons form a group related to each other, using the criteria described with reference to fig. 13 to 23 in embodiment 1.
Further, the unmanned moving object 200 may be configured to move a group of a plurality of persons different from the speaker and the related person so that the group deviates from the sound collection range or the sound collection direction when the group enters the sound collection range or the sound collection direction while the speaker and the related person collect sound.
Fig. 74 is a conceptual diagram illustrating an example when a person enters the sound pickup range. For example, in embodiment 1, when the unmanned mobile body 100 moves to the sound output position and the person enters the sound output range, the sound output by the unmanned mobile body 100 is transmitted from the unmanned mobile body 100 to the person, and therefore the person can easily recognize that the person enters the sound output range. On the other hand, in the present embodiment, when the unmanned mobile body 200 moves to the sound pickup position and the person enters the sound pickup range, it is difficult for the person to recognize that the person enters the sound pickup range.
Therefore, when the unmanned moving body 200 moves to the sound pickup position and the person enters the sound pickup range, the unmanned moving body 200 may notify the person that the person enters the sound pickup range.
For example, the unmanned mobile unit 200 may output a message of "entering the sound collection range" by using the directional speaker 207. That is, the unmanned mobile unit 200 may be notified by voice. Alternatively, the unmanned mobile unit 200 may be equipped with an LED for notification. The unmanned moving object 200 may be notified by an LED. Alternatively, the unmanned mobile unit 200 may notify the person by transmitting information indicating that the person enters the sound collection range to a portable terminal of the person through the communication unit 210.
As described above, the unmanned mobile unit 200 according to the present embodiment includes the directional microphone 208 and the processor 250. The directional microphone 208 collects sound from the direction of directivity. The processor 250 obtains one or more sensed data including data obtained from the directional microphone 208.
The processor 250 determines whether or not a second object exists around the first object based on at least one of the one or more sensing data. And a processor 250 for calculating a positional relationship between the first object and the second object based on at least one of the one or more sensing data when the second object is determined to be present.
The processor 250 determines, based on the positional relationship, a first position of the unmanned mobile unit 200 in which the first object and the second object are included in a range in which the directional microphone 208 collects sound with a mass equal to or greater than a predetermined mass, and moves the unmanned mobile unit 200 to the first position.
Accordingly, the unmanned mobile body 200 can appropriately collect sound of the first object and the second object. That is, the unmanned mobile body 200 can collect sound integrally with respect to a plurality of objects.
In the above description, although a variable sound pickup range is used, a fixed sound pickup range may be used. That is, the sound pickup sensitivity may be fixed. Instead of the directional microphone 208, a non-directional microphone may be used. Even with such a configuration, by moving to an appropriate sound collection position, it is possible to appropriately collect sound from a plurality of subjects.
(embodiment mode 3)
Embodiment 1 relates mainly to audio output. Embodiment 2 relates mainly to sound pickup. The present embodiment relates to both audio output and sound collection. The configuration and operation described in embodiment 1 and the configuration and operation described in embodiment 2 can be applied to this embodiment as well. Hereinafter, the configuration and operation relating to both the audio output and the sound collection will be described.
Fig. 75 is a block diagram showing a basic configuration example of the unmanned mobile unit according to the present embodiment. Fig. 75 shows an unmanned mobile unit 300 including a directional speaker 307, a directional microphone 308, and a processor 350.
The unmanned mobile body 300 is a device that moves. For example, the unmanned mobile body 300, moves autonomously or is stationary. The unmanned mobile unit 300 may be moved according to the operation when receiving the operation. The unmanned moving object 300 is typically an unmanned flying object, but may be a device that travels on a surface, not limited to an unmanned flying object. The unmanned mobile body 300 may include a moving mechanism such as a motor and an actuator for moving in the air or on a surface.
The unmanned mobile body 300 may include one or more sensors. For example, the unmanned moving object 300 may include an image sensor, a distance measurement sensor, a directional microphone 308 or another microphone as an audio sensor, a human detection sensor, or a position detector as a position sensor.
Directional speaker 307 is a speaker that outputs sound in a direction of directivity. The directivity direction of directional speaker 307 may be adjusted, and the sound pressure of the sound emitted from directional speaker 307 may be adjusted. The directivity direction of directional speaker 307 may be expressed as a sound output direction.
The directional microphone 308 is a microphone that collects sound from a directional direction. The directivity direction of the directional microphone 308 may be adjusted, or the sound pickup sensitivity of the directional microphone 308 may be adjusted. The pointing direction of the directional microphone 308 may also be expressed as a sound pickup direction.
The processor 350 is configured by a circuit that performs information processing. For example, the processor 350 may control the movement of the unmanned mobile unit 300. Specifically, the processor 350 may control the movement of the unmanned mobile unit 300 by controlling the operation of a moving mechanism such as a motor and an actuator for moving in the air or on a surface.
Processor 350 may also adjust the directivity direction of directional speaker 307 and may also adjust the sound pressure of the sound emitted by directional speaker 307 by transmitting a control signal to directional speaker 307. Further, processor 350 may adjust the direction of unmanned mobile unit 300 to adjust the directivity of directional speaker 307.
The processor 350 may also adjust the directivity direction of the directional microphone 308 and adjust the sound collection sensitivity of the directional microphone 308 by transmitting a control signal to the directional microphone 308. The processor 350 may adjust the direction of the unmanned mobile unit 300 to adjust the directivity direction of the directional microphone 308.
Fig. 76 is a flowchart showing a basic operation example of the unmanned mobile unit shown in fig. 75. Mainly, the processor 350 of the unmanned mobile unit 300 performs the operation shown in fig. 76.
First, the processor 350 obtains one or more sensing data (S301). The processor 350 may obtain one or more pieces of sensed data from one or more sensors inside the unmanned mobile body 300, or may obtain one or more pieces of sensed data from one or more sensors outside the unmanned mobile body 300. The processor 350 may obtain a plurality of pieces of sensing data from one or more sensors inside the unmanned mobile body 300 and one or more sensors outside the unmanned mobile body 300.
For example, an image sensor, a distance measuring sensor, a microphone, a human detection sensor, a position detector, or the like may be used as one or more sensors outside the unmanned mobile body 300.
The processor 350 determines whether or not a second object exists around the first object based on at least one of the acquired one or more sensing data (S302). For example, the first object is a speaker and the second object is a person associated with the speaker. However, each of the first object and the second object may be not only a human but also an animal or a device.
If it is determined that the second object exists in the periphery of the first object, the processor 350 calculates the positional relationship between the first object and the second object from at least one of the one or more sensing data (S303). That is, the processor 350 derives the positional relationship of the first object and the second object from at least one of the one or more sensing data.
For example, the positional relationship includes at least one of a position and a distance related to the first object and the second object. The positional relationship may include the respective positions of the first object and the second object, and may also include the distance between the first object and the second object.
Specifically, the processor 350 may calculate the position of the first object, the position of the second object, the distance between the first object and the second object, and the like using image data obtained from the image sensor. Further, the processor 350 may calculate a distance between the unmanned mobile body 300 and the first object, a distance between the unmanned mobile body 300 and the second object, a distance between the first object and the second object, and the like, using the ranging data obtained from the ranging sensor.
The processor 350 determines the first position based on the calculated positional relationship. The first position is a position where the unmanned mobile object 300 such as the first object and the second object is included in a range where sound is transmitted with a mass equal to or greater than a predetermined mass by the directional speaker 307 and sound is collected with a mass equal to or greater than a predetermined mass by the directional microphone 308. Then, the processor 350 moves the unmanned mobile object 300 to the determined first position (S304).
Accordingly, the unmanned mobile body 300 can appropriately perform sound output and sound collection for the first object and the second object. That is, the unmanned mobile body 300 can output and collect sound integrally for a plurality of objects.
A more specific example will be described below with reference to fig. 77 to 81. In this example, the unmanned mobile body 300 is an unmanned flying body also called an unmanned aerial vehicle. The first object corresponds to a speaker and the related person corresponds to the second object.
Fig. 77 is a conceptual diagram illustrating an example of the sound output range and the sound pickup range. The sound output range of the present embodiment is determined in the same manner as the sound output range of embodiment 1, and the sound collection range of the present embodiment is determined in the same manner as the sound collection range of embodiment 2.
In order for the unmanned mobile unit 300 and the person to have a conversation with each other, the unmanned mobile unit 300 moves to a conversation position where sound is transmitted to the person by the directional speaker 307 and sound is collected from the person by the directional microphone 308. Specifically, the unmanned mobile object 300 determines the conversation position based on the overlapping range of the sound output range and the sound collection range.
For example, the unmanned mobile unit 300 determines the conversation position so that the speaker is included in the overlapping range of the voice output range and the sound collection range. For example, when a person related to the speaker is present in the vicinity of the speaker, the unmanned mobile unit 300 determines the conversation position so as to include the speaker and the person related to the speaker in the overlapping range of the voice output range and the sound collection range. This operation is performed in the same manner as the operation of determining the sound output position so that the speaker or the like is included in the sound output range in embodiment 1 and the operation of determining the sound collection position so that the speaker or the like is included in the sound collection range in embodiment 2.
Then, the unmanned mobile body 300 moves to a conversation position determined according to the overlapping range of the sound output range and the sound collection range.
Fig. 78 is a conceptual diagram illustrating an example of collecting sound from a range in which the sound output range and the sound collection range do not overlap. The sound output range and the sound pickup range may partially overlap each other. That is, a part of the sound output range may overlap a part of the sound collection range, and the other part of the sound output range may not overlap the other part of the sound collection range. In addition, a person may exist only in one of the sound output range and the sound collection range. For example, the unmanned mobile unit 300 may operate so as not to have a conversation with a person existing only in one of the sound output range and the sound collection range.
Specifically, as shown in fig. 78, a person may be present within the sound collection range and outside the sound output range. In this case, the unmanned mobile body 300 collects the sound from the person, but the unmanned mobile body 300 does not transmit the sound to the person. Therefore, for example, the unmanned mobile body 300 may ignore the collected sound when it is determined that the collected sound is a sound collected from a place different from the overlapping range of the sound output range and the sound collection range. That is, the unmanned mobile unit 300 may skip the response processing for the voice collected by the person.
For example, the unmanned mobile object 300 may detect a person existing within the sound collection range and outside the sound output range by image recognition processing, voice recognition processing, or the like. Moreover, the unmanned mobile unit 300 may ignore the sound collected from the person.
The above-described work may be performed for a person who is different from the speaker and who is not clear whether the person is a related person. Alternatively, the above-described work may be performed for a person who is a speaker or a person concerned.
Fig. 79 is a conceptual diagram illustrating an example of adjusting a range in which the sound output range and the sound pickup range do not overlap. The unmanned mobile unit 300 may adjust a range in which the sound output range and the sound collection range do not overlap with each other in order to suppress the human from entering only one of the sound output range and the sound collection range.
Specifically, the unmanned mobile unit 300 may adjust at least one of the direction of the unmanned mobile unit 300, the directivity direction of the directional speaker 307, and the directivity direction of the directional microphone 308 so that the unmanned mobile unit does not cause a person to enter only one of the sound output range and the sound collection range. Alternatively, unmanned mobile unit 300 may narrow at least one of the directivity width of directional speaker 307 and the directivity width of directional microphone 308 so that a person does not enter only one of the sound output range and the sound collection range.
Fig. 80 is a block diagram showing a specific configuration example of the unmanned mobile unit 300 shown in fig. 75. The unmanned mobile unit 300 shown in fig. 80 includes a GPS receiver 301, a gyro sensor 302, an acceleration sensor 303, a human detection sensor 304, a distance measurement sensor 305, an image sensor 306, a directional speaker 307, a directional microphone 308, a drive unit 309, a communication unit 310, a control unit 320, a storage unit 330, and a power supply unit 341.
The GPS receiver 301 is a receiver that constitutes a GPS (global Positioning system) for measuring a position and receives a signal to obtain a position. For example, the GPS receiver 301 obtains the position of the unmanned mobile unit 300.
The gyro sensor 302 is a sensor that detects the posture of the unmanned mobile body 300, that is, the angle or inclination of the unmanned mobile body 300. The acceleration sensor 303 is a sensor that detects the acceleration of the unmanned mobile body 300. The human detection sensor 304 is a sensor that detects a human in the periphery of the unmanned mobile body 300. The human detection sensor 304 may be an infrared sensor.
The distance measurement sensor 305 is a sensor that measures the distance between the unmanned mobile body 300 and the object, and generates distance measurement data. The image sensor 306 is a sensor for performing imaging, and generates an image by imaging. The image sensor 306 may also be a camera.
Directional speaker 307 is a speaker that outputs sound in the direction of directivity as described above. The directivity direction of directional speaker 307 may be adjusted, and the sound pressure of the sound emitted from directional speaker 307 may be adjusted. The directional microphone 308 is a microphone that collects sound from a directional direction. The directivity direction of the directional microphone 308 may be adjusted, or the sound pickup sensitivity of the directional microphone 308 may be adjusted.
The driving unit 309 is a motor, an actuator, and the like that move the unmanned mobile unit 300. The communication unit 310 is a communicator that communicates with a device outside the unmanned mobile unit 300. The communication unit 310 may receive an operation signal for moving the unmanned mobile unit 300. The communication unit 310 may transmit and receive the contents of the session.
The control unit 320 corresponds to the processor 350 shown in fig. 75, and is configured by a circuit that performs information processing. Specifically, in this example, the control unit 320 includes a person detection unit 321, a related person determination unit 322, a range determination unit 323, a conversation position determination unit 324, a conversation control unit 325, and a movement control unit 326. That is, the processor 350 may also play their role.
The human detection unit 321 detects a human present in the periphery of the unmanned mobile unit 300. The human detector 321 detects a human present in the periphery of the unmanned mobile unit 300 based on sensing data obtained from the human detection sensor 304 or another sensor.
The related person determination unit 322 determines whether or not the person detected by the person detection unit 321 is a related person related to the speaker. The range determination unit 323 determines the audio output range and the sound collection range based on the positional relationship between the speaker and the relevant person. The conversation position determining unit 324 determines a conversation position based on the audio output range and the sound collecting range. Session control unit 325 transmits a control signal to directional speaker 307 to control the sound output of directional speaker 307, and transmits a control signal to directional microphone 308 to control the sound collection by directional microphone 308.
The movement control unit 326 transmits a control signal to the drive unit 309 to control the movement of the unmanned mobile unit 300. In this example, the movement control unit 326 controls the flight of the unmanned aerial vehicle 300, which is an unmanned aerial vehicle.
The storage unit 330 is a memory for storing information, and stores a control program 331 and correspondence information 332. The control program 331 is a program for information processing performed by the control unit 320. Correspondence information 332 is information showing the correspondence between the sound pressure of the sound emitted from directional speaker 307 and the sound output range of the sound transmitted with a predetermined or higher mass, and the correspondence between the sound collection sensitivity of directional microphone 308 and the sound collection range of the sound collected with a predetermined or higher mass.
The power supply unit 341 is a circuit that supplies power to a plurality of components included in the unmanned mobile unit 300. For example, the power supply portion 341 includes a power source.
Fig. 81 is a flowchart showing a specific operation example of the unmanned mobile unit 300 shown in fig. 80. For example, the plurality of components of the unmanned mobile unit 300 shown in fig. 80 perform the operations shown in fig. 81 in an interlocking manner.
First, the unmanned mobile unit 300 moves to a conversation position for conversation with the speaker (S311). For example, the conversation position is a position where the voice uttered by the speaker is transmitted from the position of the speaker and the sound uttered by the unmanned mobile unit 300 is transmitted. That is, the unmanned mobile unit 300 moves to a conversation position where the speaker is included in the overlapping range of the voice output range and the sound collection range. The speaker may also be predetermined. The unmanned mobile unit 300 may determine a speaker during flight.
For example, in the unmanned mobile body 300, the human detector 321 detects a speaker based on sensing data obtained from the human detection sensor 304, the image sensor 306, or the like. The movement control unit 326 moves the unmanned mobile unit 300 to a conversation position within a predetermined range with respect to the speaker via the drive unit 309.
Then, the unmanned mobile 300 starts a session (S312). That is, the unmanned mobile unit 300 starts at least one of sound output and sound collection. Specifically, session control unit 325 may start collecting sound with directional microphone 308 or start outputting sound with directional speaker 307.
Then, the unmanned mobile unit 300 senses the periphery of the speaker (S313). For example, the human detection unit 321 detects a human around the speaker by sensing the human around the speaker by the human detection sensor 304, the image sensor 306, or the like. For this detection, any sensor for detecting a person can be used. The periphery of the speaker corresponds to, for example, an area within a predetermined range with respect to the speaker.
Then, the unmanned mobile unit 300 determines whether or not a person other than the speaker is detected (S314). For example, the human detection unit 321 determines whether or not a human other than the speaker is detected around the speaker. When a person other than the speaker is not detected (no in S314), the unmanned mobile unit 300 repeats sensing of the surroundings of the speaker (S313).
When a person other than the speaker is detected (yes in S314), the unmanned mobile unit 300 determines whether or not the detected person is a person related to the speaker (S315). For example, the related person determination unit 322 may determine whether the detected person is a related person based on whether the distance between the speaker and the related person is within a threshold value, or may determine whether the detected person is a related person based on another determination criterion for grouping or the like. This determination is the same as that described in embodiment 1.
When the detected person is not the relevant person (no in S315), the unmanned mobile unit 300 repeats sensing of the vicinity of the speaker (S313).
If the detected person is the relevant person (yes in S315), the unmanned mobile unit 300 measures the separation distance between the speaker and the relevant person (S316). For example, the range determination unit 323 may calculate a distance between the position of the speaker detected from the sensing data and the position of the person concerned detected from the sensing data, and measure the distance between the speaker and the person concerned.
Then, the unmanned mobile unit 300 determines the voice output range and the sound pickup range according to the distance between the speaker and the relevant person (S317). For example, the range determination unit 323 determines the sound output range and the sound collection range based on the measured separation distance. In this case, the range determination unit 323 increases the sound output range and the sound collection range as the measured separation distance increases.
The sound output range is a range relatively determined by using the unmanned mobile object 300 as a reference, for example, and is also a range in which sound is transmitted by the directional speaker 307 with a mass equal to or higher than a predetermined mass. The sound collection range is a range relatively determined using the unmanned mobile body 300 as a reference, for example, and is also a range in which sound is collected by the directional microphone 308 with a mass equal to or higher than a predetermined mass. The predetermined mass or more may correspond to a sound pressure within a predetermined range, or may correspond to a signal-to-noise ratio (signal-to-noise ratio) within a predetermined range.
Then, the unmanned mobile unit 300 determines a new conversation position based on the position of the speaker, the position of the relevant person, the sound output range, and the sound collection range (S318). For example, the conversation position determining unit 324 determines a new conversation position so as to include the detected position of the speaker and the detected position of the person concerned in the overlapping range of the voice output range and the sound collection range.
Then, the unmanned mobile vehicle 300 moves to a new session position (S319). For example, the movement control unit 326 controls the operation of the driving unit 309 to move the unmanned mobile unit 300 to a new conversation position. Further, session control unit 325 may control the audio output of directional speaker 307 so that the audio is transmitted to the audio output range with a quality equal to or higher than a predetermined quality. The conversation control unit 325 may control the sound collection of the directional microphone 308 so that the sound is collected from the sound collection range with a quality equal to or higher than a predetermined quality.
Accordingly, the unmanned mobile unit 300 can appropriately perform sound output and sound collection for the speaker and the related person.
In the above example, after the unmanned mobile unit 300 starts a conversation with the speaker (after S312), the unmanned mobile unit performs a process of moving to a new conversation position for the speaker and the relevant person (S313 to S319). However, the unmanned mobile unit 300 may perform processing for moving to a new conversation position for the speaker and the related person before starting the conversation with the speaker.
In the above example, when the detected person is not the relevant person (no in S315), the unmanned mobile unit 300 repeats sensing of the vicinity of the speaker (S313). However, the unmanned mobile unit 300 may correct the conversation position so that the voice output and the sound collection are not performed for a person other than the relevant person. That is, the conversation position determining unit 324 of the unmanned mobile unit 300 may correct the conversation position so that a person other than the relevant person is not included in the sound output range and the sound collection range.
The conversation position determining unit 324 may correct the conversation position so that a person other than the person concerned deviates from the sound output direction and the sound collecting direction. This can suppress the possibility of a person other than the person concerned entering the sound output range or the sound collection range when the person moves.
When the voice output range and the sound collection range are fixed, the unmanned mobile unit 300 may determine whether or not the distance between the speaker and the person concerned is within the overlapping range of the voice output range and the sound collection range. Further, when the distance is within the overlap range, the unmanned mobile object 300 may determine a new session position and move to the determined new session position. The unmanned mobile body 300 may not move when the distance is not within the overlapping range.
As described above, the configuration and operation described in embodiment 1 and the configuration and operation described in embodiment 2 can be applied to this embodiment as well.
The form of the unmanned moving object has been described above with reference to the embodiments and the like, but the form of the unmanned moving object is not limited to the embodiments and the like. Modifications that can be made to the embodiments and the like as would occur to those skilled in the art may be made, and a plurality of constituent elements of the embodiments and the like may be combined as desired. For example, in the embodiments and the like, the processing executed by a specific component may be executed by another component instead of the specific component. Further, the order of the plurality of processes may be changed, or a plurality of processes may be executed in parallel.
The session in the above description may be a one-way session or a two-way session. The unmanned mobile body controls the directivity directions of the directional speaker and the directional microphone so that the directivity directions of the directional speaker and the directional microphone face the speaker and the person concerned.
In the above description, the speaker and the related person are subjected to sound output and sound collection. However, when it is determined in advance that only one person is to be subjected to sound output and sound collection, or when it is designated by switching of the operation mode that only one person is to be subjected to sound output and sound collection, it is also possible to perform sound output and sound collection only for one person. That is, the speaker may be subjected to sound output and sound collection without making a judgment on the person concerned.
Furthermore, the position may be determined so that voice output and sound collection are not performed for a person other than the speaker. Specifically, the position may be determined so that the voice is not output and collected by a person other than the speaker, as in the case of the non-related person in fig. 35 to 37 and 62 to 64.
The information processing method including the steps performed by the respective components of the unmanned mobile body may be executed by any device or system. That is, the information processing method may be executed by an unmanned mobile object, or may be executed by another apparatus or system.
For example, the information processing method may be executed by a computer including a processor, a memory, an input/output circuit, and the like. At this time, the program for causing the computer to execute the information processing method is executed by the computer, so that the information processing method can also be executed. Further, the program may be recorded on a non-transitory computer-readable recording medium.
For example, the program causes a computer to execute an information processing method of obtaining one or more pieces of sensing data, determining whether or not a second object is present in the periphery of a first object based on at least one of the one or more pieces of sensing data, calculating a positional relationship between the first object and a second object based on at least one of the one or more pieces of sensing data when it is determined that the second object is present, determining a first position of the unmanned mobile body based on the positional relationship, and moving the unmanned mobile body to the first position so that the first object and the second object are included in a range in which a speaker provided in the unmanned mobile body directionally transmits sound with a mass of a predetermined mass or more.
For example, the program causes a computer to execute an information processing method of obtaining one or more pieces of sensing data, determining whether or not a second object is present in the periphery of a first object based on at least one of the one or more pieces of sensing data, calculating a positional relationship between the first object and a second object based on at least one of the one or more pieces of sensing data when it is determined that the second object is present, determining a first position of the unmanned mobile body based on the positional relationship, and moving the unmanned mobile body to the first position so that the first object and the second object are included in a range in which a directional microphone provided in the unmanned mobile body collects sound with a mass equal to or greater than a predetermined mass.
Further, the plurality of components of the unmanned mobile unit may be configured by dedicated hardware, may be configured by general-purpose hardware that executes the program and the like, or may be configured by a combination of these. The general-purpose hardware may be configured by a memory in which a program is stored, a general-purpose processor that reads out the program from the memory and executes the program, or the like. Here, the memory may be a semiconductor memory, a hard disk, or the like, and the general-purpose processor may be a CPU or the like.
The dedicated hardware may be constituted by a memory, a dedicated processor, or the like. For example, the dedicated processor may execute the information processing method described above with reference to the memory.
Each component of the unmanned mobile unit may be a circuit. These circuits may be configured as a single circuit as a whole, or may be different circuits. These circuits may be dedicated hardware or general-purpose hardware that executes the program or the like.
Hereinafter, a basic configuration of an unmanned mobile body according to an embodiment of the present disclosure, a representative modification, and the like are shown. These may be combined with each other, or may be combined with a part of the above-described embodiments and the like.
(1) For example, an unmanned mobile object (100, 200, 300) according to one aspect of the present disclosure includes a directional speaker (107, 207, 307) and a processor (150, 250, 350). And a directional speaker (107, 207, 307) for outputting sound in a directional direction.
A processor (150, 250, 350) obtains more than one sensed data. The processor (150, 250, 350) determines whether or not a second object is present in the periphery of the first object based on at least one of the one or more sensing data. And a processor (150, 250, 350) that, when it is determined that the second object is present, calculates a positional relationship between the first object and the second object based on at least one of the one or more sensing data.
And a processor (150, 250, 350) determines the first position based on the positional relationship. Here, the first position is a position of the unmanned mobile body (100, 200, 300) in a range where the directional speaker (107, 207, 307) transmits sound with a mass equal to or greater than a predetermined mass. The processor (150, 250, 350) moves the unmanned mobile body (100, 200, 300) to the first position.
Accordingly, the unmanned mobile object (100, 200, 300) can appropriately output sound to the first object and the second object. That is, the unmanned mobile body (100, 200, 300) can output sound integrally to a plurality of objects.
(2) For example, an unmanned mobile body (100, 200, 300) according to one aspect of the present disclosure includes a directional microphone (108, 208, 308) and a processor (150, 250, 350). A directional microphone (108, 208, 308) collects sound from the pointing direction.
A processor (150, 250, 350) obtains one or more sensed data including data obtained from the directional microphone (108, 208, 308). The processor (150, 250, 350) determines whether or not a second object is present in the periphery of the first object based on at least one of the one or more sensing data. And a processor (150, 250, 350) that, when it is determined that the second object is present, calculates a positional relationship between the first object and the second object based on at least one of the one or more sensing data.
And a processor (150, 250, 350) determines the first position based on the positional relationship. Here, the first position is a position of the unmanned mobile body (100, 200, 300) in a range where the directional microphone (108, 208, 308) collects sound with a mass equal to or greater than a predetermined mass. The processor (150, 250, 350) moves the unmanned mobile body (100, 200, 300) to the first position.
Accordingly, the unmanned mobile body (100, 200, 300) can appropriately collect sound from the first object and the second object. That is, the unmanned mobile body (100, 200, 300) can collect sound of a plurality of objects integrally.
(3) For example, the processor (150, 250, 350) adjusts the range according to the position relationship, and determines the first position according to the adjusted range. Here, the range is at least one of a range in which sound is transmitted with a mass equal to or higher than a predetermined mass by the directional speaker (107, 207, 307) and a range in which sound is collected with a mass equal to or higher than a predetermined mass by the directional microphone (108, 208, 308).
Accordingly, the unmanned mobile body (100, 200, 300) can appropriately adjust the range of sound output or sound collection according to the positional relationship, and can appropriately include a plurality of objects within the adjusted range.
(4) For example, the first position is a position on the front side of the first object and the second object. Accordingly, the unmanned mobile body (100, 200, 300) can move to an appropriate position for a conversation with a plurality of objects.
(5) For example, the processor (150, 250, 350) obtains body information of the first subject and body information of the second subject from at least one of the one or more sensed data. The processor (150, 250, 350) determines the first position based on the body information of the first subject and the body information of the second subject. Accordingly, the unmanned mobile body (100, 200, 300) can move to an appropriate position with respect to the body information of the first object and the body information of the second object.
(6) For example, the processor (150, 250, 350) estimates an age of at least one of the first object and the second object based on at least one of the one or more sensed data. The processor (150, 250, 350) determines the first position based on the age of at least one of the first object and the second object.
Accordingly, the unmanned mobile body (100, 200, 300) can move to a position close to an object whose capability is estimated to be low, and can appropriately perform sound output or sound collection for a plurality of objects.
(7) For example, the processor (150, 250, 350) determines to exclude a third object, unrelated to the first object and the second object, from the first position within the range. Accordingly, the unmanned mobile body (100, 200, 300) can suppress sound output or sound collection to a third object that is not related.
(8) For example, the processor (150, 250, 350) detects a position of an obstacle based on at least one of the one or more sensed data, and determines a first position based on the position of the obstacle. Accordingly, the unmanned mobile body (100, 200, 300) can appropriately determine the positions for outputting or collecting sound to a plurality of objects according to the position of the obstacle. Further, the unmanned mobile body (100, 200, 300) can suppress sound output or sound collection to a third object having no relation, for example, by using an obstacle.
(9) For example, the processor (150, 250, 350) moves the unmanned mobile body (100, 200, 300) to the first position in a state where the first object is included in the range, when it is determined that the second object is present during the period when the first object is outputted or collected as sound. Accordingly, the unmanned mobile body (100, 200, 300) can move to an appropriate position for performing a conversation with the first object and the second object while continuing the conversation with the first object.
(10) For example, the processor (150, 250, 350) moves the unmanned mobile body (100, 200, 300) to the first position via the front side of the first object when it is determined that the second object is present during the period when the first object is subjected to sound output or sound collection. Accordingly, the unmanned mobile body (100, 200, 300) can move to an appropriate position for conducting a conversation with the first object and the second object via an appropriate region for conducting a conversation with the first object.
(11) For example, when it is determined that the second object is present during the period in which the first object is subjected to sound output or sound collection, the processor (150, 250, 350) moves the unmanned mobile body (100, 200, 300) to the first position while maintaining the quality of sound output or sound collection for the first object constant.
Accordingly, the unmanned mobile body (100, 200, 300) can move to an appropriate position for performing a conversation with the first object and the second object while continuing the conversation with the first object.
(12) For example, the second object is an object related to the first object. A processor (150, 250, 350) obtains at least one of information showing an association with the first object and information showing an association with the unmanned mobile body (100, 200, 300) from at least one of the one or more sensing data.
The processor (150, 250, 350) determines whether or not an object existing in the periphery of the first object is related to the first object based on at least one of information showing a relationship with the first object and information showing a relationship with the unmanned mobile body (100, 200, 300), and determines whether or not a second object exists in the periphery of the first object.
Accordingly, the unmanned mobile body (100, 200, 300) can appropriately determine whether or not the second object related to the first object exists in the periphery of the first object.
(13) For example, the processor (150, 250, 350) detects the frequency of sound emission of the first object and the frequency of sound emission of the second object based on at least one of the one or more sensing data. The processor (150, 250, 350) determines, as the first position, a position closer to the one of the first object and the second object having a higher frequency of sound emission than the one of the first object and the second object having a lower frequency of sound emission.
Accordingly, the unmanned mobile body (100, 200, 300) can move to the vicinity of an object whose sound emission frequency is high. Therefore, the unmanned mobile body (100, 200, 300) can appropriately collect sound from a subject whose sound emission frequency is high.
(14) For example, the processor (150, 250, 350) detects a volume of the first object and a volume of the second object based on at least one of the one or more sensed data. The processor (150, 250, 350) determines, as the first position, a position closer to the first object and the second object, at which the volume is smaller than the first object and the second object, at which the volume is larger.
Accordingly, the unmanned mobile body (100, 200, 300) can move to the vicinity of the object with a low sound volume. Therefore, the unmanned mobile body (100, 200, 300) can appropriately collect sound from a subject with a low sound volume.
(15) For example, an unmanned mobile body (100, 200, 300) provided with a directional speaker (107, 207, 307) and a processor (150, 250, 350) is further provided with a directional microphone (108, 208, 308). Further, the range in which sound is transmitted by the directional speaker (107, 207, 307) with a mass equal to or higher than a predetermined mass is a range in which sound is collected by the directional microphone (108, 208, 308) with a mass equal to or higher than a predetermined mass.
Accordingly, the unmanned mobile object (100, 200, 300) can appropriately output sound to the first object and the second object, and can appropriately collect sound from the first object and the second object.
(16) For example, the processor (150, 250, 350) controls the timing of movement of the unmanned mobile body (100, 200, 300) in accordance with a conversation of the first object with the unmanned mobile body (100, 200, 300). Accordingly, the unmanned mobile body (100, 200, 300) can move at an appropriate timing corresponding to the session.
(17) For example, the processor (150, 250, 350) moves the unmanned mobile body (100, 200, 300) to the first position while the first object is being picked up.
Accordingly, the unmanned mobile body (100, 200, 300) can move while the first object is estimated to be emitting sound and the unmanned mobile body (100, 200, 300) is not outputting sound. Therefore, the unmanned mobile object (100, 200, 300) can suppress the second object from entering the range of the audio output during the audio output, and can transmit the entire content of the audio output to the second object.
(18) For example, the processor (150, 250, 350) causes the directional speaker (107, 207, 307) to start sound output after the movement of the unmanned mobile body (100, 200, 300) is completed, in a case where the sound emitted from the first object ends while the unmanned mobile body (100, 200, 300) is moving.
Accordingly, the unmanned mobile object (100, 200, 300) can start audio output after moving to an appropriate position for audio output of the first object and the second object. Therefore, the unmanned mobile object (100, 200, 300) can suppress the second object from entering the range of the audio output during the audio output, and can transmit the entire content of the audio output to the second object.
(19) For example, the processor (150, 250, 350) moves the unmanned mobile body (100, 200, 300) during a period when no sound is output or collected to the first object. Accordingly, the unmanned mobile body (100, 200, 300) can suppress the division of sound, and can output or collect sound in units of a whole. Further, the unmanned mobile body (100, 200, 300) can suppress the mixing of noise due to movement.
(20) For example, the one or more sensing data includes image data generated by an image sensor. A processor (150, 250, 350) obtains a positional relationship of the first object and the second object from image data generated by the image sensor. Accordingly, the unmanned mobile body (100, 200, 300) can appropriately obtain the positional relationship between the first object and the second object from the image data.
(21) For example, the one or more sensed data includes ranging data generated by a ranging sensor. A processor (150, 250, 350) obtains a positional relationship of the first object and the second object from the ranging data generated by the ranging sensor. Accordingly, the unmanned mobile body (100, 200, 300) can appropriately obtain the positional relationship between the first object and the second object from the distance measurement data.
(22) For example, the positional relationship includes at least one of a distance and a position associated with the first object and the second object. Accordingly, the unmanned mobile body (100, 200, 300) can move to an appropriate position according to the distance or position associated with the first object and the second object.
(23) For example, in an information processing method according to one aspect of the present disclosure, one or more pieces of sensed data are obtained (S101). Then, it is determined whether or not a second object exists around the first object based on at least one of the one or more sensing data (S102). When it is determined that the second object is present, the positional relationship between the first object and the second object is calculated from at least one of the one or more sensing data (S103).
Then, the first position is determined based on the positional relationship. Here, the first position is a position of the unmanned mobile body (100, 200, 300) in which the first object and the second object are included in a range in which a directional speaker (107, 207, 307) provided in the unmanned mobile body (100, 200, 300) transmits sound with a mass equal to or greater than a predetermined mass. Then, the unmanned mobile body (100, 200, 300) is moved to the first position (S104).
With this, the information processing method is performed, and thus it is possible to appropriately output audio to the first object and the second object. That is, it is possible to output sound to a plurality of objects integrally.
(24) For example, a program according to an aspect of the present disclosure is a program for causing a computer to execute the information processing method described above. Accordingly, by executing the program, it is possible to appropriately output sound to the first object and the second object. That is, it is possible to output sound to a plurality of objects integrally.
(25) For example, in an information processing method according to one aspect of the present disclosure, one or more pieces of sensed data are obtained (S201). At least one of the one or more sensing data determines whether or not a second object exists in the periphery of the first object (S202). When it is determined that the second object is present, the positional relationship between the first object and the second object is calculated from at least one of the one or more sensing data (S203).
Then, the first position is determined based on the positional relationship. Here, the first position is a position of the unmanned mobile body (100, 200, 300) in which the first object and the second object are included in a range in which the directional microphone (108, 208, 308) provided in the unmanned mobile body (100, 200, 300) collects sound with a mass equal to or greater than a predetermined mass. Then, the unmanned mobile body (100, 200, 300) is moved to the first position (S204).
With this, the information processing method is performed, and thus, sound can be appropriately collected from the first object and the second object. That is, a plurality of objects can be collected integrally.
(26) For example, a program according to an aspect of the present disclosure is a program for causing a computer to execute the information processing method described above. Accordingly, by executing the program, it is possible to appropriately collect sound from the first object and the second object. That is, a plurality of objects can be collected integrally.
In the above embodiments, the position of the unmanned mobile body is determined based on the sound output range of the speaker or the sound collection range of the microphone, but other presentation devices may be used as long as they present the object directly, such as a speaker. For example, as the prompting device, there is a display. That is, the present disclosure is applicable not only to sound but also to other information transmission media such as light.
The present disclosure can be used for an unmanned mobile object or the like that has a conversation with a speaker, and can be applied to a guidance system, a guard system, and the like.
Description of the symbols
100, 200, 300 unmanned mobile body
101, 201, 301GPS receiver
102, 202, 302 gyro sensor
103, 203, 303 acceleration sensor
104, 204, 304 human detection sensor
105, 205, 305 ranging sensor
106, 206, 306 image sensor
107, 207, 307 directional loudspeaker
108, 208, 308 directional microphone
109, 209, 309 driving part
110, 210, 310 communication section
120, 220, 320 control part
121, 221, 321 human detector
122, 222, 322 related personnel judgment part
123 sound output range determining section
124 sound output position determining part
125 sound output control part
126, 226, 326 movement control unit
130, 230, 330 storage unit
131, 231, 331 control program
132 sound pressure sound output range corresponding information
141, 241, 341 power supply unit
150, 250, 350 processor
223 sound pickup range determining part
224 sound pickup position determining unit
225 sound pickup control unit
232 pick-up sensitivity pick-up range corresponding information
323 range determination unit
324 conversation position determining part
325 conversation control part
332 correspond to information

Claims (20)

1. An unmanned moving body is provided, which comprises a base,
the unmanned mobile body includes:
a directional speaker that outputs sound in a directional direction; and
a processor to obtain more than one sensed data,
the processor is used for processing the data to be processed,
determining whether a second object exists at a periphery of the first object based on at least one of the one or more sensing data,
calculating a positional relationship between the first object and the second object from at least one of the one or more sensing data when it is determined that the second object is present,
determining a first position of the unmanned mobile object based on the positional relationship, and moving the unmanned mobile object to the first position so that the first object and the second object are included in a range in which sound is transmitted by the directional speaker with a mass equal to or greater than a predetermined mass.
2. An unmanned moving body is provided, which comprises a base,
the unmanned mobile body includes:
a directional microphone that collects sound from a pointing direction; and
a processor obtaining one or more sensed data including data obtained from the directional microphone,
the processor is used for processing the data to be processed,
determining whether a second object exists at a periphery of the first object based on at least one of the one or more sensing data,
calculating a positional relationship between the first object and the second object from at least one of the one or more sensing data when it is determined that the second object is present,
determining a first position of the unmanned mobile object based on the positional relationship, and moving the unmanned mobile object to the first position so that the first object and the second object are included in a range in which sound is collected by the directional microphone with a mass equal to or greater than a predetermined mass.
3. The unmanned mobile body according to claim 1 or 2,
the processor adjusts the range according to the positional relationship, and determines the first position according to the adjusted range.
4. The unmanned mobile body according to any one of claims 1 to 3,
the first position is a position on a front side of the first object and the second object.
5. The unmanned mobile body according to any one of claims 1 to 4,
the processor is used for processing the data to be processed,
obtaining body information of the first subject and body information of the second subject from at least one of the one or more sensing data,
and determining the first position according to the body information of the first object and the body information of the second object.
6. The unmanned mobile body according to any one of claims 1 to 5,
the processor is used for processing the data to be processed,
estimating an age of at least one of the first object and the second object based on at least one of the one or more sensed data,
the first location is also determined based on an age of at least one of the first object and the second object.
7. The unmanned mobile body according to any one of claims 1 to 6,
the processor determines the first position such that a third object unrelated to the first object and the second object is not included in the range.
8. The unmanned moving body according to claim 7,
the processor detects a position of an obstacle based on at least one of the one or more sensing data, and determines the first position based on the position of the obstacle.
9. The unmanned mobile body according to any one of claims 1 to 8,
the processor, when determining that the second object is present while the first object is being subjected to sound output or sound collection, moves the unmanned mobile object to the first position in a state where the first object is included in the range.
10. The unmanned mobile body according to any one of claims 1 to 9,
the processor, when determining that the second object is present while the first object is being subjected to sound output or sound collection, moves the unmanned mobile body to the first position through a front side of the first object.
11. The unmanned mobile body according to any one of claims 1 to 10,
the processor, when determining that the second object is present while the first object is being subjected to sound output or sound collection, moves the unmanned mobile object to the first position while maintaining a constant quality of sound output or sound collection for the first object.
12. The unmanned mobile body according to any one of claims 1 to 11,
the second object is an object related to the first object,
the processor is used for processing the data to be processed,
obtaining at least one of information showing an association with the first object and information showing an association with the unmanned mobile body from at least one of the one or more sensing data,
and determining whether or not an object existing in the periphery of the first object is related to the first object based on at least one of information showing a relationship with the first object and information showing a relationship with the unmanned mobile body, and determining whether or not the second object is present in the periphery of the first object.
13. The unmanned moving body according to claim 2,
the processor is used for processing the data to be processed,
detecting a volume of the first object and a volume of the second object according to at least one of the one or more sensing data,
the position closer to the one of the first object and the second object, at which the volume is smaller than the one of the first object and the second object, is determined as the first position.
14. The unmanned moving body according to claim 1,
the unmanned mobile body further includes a directional microphone,
further, the range is a range in which sound is collected by the directional microphone with a quality equal to or higher than a predetermined quality.
15. The unmanned moving body according to claim 14,
the processor controls timing of movement of the unmanned mobile body in accordance with a session of the first object with the unmanned mobile body.
16. The unmanned moving body according to claim 15,
the processor moves the unmanned mobile body to the first position while the first object is being picked up.
17. The unmanned moving body according to claim 16,
the processor causes the directional speaker to start sound output after the unmanned mobile body finishes moving when the sound emitted from the first object ends while the unmanned mobile body is moving.
18. The unmanned moving body according to claim 15,
the processor moves the unmanned mobile object while sound output or sound collection for the first object is not performed.
19. A method for processing information includes the steps of,
more than one sensed data is obtained and,
determining whether a second object exists at a periphery of the first object based on at least one of the one or more sensing data,
calculating a positional relationship between the first object and the second object from at least one of the one or more sensing data when it is determined that the second object is present,
and determining a first position of the unmanned mobile body based on the positional relationship, and moving the unmanned mobile body to the first position so that the first object and the second object are included in a range in which a directional speaker provided in the unmanned mobile body transmits sound with a mass equal to or greater than a predetermined mass.
20. A method for processing information includes the steps of,
more than one sensed data is obtained and,
determining whether a second object exists at a periphery of the first object based on at least one of the one or more sensing data,
calculating a positional relationship between the first object and the second object from at least one of the one or more sensing data when it is determined that the second object is present,
and determining a first position of the unmanned mobile body based on the positional relationship, and moving the unmanned mobile body to the first position so that the first object and the second object are included in a range in which a directional microphone provided in the unmanned mobile body collects sound with a mass equal to or greater than a predetermined mass.
CN201980085549.6A 2019-03-29 2019-10-30 Unmanned mobile object and information processing method Pending CN113226928A (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
JP2019-065940 2019-03-29
JP2019065940 2019-03-29
PCT/JP2019/042665 WO2020202621A1 (en) 2019-03-29 2019-10-30 Unmanned moving body and information processing method

Publications (1)

Publication Number Publication Date
CN113226928A true CN113226928A (en) 2021-08-06

Family

ID=72667756

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201980085549.6A Pending CN113226928A (en) 2019-03-29 2019-10-30 Unmanned mobile object and information processing method

Country Status (5)

Country Link
US (1) US20210311506A1 (en)
EP (1) EP3950498B1 (en)
JP (1) JP7426631B2 (en)
CN (1) CN113226928A (en)
WO (1) WO2020202621A1 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11646046B2 (en) 2021-01-29 2023-05-09 Qualcomm Incorporated Psychoacoustic enhancement based on audio source directivity
WO2024084953A1 (en) * 2022-10-20 2024-04-25 ソニーグループ株式会社 Information processing device, information processing method, and program

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1715717A1 (en) * 2004-02-10 2006-10-25 HONDA MOTOR CO., Ltd. Mobile body with superdirectivity speaker
US20090262604A1 (en) * 2006-08-30 2009-10-22 Junichi Funada Localization system, robot, localization method, and sound source localization program
JP2012076162A (en) * 2010-09-30 2012-04-19 Waseda Univ Conversation robot
US20160063987A1 (en) * 2014-08-29 2016-03-03 SZ DJI Technology Co., Ltd Unmanned aerial vehicle (uav) for collecting audio data
WO2017056380A1 (en) * 2015-09-30 2017-04-06 パナソニックIpマネジメント株式会社 Object detection device, object detection system and object detection method
US20170220036A1 (en) * 2016-01-28 2017-08-03 Qualcomm Incorporated Drone flight control
US20180330623A1 (en) * 2015-11-09 2018-11-15 Nec Solution Innovators, Ltd. Flight control device, flight control method, and computer-readable recording medium
JP2019036174A (en) * 2017-08-17 2019-03-07 ヤフー株式会社 Control apparatus, input/output device, control method and control program

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2004034274A (en) 2002-07-08 2004-02-05 Mitsubishi Heavy Ind Ltd Conversation robot and its operation method
JP4347743B2 (en) 2004-05-11 2009-10-21 パイオニア株式会社 Sound generator, method thereof, program thereof, and recording medium recording the program
JP4204541B2 (en) * 2004-12-24 2009-01-07 株式会社東芝 Interactive robot, interactive robot speech recognition method, and interactive robot speech recognition program
JP4599522B2 (en) * 2006-02-21 2010-12-15 株式会社国際電気通信基礎技術研究所 Communication robot
US9355368B2 (en) * 2013-03-14 2016-05-31 Toyota Motor Engineering & Manufacturing North America, Inc. Computer-based method and system for providing active and automatic personal assistance using a robotic device/platform
JP6530212B2 (en) * 2015-03-24 2019-06-12 セコム株式会社 Autonomous mobile robot
JP6713637B2 (en) * 2016-03-28 2020-06-24 株式会社国際電気通信基礎技術研究所 Service provision robot system
US10979613B2 (en) * 2016-10-17 2021-04-13 Dolby Laboratories Licensing Corporation Audio capture for aerial devices
US11122380B2 (en) * 2017-09-08 2021-09-14 Sony Interactive Entertainment Inc. Personal robot enabled surround sound

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1715717A1 (en) * 2004-02-10 2006-10-25 HONDA MOTOR CO., Ltd. Mobile body with superdirectivity speaker
US20090262604A1 (en) * 2006-08-30 2009-10-22 Junichi Funada Localization system, robot, localization method, and sound source localization program
JP2012076162A (en) * 2010-09-30 2012-04-19 Waseda Univ Conversation robot
US20160063987A1 (en) * 2014-08-29 2016-03-03 SZ DJI Technology Co., Ltd Unmanned aerial vehicle (uav) for collecting audio data
WO2017056380A1 (en) * 2015-09-30 2017-04-06 パナソニックIpマネジメント株式会社 Object detection device, object detection system and object detection method
US20180330623A1 (en) * 2015-11-09 2018-11-15 Nec Solution Innovators, Ltd. Flight control device, flight control method, and computer-readable recording medium
US20170220036A1 (en) * 2016-01-28 2017-08-03 Qualcomm Incorporated Drone flight control
JP2019036174A (en) * 2017-08-17 2019-03-07 ヤフー株式会社 Control apparatus, input/output device, control method and control program

Also Published As

Publication number Publication date
JP7426631B2 (en) 2024-02-02
JPWO2020202621A1 (en) 2020-10-08
WO2020202621A1 (en) 2020-10-08
EP3950498A4 (en) 2022-04-27
US20210311506A1 (en) 2021-10-07
EP3950498A1 (en) 2022-02-09
EP3950498B1 (en) 2024-05-15

Similar Documents

Publication Publication Date Title
CN108790674B (en) Control method and system for vehicle-mounted air conditioner
US7840308B2 (en) Robot device control based on environment and position of a movable robot
JP4204541B2 (en) Interactive robot, interactive robot speech recognition method, and interactive robot speech recognition program
EP3301948A1 (en) System and method for localization and acoustic voice interface
JP5366048B2 (en) Information provision system
US20110153197A1 (en) Insole type navigation apparatus and operation method thereof
KR20130103204A (en) Robot cleaner and controlling method of the same
JPWO2008126347A1 (en) Speech analysis apparatus, speech analysis method, speech analysis program, and system integrated circuit
CN113226928A (en) Unmanned mobile object and information processing method
JP2018036653A (en) Voice response device
US10062302B2 (en) Vision-assist systems for orientation and mobility training
KR102420611B1 (en) Elevator monitoring image transmission device
US11875571B2 (en) Smart hearing assistance in monitored property
KR20170111450A (en) Hearing aid apparatus, portable apparatus and controlling method thereof
CN111743740A (en) Blind guiding method and device, blind guiding equipment and storage medium
EP1257146B1 (en) Method and system of sound processing
CN112912309A (en) Unmanned aerial vehicle, information processing method, and program
JP2014033373A (en) Controller, program and control system
CN216014810U (en) Notification device and wearing device
WO2018053225A9 (en) Hearing device including image sensor
US20200262071A1 (en) Mobile robot for recognizing queue and operating method of mobile robot
CN113924249A (en) Unmanned aerial vehicle and information processing method
JPWO2020021861A1 (en) Information processing equipment, information processing system, information processing method and information processing program
JP7434635B1 (en) Information processing device, information processing method and program
JP2021103391A (en) Watching device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination