Raspberry Pi robotic server and motion, video, sensor controller.
def: a person whose work provides necessary support to the primary activities of an organization, institution, or industry.
Create a social robot which conforms to very simple human social rules and recognizes social cues such as eye contact, facial expressions, speaking and known objects.
The overall goal is to mimic simple human social interaction:
python src/AC3.py
to start the robotHaving gone through this before in multiple projects, the goal overall is to create a very easy to support set of hardware and software so there are not hidden elements which are going to break or be forgotten in the future. The decisions below are designed to make it easily supportable, cost effective and understandable for new people.
The system contains full onboard processing so there are no external computers needed. It also has a full web interface allowing for easy understanding of what is going on inside the system by going to the support URL.
One of the raspberry Pi’s is for core processing and the other will be dedicated to environment processing. A 3rd may be necessary if the processing load is too much for controlling the robot and doing vision processing.
One of the biggest challenges in embedded systems is being able to understand and interact with them successfully. Therefore, I am going to expose the key elements in a password protected web interface.
Here is the API documentation.
To change the password for the web server interface, run this below in the /src/webserver
directory.
python -c "import hashlib; import getpass; print(hashlib.sha512(getpass.getpass())).hexdigest()" > password.txt
The system will use two cameras to enable both full environment awareness and targeted vision. The reason is that for environmental awareness, background subtraction is the most important step. Knowing what elements matter and what are just walls. If a camera is moving on servos, it is very difficult to guess which pixels correspond to foreground or background data without 3D pixels (Maybe a future project :-) ). Therefore, by using a wide angle static camera, a standard background removal can be done to remove non-salient objects, color clustering can be done to segment the image into elements and then those can be clustered into people, objects, etc.
’’’ sudo apt-get install python-opencv libjpeg-dev
Robotic head with 5 DOF raspberry Pi robotic server and motion, video, sensor controller.
NOTE: Even though servos have a 0 to 3.3 v control signal where 12-bits is 0 to 4095, for these, that will blow it up. The actual range of the server is 150 to 600 on RobotGeek servos. Therefore we need to map that to our positions correctly.
From here: https://www.raspberrypi.org/forums/viewtopic.php?t=32826
.5 ms / 4.8 usec = 104 the number required by our program to position the servo at 0 degreees 1.5 msec / 4.8 usec = 312 the number required by our program to position the servo at 90 degrees 2.5 msec / 4.8 usec = 521 the number required by our program to position the servo at 180 degrees
The contorller only updates at 50 hz and it seems that the actual position control of servos is only accurate to about 0.5 degrees which means that the whole thing can jitter a LOT. To account for this, we need to adjust the interpolation algorithms.
A few things I have seen online:
There are multiple
There are two ways that speech recognition can be implemented. Either local(Sphinx) or cloud based (Amazon, Google). Cloud-based recognition will always be more accurate however there is a larger delay between speech and recognition. If local recognition is to be used, then a small vocabulary should be specified.