Always Watching - Raspberry Pi Face Tracker (Part 1)

This is part one of a two-part series of posts which are an experiment in bringing machine intelligence into the realm of DIY hardware. I use a Raspberry Pi, some easily available peripherals, and some Python code to build a robot that is capable of following a face around a room and sending personalised messages to specific individuals. The robot also has the capacity to discover new inviduals on-the-fly with real time training.

I've been wanting to do a project that involved the Raspberry Pi or a similar Arduino but haven't found one compelling enough to invest the money or time. Many of the projects I see online are smart mirrors or PiHoles and neither seem particularly interesting for me. I then came across this post where the author was able to combine Deep Learning with Raspberry Pi technology and I thought it was a great project. After identifying some improvements to this project aswell as some ideas about where I could take it next, I began to source the necessary components.

Title's namesake

The Build

This is the list of components that are necessary for the core setup along with some links and what they are used for:

Raspberry Pi 4
I chose the 4GB model so that there is sufficient capacity to load and run simple ML models. I would advise against Raspberry Pi 3 or the Raspberry Pi Zero for any ML applications due to their limited memory sizes.
Pi Camera
Is the official Raspberry Pi compatible image capture mechanism.
Pan Tilt Hat
This is an addition that allows you to control the angle of the camera. It consists of a two servos (to control movement), a plastic fixture (to house attach the servos and camera), a PCB (manages connections between servos, camera and the Raspberry Pi), and a light diffuser for a separely purchased LED strip.
Micro SD Card
The SD card acts the hard drive for the RPI and stores the OS. I went for a 32GB option so there is enough space to store models and temporary images. I also picked a card that has the New Out Of Box software (Noobs) installed for easier setup and since it was cheaper. If you already have an SD card, it is simple to create a bootable disk.
Power Supply
This channels power to the computer and some small peripherals through USB-C input.
Jumper Cables
The cables allow you to connect different electrical components together. I only needed about 3 cables for the project but I will use more in later projects.
Keyboard and Mouse
Basic user input to talk to the Pi.

The following parts are optional or depend on your existing setup:

Raspberry Pi Case
This gives a nice housing to the Pi so I'm not overly concerned about damaging the motherboard and also looks quite nice.
30cm Flex Cable
The camera module comes with it's own camera flex cable but it's too short for the full range of movement that the Pan Tilt hat provides.
Adafruit NeoPixel
These are a set of programmable LEDs on a PCB that connect to the Pan Tilt hat. Note that these header pins soldered onto the board to connect to the hat.
Header Pins
These allow you to connect jumpers from the Pan Tilt hat to the LED strip
USB-C Switch
This switch lets me turn on and turn off the Pi without needing to (un)plug the Pi.
Standoffs
These are some screws that let you physically mount peripherals to the Pi. In this case I use it to attach the Pan Tilt hat and stop it falling off when the servos jerk.
Micro-HDMI Adaptor
Connect a normal size HDMI to the Pi.

The Whole Kit


The Whole Kit Unboxed


Connecting all the parts is fairly straightforward, and apart from handling the delicate case, nothing can really go wrong. One thing I would say is that to connect the light strip to the Pi, header pins need to be soldered onto the board and for a complete beginner, this is a difficult task due to the small and tightly packed contact pads. Once everything was built, the pi definately had some strong Wall-E / Pixar Lamp vibes.

Assembled Raspberry Pi


Pi with Camera Module


Setting Up The Pi

The next step is powering up, establishing how best to interface with the Pi, and testing the components.

Installing an OS

Since I bought an SD card with software already installed, shortly after booting I was greeted with the OS installation wizard. Following the wizard results in a fresh install of Raspbian, a Raspberry Pi friendly Linux derivative.

Noobs Install Wizard (source)


Interfacing - SSH

I'm too lazy to not use my primary laptop to control the Pi so I enable SSH through the command line with

    sudo raspi-config

This loads the configuration screen where you should go to Interfacing Options > SSH > Yes.

Main Configuration Screen

Interacing Submenu


Then you can find the local IP of the Pi by typing into the command line

    hostname -l

And on your main laptop you can SSH into the Pi with

    ssh pi@LOCAL_RASPBERRY_PI_IP

Interfacing - Remote Desktop

Since I will also need to see the camera feed from the Pi, I enable the ability to remote desktop into the Pi by going into the config menu as above and enabling Interfacing Options > VNC > Yes. Then, on my Mac, I downloaded VNCViewer and you log in to the Pi with it's local IP and control it like a normal computer.

Interfacing - Data Transfer

The theme of wanting to use my main machine as much as possible continues when I begin writing the software. When starting the project, I create an Git repo on my main machine and code as normal. When it comes to deploying this code onto the Pi, there are two solutions I use:

  1. When all code changes made are all made on my laptop, I use the rsync command to copy all files and folders apart from the virtual environment onto the pi with a command similar to rsync -r face_tracker pi@192.168.0.55:Desktop/ --exclude=venv.
  2. Sometimes, for the sake of speed, I will edit code straight from the Pi in which case I'll use Git as normal to synchronise code between the laptop and Pi.

Testing The Hardware

By this time it's clear that the networking chip works, USB ports work, the SD card is in working order and the device powers nicely. The key things remaining to test are the servos, camera, and Neopixel strip. There are two Python libraries that are useful, pantilthat, the official module of the Pan Tilt Hat, and picamera to control the camera. Note that before using the camera, you need to enable the camera function in the config options as above by selecting Interfacing Options > Camera > Yes.

There are few Python scripts that come with these libraries that lets you test functionality, but it's fairly straightward to understand the few functions to control the hardware. One thing to note is that I had to manually adjust the servoX_min and servoX_max attributes from the default values to avoid the motor jamming and quickly wearing the components out.

    from pantilthat import PanTilt
    import math
    def test_motion():
        """
        Test full range of motion as aswell as lights
        """
        # start in the middle
        # u shape to the top right
        # straght down
        # figure of eithe back to middle

        aimer = PanTilt(
            servo1_min=745,
            servo1_max=2200,
            servo2_min=580,
            servo2_max=1910
        )
        aimer.pan(0)
        aimer.tilt(0)
        time.sleep(1)
    
        for t in range(0, 90):
            x = t
            y = math.sin(math.radians(t)) * 90
            aimer.pan(x)
            aimer.tilt(y)
            time.sleep(0.01)
    
        for t in range(90, -90, -1):
            x = 90
            y = t
            aimer.pan(x)
            aimer.tilt(y)
            time.sleep(0.01)
    
        for t in range(-90, 90):
            x = -t
            y = math.sin(math.radians(t)) * 90
            aimer.pan(x)
            aimer.tilt(y)
            time.sleep(0.01)
    
        for t in range(90, -90, -1):
            x = -90
            y = t
            aimer.pan(x)
            aimer.tilt(y)
            time.sleep(0.01)
    
        for t in range(-90, 0):
            x = t
            y = t
            aimer.pan(x)
            aimer.tilt(y)
            time.sleep(0.01)     
    import pantilthat
    def test_lights():
        pantilthat.light_mode(pantilthat.WS2812)
        pantilthat.light_type(pantilthat.GRBW)
    
        r, g, b, w = 0, 0, 0, 0
    
        try:
            while True:
                pantilthat.set_all(0, 0, 0, 0)
                pantilthat.show()
                time.sleep(0.5)
                pantilthat.set_all(100, 0, 0, 0)
                pantilthat.show()
                time.sleep(0.5)
                pantilthat.set_all(0, 100, 0, 0)
                pantilthat.show()
                time.sleep(0.5)
                pantilthat.set_all(0, 0, 100, 0)
                pantilthat.show()
                time.sleep(0.5)
                pantilthat.set_all(100, 100, 0, 0)
                pantilthat.show()
                time.sleep(0.5)
                pantilthat.set_all(0, 100, 100, 0)
                pantilthat.show()
                time.sleep(0.5)
                pantilthat.set_all(100, 0, 100, 0)
                pantilthat.show()
                time.sleep(0.5)
                pantilthat.set_all(100, 100, 100, 0)
                time.sleep(0.5)
                pantilthat.show()
        except KeyboardInterrupt:
            pantilthat.clear()
            pantilthat.show()

Lights Work

Movements Are Good


Aiming the Robot

The first task on the software part of this project is to be able to make the Pi identify a face in an image frame and be able to point to it.

This task breaks down into four key components:

  1. Detecting the centre point of a face in a frame
  2. Calculating the vector of movement for the camera to know how point to face
  3. Adjusting the servo motors with the movement vector to point at the face
  4. Enabling all of the above to occur simultaneously
Our setup has two servos, one that moves the camera along a horizontal axis (panning), and one that moves the camera along a vertical axis (tilting). This means that steps two and three occur twice simultaneously.

1. Detecting Faces

Face detection is a well-studied task, and the first line of attack for this vision task would normally be a deep learning implementation. However, when working with a low powered computer like the Pi we are heavily constrained by the resources available, and with deep models being computationally expensive especially for a real time task, I decided to start with a less accurate approach.

    import cv2
    
    ## load model
    # haar_path = path to model file downloaded from https://github.com/opencv/opencv/blob/master/data/haarcascades/haarcascade_frontalcatface.xml
    self.detector = cv2.CascadeClassifier(haar_path) 

    ## make inference
    objects = self.detector.detectMultiScale(
        image,
        flags=cv2.CASCADE_SCALE_IMAGE,
    )

I decide to use the method described in a previous article of mine, namely using the OpenCV module's HaarCascadeClassifier model; this needs fewer resources and is still accurate enough for my purpose.

A problem I faced, however, was that model sometimes identified a face when one did not exist, and this would cause many problems when adjusting the servos. I fixed this problem by extracting only the most likely face using confidence intervals.

    objects = self.detector.detectMultiScale3(
        image, 
        scaleFactor=1.05,
        minNeighbors=9, 
        minSize=(30, 30),
        flags=cv2.CASCADE_SCALE_IMAGE,
        outputRejectLevels=True
    )

If we wrap this code in a while loop that feeds the camera input into this function, we can return the relative location of a face as a two-dimensional vector. This can then be used to point the camera.

2. Calculating the Vector of Movement

Now that we know the relative location of a face in a frame, we need to figure out how to point to it. This is done using a Proportional-Integral-Derivative (PID) controller. , which is a control feedback loop. A control feedback loop is a mechanism where a system takes continuous measurements of some physical object and makes continuous adjustments to keep a measured variable within some set of values. Here, the measure variable is the error between where the camera is pointing and where it should be pointing. Reducing this error means we would be aiming towards a face by adjusting the servos on our camera module.

Functionally, a PID calculator will take in as an input a history of errors, in our task this is the current position of the camera and the position of the face along a single axis. It will then return a single number representing the magnitude of some mechanical process that needs to made to correct this present error. The output is determined by three aspects:

  • Proportional: if the current error is large, the correction to be made will be large. Our case: if we are far away from the face, we should move the servo a proportionally large amount.
  • Integral: sums the historical error and forces the system to a steady-state of no error.
  • Derivative: a measure of the future value, used to dampen movement by responding to rapid changes in the error quickly

The formula for a PID calculation is:

PID Formula


Where kP kI, kD represent constants that should be set for the application. The goal of setting these constants is to ensure the system is responding in the way you want; i.e. it is quick to react, there is no oscillation, and it reacts changes well enough. When choosing these values myself, I spent significant time tuning these to eventually achieve the smoothest movement for my robot. The recommended method of tuning is:

  1. Set kI, kD to 0.
  2. Increase kP until the output (camera), then set kP to half this value.
  3. Increase kI until the oscillation is quickly rectified. Note that too high kI will cause instability.
  4. Increase kD so that when load is disturbed (face moves in a camera), the system can quickly rectify this.
I followed this method for the pan and tilt servos separately. I found it helpful to visualise the error term over time and determine more accurately if I was seeing incremental improvement when adjusting the constants. I aimed to reach a steady state quickly, with minimal oscillations before it is reached.

Example Error Term Chart for Pan Process

3. Adjusting the Servos

Thankfully this part is simpler. The Python module pantilthat has a pan and tilt methods that take in an angle and adjust the servos.

Code snippets can be seen from the build part of this post.

4. Multiprocessing

The above steps have been described sequentially, but they should be executed simultaneously. To make this happen, we can use the multiprocessing Python module, to initialise several Python processes that execute different commands at the same time. Since the different process will need to talk to each other, namely to hand each other variables that relate to the location of the face, and the result of the two PID functions, we need to use the process manager. The code for wrapping up these steps is:

	pi_face_detector = PiFaceDetector(rpi=False)

	with Manager() as manager:
		print('Start Manager')

		# set integer values for the object's (x, y)-coordinates
		obj_coord_X = manager.Value("i", pi_face_detector.frame_center_X)
		obj_coord_Y = manager.Value("i", pi_face_detector.frame_center_Y)

		# pan and tilt values will be managed by independed PIDs
		pan_angle = manager.Value("i", 0)
        tilt_angle = manager.Value("i", 0)
        
		# initialise tuning data variable holder to draw graphs
		tuning_time_data = manager.list()
		tuning_error_data = manager.list()
		tuning_angle_data = manager.list()

        print('Define process')
        # begin recording
		process_start_camera = Process(target=pi_face_detector.start_camera,
			args=(obj_coord_X, obj_coord_Y))
        
        # calculate PID outputs
		process_panning = Process(target=pi_face_detector.pan_pid_process,
			args=(pan_angle, obj_coord_X, tuning_time_data, tuning_error_data, tuning_angle_data))

		process_tilting = Process(target=pi_face_detector.tilt_pid_process,
			args=(tilt_angle, obj_coord_Y, tuning_time_data, tuning_error_data, tuning_angle_data))

        # move camera
		process_set_servos = Process(target=pi_face_detector.set_servos, args=(pan_angle, tilt_angle))

		# store graph data
		process_save_pan_tuning_process = Process(target=pi_face_detector.save_pan_tuning_process,
			args=(tuning_time_data, tuning_error_data, tuning_angle_data))

		process_save_tilt_tuning_process = Process(target=pi_face_detector.save_tilt_tuning_process,
			args=(tuning_time_data, tuning_error_data, tuning_angle_data))
	
		
		print('Start process.')
		process_start_camera.start()
		process_panning.start()
		process_tilting.start()
		process_set_servos.start()
		process_save_pan_tuning_process.start()
		process_save_tilt_tuning_process.start()

		print('Join process.')
		process_start_camera.join()
		process_panning.join()
		process_tilting.join()
		process_set_servos.join()
		process_save_pan_tuning_process.join()
		process_save_tilt_tuning_process.join()

This is a recording of the Pi with all of the above in action. The movement could do with a bit of tuning but it's a great feeling to have achieved this so far.

Tracking

The code can be accessed from my Github. Part two will focus on having the Pi recognise certain faces, respond to these faces in unique ways, and learn new faces on the fly. Stay tuned!