GorillaWatch: An Automated System for In-the-Wild Gorilla Re-Identification and Population Monitoring

Kevin C. Yang Pranjal Singh Joshua M. W. Nsereko Martha M. Robbins Andrew Z. Kahol Tilo Burghardt Daniel J. Graham
University of Florida

Abstract

Monitoring endangered great ape populations is crucial for conservation but challenging due to the labor-intensive nature of manual methods, which struggle with large datasets and errors. This paper introduces GorillaWatch, an automated re-identification system designed for in-the-wild gorilla monitoring. Leveraging deep learning, the system combines a robust face detection module with a discriminative feature extraction network and a novel spatio-temporal tracking component. Evaluated on a real-world dataset of wild gorillas, GorillaWatch achieves a high Top-1 accuracy of 96.4% and a Top-5 accuracy of 98.7%, significantly outperforming traditional methods, thereby demonstrating its potential for accurate and scalable long-term population monitoring.

Keywords

Gorilla re-identification, wildlife monitoring, deep learning, conservation, face recognition


1. Introduction

Conservation of endangered great apes relies on accurate population monitoring, but current manual methods are time-consuming, prone to human error, and not scalable for large populations or long-term studies. There is a critical need for automated systems to efficiently identify individual gorillas in their natural habitat to support conservation efforts. Models used in the article include RetinaFace for face detection, InceptionResNetV2 as the backbone for feature extraction within a Siamese network architecture, and a custom spatio-temporal tracking module.

2. Related Work

Previous work in animal re-identification often focuses on captive animals or uses less robust methods like traditional computer vision, struggling with uncontrolled wild environments, varying poses, and occlusions. While deep learning has advanced human face recognition, its application to wildlife often requires domain-specific adaptation, as demonstrated by early efforts in species like chimpanzees and elephants. GorillaWatch builds upon these advancements by integrating state-of-the-art deep learning for detection and embeddings with a spatio-temporal tracking framework specifically tailored for challenging in-the-wild gorilla data.

3. Methodology

GorillaWatch employs a three-stage pipeline for automated gorilla re-identification. First, a RetinaFace detector identifies gorilla faces within images or video frames. Second, an InceptionResNetV2 model, trained with a Siamese network architecture, extracts unique, discriminative feature embeddings for each detected face. Finally, a spatio-temporal tracking module links new observations to previously identified individuals, considering both visual similarity and temporal proximity to robustly re-identify gorillas across different captures and over time.

4. Experimental Results

GorillaWatch achieved significant performance, demonstrating a Top-1 accuracy of 96.4% and a Top-5 accuracy of 98.7% on the comprehensive GorillaWatch dataset. This significantly surpasses baseline methods, including a traditional FaceNet approach (88.2% Top-1) and a ResNet50-based feature extractor (85.1% Top-1), highlighting the robustness of the system's combined detection, embedding, and tracking components. The table below presents the re-identification performance of GorillaWatch compared to two baseline methods on the GorillaWatch dataset. GorillaWatch achieved a remarkably high Top-1 accuracy of 96.4% and a Top-5 accuracy of 98.7%, demonstrating its superior capability in correctly identifying individual gorillas. In contrast, the FaceNet baseline achieved 88.2% Top-1 accuracy, and the ResNet50 baseline only managed 85.1%, indicating that GorillaWatch's specialized architecture and spatio-temporal tracking module provide substantial performance improvements for in-the-wild gorilla re-identification.

MethodTop-1 Accuracy (%)Top-5 Accuracy (%)
GorillaWatch96.498.7
Baseline (FaceNet)88.292.5
Baseline (ResNet50)85.190.3

5. Discussion

The superior performance of GorillaWatch validates the efficacy of its integrated deep learning and spatio-temporal tracking approach for challenging wildlife re-identification tasks. This system provides conservationists with an unprecedented tool to accurately and efficiently monitor individual gorillas, enabling more precise demographic studies, health assessments, and anti-poaching strategies. Ultimately, GorillaWatch represents a significant advancement in leveraging computer vision for biodiversity conservation, offering scalable solutions for endangered species management.