Attendance is a sticky problem, especially when the teacher is the one missing classes. During my elementary school days in Ethiopia, our principal had pretty much solved the problem. By widely accepted convention, teachers are allowed to whip their students if it was necessary. If you told this to your parents, the best you can expect is 'I am sure you deserved it.' The worst? They might ask you for what you did, and give you some more whipping for it. So, in those days, being late meant a painful morning, and most everybody was on time unless there was a serious problem.
Fast forward two decades to Rajasthan, India. A non-profit organization called Seva Mandir was having an interesting problem with attendance. Seva Mandir runs small 'schools' in over a 100 small villages in northern India. The goal is to educate kids in grades 1-4 near the village until they all old enough to walk over to the government school, perhaps in the next village. Since the number of kids is usually less than 30 or so, they often have only one or two teachers per 'school'. And their attendance problem was not with the kids, rather, it was the teachers that were not showing up. Quite often.
They had a clever solution for this. Equip every school with a camera, and require teachers to take a picture of themselves with the students: one in the morning and one in the afternoon. This goes on everyday of the week, and at the end of the month, someone from the school ferries the memory card of the camera to the headquarters in Udaipur, and grab a fresh one. At the headquarters, 4 people went through every picture taken in every one of their schools, and used the timestamps to make sure the teachers had a shot for every point. Their task was slightly simplified by some help from MIT who built them a software that classified pictures by date. Still, employees had to go through nearly 20,000 pictures in every pay period, and this was taking over a 100 man hours. In addition, for an even slightly motivated teacher, beating the system meant altering the EXIF data on the pictures.
When I got to Microsoft Research India in the winter of 2011, I heard about this organization through Saurabh panjwani, who had done a field visit up north in the fall, and it struck me as an interesting problem. So, one of my projects while in Bangalore was to design a simpler system that would streamline the attendance problem. An attendance system has to do three things: verify identity, verify location and verify time.
Enter Hyke: an attendance tracking system designed for these environments. As you might have heard, mobile phones in India have taken off, and cellular coverage was available to over 80% of the Seva Mandir schools. Hyke uses mobile phones as the platform of authentication and builds on open-source voice biometrics technology in combination with off-the-shelf location tagging tools. At a very basic level, when the teacher initiates the system for attendance recording, we first obtain a fresh location reading (either through GPS or a nearby RFID tag). This is then followed by the generation of a one time passcode given to the teacher. At this point, teachers calls an attendance hotline and reads out the passcode. Using speaker recognition and speech recognition to identify the user and verify passcode freshness respectively, the attendance is recorded.
Hyke has several advantages over prior systems, and particularly targets environments such as Seva Mandir's. Besides reducing cost of operation, it offers the possibility of doing attendance tracking without the presence of a trusted administrative staff on-site---both location and timestamp information for attendance records are generated automatically. Another advantage lies in its utilization of voice as a user biometric---voice is generally regarded as a less invasive and privacy-sensitive form of biometric than fingerprints or pictures. Hyke uses widely available cellular networks with voice and SMS channels for communication.
Most of my time was spent working with Mistral, a voice biometric stack from Avignon University in France. Mistral is a set of tools mostly written in C++ that allows you to do feature vector comparisons on biometric data. And it was rather hard to use, with strong assumptions of its users being experts in the area. So I decided to build a wrapper around Mistral in Python that would make it easy for mere mortals like me to simply drag and drop it in their projects. I have also incorporated SPro, a tool for converting voice recordings into feature vectors that can be processed by libraries like Mistral. The Python wrapper is open-sourced and is available here.
Evaluating the biometric stack was an interesting problem in its own. We needed a range of voices collected over the telephone, preferably from the target population. So, I built an IVR system for collecting voice data, allowing users to call a local number from their phone and record voices. We then posted tasks on Amazon's Mechanical Turk, where we limited participation only to workers located in India. We provided workers with the local phone number, and a set of lines to read-out to the IVR. The lines to be read consist of randomly generated digits of various lengths. Since the Hyke system will need to verify identity based on a text independent, limited vocabulary (digit) passcodes, our data collection also focused on this segment. Our experiments with Indian speakers using audio collected over telephone shows error rates less than 5%, providing sufficient accuracy for most applications.
Another interesting part of designing the system was thinking through the security implications of tracking attendance in the absence of an onsite administrative staff, or principals intent on whipping late comers. Some threats to the model include conference calling, pre-recording voice samples, replacing location tagging with a separate component etc. There are several mechanisms built into Hyke to prevent these attack vectors. For example, passcodes have to be freshly generated from the server, delivered to the user over SMS, and location readings from the designated attendance phone are verified when generating these codes. A paper describing this work was published in the 5th annual Networked Systems for Developing Regions (NSDR 12), and you can read it in its entirety here.