Keys to the kingdom: Proctortrack
These are my findings, and my take on a story that ConsumerReports.org covered here, relating to the ProctorTrack breach and the source code leak.
Huge thanks to Thomas Germain and Bill Fitzgerald at CR, and Erik!
Prologue
I occasionally look at @antiproprietary on Twitter to track leaks, and I noticed that the Proctortrack source code was leaked one day. I wasn't really interested in it at the time, and I didn't have a Telegram account, so I let it pass.
Then, I heard of an incident where Proctortrack's front page was defaced and replaced with a rickroll, and emails with abusive language were sent out to a few students, in what appeared to be a security breach. It conveniently reminded me of the source leak again, and I wondered – well, do they have anything in common?
So I set out to try and get a copy of the code. I didn't have a Telegram account, but that wasn't a showstopper – someone on that group linked their Telegram account to a public telemetry and advertising dashboard, and I got the sources from there.
Secrets
My first natural instinct was to look for secrets – and oh boy did I find them in spades. I think Patrick Jackson, CTO of Disconnect, put it best – Proctortrack's code was a ticking time bomb.
I ran a secret searching tool, TruffleHog3, and it came back with a report over 80MB in size – containing credentials for everything from Cloudfront (their CDN) to S3 (the place where all information is stored) to LinkedIn (because you could link LinkedIn accounts, for some reason).
Put quite literally, just this source code alone possibly had keys to the entire kingdom.
In a config file that's in source control:
AWS_CLOUDFRONT_ID = "[REDACTED]"
AWS_CLOUDFRONT_KEY_ID = "[REDACTED]"
# ...
AWS_S3_ACCESS_KEY_ID = "[REDACTED]"
AWS_S3_SECRET_ACCESS_KEY = "[REDACTED]"
(all redactions above mine, and those are just a few)
I did not test any of these production credentials – doing so would be probably unethical and probably illegal – but I can assure you they exist and that they have every sign of being legitimate.
Those passwords and the information on hand would provide enough to access every last bit of student data – from their biometrics to their exam recordings to their bedrooms.
They claim the secrets were replaced before production – but as Jackson confirmed in the CR article, there was nothing in the code to indicate that was the case – and that would not explain the breach that happened.
Tracing the source's history, some of those credentials have existed unchanged for almost 6 years – which raises the question if any Proctortrack engineer could have accessed any student data too.
Code Smell
There's more than secrets that's wrong with the code; what leaked was their entire history, not just the most current version. You find funny things, like “make tests pass” and then just skipping or wholesale commenting out tests, or fun things like:
def create_test(self):
'''
I am not even going to bother with this because our installation is broken and we can't create tests reliably
'''
or else, commit logs like:
* Dockerfile
* fix Dockerfile
* fix Dockerfile
* fix Dockerfile
* facepalm
* facepalm again
Yikes. I guess I'm happy I'm still a student, not someone working on unmaintainable software (:
Bonus: Please do not include PII in version control
I managed to find a full name, a phone number, an address, and a date of birth sitting in the commit history. It's not in the latest snapshot, but it still leaked, and the address appears to be still valid from information I found online. For obvious reasons I won't be saying whose it is or where to find it. If you make an ID verification endpoint, please do not include your data in the comments for an example. Use some placeholder data instead.
In addition, I found a .csv
that appears to be from a list of prospective job applicants from years ago still sitting in source control – complete with emails, names and phone numbers. Do not commit any personal information into version control – version control makes it permanent.
It bears repeating that once something makes it to version control, there's a permanent record that's maintained unless you force-push – you can find a lot of things that were meant to be hidden just by digging through VCS.
Takeaways
When a data breach happens, it is most often students that pay the price. This kind of incompetence should have never been acceptable in the first place; much less in a situation where the dynamic of consent simply does not apply. Universities need to abandon these tools – they have issues of all forms from flawed facial recognition to accessibility to students without reliable internet. Security issues are yet another reason why these tools are flawed.
The pandemic demands compassion from and for all of us. Choosing exam spyware is choosing violence against students.