The duality of obfuscation: feat. Proctorio
Proctorio obfuscates code and hides behavior. Do Google and Microsoft care?
Recently, Google made the headlines (or, at least the front page of Hacker News) for removing an extension that included lodash, citing a portion of the script as evidence of “obfuscation”.
Which prompted me to think back to one of the first posts I made to this blog – about Proctorio, specifically. TL;DR extension store obfuscation policies are not as equally enforced as they appear to be.
Obfuscation galore
This analysis was performed on a version of the extension obtained from the Chrome Web Store on the 18th of March.
To start out, let's look at something simple, a random sample of names for .js files in their extension:
[oxy@toolbox proctorio_20210318]$ find . -name *.js | shuf -n 10
./assets/h7iY.js
./assets/touch/af20a691735f66590005df7e2b2d6691.js
./assets/G787.js
./assets/pipes/b86493d2ae255a02bb7ea61e7fb77c10.js
./assets/k7Wq.js
./assets/drops/c5e5541622a147e4b5b279d034d44975.js
./assets/touch/bc3a0da96bfbf227d678f23d5c1c8379.js
./assets/touch/ec518e9a786dbecff863386ee44bb8d2.js
./assets/elbows/d9b75e781f5d4ef9a11c68875d99ea8d.js
./assets/touch/8c8f73ef24833fd3a8dd6befc832a060.js
“Readable” code was Google's standard, yes? Well... I'll let the readability of the above names speak for themselves. (No, the names do not match checksums.)
Silly renames: Wonder what ./assets/J5HG.js
is? There's a variable called dhs
in there... its really SJCL. Oh, there's also a cia
and a kgb
if you look hard enough; real mature.
JScrambler!
function a000() {}
a000.o6 = "classic-learn-iframe";
a000 is a standard artifact of using JScrambler, and so are the locale strings being Axxx. It looks like they've turned off a reasonable amount of JScrambler's obfuscation options since I've last written about them.
Now, JScrambler's product page should tell you its an obfuscation tool, so...
What's in the .7z
s?
Proctorio, in addition to shipping a lot of questionably “readable”/unreadable JS, also ships some .7z
files in assets/packs
, which are... actually not 7zip files.
[oxy@toolbox packs]$ ls
FmuG8K.7z nWZk8V.7z rXqE9b.7z sVAazF.7z
[oxy@toolbox packs]$ file FmuG8K.7z
FmuG8K.7z: data
[oxy@toolbox packs]$ 7za x FmuG8K.7z
~[snip]~
Extracting archive: FmuG8K.7z
ERROR: FmuG8K.7z
FmuG8K.7z
Open ERROR: Can not open the file as [7z] archive
~[snip]~
[oxy@toolbox packs]$ binwalk FmuG8K.7z
DECIMAL HEXADECIMAL DESCRIPTION
--------------------------------------------------------------------------------
If you look for references to these file names, they don't show up anywhere in the JS, but they also ship a PNaCl binary! I wonder if anything shows up there...
Nope, nothing in strings proctorio.pexe
, and I have better things to do than to wire up LLVM to work with PNaCl files, so... moving along
Scripts from a CDN!
A friend was running a packet capture tool while taking a Proctorio practice test, and this showed up in the logs: https://cdn.proctorauth.com/assets/payload-3.4.7.2.js
They texted me “Hey, there's some JS, wanna look at it?”.
Policy reminder: “Developers must not obfuscate code or conceal functionality of their extension. This also applies to any external code or resource fetched by the extension package.“
Decided I'd take a peek, and at the top of the file, lo and behold!
a000.G26='function';
function a000(){}
It's my old friend JScrambler again. Nice to see you :)
I wondered if they used JScrambler in the near no-op mode they've reduced it to on their in-extension JavaScript, but I scrolled to the end of the file and...
function B4xx(){return"lro%1Ce%1C%1BjI:$;%3CPn)5;%0E%5CF-e)%1EtK2%07~8%18C&l~.%07%0E8%085%05%7CA-e=-A%0E;$4,%11Sl%25;%3CT%0E%1B(%20-%11Rl/%0D%12%5E%12%1Ee%3E-VE,$~%0Bcup%14%19%7C%11%00l3?;EE&2?lGO)%25~=%5BN-'3&PNl.4$ZK,e2-%5CM%205~%1API%3C%17?+AE:e)=WY%3C33&R%0E;5;:AY%1F(.%20%11Y!;?lFC2$~.GE%25%022)Gi'%25?l_%18l558%11_&%25?.%5CD-%25~'%5BF'%20%3ElE%07'4.e%11M-5%18=%5CF,%084.ZX%25%20.!ZDl.*-%5B%0E8.)%3CxO;2;/P%0E,$.-V%5E%0546%3C%5Cy+%206-%11g)5~+C%5E%0B.6'G%0E%0E%203$PNh55hYE)%25zlZZ-/~;ck);%1ClYE)%25~+TG%3E(%3ElRO%3C%046-XO&5%181%7CNlnu+QDf1('V%5E'3;=ABf%225%25%1AK;2?%3CF%05ln~;AK%3C4)lTN,%04,-%5B%5E%04()%3CPD-3~+C%5E%0B.6'G%0E:$;,Ly%3C%20.-%11%0A;5;%3C@Yra~:PY8.4;P%0E%1A$9%3CcO+55:%11F-/=%3C%5D%0E%25$);TM-e-!Q%5E%20e?:GE:e;,Qo%3E$4%3CyC;5?&PXl%2258L~'e4%1FoAp%17~%0Bzf%07%13%05%1Arh%09s%1D%1Atsl%20(:TS*4%3C.PXl%077=r%12%03e%19)FI)%25?%0BYK;23.%5CO:e%0C!QO'%02;8A_:$~:PY8.4;P~11?lsK!-?,%15%5E'a6'TNhe=-Ao$$7-%5B%5E%0A8%13,%11Z=22lVB)3%19'QO%095~-%5BI'%25?lGr9%04c*%11l%254%1Dp~";}
String obfuscation! Here's a prettified version of one of the functions in it, with no deobfuscation done:
function f05() {
var i05 = [arguments];
i05[8] = new XMLHttpRequest();
i05[8][e644.B7(9)](e644.B7(40), e644.B7(18) + i05[0][0], !![]);
i05[8][e644.v7(34)] = e644.v7(30);
e644[e644.v7(44)]();
i05[8][e644.v7(59)] = function () {
var j05 = [arguments];
e644[e644.B7(44)]();
if (i05[8][e644.B7(21)] === 4) {
if (i05[8][e644.B7(20)] === 200) {
j05[1] = new TextDecoder()[e644.v7(53)](i05[8][e644.B7(56)]);
j05[8] = e644.v7(45);
j05[4] = e644.B7(39);
for (j05[7] = 0; j05[7] < j05[1][e644.v7(23)]; j05[7]++)
j05[4] += String[d05.B7(4)](
j05[1][e644.v7(36)](j05[7]) ^
j05[8][e644.B7(36)](j05[7] % j05[8][e644.B7(23)])
);
j05[6] = new Uint8Array(new TextEncoder()[e644.v7(37)](j05[4]));
cv[e644.B7(41)](e644.B7(19), i05[0][0], j05[6], true, false, false);
}
(1, i05[0][1])();
}
};
i05[8][e644.v7(47)]();
}
There's many of our JScrambler passes in here:
String Concealing Variable Masking Boolean to Anything
There's also other transformations elsewhere in the file, such as probably a few Opaque Predicates, but... life's too short for listing obfuscator passes that were trivial to defeat!
I'm pretty sure that if you've read this far, you probably wanted to know what's behind that mess of characters – and may I present, in about an hour's time:
function fetch_xorenc_asset(filepath, callback) {
var xhr = new XMLHttpRequest();
xhr.open("GET", "//cdn.proctorauth.com/assets/" + filepath, true);
xhr.responseType = "arraybuffer";
xhr.onload = function () {
if (xhr.readyState === 4) {
if (xhr.status === 200) {
var xorenc = new TextDecoder().decode(xhr.response);
var xorkey = "pIoMIke";
var xordec = "";
for (var i = 0; i < xorenc.length; i++)
xordec += String.fromCharCode(xorenc.charCodeAt(i) ^ xorkey.charCodeAt(i % xorkey.length));
var result = new Uint8Array(new TextEncoder().encode(xordec));
cv.FS_createDataFile("/", filepath, result, true, false, false);
}
callback();
}
};
xhr.send();
}
XOR encryption, my favorite kind of obfuscation! /s Oh, and it has Mike's name on it too, how cute.
So now that I knew this much, I set out to find out what it fetched, and turns out...
fetch_xorenc_asset("sVAazF", function () {
true;
cv_cclassifier2.load("sVAazF");
fetch_xorenc_asset("FmuG8K", function () {
cv_cclassifier3.load("FmuG8K");
true;
fetch_xorenc_asset("rXqE9b", function () {
cv_cclassifier4.load("rXqE9b");
fetch_xorenc_asset("nWZk8V", function () {
Do these names look familiar? (Spoiler: look for the .7z
files.)
Yes, they do. In fact, they're the exact same files that Proctorio ships in their extension! Armed with a key, XOR-decrypting it leads to...
OpenCV XML files.
Peering under the hood
Once I had the XML files, I decided to take a look at them. Hooked them up into OpenCV for Python, and...
I got the exact same results as the corresponding stock OpenCV model. For every single test image.
Impressive. I somehow expected this from this company, but... it's nice to know that this is really how it works.
TL;DR
Okay, so you gave me a lot of snippets of code, what about them?
Proctorio's extension is questionably readable at best, and also fetches clearly obfuscated web resources when in use. In addition, the extension ships with “concealed” XML files obfuscated with a XOR key and mislabeled as a .7z
. Not sure why they did it, but they did.
I've reported this via the Report abuse
button on the Chrome Web Store, but it appears all reports get sent to /dev/null
. I've also flagged this exact same thing with Microsoft's extension store, which has similar policies, but have not heard back from them in a week and a half.
I'd love to see Google and Microsoft show up and actually enforce things.