In my previous post, I have explained what are the tricks attacker uses to obfuscate the javascript code as well as how to analyze those obfuscated Javascript code. Now, I want to expand more onto that.
Points to remember:
1. Javascript uses Object oriented Programming model. Hence, it is possible to use Spidermonkey tool to analyze obfuscated javascript because spidermonkey is javascript interpreter and it cannot interpret the browser objects such as document.
Now we can make use of object oriented concept to define as well as redefine objects such as document.
For example:
function decrypt(encoded_string){
....
....
....
document.write(t);
}
Now, you need to define document object before using this script in spidermonkey.
You can use rhino and use the -f option to select the file of javascript function as well as the rules file to define or redefine the objects
$ rhino -f rule_file -f javascript_file
Use the rule_file before javascript_file so that its definition gets loaded and used by javscript_file.
Similarly, attacker may redefine print function to document.write function to execute code code instead of printing it on screen.
2. To make analysis difficult attacker use various techniques:
For example: encoding technique, splitting script into multiple script tags, inserting script into the html element and calling that script from javascript code, using one line conditional statement, evaluating the key along with malicious javascript code using eval function, calling browser objects from javascript code, using replace function in javascript code, using browser based properties in evaluating key.
Basic Concatenation technique:
For example:
fr="f"+"r"+"o"+"m"+"C";
fr+=(fr)?"ha"+"rCode":"";
Hence ,
fr="fromCharCode"
Representing string as Char Codes:
Char codes can be represented in different formats i.e base 10(decimal), base 8(octal),or base 16(hexadecimal). The following functions can be used to obfuscate the strings.
String.fromCharCode(x)
This javascript function will return the Character corresponding to the ASCII Code, x
For example:
String.fromCharCode(0157,112,0145,114,97); corresponds to "opera"
Encoding Technique:
Encoding technique is used for simple obfuscation of javascript. The best part is although encoded javascript is difficult to read but browser parses the encoded script properly and executes the code without any problems. For example: "K" character can be represented as its decimal or hexadecimal or unicode values.
K= \75 = \x4B = %4B = %u004B = \u004B
Role of unescape() function:
Generally, unescape value is normally used in obfuscated javascript code. unescape function can decode the hexadecimal values into the plaintext.For example:
unescape("%4B") decodes into K
A special case of encoding is 8-bit ASCII encoding. This encoding results in a file that human cannot understand. This encoding simply sets the highest bit in every byte of the text to 1. Browser will ignore this and properly decode and execute the file. So in this case, you need to use python or perl script to decode such encoding.
Replace function technique:
var NMeZD='kIkeIvIaMlMI'.replace(/[kIYM]/g,'');
The variable NMeZD will hold the eval function. Hence, it can be used to run the malicious javascript.
One line conditional statement:
Following is the one line conditional statement :
For example:
if (x>y?never_execute_result:always_true_result)
Above statement is used extensively in javascript obfuscation .
Splitting script into multiple script tags:
Following is the structure used to make analysis more difficult for spidermonkey, or any other automated analysis tool: You manually need to intervene into the process and remove the scripts tag and combine the javascript code to analyze it properly.
By raising exception:
Obfuscation technique can also apply execution technique which is similar to SEH technique in malicious executables.
For example:
fr="f"+"r"+"o"+"m"+"C";
try
{
Boolean(false)[p].google; /* invalid code, throws an exception which triggers code in catch() block */
}
catch(vb)
{
e=zz['e'+'v'+'al']; /* Basic String Concatenation */
fr+=(fr)?"ha"+"rCode":"";
ch+=(fr)?"odeAt":"";
r="replace";
}
fr+=(fr)?"ha"+"rCode":"";
By calling browser objects from javascript code:
Most of the tools such as spidermonkey, malzilla, rhino, jsunpack-n , others only handle pure Javascript. They don't handle HTML tags. But most javascripts contains html objects to obtain javascript from inside html tag elements to make analysis of code difficult.For example:
<html>
<script>
xxwsx=document.getElementById("getScript").value;
</script>
<body>
<div id="getScript" value="javascript_code">contents</div>
</body>
</html>
In above case, we need to remove all the html tags from the script and replace that part with javascript code present inside html element or define objects in your rule set with the value set to javascript_code .
By using browser based properties in evaluating key:
Sometimes javascript uses the browser properties to evaluate the key to decrypt the encrypted function.
For example:
var key=navigator.userAgent.toLowerCase()
decrypt_fun(key,encrypted_code);
or
var key=document.lastModified
or
var key=document.location
In the above case, we need to first find out the value of the key very carefully and then replace the value of key with document.location or navigator.userAgent.toLowerCase() or whatever.
Anti-Deobfuscation Technique:
Above are the techniques used for obfuscating javascript code. But there are some anti-deobfuscation technique, which are used to make deobfuscation difficult. For example using argument.callee function, using checksum of the obfuscated code as a key, etc, etc
Using argument.callee function:
It allows a function to reference its own body. Hence, if argument.callee function is used as a key the obfuscated javascript prevents itself from modifications. For instance, a function can usee the text of its body as a key for deobfuscation. If any modifications are done in obfuscated javascript code, deobfuscation will produce incorrect result.
var key=argument.callee.toString()
For practice, you can use the following link to analyze the obfuscated script:
http://blogs.ixiacom.com/default/assets/File/blogresources/JavaScript-obfuscation-code.txt
Note: The script may contain few obfuscated techniques which are not described above.
Points to remember:
1. Javascript uses Object oriented Programming model. Hence, it is possible to use Spidermonkey tool to analyze obfuscated javascript because spidermonkey is javascript interpreter and it cannot interpret the browser objects such as document.
Now we can make use of object oriented concept to define as well as redefine objects such as document.
For example:
function decrypt(encoded_string){
....
....
....
document.write(t);
}
Now, you need to define document object before using this script in spidermonkey.
You can use rhino and use the -f option to select the file of javascript function as well as the rules file to define or redefine the objects
$ rhino -f rule_file -f javascript_file
Use the rule_file before javascript_file so that its definition gets loaded and used by javscript_file.
Similarly, attacker may redefine print function to document.write function to execute code code instead of printing it on screen.
2. To make analysis difficult attacker use various techniques:
For example: encoding technique, splitting script into multiple script tags, inserting script into the html element and calling that script from javascript code, using one line conditional statement, evaluating the key along with malicious javascript code using eval function, calling browser objects from javascript code, using replace function in javascript code, using browser based properties in evaluating key.
Basic Concatenation technique:
For example:
fr="f"+"r"+"o"+"m"+"C";
fr+=(fr)?"ha"+"rCode":"";
Hence ,
fr="fromCharCode"
Representing string as Char Codes:
Char codes can be represented in different formats i.e base 10(decimal), base 8(octal),or base 16(hexadecimal). The following functions can be used to obfuscate the strings.
String.fromCharCode(x)
This javascript function will return the Character corresponding to the ASCII Code, x
For example:
String.fromCharCode(0157,112,0145,114,97); corresponds to "opera"
Encoding Technique:
Encoding technique is used for simple obfuscation of javascript. The best part is although encoded javascript is difficult to read but browser parses the encoded script properly and executes the code without any problems. For example: "K" character can be represented as its decimal or hexadecimal or unicode values.
K= \75 = \x4B = %4B = %u004B = \u004B
Role of unescape() function:
Generally, unescape value is normally used in obfuscated javascript code. unescape function can decode the hexadecimal values into the plaintext.For example:
unescape("%4B") decodes into K
A special case of encoding is 8-bit ASCII encoding. This encoding results in a file that human cannot understand. This encoding simply sets the highest bit in every byte of the text to 1. Browser will ignore this and properly decode and execute the file. So in this case, you need to use python or perl script to decode such encoding.
Using Array notation to call methods:
For example:
one can write document.write(encoded_data) as follows:
document["write"]("encoded_data");
Assigning array to single dimension variable:
For example:
XWAAZDD=(10,20,30,"eval");
For example:
one can write document.write(encoded_data) as follows:
document["write"]("encoded_data");
Assigning array to single dimension variable:
For example:
XWAAZDD=(10,20,30,"eval");
Replace function technique:
var NMeZD='kIkeIvIaMlMI'.replace(/[kIYM]/g,'');
The variable NMeZD will hold the eval function. Hence, it can be used to run the malicious javascript.
One line conditional statement:
Following is the one line conditional statement :
For example:
if (x>y?never_execute_result:always_true_result)
Above statement is used extensively in javascript obfuscation .
Splitting script into multiple script tags:
Following is the structure used to make analysis more difficult for spidermonkey, or any other automated analysis tool: You manually need to intervene into the process and remove the scripts tag and combine the javascript code to analyze it properly.
<SCRIPT type="text/javascript" src="http://someplace.com/progs/vbcalc">
</SCRIPT>
<BODY>
<SCRIPT type="text/javascript">
...some JavaScript...
</SCRIPT>
</BODY>
By raising exception:
Obfuscation technique can also apply execution technique which is similar to SEH technique in malicious executables.
For example:
fr="f"+"r"+"o"+"m"+"C";
try
{
Boolean(false)[p].google; /* invalid code, throws an exception which triggers code in catch() block */
}
catch(vb)
{
e=zz['e'+'v'+'al']; /* Basic String Concatenation */
fr+=(fr)?"ha"+"rCode":"";
ch+=(fr)?"odeAt":"";
r="replace";
}
fr+=(fr)?"ha"+"rCode":"";
By calling browser objects from javascript code:
Most of the tools such as spidermonkey, malzilla, rhino, jsunpack-n , others only handle pure Javascript. They don't handle HTML tags. But most javascripts contains html objects to obtain javascript from inside html tag elements to make analysis of code difficult.For example:
<html>
<script>
xxwsx=document.getElementById("getScript").value;
</script>
<body>
<div id="getScript" value="javascript_code">contents</div>
</body>
</html>
In above case, we need to remove all the html tags from the script and replace that part with javascript code present inside html element or define objects in your rule set with the value set to javascript_code .
By using browser based properties in evaluating key:
Sometimes javascript uses the browser properties to evaluate the key to decrypt the encrypted function.
For example:
var key=navigator.userAgent.toLowerCase()
decrypt_fun(key,encrypted_code);
or
var key=document.lastModified
or
var key=document.location
In the above case, we need to first find out the value of the key very carefully and then replace the value of key with document.location or navigator.userAgent.toLowerCase() or whatever.
Anti-Deobfuscation Technique:
Above are the techniques used for obfuscating javascript code. But there are some anti-deobfuscation technique, which are used to make deobfuscation difficult. For example using argument.callee function, using checksum of the obfuscated code as a key, etc, etc
Using argument.callee function:
It allows a function to reference its own body. Hence, if argument.callee function is used as a key the obfuscated javascript prevents itself from modifications. For instance, a function can usee the text of its body as a key for deobfuscation. If any modifications are done in obfuscated javascript code, deobfuscation will produce incorrect result.
var key=argument.callee.toString()
For practice, you can use the following link to analyze the obfuscated script:
http://blogs.ixiacom.com/default/assets/File/blogresources/JavaScript-obfuscation-code.txt
Note: The script may contain few obfuscated techniques which are not described above.
Excellent write up on various obfuscation techniques! Do you know where I can learn more about the section titled "By calling browser objects from javascript code"?
ReplyDelete