Some notes on how to set up an Android Malware Analysis Lab using state-of-the-art tools along with useful tips and tricks.

cover

It’s time to wake up and smell the Mutating Hash! Signature Based Malware Detection is Dead ― James Scott

Table of Contents

Lab Setup

The following three tools may be very useful for your Android malware analysis process, as setting up a lab environment is a must - unless you might want to damage your own devices.

Genymotion Emulator

Genymotion is an Android emulator which includes a complete set of sensors and features in order to interact with a virtual Android environment.

Create Android Environment

  1. Create a new VM with your favourite Android version and phone model
  2. Install GApps from the side Menu
  3. Disable Google Play Protect to avoid issues when running malware

Once you have completed all of the configurations and steps listed in this guide, remember to create a backup copy of the VM before analysing a new malware.

N.B. in the case you have an old Android device, you could opt for it as your test environment but remember to adopt the appropriate precautions.

Android Debug Bridge

Android Debug Bridge (ADB) is a programming tool used for the debugging of Android-based devices. The daemon on the Android device connects with the server on the host PC over USB or TCP, which connects to the client that is used by the end-user over TCP.

Connect to a Device

  1. $ adb devices
  2. $ adb connect <device_ip>:<port> (default port=5555)

Useful Commands

  • $ adb shell - Get an interactive shell
  • $ adb install <application>.apk - Install application (also works for .xapk)
  • $ adb push <source> <dest> - Push file/folder to device
  • $ adb pull <source> <dest> - Pull file/folder from device
  • $ adb logcat | grep <string> - Grep a string from the logs
  • $ adb shell setprop wrap.<app_package_name> '"logwrapper strace -f -o /data/local/tmp/strace.<app_package_name>.log' - Run strace for application

アマヤラ Lab

アマヤラ (Android Malware Analysis YARA) Lab is an open-source project that I created to provide a ready-to-use Jupyter Lab environment and help out with Android malware analysis using YARA rules - we will discuss YARA later in this article. This tool automatically analyses files with your YARA rules and stores the results in a JSON file. YARA rules are checked against both the APK file itself and its content (recursively).

アマヤラ Lab also gathers some information about the file(s) that you want to analyse from the Virus Total and Malware Bazaar APIs, using your own API keys. Eventually, the results include a link to Pithus which is valid only if the file was already uploaded.

Usage

You can choose whether to use the Python script or the Jupyter Notebook (recommended) to perform your analyses.

In order to launch the lab, open your favourite Terminal and run Jupyter Lab:

jupyter-lab

You can then access the amayara_lab.ipynb notebook and follow its instructions.

N.B. only a test rule and a couple of JSON results from a local test were included in the files within this repository since I did not intend to upload malware samples. Therefore, you need to create a “files” folder and add the file(s) you want to analyse in there.

Android Malware Analysis

Malware analysis is the study or process of determining the functionality, origin and potential impact of a given malware sample such as a virus, worm, trojan horse, rootkit, or backdoor.

Basics of Android Applications

Android is an open-source mobile operating system. It is now being developed by Google and is based on a Linux kernel. The applications are written in Java and are transformed into a slightly different format known as Dalvik. The apps are then run in the Dalvik virtual machine which provides a layer of abstraction over the real hardware. This way most applications can be run on any Hardware as long as the API of the Operating system meets the requirements of the app. Besides the Java part, native code can be used. This needs to be provided along with the application and must be compiled for all target platforms. The native code should mainly be used for computation-intensive tasks like graphic rendering. Below the Dalvik VM lies the Linux kernel, which provides hardware abstraction and rights management. The permissions requested by the Application are enforced by using Linux users and groups, so so far every malware known had to acquire needed access rights the official way.

Android applications are packed in the format apk, which is a ZIP archive containing the AndroidManifest.xml, resources like media files, the actual code as classes.dex and some other optional files. The XML provides the Android system with important information like which class to use when starting the app and what permissions are needed. Only permissions listed in this file will be provided to the application, if it tries to use any other the call will either fail or return an empty result. When installing an application these permissions are shown to the user, who must make sure that he reviews them to prevent malicious apps from accessing important data or being installed in the first place. The code is contained in classes.dex, which is a collection of all compiled classes. Instead of the regular format used in .jar all classes are packed into one file which saves some space on the mobile device.

Static Analysis

Static or Code Analysis is usually performed by dissecting the different resources of the binary file without executing it and studying each component. The binary file can also be disassembled (or reverse-engineered) using a disassembler such as IDA or Ghidra. The machine code can sometimes be translated into assembly code which can be read and understood by humans: the malware analyst can then read the assembly as it is correlated with specific functions and actions inside the program, then make sense of the assembly instructions and have a better visualization of what the program is doing and how it was originally designed. Viewing the assembly allows the malware analyst/reverse engineer to get a better understanding of what is supposed to happen versus what is really happening and start to map out hidden actions or unintended functionality. Some modern malware is authored using evasive techniques to defeat this type of analysis, for example by embedding syntactic code errors that will confuse disassemblers but that will still function during actual execution.

jadx-gui

Command line and GUI tools for producing Java source code from Android Dex and Apk files. More info at https://github.com/skylot/jadx.

Dynamic Analysis

Dynamic or Behavioral analysis is performed by observing the behaviour of the malware while it is actually running on a host system. This form of analysis is often performed in a sandbox environment to prevent the malware from actually infecting production systems; many such sandboxes are virtual systems that can easily be rolled back to a clean state after the analysis is complete. The malware may also be debugged while running using a debugger such as GDB or WinDbg to watch the behaviour and effects on the host system of the malware step by step while its instructions are being processed. Modern malware can exhibit a wide variety of evasive techniques designed to defeat dynamic analysis including testing for virtual environments or active debuggers, delaying execution of malicious payloads, or requiring some form of interactive user input.

FRIDA

Frida is a free dynamic instrumentation toolkit that enables software professionals to execute their own scripts in software that has traditionally been locked down, i.e., proprietary (such as Android applications). More info at https://frida.re/.

Install FRIDA

pip3 install Frida frida-tools objection

Install frida-server

wget https://github.com/frida/frida/releases/<frida-server_arch>.xz
tar -xf <frida-server_arch>.xz
adb push <frida-server> /data/local/tmp/
adb shell "chmod 755 /data/local/tmp/<frida-server>"
adb shell "/data/local/tmp/frida-server

NOTE: for Genymotion, the Android architecture is x86.

You can check that Frida is properly working by running the following command:

frida-ps -Uai

Burp Suite

Automated, scalable Web vulnerability scanning. More info at https://portswigger.net/burp.

Certificate Installation

The first step is to get the Burp CA in the right format. Using Burp Suite, export the CA Certificate in DER format as follows: cacert.der.

Android wants the certificate to be in PEM format, and to have the filename equal to the subject_hash_old value appended with “.0”. Use openssl to convert DER to PEM, then output the subject_hash_old and rename the file as follows:

openssl x509 -inform DER -in cacert.der -out cacert.pem
openssl x509 -inform PEM -subject_hash_old -in cacert.pem | head -1
mv cacert.pem <cert_hash>.0

We can use adb to copy the certificate over, but since it has to be copied to the “/system” filesystem, we have to remount it as writable. As root, this is easy with adb remount:

adb root
adb remount
adb push <cert_hash>.0 /sdcard/
adb shell "mv /sdcard/<cert_hash>.0 /system/etc/security/cacerts/"
adb shell "chmod 644 /system/etc/security/cacerts/<cert_hash>.0"
adb reboot

NOTE: the first two commands might not be necessary if using Genymotion as Android device.

After the device reboots, browsing to Settings -> Security -> Trusted Credentials , the new “Portswigger CA” should show as a system-trusted CA.

Proxy Setup (Burp side)

Now it’s possible to set up the proxy and start intecepting any and all app traffic with Burp. Go to Proxy -> Options -> Add and add a new proxy listener with the following parameters:

  • Bind to port: 8082 (or a port of your choice)
  • Bind to address: All interfaces (or the IP address of the Genymotion VM)
  • Check the “Support invisible proxying (enable only if needed)” box
  • Choose “Use custom protocols” and disable “TLSv1.3” from the list (leave all the others with default values)
  • Uncheck the “Support HTTP/2” box

Proxy Setup (Android side)

Finally, you can go to WIFI settings -> Edit WIFI -> Proxy -> Manual and set Burp IP and port as proxy..

SSL Pinning Bypass

If you are unable to sniff the HTTPS traffic from an application, then disable the SSL pinning as follows:

objection -g <app_package_name> run android sslpinning disable

In addition, you can also install the Android-SSL-TrustKiller.apk to enforce SSL pinning bypass.

Malware Detection

Malware detection refers to the process of detecting the presence of malware on a host system or of distinguishing whether a specific program is malicious or benign.

YARA

A tool designed to help malware researchers identify and classify malware samples. It’s been called the pattern-matching Swiss Army knife for security researchers (and everyone else). It is multiplatform and can be used from both its command-line interface and through your own Python scripts. It is also compatible with Perl-based Regular Expressions, and is used in general to examine the suspected files/directories and match strings as defined in the YARA rules with the file. YARA rules are a way of identifying malware (or other files) by creating rules that look for certain characteristics. More info at https://virustotal.github.io/yara/.

Syntax

Each rule has to start with the word rule, followed by the name or identifier. The identifier can contain any alphanumeric character and the underscore character, but the first character is not allowed to be a digit. There is a list of YARA keywords that are not allowed to be used as an identifier because they have a predefined meaning. At its most basic, the following is the syntax of a YARA ruleset:

/*
    This is a sample
*/
rule RuleName{
 meta:
  my_identifier_1 = "Some string data"
  my_identifier_2 = 24
  my_identifier_3 = true

 strings:
  $test_string1 = "Test" // Comment here
  $test_string2 = "Honeypot" wide ascii
  $test_string2 = { E1D2C3B4 }
  $dummy_hex = { 9C 50 ?6 A1 ?? ?? ( 62 B4 | 56 ) 66 A9 [4-6] 58 0F 85 }
  $re1 = /md5: [0-9a-fA-F]{32}/

 Conditions:
  ($test_string1 or $test_string2) and $dummy_hex or $rel
}
Condition

Rules are composed of several sections. The condition section is the only one that is required. This section specifies when the rule result is true for the object (file) that is under investigation. It contains a Boolean expression that determines the result. Conditions are by design Boolean expressions and can contain all the usual logical and relational operators. You can also include another rule as part of your conditions.

Strings

To give the condition section a meaning you will also need a strings section. The strings section is where you can define the strings that will be looked for in the file. There are several types of strings you can look for:

  • Hexadecimal, in combination with wild-cards, jumps, and alternatives;
  • Text strings, with modifiers: nocase, fullword, wide, and ascii;
  • Regular expressions, with the same modifiers as text strings.
Metadata

Metadata can be added to help identify the files that were picked up by a certain rule. The metadata identifiers are always followed by an equal sign and the set value. The assigned values can be strings, integers, or a Boolean value. Note that identifier/value pairs defined in the metadata section can’t be used in the condition section, their only purpose is to store additional information about the rule.

Detection

Once you have identified the signature of malware, i.e., significant strings, URLs, patterns, etc., and you have created a dedicated YARA rule, you can then analyse the application file by running the following:

yara <rules>.yar <target_app>.apk