Did GCC Optimize my code?

Often, we want to know if compiler has optimized our code. This could be for many reasons: Just curious, Debugger behaves weirdly, want to see how good you are as a programmer, etc.

If you are using GCC, there is an easy way to do this:

Compile without optimization: gcc -c -fdump-tree-optimized -O0 <your_c_file> -o /dev/null
cp *.optimized unoptimized
Compile with optimization: gcc -c -fdump-tree-optimized -O3 <your_c_file> -o /dev/null
cp *.optimized after_optimization
Check for difference: diff -r unoptimized after_optimization

Any difference you see is an indication that GCC has optimized your code.

Note that, this is much better and precise compared to diffing binary executable, because there are reasons apart from optimization (like: register allocation) to why 2 binaries produced from same source are different.

YASS: Yet Another SUDOKU Solver

Yes. This is sudoku solver. You can find the code here:

It expects input to be 81 numbers, which correspond to the values in 81 cells of the SUDOKU table. A missing value should be represented by 0.

I prefer to save the values in a file and then redirect this to the input. Example input file can be found here.

Usage:

Usage 1: ./sudokosolver

[Here it expects 81 numbers and you need to provide the numbers in row majority order of the SUDOKU]

Usage 2: ./sudokusolver < input_file

input_file contents:

5 3 0 0 7 0 0 0 0
6 0 0 1 9 5 0 0 0
0 9 8 0 0 0 0 6 0
8 0 0 0 6 0 0 0 3
4 0 0 8 0 3 0 0 1
7 0 0 0 2 0 0 0 6
0 6 0 0 0 0 2 8 0
0 0 0 4 1 9 0 0 5
0 0 0 0 8 0 0 7 9

Output:

Solved Successfully

Result:
Number of rounds:3, Elapsed Time:0.034000 sec (0.000034 msec)

Result Matrix:
5 3 4 6 7 8 9 1 2
6 7 2 1 9 5 3 4 8
1 9 8 3 4 2 5 6 7

8 5 9 7 6 1 4 2 3
4 2 6 8 5 3 7 9 1
7 1 3 9 2 4 8 5 6

9 6 1 5 3 7 2 8 4
2 8 7 4 1 9 6 3 5
3 4 5 2 8 6 1 7 9

Enjoy 🙂

Linux Kernel Myths

On my recent code review project, where I am supposed to review a linux kernel. I was faced with few doubts/myths about the behaviour of operating system kernel during few scenarios.

Following is the case on Linux Kernel 2.6:

NULL PTR dereference (either reading or writing) in kernel causes Kernel Panic.
copy_from_user : when the src(user ptr) argument is NULL : the call fails.
copy_to_user : when the dst(user ptr) argument is NULL : the call fails.

Will post more as and when I find.

-Have fun

Android kernel debugging setup

Downloading the Kernel Sources.

I am using kernel sources for emulator (goldfish).

git clone https://android.googlesource.com/kernel/goldfish.git -b android-goldfish-2.6.29

2. Enable kernel debugging.

Set up the default configuration: make ARCH=arm goldfish_defconfig

Enable debugging : make ARCH=arm menuconfig

When the menu pops up:

Chose the following options under ‘Kernel Hacking’:

Enable Kernel Module loading:

Enable force module loading and unloading (Under ‘Enable kernel module loading’):

Downloading the tool chain

For this you need to have ARM cross compiler tool chain, one way to get this is from AOSP:

repo init -u https://android.googlesource.com/platform/manifest -b android-4.3_r1

repo sync platform/prebuilts/gcc/linux-x86/arm/arm-eabi-4.6

Building the kernel:

make ARCH=arm CROSS_COMPILE=<path_to_downloaded_android_sources>/prebuilts/gcc/linux-x86/arm/arm-eabi-4.6/bin/arm-eabi-

Install android SDK(10) by following the instructions at: http://developer.android.com/sdk/index.html
Make sure that the platform-tools and tools directories are in system path. On linux : export PATH=<sdk_install_dir>/tools:<sdk_install_dir>/platform-tools:$PATH
Create a AVD for android-2.3.3 or android-10 by following the instructions at: http://developer.android.com/tools/devices/index.html,
Let the AVD name be: kernel-debug
Start the emulator: emulator -avd hel -kernel <pathToKernelSources>/arch/arm/boot/zImage -show-kernel -qemu -s

The above command will start the emulator with emulator listening on localhost:1234 for gdb connection.

install gdb-multiarch: sudo apt-get install gdb-multiarch
run: gdb-multiarch <pathToKernelSources>/vmlinux
In the gdb prompt : target remote localhost:1234

Here you go, you should see the gdb breaked into kernel, from now on its the usual gdb stuff. Here is the quick reference of GDB commands.

Getting method coverage in Android.

During my research on dynodroid, I wanted to know the effectiveness of my technique in terms of the method coverage (No of methods in side the app which are triggered at least once) of the app under test.

For this I have made few modifications to the dalvik profiling facility to get this information in most desirable way. Note: The trace facility provided by android is way too verbose, collects lot of data (like timings, thread info etc) and will stop once the size of the trace reach certain size. Thus not suitable for my purpose (i.e get all the methods only inside the app, that are called at least once)

I have made few changes to the following files of the android sources:

dalvik/vm/Profile.h
dalvik/vm/Profile.cpp
dalvik/vm/Globals.h
dalvik/vm/oo/Object.h
dalvik/vm/Dvm.mk
dalvik/vm/native/dalvik_system_Zygote.cpp

The changed files (for 4.3) , along with the app : RandomMusicPlayer are present at: https://github.com/Machiry/DalvikUtils/tree/master/DexMethodCoverage

Note: These changes will disable the default tracing facility in android, you will no longer be able to use the trace facility of android.

Usage:

Assuming you built the android image with the above changes and using it.

1) You need to first tell the dalvik, which app you are interested, this information can be reliably provided by providing the information about the dex file contained in the app. You can specify the dex file info by providing its checksum. Each dex file has a checksum in the header which uniquely identifies itself. After installing the app on the device/emulator, you can get the odex (optimized dex file) form the: /data/dalvik-cache folder of the device, the file name generally contains the app package name.

For ex: after installing the app RandomMusicPlayer (whose package name is: com.example.android.musicplayer), I found the odex in the dalvik-cache as:

/data/dalvik-cache/data@app@com.example.android.musicplayer-1.apk@classes.dex

Next, get the odex file by using adb pull.

adb pull /data/dalvik-cache/data@app@com.example.android.musicplayer-1.apk@classes.dex .

sdk_install_folder/build-tools/18.0.1/dexdump -f data@app@com.example.android.musicplayer-1.apk@classes.dex | grep checksum | head -n 2 | tail -n 1

Usually, you will get output of the form : checksum : 0dbf1e3d

Now copy only the checksum i.e: 0dbf1e3d to a file named : toTrackDexFile.txt, also copy the package name of the app you want to track to a file named: toTrackPackage.txt

for the RandomMusicPlayer app:

toTrackDexFile.txt contains (This mostly will be different if you rebuild the app):

0dbf1e3d

toTrackPackage.txt contains (This should be same for a given app):

com.example.android.musicplayer

After this:

adb push toTrackDexFile.txt /sdcard/toTrackDexFile.txt
adb push toTrackPackage.txt /sdcard/toTrackPackage.txt

in the same order.

Then you can start the app and use it, all the methods that are called will be logged in logcat. to get the filtered logs use:

adb logcat | grep DVM_M3_Method , you will find all the methods in the corresponding dex file which were called atleast once.

Note: In the log you will see the method only once irrespective of the number of times that method is actually called.

Have fun.

To check for ASLR

Most of the folks probably know what ASLR means, if not check this. To check whether your *nix system is ASLR enabled, I have written small piece of code which will tell you if ASLR is enabled on the system on which this is run.

Code for it is as shown below:
#include<stdio.h> #include<stdlib.h> #include<string.h> int main(int argc,char* argv[]){ int i; if(argc < 2){ char command[1024]; int** addr = 0; sprintf(command, "%s dummy", argv[0]); FILE *fp = popen(command, "r"); if (fp) { if ((fscanf(fp, "%p", &addr)) != EOF) { } else { addr = 0; } fclose(fp); } if(&i != addr){ printf("Yes\n",&i,addr); } else{ printf("No\n"); } } else{ printf("%p\n",&i); } return 0; }
I understand that there are better comprehensive tools which does much more like this. But this is my effort to understand the security features and how to test for their existence.

Btw, to disable ASLR system wide:
echo 0 > /proc/sys/kernel/randomize_va_space
References:

http://www.win.tue.nl/~aeb/linux/hh/protection.html

Double free: Its not my fault!

I hope most of the programmers aware of secure coding have heard of double free vulnerability. If you haven’t , I suggest your refer this to get basic understanding of the issue, if you are really interested in details then refer this.

In simple words “undefined behaviour(depends on the implementation of heap) occurs when you free an allocated memory more than once”. If you look at this vulnerability bit closely, it seems the free function is not validating the provided memory address and its assuming that provided address is a valid address of an allocated memory. This forces the programmer to ensure that she frees the memory only once, although this is easy to achieve in single threaded programs but becomes quite a daunting task in case of multi-threaded programs where the memory is being managed by more than one thread.

Programmer can write a piece of code similar to below to handle this:
void freeMemory(void* tofree){ /*This is to ensure atomicity, to improve performance we can have a lock for each chuck of memory*/ lock(freelock); /* some method which can determine that memory to be freed is a valid allocation */ if(validAllocation(tofree)){ free(tofree); } unlock(freelock); }

This method still requires support from memory management implementation to provide support for: validAllocation(*) method.

This method though looks fine but is hard to get right with least effect on performance. Now, lets sit back and think about the actual cause of this vulnerability: free assumes that memory location provided is valid. This assumption is invalidated during double free condition. Now, is this assumption valid? may be w.r.t performance but certainly not if we look it at purely security perspective, this looks like a typical case of improper input validation: CWE-20: Improper Input Validation. Ideal fix for this would be for the free method to validate the provided address to be a valid allocated memory before releasing the memory.

Why are we not fixing this in free? To be true, I don’t know. I guess most probably due to performance reasons. This is quite evident from the number of SafeHeap implementations going around and having some kind of performance bottle neck. Although its understandable for other heap vulnerabilities (like Buffer overrun, etc), double free vulnerability seems to need a quick check on the provided address. Moreover, there are many efficient search algorithms published every year, careful analysis of those would lead to an efficient search algorithm to be included in free method thus fixing the double free vulnerability.

As mentioned before, double frees are easy to occur in case of concurrent programs and finally its not programmers fault to fix the fault when the root cause (CWE-20) is in free method.

Light Weight Recoverable Virtual Memory

We implemented a system based on the research paper: http://www.cs.berkeley.edu/~brewer/cs262/lrvm.pdf

The README and the source code for this is present at:https://github.com/Machiry/AOS/tree/master/LightWeightRecoverableVirtualMemory

-Have fun

RPC Based WebProxy Server

In computer networks, a proxy server is a server (a computer system or an application) that acts as an intermediary for requests from clients seeking resources from other servers. A client connects to the proxy server, requesting some service, such as a file, connection, web page, or other resource available from a different server and the proxy server evaluates the request as a way to simplify and control its complexity. In this project we have implemented web proxy, which facilitates access to content on the internet. The proxy server exposes a Remote Procedure Call interface through which client(s) can request URLs(web pages). Caching of web pages is enabled on server and five cache replacement policies are implemented to manage the cache. We present the details of our experimental data and the methods we used to obtain it along with detailed evaluation of the performance of replacement policies.

We use Sun RPC to implement the RPC interface and libcurl to fetch the web pages. Sun RPC is a widely-used protocol for RPC in different programming languages. It relies on an available service (RPC portmap) to handle binding and service location. ONC RPC also relies on the XDR standard (eXternal Data Representation) to define a common wire-format for structured data sent between machines. It also defines an interface definition language that can be used with rpcgen to automatically generate stub code for services (as well as marshalling and unmarshalling code). ’libcurl’ is a powerful library for communicating with servers via HTTP (and FTP, LDAP, HTTPS, etc.). It supports HTTP GET/PUT/POST,form fields, cookies, etc.

The code along with detailed evaluation is present at:https://github.com/Machiry/AOS/tree/master/RPCBasedWebProxy

Barrier Synchronization

As part of our Advance Operating Systems course, we implemented barrier synchronization algorithms using open mp and MPI. we also evaluated the algorithms in various settings and explain the results in details.

Barrier is a type of synchronization method. A barrier for a group of threads or processes in the source code means any thread/process must stop at this point and cannot proceed until all other threads/processes reach this barrier [1]. There are several ways in software to achieve this using atomic operations and spinning on shared variables. Different synchronization algorithms differ in their architectural assumptions and remote operations, based on the signalling mechanisms used in various algorithms the performance will vary depending on cache effects and no of memory invalidations. As part of this project we implement 2 barriers using OpenMP: 1) Tournament, 2) Dissemination and 2 using OpenMPI: 1)MCS and 2) Tournament and a combined barrier which contains openmpi outer MCS barrier and openmp inner barrier.

The code and detailed report of this project are present at: https://github.com/Machiry/AOS/tree/master/BarrierSynchronization

LID: Lost In Droid

My random stuff, worth sharing.