Sunday, July 10, 2011

The awesome Memory Analyzer Tool

After struggling mightily to get the Nexus One rooted (no working fastboot on Mac!), we've finally got to the point where we can start debugging the memory leak that triggered this whole thing. Let's take a look at a really nice link we had come across previously:

http://ttlnews.blogspot.com/2010/01/attacking-memory-problems-on-android.html


You might not run into them that quickly, but when you do, with a bit of bad luck, you've got a lot of work to do.
I've split the memory problems in two parts: out of memory (OOM) problems (this post) and stack overflow problems (a next post).

The maximum heap memory an process (app) gets in Android is 16M. That's not that much, a basic app takes already several MBs after you've only just started it.
So in the LogCat output you see something like when you run out of (heap) memory:


09-12 16:33:43.357: ERROR/dalvikvm-heap(157): 528640-byte external allocation too large for this process.
09-12 16:33:43.357: ERROR/(157): VM won't let us allocate 528640 bytes
09-12 16:33:43.357: DEBUG/AndroidRuntime(157): Shutting down VM
09-12 16:33:43.367: WARN/dalvikvm(157): threadid=3: thread exiting with uncaught exception (group=0x40010e28)
09-12 16:33:43.367: ERROR/AndroidRuntime(157): Uncaught handler: thread main exiting due to uncaught exception
09-12 16:33:43.387: ERROR/AndroidRuntime(157): java.lang.RuntimeException: can't alloc pixels
09-12 16:33:43.387: ERROR/AndroidRuntime(157): at android.graphics.Bitmap.nativeCreate(Native Method)
09-12 16:33:43.387: ERROR/AndroidRuntime(157): at android.graphics.Bitmap.createBitmap(Bitmap.java:343)


Actually, that's not what I had. But, let's keep looking.

With


adb shell procrank


you can get some information about your app's memory usage, but only a very rough idea.

// actually non- the command isn't found.


A bit more specific information you can find by executing


adb shell dumpsys meminfo



** MEMINFO in pid 1417 [com.jlptquiz.app] **
native dalvik other total
size: 6960 6727 N/A 13687
allocated: 6577 3027 N/A 9604
free: 310 3700 N/A 4010
(Pss): 1313 435 6075 7823
(shared dirty): 2188 1776 5296 9260
(priv dirty): 1216 188 4496 5900

Objects
Views: 0 ViewRoots: 0
AppContexts: 0 Activities: 0
Assets: 2 AssetManagers: 2
Local Binders: 8 Proxy Binders: 13
Death Recipients: 1
OpenSSL Sockets: 0

SQL
heap: 0 MEMORY_USED: 0
PAGECACHE_OVERFLOW: 0 MALLOC_SIZE: 50


Asset Allocations
zip:/mnt/asec/com.jlptquiz.app-1/pkg.apk:/resources.arsc: 12K



DDMS also gives some rough info, but only some quite high-level info about objects being freed and allocated.

// Ouch. DDMS not found either.

So still not enough info to really figure out what's going on: you'll see the memory usage will go over the 16MB limit at some point, at which the JVM will start complaining in the LogCat output it can't allocate the requested memorory.
But that you already knew :)

So you need something more detailed about what is when allocated and GC'ed. This is where the Eclipse Memory Analyzer Tool (MAT) comes into play.

Below I'll just give the steps you could apply to figure out where you're not freeing memory correctly and thus have a memory leak(s). All steps apply for a Windows + Eclipse environment, but should not be hard to apply to other setups.
The steps are based on input from several posts:


Analyzing the memory usage of your Android application

Immortal Objects - Or: How to Find Memory Leaks

Automated Heap Dump Analysis: Finding Memory Leaks with One Click

Eclipse Memory Analyzer - More Than Just a Heap Walker

Avoiding memory leaks. Original post with some questions/replies.
Very important is this mention: "When a Drawable is attached to a view, the view is set as a callback on the drawable.". So here's implicitly a reference back introduced! And you probably have images in your app, so for 99.99% certainty, you could have a memory leak here if you're not "setting the stored drawables' callbacks to null when the activity is destroyed".

Android memory usage with hprof

Java theory and practice: Plugging memory leaks with weak references

Make sure you've got your app running in the emulator via Eclipse. A real device should also work but I didn't try it. Probably sending the signal 10 needs to be done from w/in the code, see here. I'll assume you've already installed the MAT plugin.
Start a Windows cmd shell. Go to the directory where you installed the Android SDK and go into the 'tools' subdirectory.

First the directory /data/misc needs to be in mode 777 for the heap dump to be written. So execute:


adb shell chmod 777 /data/misc




Now you probably already have an idea which Activity is causing the OOM or might have a memory leak. To be able to easily spot it, open and close that Activity about 5 times. This makes sure it will stand out in the heap data. And, if you've opened/closed it 5 times, it should be fully GC'ed right? So if later on we see it still has 5 or 4 instances around, you know something is not cleaned up correctly!

Find the process id (PID) you want an heap dump for. That is your app, identified by the package path of your app. So execute to look it up:


adb shell ps


Take the first column (named PID). Send a signal 10 to that process via:


adb shell kill -10


In your LogCat output you should see the VM heap being dump with lines similar to:



01-31 15:29:34.112: INFO/dalvikvm(210): SIGUSR1 forcing GC and HPROF dump
01-31 15:29:34.112: INFO/dalvikvm(210): hprof: dumping VM heap to "/data/misc/heap-dump-tm1264948174-pid210.hprof-hptemp".
01-31 15:29:35.873: INFO/dalvikvm(210): hprof: dumping heap strings to "/data/misc/heap-dump-tm1264948174-pid210.hprof".
01-31 15:29:36.652: INFO/dalvikvm(210): hprof: heap dump completed, temp file removed


// Er. No.


I/ActivityManager( 93): Displayed com.jlptquiz.app/.StartActivity: +2s8ms
D/AlarmManagerService( 93): Kernel timezone updated to 240 minutes west of GMT
D/SystemClock( 183): Setting time of day to sec=1310338422
D/dalvikvm( 1005): GC_EXPLICIT freed 37K, 49% free 2794K/5379K, external 1712K/2137K, paused 54ms
W/InputManagerService( 93): Window already focused, ignoring focus gain of: com.android.internal.view.IInputMethodClient$Stub$Proxy@407c6680
D/AlarmManagerService( 93): Kernel timezone updated to 240 minutes west of GMT
D/SystemClock( 183): Setting time of day to sec=1310338471
D/dalvikvm( 183): GC_CONCURRENT freed 532K, 51% free 3097K/6215K, external 1625K/2137K, paused 6ms+3ms
D/dalvikvm( 318): GC_EXPLICIT freed 15K, 49% free 2755K/5379K, external 1625K/2137K, paused 63ms
I/ActivityManager( 93): Process com.jlptquiz.app (pid 1690) has died.
E/InputDispatcher( 93): channel '40534e30 com.jlptquiz.app/com.jlptquiz.app.StartActivity (server)' ~ Consumer closed input channel or an error occurred. events=0x8
E/InputDispatcher( 93): channel '40534e30 com.jlptquiz.app/com.jlptquiz.app.StartActivity (server)' ~ Channel is unrecoverably broken and will be disposed!
D/Zygote ( 67): Process 1690 terminated by signal (15)
I/WindowManager( 93): WIN DEATH: Window{40534e30 com.jlptquiz.app/com.jlptquiz.app.StartActivity paused=false}
V/RenderScript_jni( 190): surfaceCreated
V/RenderScript_jni( 190): surfaceChanged
W/InputManagerService( 93): Got RemoteException sending setActive(false) notification to pid 1690 uid 10057
D/SntpClient( 93): request time failed: java.net.SocketTimeoutException: Try again
W/InputManagerService( 93): Window already focused, ignoring focus gain of: com.android.internal.view.IInputMethodClient$Stub$Proxy@4080a2f8
D/SntpClient( 93): request time failed: java.net.SocketTimeoutException: Try again
D/SntpClient( 93): request time failed: java.net.SocketTimeoutException: Try again

Well, looking through this:

http://groups.google.com/group/android-developers/browse_thread/thread/6659aaf069e462d2/2f2327e87dfdf639

It seems that they have been trying to do some stuff with sending the dump to a file through DDMS - and sure enough (after I installed the MAT plugin) I saw an icon somewhere in DDMS that said something about an hprof dump file. And when I clicked it, lo and behold, it brought up a suspected leaks report in a debug window!

Not that I really get yet or that it's helpful, but this MAT tool looks to be very good. I do see my App object showing up in there, but the it's just a tiny slice of the 2.2 total 2.2 meg of heap? Memory available. It looks so great, I'm going to load up an image of it:





Theres a big hashtable in my app state, but it's only taking up 208k - 8.5% of what? The total ram? the 2.2 Meg of heap?

Ok, I ran it a few times - let's see if it's gotten bigger.

Yes - it has - and now it's the number one suspect. Check it out:

One instance of "com.jlptquiz.app.AppState" loaded by "dalvik.system.PathClassLoader @ 0x40513950" occupies 370,712 (13.67%) bytes. The memory is accumulated in one instance of "java.util.Hashtable$HashtableEntry[]" loaded by "".

Keywords
com.jlptquiz.app.AppState
java.util.Hashtable$HashtableEntry[]
dalvik.system.PathClassLoader @ 0x40513950



And it's 13.67%.

Well, it's sure starting too look like we're onto something. I remember writing in my bug report, something to the effect of making sure that it doesn't increase every time, but somehow I lost it. I think all I really need to do is clear it.

But, this is fun. I want to see it go up some more...

Ok, it's up to 17.23%.

Let's run like 100 times...

Ok, 15 - 20 times...

And it's up to 22%.

Ok, let's fix this. Here's the array that's causing all the problems:

public Hashtable> wordGroupingsBySuffix = new Hashtable>();

It's getting reloaded each time a question is asked, reason being that if the activity was lost, then this would have to be recreated, and I simply didn't get to checking to see if the array hashmap already had entries. Next time, make a huge TODO. But - why wouldn't it just overwrite the existing entries? Why is it adding more? Does it have something to do with the strings objects pointed to by the stored array aren't released, somehow?

Well, let's just check for the existence of entries in the map, and just leave it alone if it's already there.
if (appState.wordGroupingsBySuffix.isEmpty() == false) {
return;
}

Let's try it...

Ok, after the first 5 or so, it's not on the suspect's list. It has a retained heap of 44k and 1.88%. Let's do say 10 or 20 more and see what happens.

Ok, after about 20 runs - still not a suspect. What about the heap?

It's not even showing up in it anymore. Am I doing this right?

Ah, there it is. The dominator tree. And the numbers are 44,274 and 1.8%. Sounds familiar. And thus, at least one memory problem is solved. The Memory Analyzer Tool, which I believer was developed by a couple of whiz kids at SAP, is really spectacular in taking the pain out of solving memory bugs.

Heres the url for the Eclipse plug-in:

http://www.eclipse.org/mat/

After all that, I probably didn't even need root :)

That's a wrap for now.




No comments:

Post a Comment