Shortly after the new Android Runtime made its grand entrance, I ran a pretty exhaustive (and exhausting) series of performance benchmarks that showed ART wasn't really ready to blow us away. At the time, I opted to avoid the topic of battery life because it is so difficult to test accurately and with unbiased, meaningful results. As it turns out, that was dumb. Yup, so many of you have asked, I finally had no choice but to dive in and run a battery of tests on...well, the battery.

I've been running tests for over 6 weeks, covering virtually every angle I could think of and repeating several measurements to ensure the results were consistent and accurate. To be honest, the vast majority of results were boring and predictable (given the results of some other tests). Instead of boring readers with numbers and graphs, I wanted this post to be interesting; so the only thing you’ll see here are the most important results and explanations. If you have questions about a given scenario, feel free to ask in the comments.

What I’m Testing For

Traditional methods for benchmarking battery life aren’t particularly useful for comparing runtimes. Frankly, traditional methods aren’t even that good for comparing phones. Simply maxing out the processor or running a video for several hours isn't going to demonstrate the advantages ART. Rather, we should see lower demands on the hardware while doing normal activities, ultimately conserving battery life. There are 3 scenarios where a more efficient runtime should save power:

  • Shorter Wakelocks - While your phone or tablet sits idle, it wakes up fairly often to perform regular tasks. As long as the CPU is awake, it is consuming more power. If execution can be sped up, those background processes can finish their work more quickly and get the device back to sleep. Even if this only shaves off milliseconds, the total savings could be substantial.
  • Lower Clock Speed - Virtually all modern processors are capable of adjusting their clock speed on the fly. Much like a car, speed comes at the cost of power. An efficient runtime can allow the processor to get the same work done while running at lower speeds. Some of the biggest beneficiaries will be seemingly simple things like animations, scrolling list views, and app switching.
  • Better Use Of Hardware - The lower demands on the processor can enable even deeper optimizations. For example, it may be possible to target ARM’s big.Little architecture, which pairs high performance cores for intense processing with low power cores for simple tasks. Activities that may have once required a moderate amount of power to complete could possibly operate on weaker cores without sacrificing performance, drastically cutting battery usage. Unfortunately, this probably won't happen for a while, but it is definitely something to look forward to.

In short, I want to isolate and dig into the common activities on our devices to determine if ART can do them more efficiently.

Testing Setup

I used two devices for my measurements, a Nexus 4 and a Nexus 5. Each was flashed with stock factory images of Android 4.4.2 (KOT49H). I wanted to keep the tests as controlled as possible while still keeping them relevant to real-world scenarios. To this end, both handsets were set up with a blank Gmail account so there wouldn’t be any unpredictable notifications or messages, but regular communication with Google’s servers would continue.

Wi-Fi stayed enabled and configured to connect only to a specific access point. SIM cards were not installed at any point and Bluetooth was disabled during all tests.

Aside from the apps included with the factory images from Google, the only other installed app was a custom battery logging tool I built for these tests. No apps were allowed to update during or between tests, including the Play Store and Play Services Framework, which like to silently update.

Disclaimer: As with the performance benchmarks, I want to be clear that these are simply the results of my own tests and can't represent every possible scenario. Differences in hardware, software versions, and other external factors could skew results in unpredictable ways. In other words, don't be angry if your experiences don't line up with my own.

Test 1 - Idle With Background Processes

No matter how powerful and useful our smartphones and tablets are, they usually have to sit with the screen off to conserve energy. Even when left unused, a little bit of power is consumed every time a service spins up to download the latest weather information, sync messages with an email server, or polling for updates from Facebook and Twitter.

Even though most events involve communicating with web services, I wanted to remove the networking component and the potential variables that might come from irregular or unpredictable data. To simulate the right conditions, I wrote a simple app that wakes up every 5 minutes, runs a standard sorting algorithm on a moderately sized data set, and then logs the battery level. Each test ran for just over 24 hours. These are the results:

The differences are pretty small given the timeline, but they do favor Dalvik by a little bit. All in all, this isn’t particularly surprising, as many built-in functions that occur as a part of running background services are already heavily optimized or based in native code, which we already know isn't running quite as smoothly on ART. In a more typical environment, we would probably see both sides becoming slightly more balanced. I suspect this might start to tip the other way after the next version of Android rolls out.

From left to right: Nexus 4 (Dalvik then ART), Nexus 5 (Dalvik then ART)

Test 2 - Video Rundown

Quite a few readers specifically requested a video rundown. I don't personally think it's a good way to even compare phones because it's a bit of a biased test, but it's easy enough to do and produced a side effect that was very interesting.

I chose a 720p resolution MP4 with x264 encoding and looped it for exactly 3 hours 25 minutes. Multiple players were used, but there were no discernable difference in the results, so MX Player is filling in on the screenshots.

The numbers don't differ by much, but ART on the Nexus 4 does come out slightly ahead and there's a tie on the Nexus 5. Of course, video codecs are already heavily optimized and usually built from native code, typically on hardware with built-in decoders, so there was little chance of a significant power savings here.

From left to right: Nexus 4 (Dalvik then ART), Nexus 5 (Dalvik then ART)

The more interesting outcome from this test actually has to do with the battery usage report. Under Dalvik, we see the mediaserver and MX Player processes are clearly consuming most of the power, along with a healthy amount for the Android OS. However, ART attributes most of the power usage to the screen. Perhaps I’m thinking of that in reverse and Dalvik is under-reporting what it takes to power the screen while ART is calling it correctly. I’ll leave this up to the commenters to debate, but this is a strong indication that we’ll need to be a little more careful about diagnosing battery usage in the future.

Test 3 - Animation Rundown

As many have observed, the current incarnation of ART just feels much smoother. There certainly are fewer glitches and dropped frames during animations, particularly in web browsers and launchers. In fact, that’s one of the top reasons people have given for making the switch. I decided to investigate if these optimizations resulted in any sort of gain or loss to battery life. As it turns out, the results aren't just significant, they actually show us one of the weak points in ART.

This test called for another custom application - something that could roll through animations for a few hours without any interaction or randomness. I wrote a simple app that animated in a list with 100 cards, each with a picture and some text, then continuously scrolled back and forth through the list. Each trial ran for exactly 3 hours.

For the first time, ART breaks through with a definitive win in battery life. Oddly, only the Nexus 4 came out ahead while the Nexus 5 was behind, but more on that in a bit. At least this gives us some evidence that there are some places where ART could extend battery life - under the right circumstances. Animations tend to create a lot of objects in memory and must be able to make changes quickly. Some targeted optimizations to enhance animations would make a lot of sense, and enhanced battery life would make for a great side-effect.

Still, we have to ask why one device shows a win for ART while the other does not. I considered a few possibilities, but it occurred to me that a simple oversight of my own may have exposed a performance issue. The first version of my testing app used a single set of images on the cards, all of which were exactly 1/2 of the size necessary for rendering on a Nexus 4, which meant that they were 1/3 of the size for the Nexus 5. It looks like image processing was responsible for eating up a significant amount of power.

I enlarged the images to match the PPI of each device and ran through the tests again. This produced the winning combination I think we were all hoping to see: ART was finally the winner on both testing devices. Sadly, the differences are still mostly negligible, but it's enough to be optimistic.

From left to right: Nexus 4 (Dalvik then ART), Nexus 5 (Dalvik then ART)

Conclusions

These results should hopefully give some context to the effect of ART on battery life. Even though the outcomes don't explicitly favor ART, they also don't really contradict any one story on the Internet. If anything, this just proves that there are situations where both runtimes can shine. Additionally, there are many other optimizations that can’t be accurately measured, none of which are represented here and could be responsible for more significant experiences.

I expect the majority of the people claiming wildly better battery life are experiencing a placebo effect. After all, every test I ran demonstrated no more than about 2%-4% difference after burning through nearly 50% of the battery.

Still, I'm fairly certain early adopters won’t be swayed to return to Dalvik, and I'm not really sure they should be. If you’re on the fence about switching, I would still advise sticking to Dalvik, at least until the next version of Android. Aside from smoother animations and scrolling, this version certainly doesn’t offer any profound increase in performance or battery life, and it’s still prone to occasional bugs. At this stage, it’s still something best left to developers and some enthusiasts.

It’s important to remember that these underwhelming results are coming from a preview version of software the Android Team did not intend for regular users to even start using yet. The code isn’t optimized to the extent that we know it can be, and there are surely several safety checks to guard against bugs, all of which are adding overhead that will eventually become unnecessary. Of course, nobody wants ART to land in the crosshairs of an upcoming Bug Watch (wink), so let’s not be too hasty about stripping out those safety measures. Nonetheless, the current incarnation is simply about getting everything working and introducing a new runtime to the world. We should be looking forward to future versions for more tangible improvements.

I’m pretty sure I’ll be hammering out an even more thorough set of benchmarks when Android 4.5 (or 5.0, or whatever) launches in a few months. In the meantime, Part 4 is coming up really soon, so keep your eyes peeled!