Contents

Arduino vs. embedded C - AVR reversing

As I was teaching some embedded C basics, I was asked what are some benefits of embedded C over the classic Arduino language for an Arduino-based board. This article tries to see what we can do by reversing a really simple program compiled with both methods for the Arduino Uno.

Prerequisites

  • arduino-cli: Command Line Interface for Arduino
  • AVR cross compiler: sudo apt install gcc-avr
  • Optional: an Arduino simulator such as SimulIDE

Sample program

/img/arduino_blink.jpg

We want to create the most simple program which goal is to light on the built-in LED, located at port 13 (or PORT PB5) on the Arduino Uno.

Arduino

1
2
3
4
5
6
7
8
void setup()
{
    pinMode(LED_BUILTIN, OUTPUT);
}
void loop()
{
	digitalWrite(LED_BUILTIN, HIGH);
}

Embedded C

1
2
3
4
5
6
7
#include <avr/io.h>
int main()
{
    DDRB = 1 << PB5;
    PORTB = 1 << PB5;
    return 0;
}

Codes and compilation

1
2
3
4
5
6
7
8
9
$ ls -lR
./led-arduino:
total 4
-rw-r--r-- 1 pascal pascal 104 mai    1 12:58 led-arduino.ino
./led-embedded:
total 4
-rw-r--r-- 1 pascal pascal 81 mai    1 13:00 led-embedded.ino
$ arduino-cli compile --fqbn arduino:avr:uno led-arduino
$ arduino-cli compile --fqbn arduino:avr:uno led-embedded

Binary comparison

Arduino Embedded C
Storage use 724 bytes 144 bytes

The embedded C code is 5 times smaller than the Arduino one which is a bit “weird” as both codes do the same thing! Let’s find why by reversing binaries and analyzing assembly codes.

arduino-cli produces the *.elf binary and the *.hex file which is just a series of bytes to be loaded in the Arduino.

Reversing binaries

1
2
$ avr-objdump -S led-arduino/led-arduino.arduino.avr.uno.elf > led-arduino.asm
$ avr-objdump -S led-embedded/led-embedded.arduino.avr.uno.elf > led-embedded.asm

Arduino binary

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
void init()
{
	// this needs to be called before setup() or some functions won't
	// work there
	sei();
 174:	78 94       	sei
	
	// on the ATmega168, timer 0 is also used for fast hardware pwm
	// (using phase-correct PWM would mean that timer 0 overflowed half as often
	// resulting in different millis() behavior on the ATmega8 and ATmega168)
#if defined(TCCR0A) && defined(WGM01)
	sbi(TCCR0A, WGM01);
 176:	84 b5       	in	r24, 0x24	; 36
 178:	82 60       	ori	r24, 0x02	; 2
 17a:	84 bd       	out	0x24, r24	; 36
	sbi(TCCR0A, WGM00);
 17c:	84 b5       	in	r24, 0x24	; 36
 17e:	81 60       	ori	r24, 0x01	; 1
 180:	84 bd       	out	0x24, r24	; 36
	// this combination is for the standard atmega8
[...]

Here is a part of the Arduino objdump. The assembly code is really long for such a program… It is easy to understand when we look at the source code of pinMode() and digitalWrite() (https://github.com/arduino/ArduinoCore-avr/blob/master/cores/arduino/wiring_digital.c). It does not only write a value into a register…

Garretlab made an analysis of both functions:

Embedded C binary

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
48 00000080 <main>:
49 #include <avr/io.h>
50 int main()
51 {
52   DDRB = 1<<PB5;
53   80:	80 e2       	ldi	r24, 0x20	; 32
54   82:	84 b9       	out	0x04, r24	; 4
55  PORTB = 1<<PB5;
56  84:	85 b9       	out	0x05, r24	; 5
57  return 0;
58  86:	90 e0       	ldi	r25, 0x00	; 0
59  88:	80 e0       	ldi	r24, 0x00	; 0
60  8a:	08 95       	ret

At first sight, it’s clearly easier. Step-by-step instruction decoding:

ldi r24, 0x20: according to the AVR ISA [1], ldi load the value 0x20 in r24 which is a General Purpose Working Register of the Arduino.

registers

out 0x04, r24: this instruction write the value stored in r24 at the address 0x04 (the address of DDRB according to [2]). 0x20=0b100000: DDRB5 is set to 1 😉

mapping

out 0x05, r24: same thing, for PORTB.

Conclusion

Embedded C code is usually quicker and smaller as long as we make some effort to study the datasheet. But it’s worth it when we’re working with embedded systems! 😉

References

  1. AVR ISA
  2. ATMega328 datasheet