C and C++
Q: What compiler should I use?
GCC is a compiler suite, created by the Free Software Foundation, which has support for the ARM7TDMI processor in the GBA. Most developers, both professional and amateur, use a version of GCC. If you run Microsoft Windows or Linux on the x86 architecture, try the "devkitARM" distribution of GCC, which you can download at
http://www.devkitpro.org/.
Professional developers with a ton of money to spare may use a proprietary compiler. If you are a professional, being paid for your work, it may be worth your while to consider ARM's own compiler or
Green Hills compilers.
Q: How do I install devkitARM on Windows?
Use the latest
devkitPro Updater, an installer written in NSIS that automatically fetches and installs the required components for you. If you are developing for the Game Boy Advance or Nintendo DS, you probably want everything but devkitPPC and devkitPSP.
This installer places files inside folders called 'bin', 'lib', 'include', etc. inside C:\devkitPro\devkitARM\. There should be a file called
C:\devkitPro\devkitARM\bin\arm-eabi-gcc.exe. It's not something to be double-clicked; you'll have to use the command prompt. Most GCC users are familiar with their favorite operating system's command prompt. If you have never used a command prompt,
learn DOS; the Windows command prompt is copied nearly wholesale from DOS's.
Ordinarily, you will use GNU Make to build your projects, and a conforming makefile handles the
PATH environment variable for you, based on the contents of the
DEVKITPRO and
DEVKITARM variables. But if you are using devkitARM components outside of a makefile, such as
arm-elf-nm to see what is taking up space in the binary, you'll have to add the path of the folder containing arm-eabi-gcc.exe to your PATH. Don't do this system-wide, or builds will fail for mysterious reasons. Instead, make a
batch file that adds to the PATH for only one command prompt session:
Code: Select all
@echo off
set PATH=C:\devkitPro\devkitARM\bin;%PATH%
cmd
And then double-click this batch file.
If you are writing your own makefile instead of using the makefile from the libgba project template, there are some more details that apply to cross-compilers such as devkitARM: When linking a program, you'll need to specify the Thumb C library (
-mthumb) that supports interworking (
-mthumb-interwork) and the GBA memory layout (
-specs=gba.specs for ROM or
-specs=gba_mb.specs for multiboot), or you'll just get a white screen:
Code: Select all
arm-eabi-gcc -Wall -mthumb -mthumb-interwork -specs=gba_mb.specs hello.c -o hello.elf
arm-eabi-objcopy -O binary hello.elf hello.mb
Q: How do I do XYZ in C or in C++?
See the
comp.lang.c FAQ or
C++ FAQ Lite.
Q: How do I put multiple files in a program?
Make a header (.h) file containing the class/struct declarations and function prototypes, and #include it from your .c and .cpp files.
DON'T:
Add other C files to your main.c by #include "foo.c".
DO:
Add other C files by compiling them separately (arm-eabi-gcc {options} -c foo.c -o foo.o) and linking them together (arm-eabi-gcc {options} a.o b.o c.o d.o -o game.elf).
Q: How do I put more than one source code file into a project?
Use GNU Make, which comes with MinGW or MSYS. You might find the
GNU Make manual and this
tutorial by sajiimori helpful.
Q: How do I put large (> 16 KB) arrays into a GBA program without it crashing?
The linker script included with devkitARM puts arrays and other variables into IWRAM unless you tell it otherwise. Trouble is IWRAM is only 32 KiB. For arrays that you don't plan to modify, use the keyword
const, which will instruct the linker to put the entire array into ROM (or EWRAM for .mb programs). For arrays that you do plan to modify, put them into EWRAM using a section attribute on the array's definition:
- __attribute__((section (".sbss"))) char foo[8192];
Puts the variable in EWRAM and initializes it to zero at program start. (Initializer values are ignored.) - __attribute__((section (".ewram"))) char foo[8192] = {3, 4, /*... */ };
Puts the variable in EWRAM and initializes it to the given values at program start. (This uses space in the binary even if initializer values are not given.)
Q: Where can I get a header file describing the GBA hardware?
Most developers nowadays use the header files that come with libgba or libnds. It contains several useful macros to define the addresses of the memory-mapped I/O registers and to form RGB color values, plus constants that represent bitwise settings.
This paragraph is historical. At one time, the Pin Eight GBA header file by Damian Yerrick (
tepples on gbadev forum), distributed with most of
his GBA software, was popular. It contains macros much like those in the libgba headers, but with different names, initially for what were thought to be legal purposes. In many cases, where there exist multiple sets of registers for a given task (such as the DMA, timers, background scrolling, background control, etc), the macros simulate a structure in register space, an array of registers, or even an array of structs. It began to fall out of use around the time libfat was adopted because libfat depended on the libgba headers. However, libgba headers do use arrays and structures in many cases.
Q: How do I use C effectively?
Some of
these tips will work. But remember that clarity is more important than optimization. According to Donald Knuth, "premature optimization is the root of all evil."
Don't divide too much. Because the ARM architecture has no hardware divider, and Nintendo forgot to put one in the GBA's I/O area like it did on the Super NES, the / and % operators are extremely slow. Don't use them unless you really have to. Alternatives, in order of most preferable to least preferable:
- For division by a power of 2, bitshifting; for modulus by a power of 2, the & operator.
- For division by an approximate constant, fixed-point multiplication by its reciprocal. (Recent GCC may do this for you.)
- For division by an exact constant (as in binary to decimal conversion), 1. multiply by the reciprocal rounded properly, 2. multiply the quotient by the divisor, and 3. add one to the quotient if the dividend is less than the product. (Recent GCC may do this for you as well.)
- For division by a variable in a small range (as in DDA algorithms), make a lookup table of reciprocals.
- For division by a variable in a large range, use the BIOS divide call, which is faster than the divide code that GCC inserts into your code, or a custom divider in IWRAM, which is even faster than the BIOS routine (which is optimized for size rather than speed, as is everything else in the BIOS).
- On the DS, you can use the hardware divider in the I/O area, but be careful with interrupts: there is only one context for both the main thread and for interrupts.
Avoid floating point. Don't use floating-point arithmetic unless you absolutely have to. The GBA has no dedicated floating-point hardware, and all floating-point operations have to be run in a slow software FPU emulator. Use fixed-point arithmetic instead; if you need floating-point arithmetic to create a data table, create it on the host computer before compiling the program.
Avoid trigonometry. Use lookup tables instead. Write a program that runs on the host computer to build a lookup table containing values of cos(x*Pi/512) from x=0..256, and then write cos() and sin() functions that index into that table. This table contains only one quadrant of sin(), so you'll need to perform appropriate reflections.
Avoid calling small functions in an inner loop. Calling a function flushes the processor's instruction pipeline and saves the contents of registers to the stack, and returning reads the stack and flushes the pipeline. Inline functions, on the other hand, are pasted directly into your code stream. For a small performance boost, make your smallest functions inline functions wherever it makes sense. At higher optimization settings (-O2 and especially -O3), GCC will try to do this for you when the caller and callee are in the same file. Use the
static keyword to give GCC a hint that a function will be called only from one file; use
static inline to give GCC an even stronger hint.
Use fast RAM. On the Game Boy Advance, compiling your code as ARM and putting it in IWRAM will make it faster. However, you have only 32 KB of IWRAM, so choose what to put in IWRAM wisely. But once you've profiled your program and decided which speed-critical functions to put in IWRAM, there are two ways to do it:
- arm-eabi-gcc -Wall -O2 -marm -mthumb-interwork -c mixer.c -o mixer.iwram.o
Puts everything in the compiled module into IWRAM. - __attribute__((section (".iwram"),long_call))
int mixer(void) { /*...*/ }
Puts only one function's binary code into IWRAM.
Likewise, code that manipulates savegame data or flash cart bankswitch registers should usually run in EWRAM:
- __attribute__((section (".ewram"),long_call))
void bytecpy(void *restrict in_dst, const void *restrict in_src, unsigned int length) { /*...*/ }
Puts the function's binary code into EWRAM.
Less of this should be necessary on the Nintendo DS, which has an instruction and data cache.
Q: How do I use C++ effectively?
C++ is a powerful language, but with great power comes great responsibility. Too many novices read through a "21-day" C++ book and come out of it with just enough knowledge to be dangerous. StoneCypher is a big fan of C++, and he has written a
post clearing up some people's misconceptions.
Use whatever features of C and C++ that you're comfortable with, but know the cost of what you use. C++ is a big language; to help you find your way around, StoneCypher recommends the
Effective C++ books by Scott Meyers. Not everybody considers C++ as maintainable as C, especially due to the
larger context required to understand a given piece of code. I've read that
Linus Torvalds isn't a big fan either, but that might be in part because Linux is older than the C++ standard.
Template types may produce large code, filling available RAM, unless you're very careful. Unfortunately, a lot of people are not. Know where your templates are instantiated, and know which classes are actually template instantiations (such as std::string).
The C++ standard library itself is a maze of templates. DevkitARM includes GNU libstdc++, which unfortunately is tuned for use as a shared library on Linux, not as a static library on an embedded system. A Hello World program for GBA that uses <iostream> compiles to over 180,000 bytes on devkitARM, even if you use -Os -Wl,-gc-sections to make ld try harder at removing unreachable code. A quick pass through arm-eabi-nm reveals that much of this space is used for locale support and the floating-point emulator. StoneCypher claimed that he and Wintermute were able to work around this, but he
refused to give any details other than "RTFM". It may be possible to fix this if you have a
lot of time on your hands, such as by replacing GNU libstdc++ with
uClibc++.
But it's perfectly fine to use C++ with the parts of the standard library that it inherits from the C language, such as <cstdio> and <cstdlib>. In this environment, you can use C-style error handling through return values instead of exceptions. Consider
T foo = new(std::nothrow) T;, which returns NULL if it can't allocate an object, just like malloc does. If you know you're not using exceptions in a program, give g++ the
-fno-exceptions flag when you compile and link for a minor performance boost.
Run-time type identification (RTTI) is used for the dynamic_cast<> and typeid operators. One can often refactor an object-oriented design so as not to use these operators. If you know you're not using RTTI in a program, give g++ the
-fno-rtti flag when you compile and link for a minor performance boost.
If declaring a pure virtual method gives "undefined reference to 'write'", you are using old devkitARM. If you cannot upgrade, see
this topic for a solution.
If your C++ program is much larger than you expect, or you start running out of RAM (especially in a GBA multiboot or on the DS ARM7), see
tepples' list of six space-saving tips.
Q: In C++, what is the difference between a struct and a class?
A class can contain data members (also called fields), member functions (also called methods), and virtual member functions; so can a struct. A class can contain members with public, protected (subclass API), or private (internal) privilege; so can a struct. A class can inherit or be inherited; so can a struct.
In fact, C++ sees
only one difference between a struct and a class:
A struct's members before the first privilege declaration are public; those of a class are private.
The other difference is that the C language accepts only struct. C and C++ are officially separate languages, but often, code in the C++ language must work together with code in the C language. Programmers tend to apply the term "struct" to data types that match the "
Plain Old Data" (POD) semantics inherited from C. The difference between POD and non-POD is orthogonal to the difference between class and struct, but this distinction remains a good practice in order to mark types as safe for use in C.
Q: My program works when compiled without optimizations (-O0), but when compiled with optimizations (-O, -O2, -O3), it misbehaves. How do I fix this?
The following paragraph is historical, as libgba headers handle this correctly.
Most likely: You forgot to declare your registers as volatile. When optimizing your program, GCC removes repeated accesses to memory addresses that it assumes do not change. This is not true of the memory-mapped registers such as the vertical scan position register (REG_VCOUNT) and the joypad data register. To tell GCC that the values of those registers change behind your program's back, you have to declare them as volatile.
Code: Select all
/* Instead of this: */
#define REG_VCOUNT (*(unsigned short *)0x04000006)
/* Do this */
#define REG_VCOUNT (*(volatile unsigned short *)0x04000006)
Also use volatile for variables that an interrupt service routine can change.
Far less likely: You ran into an optimizer bug. Check the GCC mailing lists to see if any ARM optimizer bugs have been reported in the last couple months. You may find that optimizer bugs have been fixed in a version of devkitARM based on a more recent GCC.
Q: How should I specify what files to rebuild?
It used to be common to use batch files that rebuild everything every time, especially for the simplest projects. Nowadays, most projects use GNU Make, an automated rebuilding tool which is included with MSYS. The libgba and libnds project templates include makefiles that automatically discover and compile all C source code files in a specific folder. See also the
GNU Make manual.
Q: How do I set up devkitARM with my favorite integrated development environment?
You can try the
Eclipse instructions or the
Visual Studio instructions.
Q: How do I generate pseudo-random numbers?
See the topics linked from
here.
Driven from Tilwick by ice storms, couldn't fit in in Flower Bud...
Nintendo DS: With two ARMs, who needs legs?