Assembler

From Sega Retro

An assembler is a computer program for translating assembly language -- essentially, a mnemonic representation of machine language -- into object code. A cross assembler produces code for one type of processor, but runs on another.

As well as translating assembly instruction mnemonics into opcodes, assemblers provide the ability to use symbolic names for memory locations (saving tedious calculations and manually updating addresses when a program is slightly modified), and macro facilities for performing textual substitution -- typically used to encode common short sequences of instructions to run inline instead of in a ubroutine.

Assemblers are far simpler to write than compilers for high-level languages, and have been available since the 1950s. Modern assemblers, especially for RISC based architectures, such as MIPS, Sun SPARC and HP PA-RISC, optimize instruction scheduling to exploit the CPU pipeline efficiently.

High-level assemblers provide high-level-language abstractions such as advanced control structures, high-level procedure/function declarations and invocations, and high-level abstract data types including structures/records, unions, classes, and sets.

On Unix systems, the assembler is traditionally called "as", although it is not a single body of code, being typically written anew for each port. A number of Unix variants use GAS.

It is important to note that each assembler has its own dialect within processor groups. Sometimes, some assemblers can read other assembler's dialect, for example, TASM can read old MASM code, but not the reverse. FASM and NASM have similar syntax, but each support different macros that could make them difficult to translate to each other. The basics are all the same, but the advanced features will differ.

Also, assembly can sometimes be portable across different operating systems on the same type of CPU. Calling conventions between operating systems often differ slightly to none at all, and with care it is possible to gain some portability in assembler language, usually by linking with a C library that does not change between operating systems. However, it is not possible to link portably with C libraries that require the caller to use preprocessor macros that may change between operating systems. For example, many things in libc depend on the preprocessor to do OS-specific, C-specific things to the program before compiling. In fact, some functions and symbols are not even guaranteed to exist outside of the preprocessor. Worse, the size and field order of structs, as well as the size of certain typedefs such as off_t, are entirely unavailable in assembler language, and do differ even between versions of Linux, making it impossible to portably call functions in libc other than ones that only take simple integers/pointers as parameters.

Many people use an emulator to debug assembly-language programs.

Related wikis

External links