Implementation of the GNU Assembler
Jan 21, 2021
Sections and Relocation
- Assigning run-time addresses to sections is called relocation.
- An object file written by as has at least three sections, any of which may be empty. These are named text, data and bss sections.
- text section and data section
- hold the program
- text section is often shared among processes: it contains instructions, constants and the like.
- data section of a running program is usually alterable: for example, C variables would be stored in the data section.
- bss section
- contains zeroed bytes when your program begins running
- used to hold unitialized variables or common storage
- was invented to eliminate those explicit zeros from object files
- absolute section
- addresses that do not change when relocating
- undefined section
- a catch-all for address references to objects not in the preceding sections.
- allocate address space in the bss section
- may not dictate data to load into it before your program executes
- Labels: represents the current value of the active location counter
- Local Symbol Names: the first 1: is named L1C-A1, the 44th 3: is named L3C-A44.
- The special symbol
. refers to the current address that
as is assembling into.
- Symbol Attributes:
- for a symbol that labels a location in the text, data, bss or absolute sections the value is the number of addresses from the start of that section to the label.
- the value of a symbol changes as ld changes section base addresses during linking.
- Absolute symbols' values do not change during linking.
- The value of an undefined symbol
- If it is 0 then the symbol is not defined in this assembler source file, and ld tries to determine its value from other files linked into the same program.
- A non-zero value represents a
.comm common declaration. The value is how much common storage to reserve, in bytes (addresses). The symbol refers to the first address of the allocated storage.
- contains relocation (section) information, any flag settings indicating that a symbol is external, and (optionally), other information for linkers and debuggers.
- The exact format depends on the object-code output format in use.
- Empty expressions
- Integer Expressions
- arguments delimited by operators
- Arguments are symbols, numbers or subexpressions.
- Adjust and remove extra white spaces
- Remove comments.
- Convert the chars into numeric representations.
- C-style ,
/* ... */
- the line comment character
@ (or others).
.byte 74, 0112, 092, 0x4A, 0X4a, 'J, '\J # All the same value.
.ascii "Ring the bell\7" # A string constant.
.octa 0x123456789abcdef0123456789ABCDEF0 # A bignum.
95028841971.693993751E-40 # - pi, a flonum.