Programming tricks: Know your sections

Post by **Dbug** » Sun Nov 28, 2010 1:37 pm

Recently I had to help some people transfer some software from tape to disk, and the difficulty was that the program was using some much memory that it was virtually impossible to load it without destroying the operating system.

Since the program was using all the area from $400 to $bfdf, when loading it the page 4 used by Sedoric was overwritten, meaning that you could not save it to disk after that.

Fortunately since the source code was available I was able to find a way to solve the problem. The secret? Use the BSS section!

I guess you already know about the .text and .zero directives when using XA.

.text defines the area where the code goes. So typically in this program the .text section was defined to start at $400

.zero defines the location of variables in zero page. Everything declared after the .zero is going to be allocated from $00 to $ff

There are two other directives that can be used, namely .data and .bss, and most people don't use them because they don't know what they can be used for.

Truth be told, on the Oric you can do without .data. It's used to differentiate the actual code and the data. It does matter on machines where you can relocate the code to load at various location, but on the Oric we don't really need that one.

Now, what about .bss?

Well, BSS means "Block Storage Section". This section is for non initialized data, things like temporary buffers, unpacking areas, save game status, things that contains zeroes at the start of the program.

By definition you can't have instructions, or data defined by .byt or .word in the BSS section. All you can use is the *=value, .dsb, etc...

The trick is that the assembler groups together all the things by sections (all the .zero together, all the .text together, all the .data together, all the .bss sections together) and only saves in the final executable the content of .data and .text.

The content of .zero and .bss is not saved. All it does is to compute the addresses of the various elements defined, and make sure the code works fine when you try to use them.

So how to use that? Guess an example will help.

Let's say you have this program:

Code: Select all

 .text
*=$500   

Start
 ; Save the screen content to the buffer 
 ldx #0
loop
 lda $bb80+256*0,x
 sta Buffer+256*0,x
 lda $bb80+256*1,x
 sta Buffer+256*1,x
 lda $bb80+256*2,x
 sta Buffer+256*2,x
 lda $bb80+256*3,x
 sta Buffer+256*3,x
 dex
 bne loop
 rts

Buffer
 .dsb 1000

If you assemble that, the resulting TAP file will be more than 1000 bytes. It will contain the code with the lda, sta and loop code, but it will also contain the 1000 bytes buffer full of zeroes.

Do you really need to save the 1000 bytes full of zeroes in the tape file ???
Not really

So instead you can rewrite this program like this:

Code: Select all

 .text
*=$500   

Start
 ; Save the screen content to the buffer 
 ldx #0
loop
 lda $bb80+256*0,x
 sta Buffer+256*0,x
 lda $bb80+256*1,x
 sta Buffer+256*1,x
 lda $bb80+256*2,x
 sta Buffer+256*2,x
 lda $bb80+256*3,x
 sta Buffer+256*3,x
 dex
 bne loop
 rts

 .bss
*=$1000
Buffer
 .dsb 1000

Now when you save it, only the few bytes used by the loop will be saved, the buffer itself will just be defined as some label pointing to $1000.

In this particular case the program will work because it does not assume that the buffer contains zeroes. If your program really needs the buffer to be full of zeroes, then you have to add a manual clean-up code for the BSS data.

On machines like the Atari ST, the program header contains information about the sections, this is used to "relocate" the program to whatever address is available (on the Atari ST you can have more than one program in memory at any given time, so you can't use absolute addresses), and the system will know where the BSS section is and how large it is, so it will clear it for you.

On the Oric we do not have such system, so you have to clean the bss yourself, else it will contains whatever was there before the loading, which most probably will be the value $55 all over the place (the famous UUUUUUUUU pattern you can see when things go wrong).

To solve this problem, I generally have in my code a StartBSS label, and a EndBss label, and a "clear BSS" routine that makes sure that everything between the two labels is forced to zero

To go back to the first paragraph, I found out that in the program there was about 10 kilobytes of buffers containing zeroes, so all I did was to create a BSS section (set to the address $400), moved the declarations for the various buffers to the BSS section, then I modified the program so it loads higher in memory, and added some code to clear the BSS area on start-up.

Doing that gave me two benefits:
- The program was reduced from 48k to 40k, which is a real benefit from people still wanting to load it from tape (significantly reduced loading time)
- The program can now be loaded from tape and saved to disk, because until you actually run it the operating system is not trashed (of course you have to disable the auto-run in the header first !)

Hope this was informative

Twilighte · Post by **Twilighte** » Mon Nov 29, 2010 3:01 pm

Could this same .bss be used for compiling zero page routines?
For example zero page routines only ever reside in zero-page once the program is loaded and the driver has shifted the code to zero-page.
This means I have a difficulty when i try to do self modifying zero-page code..

Code: Select all

ZeroPageRoutine
source lda $A000
       sta $030F
       inc source+1
       bne skip1
       inc source+2
skip1  rts

Since if that code was compiled inside a .text it would treat source as the location in the .text rather than any zero-page place.

So by duplicating the same code within a .bss structure it should be possible to avoid this situation and get it to do the thing i want...

Code: Select all

 .text
*=$500

ZeroPageRoutine
       lda $A000
       sta $030F
       inc source+1
       bne skip1
       inc source+2
skip1  rts

 .bss
*=$00
ZeroPageRoutine
source lda $A000
       sta $030F
       inc $FF
       bne skip1
       inc $FF
skip1  rts

Is this right, or is there a better way without duplicating? :p

Post by **Dbug** » Mon Nov 29, 2010 6:34 pm

.bss cannot contain anything except "virtual zeros", so no it cannot be used for that.

That being said, this code does what you are asking for:

Code: Select all

	.text

_main
	jsr CopyZeroPageRoutine	
	jsr ZeroPageRoutine
	rts
	
StartZeroPageRoutine 
	*=$50
 
ZeroPageRoutine 
	nop
source 
	lda $A000 
	sta $030F 
	inc source+1 
	bne skip1 
	inc source+2    
skip1  
	rts
EndZeroPageRoutine

	*=StartZeroPageRoutine+EndZeroPageRoutine-ZeroPageRoutine
	
CopyZeroPageRoutine	
	ldx #0
loop	
	lda StartZeroPageRoutine,x
	sta ZeroPageRoutine,x
	inx
	cpx #EndZeroPageRoutine-ZeroPageRoutine
	bne loop
	rts

I specifically placed the copy routine at the end to show that the zero page block was not breaking the main assembly

I admit it's not pretty, but it worked.

Chema · Post by **Chema** » Mon Nov 29, 2010 8:05 pm

Just read this and I am not sure if I got it correctly (I just give a quick read) but I guess I can explain how I use .bss to set code and data for the overlay ram. Obviously only labels are evaluated and nothing is generated, but this way I can have code in the overlay ram.

What I do is have a

Code: Select all

.bss
*=$c000

somewhere at the end of the files that go into normal memory. Then, via includes or in the osdk_configure, I add files with the code and data that will be placed in overlay. For instance multiplication tables or music in 1337 or all the sound code among other things in space:1999.

The same files are included in the project which generates the tap file that I also put in the disk and which is afterwards loaded by the main program at $c000. In this case the main file issues the *=$c000 but not the .bss so the code is compiled this time.

So the main lesson is that the .bss section indicates the assembler to evaluate the labels but NOT to generate code.

Post by **Dbug** » Mon Nov 29, 2010 8:14 pm

I'm doing that as well

Like in BuggyBoy I have this file just to define the content of all my big buffers in overlay memory.

The important point is that you can have as many .text .zero or .bss sections you want, they will all be packed together.

You don't have to respect a particular order either.

What I do is that every single of my file starts by a section declaration, so the file inclusion order does not matter.