Basically, I made 64 copies of the volume conversion table (so the "and #3" is not necessary anymore), I moved the code that sets the PSG to accept data on register 8 out of the main loop, and I swapped the bits around to make it more efficient to extract the two bit values.
The inner loop now looks like that:
The "nops" are only there to keep the music to play at the same speed (give or take) so basically this new code sounds the same, but thanks to the changes is 16 nops faster for each byte being replayed.
Code: Select all
PlaySample: stx RES sty RES+1 sta _auto_end_sample_check+1 loop_read_page: ldy #0 lda (RES),y sta RESB ldy #4 loop_decode_byte: lda RESB tax lsr lsr sta RESB lda TableVolumeConversionData,x sta $030F lda #$FD sta $030C lda #$DD sta $030C nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop dey bne loop_decode_byte inc RES bne skip_high_byte inc RES+1 skip_high_byte: lda RES+1 _auto_end_sample_check cmp #$12 ; Self modified based on which samples play bne loop_read_page rts
What that means, is that it's probably viable to put the routine in an interrupt so other things can be done at the same time.
PS: There are other optimizations possible, I just wanted to check the main "bang for the buck" ones