Rambo/Tivoli Pirat

Probably the most technical forum around. It does not have to be coding related, or assembly code only. Just talk about how you can use the AY chip to do cool sounding things :)
User avatar
Dbug
Site Admin
Posts: 4821
Joined: Fri Jan 06, 2006 10:00 pm
Location: Oslo, Norway
Contact:

Rambo/Tivoli Pirat

Post by Dbug »

I've been playing a bit with the disassembly made by Jede, and I gave a shot at changing a bit the code structure.
Basically, I made 64 copies of the volume conversion table (so the "and #3" is not necessary anymore), I moved the code that sets the PSG to accept data on register 8 out of the main loop, and I swapped the bits around to make it more efficient to extract the two bit values.

The inner loop now looks like that:

Code: Select all

PlaySample:  
        stx     RES
        sty     RES+1
        sta     _auto_end_sample_check+1
loop_read_page:        
        ldy #0
        lda (RES),y
        sta RESB

        ldy #4
loop_decode_byte:  
        lda RESB
        tax
        lsr
        lsr
        sta RESB

        lda     TableVolumeConversionData,x       
        sta     $030F
        lda     #$FD
        sta     $030C
        lda     #$DD
        sta     $030C

        nop
        nop
        nop
        nop
        nop
        nop
        nop
        nop
        nop
        nop
        nop
        nop
        nop
        nop
        nop
        nop


        dey
        bne     loop_decode_byte

        inc     RES
        bne     skip_high_byte
        inc     RES+1
skip_high_byte:  
        lda     RES+1

_auto_end_sample_check
        cmp #$12                ; Self modified based on which samples play
        bne loop_read_page
        rts
The "nops" are only there to keep the music to play at the same speed (give or take) so basically this new code sounds the same, but thanks to the changes is 16 nops faster for each byte being replayed.

What that means, is that it's probably viable to put the routine in an interrupt so other things can be done at the same time.

PS: There are other optimizations possible, I just wanted to check the main "bang for the buck" ones :)
jede
Flying Officer
Posts: 191
Joined: Tue Mar 14, 2006 11:53 am
Location: France

Re: Rambo/Tivoli Pirat

Post by jede »

I was sure that this code could be optimized when i did the disassembly of this code.

Anyway, if you do an osdk tool to generate music like this one let me know :) and a faster player :)
User avatar
Dbug
Site Admin
Posts: 4821
Joined: Fri Jan 06, 2006 10:00 pm
Location: Oslo, Norway
Contact:

Re: Rambo/Tivoli Pirat

Post by Dbug »

Only been 4 years since the last post, but I think I managed to optimize the player by removing the shifting part.

Basically, instead of doing this four iterations loop:

Code: Select all

        ldy #4
loop_decode_byte: 
        lda RESB
        tax
        lsr
        lsr
        sta RESB

        lda     TableVolumeConversionData,x   

        [Write to YM + delay]

        dey
        bne     loop_decode_byte
we just need four tables (of 256 bytes) which return the proper values for the two bits we are interested in (whatever the six other bits are), which results in this code:

Code: Select all

        lda     TableVolumeConversionData1,x
        [Write to YM + delay]
        lda     TableVolumeConversionData2,x
        [Write to YM + delay]
        lda     TableVolumeConversionData3,x
        [Write to YM + delay]
        lda     TableVolumeConversionData4,x
        [Write to YM + delay]
the tables look like that:

Code: Select all

TableVolumeConversionData1
  .byt $00,$00,$00,$00
  .byt $00,$00,$00,$00
  (...)
  .byt $07,$07,$07,$07
  .byt $07,$07,$07,$07
  (...)
  .byt $0B,$0B,$0B,$0B
  .byt $0B,$0B,$0B,$0B
  (...)
  .byt $0E,$0E,$0E,$0E
  .byt $0E,$0E,$0E,$0E

TableVolumeConversionData2
  .byt $00,$00,$00,$00
  .byt $00,$00,$00,$00
  .byt $00,$00,$00,$00
  .byt $00,$00,$00,$00
  .byt $07,$07,$07,$07
  .byt $07,$07,$07,$07
  (...)

TableVolumeConversionData3
  .byt $00,$00,$00,$00
  .byt $07,$07,$07,$07
  .byt $0B,$0B,$0B,$0B
  .byt $0E,$0E,$0E,$0E
  (...)

TableVolumeConversionData4
  .byt $00,$07,$0B,$0E
  .byt $00,$07,$0B,$0E
  (...)
That makes the code much more compact, uses less registers, so less stuff to save and restore when we want to move that into an IRQ.

Now, performance wise, I've no idea how that compares to the usual 4bit player I'm using, maybe I should try to convert one of my old demos, like Nyan Cat with the two methods.
User avatar
jbperin
Flight Lieutenant
Posts: 502
Joined: Wed Nov 06, 2019 11:00 am
Location: Valence, France

Re: Rambo/Tivoli Pirat

Post by jbperin »

Looks good .. :)

Do you plan to update
your repository
with this smart optimisation ?
User avatar
Dbug
Site Admin
Posts: 4821
Joined: Fri Jan 06, 2006 10:00 pm
Location: Oslo, Norway
Contact:

Re: Rambo/Tivoli Pirat

Post by Dbug »

I only update when I have working code, tested, commented, and that I'm happy with.

That can take years.

I *never* just commit because I changed one line of code :p

Also, I reuse threads instead of creating new ones when I'm still working on the same project ;)
User avatar
Dbug
Site Admin
Posts: 4821
Joined: Fri Jan 06, 2006 10:00 pm
Location: Oslo, Norway
Contact:

Re: Rambo/Tivoli Pirat

Post by Dbug »

For the ones interested, the original music the demo use has been found:



It's the 1985 "Jungle" remix from the Rambo "first blood part II" title music.

https://www.discogs.com/master/99889-Fi ... m-Rambo-II
Post Reply