Incorrect IRQ timings in Oric emulators

Comments, problems, suggestions about Oric emulators (Euphoric, Mess, Amoric, etc...) it's the right place to ask. And don't hesitate to give your tips and tricks that help using these emulations in the best possible way on your favorite operating system.
User avatar
Dbug
Site Admin
Posts: 4437
Joined: Fri Jan 06, 2006 10:00 pm
Location: Oslo, Norway
Contact:

Incorrect IRQ timings in Oric emulators

Post by Dbug »

Looks like Fabrice and Xeron are not totally done with their Oric work :)

After the Solskogen party I decided to take a look at if I could optimize the replay routine (both in terms of CPU time and quality), so I started to code a small profiler. Then I compared the sound quality between Euphoric and Oricutron... and noticed that the profiler did not return the same values.

To be sure I ran it on two of my real Orics, and I had a third set of profiler values (but both Oric agree on the same value). Then finally I downloaded MESS to check, and MESS gives yet another value (but the closest from a real Oric).

So here are the result (first number is routine called from the ROM, second result is optimized IRQ running from Overlay memory):

Oricutron: 289/279
Euphoric: 2C9/2BD
MESS: 2A3/xxx (MESS does not support floppy drives emulation)
Pravetz: 2A7/29B
Atmos: 2A7/xxx (my floppy drive does not work on that one)

So as you can see the CPU time for the real orics is somewhat between what Oricutron and Euphoric take. I suspect that the speed difference is due to a difference in the amount of clock cycles counted for the IRQ sequence.

What would be cool is if everybody who has a working Oric or random emulator could run the small test program, with or without a disk drive (or cumulus) connected, and report the numbers that appear on the top left.
When the top right message show a red ROM, it means it uses the slow ROM based IRQ, when you get a green OVR, it means it uses the faster overlay ram based IRQ system. I'm also interested by reports about crashes or lack of sound.

Here is the link: http://www.defence-force.org/download/o ... chmark.tap

Thanks :)
User avatar
Xeron
Emulation expert
Posts: 426
Joined: Sat Mar 07, 2009 5:18 pm
Contact:

Post by Xeron »

On Oricutron, CPU/VIA interactions are (or should be) cycle exact, so fixing this should just be a matter of finding inaccuracies in the instruction cycle counts of the CPU and fixing them... which doesn't sound like fun :)
User avatar
Chema
Game master
Posts: 3013
Joined: Tue Jan 17, 2006 10:55 am
Location: Gijón, SPAIN
Contact:

Post by Chema »

Hi Dbug,

In my Atmos (no disk drive) it oscillates betwen 2A6 and 2A7 (I'd say it seems that it is a bit more time on 2A6).

It is, btw, a very nice experiment. Sounds a bit dirty (with a lot of noise) but still great!
User avatar
Dbug
Site Admin
Posts: 4437
Joined: Fri Jan 06, 2006 10:00 pm
Location: Oslo, Norway
Contact:

Post by Dbug »

Chema: Yes, I also have the 2A6/2A7 oscillation on my machines, so your experience matches mine.

About the sound quality, well its still 4bit samples playing at 4khz :) I have some possible ideas on how to improve the quality (basically error propagation instead of just truncating each individual sample value.)
User avatar
Dbug
Site Admin
Posts: 4437
Joined: Fri Jan 06, 2006 10:00 pm
Location: Oslo, Norway
Contact:

Post by Dbug »

[This post was edited to show the results with version 1.1 of the test]

Some updates, I actually found another issue: I modified my program to use the VIA timer 2 to accurately measure the time taken by some random routine when my sample player is running in the background (using an IRQ triggered by the VIA timer 1).

Based on that I ran five tests:
- Compute the time taken to just set the value of the timer2 and then read it immediately back to have a good base value
- Measure the time taken at startup on the Oric with just the normal 100hz system interrupt running (the one that reads the keyboad, blinks the cursor, handle internal timers, etc...)
- Measure the time when my sample player is not initialized, but with the interruptions disabled on the 6502 (instruction SEI before the test).
- Measure the time when my sample player is initialized (with the timer 1 setup), but with the interruptions disabled on the 6502 (instruction SEI before the test).
- Measure the time when everything is initialized and interruptions allowed on the 6502 (instruction CLI before the test).

I have to say that the results are entertaining :)

For the reference here is a screenshot taken on my TV connected to a real Oric Atmos 48k:
Image

And here are the results we got from three different emulators:
- Euphoric 1007:
Image

- Oricutron 0.7:
Image

- Mess 0143u2b:
Image

As you can see the numbers are quite different, but from these we can give some conclusions:
- The base test returns $11 on the Atmos, both Euphoric and Oricutron are close with $12, Mess is a bit faster with $f. So here we are between +1 and -2 cycles compared to the real machine.
- If you take into account this small difference in timer value reading, the Mess seems to be remarkably similar to the real machine on the normal Oric setup: Same $13c3 difference, and $5047+2=$5049, $640a+2=$640c, so it's consistent with what my Atmos shows. Euphoric has worse MAX value, but the diff is still relatively close. Oricutron is the farther with $14C3 instead of $13C3, that's about 256 more clock cycles.
- Euphoric is the only emulator which has different clock cycles values for the tests with SEI enabled, this means that for some strange reasons even with the 6502 not accepting any IRQ it still get somewhat some lost cpu time when you play with the VIA. Definitely a real bug here.
- When IRQ are disabled Oricutron is almost like the reference Atmos, just off by plus one (465D instead of 465C), while Mess is off by minus two (465A instead of 465C), that's consistent with the +1/-2 timer 2 reading difference we had at the start.
- The last test is the most interesting. Basically instead of the 100hz single interrupt (so that's two interrupts per frame) we have a 4khz interrupt (so that's 80 interrupts per frame). Mess is actually still consistent there, showing the same values than the Atmos, minus two cycles. Euphoric is about 662 clock cycles higher ($6a9c-6853) that the reference Atmos which divided by 80 would amount for about 8 additional clock cycle per IRQ. Oricutron on the other hand is way to fast by 901 clock cycles ($6806-$6481), so that would be about 11 missing clock cycles per IRQ.

Stay tuned :)

PS: You can find the source code on the SVN server: http://miniserve.defence-force.org/svn/ ... Benchmark/
The program itself is on the FTP: http://www.defence-force.org/ftp/forum/ ... Bench1.tap
Last edited by Dbug on Sat Aug 06, 2011 1:46 pm, edited 3 times in total.
User avatar
Chema
Game master
Posts: 3013
Joined: Tue Jan 17, 2006 10:55 am
Location: Gijón, SPAIN
Contact:

Post by Chema »

From the above post, I understood that (as IRQs are disabled for some tests) there is a difference of one cycle between oricutron and real Orics.

Not sure if it would be a good idea, but maybe having the actual code of the routine would at least give a hint of which instruction could be the culprit? Being just one cycle, maybe one of those extra cycles due to page crossing?

About the bug in Euphoric, might look uglier, but at least we know that something is eating up cycles when interrupts are disabled. I think you should email Fabrice about this. After all he modified Euphoric after the tests with 1337, so he will probably be willing to look into this.

Finaly, maybe it is not an issue of the cycles taken by interrupt handling. Nevertheless, for the records, I found in a book the cycles used by the interrupt sequence (just in case it can help). It uses 7 cycles and fetchs the opcode of the first instruction to run in the service routine in the 8th.

BTW the interrupt disable flag is automatically turned on by the micro.
User avatar
Dbug
Site Admin
Posts: 4437
Joined: Fri Jan 06, 2006 10:00 pm
Location: Oslo, Norway
Contact:

Post by Dbug »

Actually I found a bug in my test, I did not reset the min/max values between the tests, making a new version which also check the cpu time used by the normal IRQ as you normally get when booting the Oric.

Will update the SVN and numbers after. The number changed, but they are still indicating valid issues in the timings :)
User avatar
Dbug
Site Admin
Posts: 4437
Joined: Fri Jan 06, 2006 10:00 pm
Location: Oslo, Norway
Contact:

Post by Dbug »

I updated the post with the screenshots with a different analysis with more detailed numbers and possible explanations of where the clock cycles differ.

The benchmark code has been updated on the SVN depot as well.
User avatar
Xeron
Emulation expert
Posts: 426
Joined: Sat Mar 07, 2009 5:18 pm
Contact:

Post by Xeron »

Removed duplicate post.
Last edited by Xeron on Sun Aug 07, 2011 3:23 pm, edited 1 time in total.
User avatar
Xeron
Emulation expert
Posts: 426
Joined: Sat Mar 07, 2009 5:18 pm
Contact:

Post by Xeron »

If anyone wants to help fix this in Oricutron, there is a function in 6502.c called m6502_set_icycles, which calculates the number of cycles the next instruction will take.

At the top of the function, it checks for breakpoints. The cycles calculation is in the "switch( nextop )" block @ line 547.
User avatar
Dbug
Site Admin
Posts: 4437
Joined: Fri Jan 06, 2006 10:00 pm
Location: Oslo, Norway
Contact:

Post by Dbug »

If we consider that with IRQ disabled there's only one cycle difference shown between Oricutron and the Atmos, I believe there's no problem in any of the base instructions timing. I think there's more something to do with the VIA timers handling. Possibly something like one cycle difference in the way the latch/start counting is done.

My rationale is that the TIME2 OFFSET value is $12 instead of $11, which in term of instructions ran by the processor just amount to that exact code:

Code: Select all

	sei
	
	; Start the timing
	ldy #0 
	jsr _ProfilerReset 
	
	ldy #0 
	jsr _ProfilerRead 
	; At this point the timing is done
	
	lda _ProfilerTimer 
	sta tmp0 
	lda _ProfilerTimer+1 
	sta tmp0+1 
	
	sec 
	lda #<( $ffff) 
	sbc tmp0 
	sta tmp0 
	lda #>( $ffff) 
	sbc tmp0+1 
	sta tmp0+1 
	
	lda tmp0 
	sta _ProfilerTimerOffset 
	lda tmp0+1 
	sta _ProfilerTimerOffset+1 
	;
	; At this point _ProfilerTimerOffset contains
	; $11 on an Oric Atmos
	; $12 on Oricutron and Euphoric
	; $0f on MESS
	
_ProfilerReset
	lda #$ff
	sta VIA_T2C_L
	sta VIA_T2C_H  
	rts
	
_ProfilerRead
	lda VIA_T2C_L
	ldx VIA_T2C_H
	sta _ProfilerTimer+0
	stx _ProfilerTimer+1
	rts

_ProfilerTimerMin
	.word $ffff 
	
_ProfilerTimerOffset
	.word $0 
As you can see there are no incredible or rarely used addressing modes, just immediate loading, jsr, sta, a grand total of 13 instructions.
User avatar
Xeron
Emulation expert
Posts: 426
Joined: Sat Mar 07, 2009 5:18 pm
Contact:

Post by Xeron »

OK, i've investigated the timer2 test at the start. After writing $FFFF to the timer 2 latch, the following instructions are executed:

Code: Select all

inst.    cyc  t2 before  t2 after  t2 diff  total
-----------------------------------------------------
rts       6   $ffff      $fff9       6        6 ($06)
ldy imm   2   $fff9      $fff7       2        8 ($08)
jsr abs   6   $fff7      $fff1       6       14 ($0e)
lda abs   4   $fff1      $ffed       4       18 ($12)
As you can see, this sequence is 18 cycles. I'm guessing the real oric shows that either the VIA takes an extra cycle when you set the counter via the latch, or that the lda instruction does the actual read 1 cycle before the instruction completes. Both make sense, so I guess i need to find out which one is correct.
User avatar
Xeron
Emulation expert
Posts: 426
Joined: Sat Mar 07, 2009 5:18 pm
Contact:

Post by Xeron »

i investigated further last night, and discussed it with dbug on irc. Just thought i'd update this thread.

After looking at the timing diagrams in the 6522 datasheet, it looked like the timer reload is ok, so it looks like the load instruction retrieves the value on the cycle before the end of the instruction. I have added support for this and now the timer 2 value and SEI tests match the reference from real hw.

now we just need to figure out why the tests with interrupts enabled are so far out...
User avatar
Dbug
Site Admin
Posts: 4437
Joined: Fri Jan 06, 2006 10:00 pm
Location: Oslo, Norway
Contact:

Post by Dbug »

What I can do, is make a new version of the test with a high frequency IRQ that does nothing except "bit VIA / rti".

Advantage is that we know exactly how long the IRQ should take, so we can easily find out the exact number of cycles we are missing.
User avatar
Xeron
Emulation expert
Posts: 426
Joined: Sat Mar 07, 2009 5:18 pm
Contact:

Post by Xeron »

Well, i'm a bit closer to the real hw result now :)

Image

The SVN commit message explains the change well enough:
"Fixed an issue with the 6502 irq emulation where the cycles for the next instruction were calculated, the machine was emulated for that many cycles, then the 6502 instruction was executed.

The problem with this is that if during the cycles for that instruction an interrupt was raised, the actual instruction executed would be the first one if the irq. In real hw, thats like an irq travelling back in time and causing the CPU to execute a different instruction. It also has the side effect of the wrong number of cycles being executed of the rest of the machine for that instruction.

Now, the cycles are calculated for the next cpu instruction, and the machine is emulated for that many cycles as before, but now, when the CPU instruction is executed, it always executes the instruction used to calculate the cycles. This then behaves like the real hw; an irq only happens once the current instruction has finished."
Post Reply