Page 4 of 8

### Re: Novalight - very fast tape loading

Posted: Tue Feb 26, 2019 11:18 pm
I think I found a way to save room.
This will be at the price of a little additional fast-speed loading time and more bank swapping (sorry ISS, the structure will be more complex :p).
But this should remove the dictionary from page 2, thus making Novalight compatible with ALC and Loriciels programs.

So I'm changing my plans: next release will be v1.2 with these heavy structure changes, the multipart booster, and probably no more "old loader / no loader" option.

### Re: Novalight - very fast tape loading

Posted: Fri Mar 01, 2019 4:56 am
Symoon wrote:
Sat Feb 23, 2019 1:20 pm
Are there any mathematicians aroud? I think my dictionary in Novalight can be optimized.
I think this is less about maths and more about logic.
Symoon wrote:
Sat Feb 23, 2019 1:20 pm
What I'm doing now (the RLE compression has already been performed and the concerned bytes are ignored):
1- calculate the time taken by each uncompressed byte in the Novalight signal, which is: occurencies*encoding time. For instance, if "Z" lasts 19 samples, and is present 100 times in the TAP file, its total time is 19*100 = 1900
2- the dictionary has 14 entries, so I take the 14 highest "total times" calculated. This way I'm removing a maximum of time from the uncompressed signal.
3- in the dictionary, each entry will last a different time (between 9 and 16 samples), so I'm optimizing the dictionary assignation: among the 14 bytes, the bytes with the highest occurencies, will be assigned to the shortest new encoding time. This way, I'm minimizing the new time I re-insert in the signal.
Assigning variable length to samples as a function of their frequency of apparition is a well known technique and is called "entropy encoding" (cf https://en.wikipedia.org/wiki/Entropy_encoding).
Symoon wrote:
Sat Feb 23, 2019 1:20 pm
For instance: in a TAP file, a byte "A" that lasts 22 samples is present 4 times, and another one "B" that lasts 12 samples is present 8 times.

Code: Select all

``````A: 22 *4 = 88
B: 12*8 = 96	total 96+88= 184``````
If I pick up "B", as it is the longest time in the signal (removing 96 samples insted of 88 if I had taken "A"), I get this in the final signal:

Code: Select all

``4*8 = 32; 32 + 88 = 120 samples``
But if I had taken "A", removing less time from the original signal, I get this in the end:

Code: Select all

``4*4 = 16; 16 + 96 = 112 samples``
Which is better!
So I save time with my method but it may not be optimal... But I really can't figure how to calculate the best result. What is the best way to do? Is there a magic formula, with 256 possibles bytes combined in 14 possible bytes of a dictionary?
It is not a question of formula, you cannot know in advance how much space you save unless you actually compute it.

What should be done is compute for each candidate, NOT how much space it *currently* takes, but how much space it *will* remove.

In your case:

Code: Select all

`````` gain(A) = current_cost(A) - cost_if_encoded(A) = 88 - (4 * 4) = 88 - 16 = 72
gain(B) = current_cost(B) - cost_if_encoded(B) = 96 - (4 * 8) = 96 - 32 = 64``````
With this, it is fairly clear that A is the better candidate since it gains you more bytes than B will.
What you are currently doing is assuming that the byte that currently takes the most space will get you better gains but as you proved yourself, this is false, what matters is to take the byte that will shrink the most.

If you look at the maths of it:

Code: Select all

``````A : 22 samples, present 4 times => cost(A) = len(A) * freq(A)
22       4
``````
The gain, as computed above is:

Code: Select all

``````gain(A) = cost(A) - cost_if_encoded(A)
``````
and obviously

Code: Select all

``````cost_if_encoded(A) = encoded_len(A) * freq(A)
``````
if we now expand all components of gain, that gives:

Code: Select all

``````gain(A) = (len(A) * freq(A)) - (encoded_len(A) * freq(A))
``````
there is freq(A) on both sides so we can factorize:

Code: Select all

``````gain(A) = freq(A) * (len(A) - encoded_len(A))
``````
So in your case:

Code: Select all

``````gain(A) = 4 * (22 - 4) = 4 * 18 = 72
gain(B) = 8 * (12 - 4) = 8 * 8 = 64``````
So, just compute for each candidate, the value

Code: Select all

``count * (nb_samples - nb_encoded_samples)``
and take the one with the biggest value.

Hope this helps!

### Re: Novalight - very fast tape loading

Posted: Fri Mar 01, 2019 7:18 am
Thanks for this detailed reply
NekoNoNiaow wrote:
Fri Mar 01, 2019 4:56 am
So in your case:

Code: Select all

``````gain(A) = 4 * (22 - 4) = 4 * 18 = 72
gain(B) = 8 * (12 - 4) = 8 * 8 = 64``````
So, just compute for each candidate, the value

Code: Select all

``count * (nb_samples - nb_encoded_samples)``
and take the one with the biggest value.
After an few hours of headache, I had finally realised that it was logical to use the same "unit" to evaluate what I remove (frequency*time) and what I add (I was only considering frequency!).
A problem remained: claculating the real gain for a 14-bytes dictionary and 256 possible bytes values (thus encoding time) depends on the chosen bytes, which choice depends itself on the new length, which depends itself on the position I assign in the dictionary to the chosen bytes (each position having its own time length).
So if I'm not mistaken, this simply meant finding "the best dictionary" which meant claculating something like 256 to the power of 14 combinations (a 35 digits number!).

I simplified it by ignoring the dictionary position, and using an average "new encoding time" (13 samples for a byte) instead of finding the exact one (between 9 and 16 according to its position)

So in the end in the code, I'm simply checking the program to load, and each time I find a byte, I'm updating its gain (in a table) by adding (its length in samples - 13).
I could see that just adding this "-13" lead to one different choice for Zorgons Revenge and saved a little loading time.

I don't think it's worth trying to be 100% exact with the new length (I have no idea how anyway) as it would bring a crazy complexity to save a tiny fraction of seconds (something like 0.00x)

### Re: Novalight - very fast tape loading

Posted: Sat Mar 02, 2019 6:13 pm
You do not need to consider all possible combinations!

Just sorting each candidate with the method I described and putting them into the dictionary in order is sufficient to obtain an optimal choice. Any other method is mathematically guaranteed to result in a bigger size.

Think about it this way:
The best choice for the first selection in the dictionary is the byte with the biggest gain, right?
This much is obvious. So the first choice is a no brainer.
Now, what is the best choice for the second position in the dictionary?
Well, this is the exact same problem as for the first choice.
The only thing that changed is the encoded length of that byte.
So, just redo the gain computation for all remaining candidates and select the best one.
Third one?
Same problem again, with a new encoded length, so same method.

In the end, if you have n bytes to put in the dictionary, then you only need to apply that method n times, no need to test every combination since you have a method that gives you the best choice at every step.

This is exactly how entropy coding works, if you sort bytes by their frequency * size in decreasing order, and give the corresponding place in the dictionary in that same order (with encoded length increasing progressively in the dictionary) then you have the most optimal encoding possible.

I could explain this more formally with mathematical induction if needed but I am typing on my iPad so that is a bit of a pain right now.

In any case: ditch the average computation, it does not work, just loop on your list of candidates and apply the size*frequency sort at each step and you will have your best possible combination almost instantly. (O(n^2*log(n)) to be precise .)

### Re: Novalight - very fast tape loading

Posted: Sat Mar 02, 2019 9:37 pm
NekoNoNiaow wrote:
Sat Mar 02, 2019 6:13 pm
Think about it this way:
The best choice for the first selection in the dictionary is the byte with the biggest gain, right?
This much is obvious. So the first choice is a no brainer.
You're right if you consider the gain as the time removed less the time added with the dictionary encoding. But you don't know this dictionary time yet since you don't know the position of your byte in the dictionary... So you can't know the exact gain at this first step, so you can't know if it's worth being selected for the dictionary
In the end, if you have n bytes to put in the dictionary, then you only need to apply that method n times, no need to test every combination since you have a method that gives you the best choice at every step.
Yep but I just don't know how to calculate the 1st step; I may be misunderstanding you, but for me finding the real optimal choice forbids splitting the problem into different steps!
That's what I did 1st, and it showed I was wrong.
This is exactly how entropy coding works, if you sort bytes by their frequency * size in decreasing order, and give the corresponding place in the dictionary in that same order (with encoded length increasing progressively in the dictionary) then you have the most optimal encoding possible.
Potentially wrong :p

Imagine you have A, lasting 10 and repeated 1000 times. That makes 10000. You'll consider it as a better choice than B lasting 90 and repeated 110 times (which is 9900).
So you select A. Imagine it's the 14th byte you selected (14th in the final decreasing order). So you drop B, no more room in the dictionary.
And then you have the new value for A, lasting 6. So you removed 10000, and now add 6*1000 = 6000. Gain = 10000-6000 = 4000.
What if you had chosen B ? Removing 9900, and adding 6*110 (660) = a gain of 9240 instead of 4000. That's much better, because your 1st step didn't take into consideration the real potential gain, but just what you removed, which is not always optimal.
In any case: ditch the average computation, it does not work
It did give a better result with Zorgons Revenge by selecting a byte instead of another
To be honest I thought about this problem for about 2 or 3 days and changed my mind almost every hour I also found myself silly when I typed this "-13" in the code, but it did give a shorter loading time

### Re: Novalight - very fast tape loading

Posted: Sat Mar 02, 2019 11:39 pm
Symoon wrote:
Sat Mar 02, 2019 9:37 pm
NekoNoNiaow wrote:
Sat Mar 02, 2019 6:13 pm
Think about it this way:
The best choice for the first selection in the dictionary is the byte with the biggest gain, right?
This much is obvious. So the first choice is a no brainer.
You're right if you consider the gain as the time removed less the time added with the dictionary encoding. But you don't know this dictionary time yet since you don't know the position of your byte in the dictionary... So you can't know the exact gain at this first step, so you can't know if it's worth being selected for the dictionary
I think you missed the fact that this method must be used to fill the entire dictionary.

For every empty entry in the dictionary (the first being the one with the smallest encoding, the last being the one with the biggest encoding, and the size increasing constantly), you apply the selection method:

1 - compute the gain at this stage for all byte candidates using:

Code: Select all

``gain(candidate) = freq(candidate) * (length(candidate) - dictionary_length)``
2 - put the candidate with the best gain in the first entry (with the smallest encoded length) of the dictionary
3 - remove the chosen candidate from your candidate list and go back to step 1 until there are no available entries in the dictionary

If your dictionary encodes bytes with gradually increasing sizes, then choosing the best gain at each step guarantees to choose the best gain overall. That is a mathematical consequence of that ordering and it can be formally proven by induction or even visually.
Symoon wrote:
Sat Mar 02, 2019 9:37 pm
In the end, if you have n bytes to put in the dictionary, then you only need to apply that method n times, no need to test every combination since you have a method that gives you the best choice at every step.
Yep but I just don't know how to calculate the 1st step; I may be misunderstanding you, but for me finding the real optimal choice forbids splitting the problem into different steps!
That's what I did 1st, and it showed I was wrong.
This is not what I suggested.
What I suggested implies to redo the best candidate computation for every empty dictionary entry.

As I explained, you were not maximizing the gain at every step, you were maximizing using the size currently taken by each candidate. But as you proved yourself, that does not work, you have to maximize by the best size gain obtainable with the dictionary entry with the smallest encoding length.
Symoon wrote:
Sat Mar 02, 2019 9:37 pm
This is exactly how entropy coding works, if you sort bytes by their frequency * size in decreasing order, and give the corresponding place in the dictionary in that same order (with encoded length increasing progressively in the dictionary) then you have the most optimal encoding possible.
Potentially wrong :p
This is in every book on compression, I doubt they are wrong.
Symoon wrote:
Sat Mar 02, 2019 9:37 pm
Imagine you have A, lasting 10 and repeated 1000 times. That makes 10000. You'll consider it as a better choice than B lasting 90 and repeated 110 times (which is 9900).
So you select A. Imagine it's the 14th byte you selected (14th in the final decreasing order). So you drop B, no more room in the dictionary.
And then you have the new value for A, lasting 6. So you removed 10000, and now add 6*1000 = 6000. Gain = 10000-6000 = 4000.
What if you had chosen B ? Removing 9900, and adding 6*110 (660) = a gain of 9240 instead of 4000. That's much better, because your 1st step didn't take into consideration the real potential gain, but just what you removed, which is not always optimal.
You just proved that you did not read my method. Because it dictates to precisely to choose B for these exact reasons.

1- you compute gain(A) = freq(A) * (len(A) - encoded_len(A)) = 1000 * (10 - 6) = 10000 * 4 = 4000
2- you compute gain(B) = freq(B) * (len(B) - encoded_len(B)) = 110 * (90 - 6) = 110 * 84 = 9240

These are the gains to use to select the best candidate!

That means you select B for the current entry.
Then for the remaining candidates, you redo the same computations for all of them, using the encoding length of the next dictionary entry.

Go back to the 1-2-3 steps above I listed and that is exactly what should be done.
The great part of this method, is that If you do that for selecting every entry in the dictionary (in increasing encoding length), then this guarantees to choose the optimal candidates on the whole.

As I explained, entropy coding is a case where maximizing the result at every step maximizes the result globally.
Symoon wrote:
Sat Mar 02, 2019 9:37 pm
In any case: ditch the average computation, it does not work
It did give a better result with Zorgons Revenge by selecting a byte instead of another
To be honest I thought about this problem for about 2 or 3 days and changed my mind almost every hour I also found myself silly when I typed this "-13" in the code, but it did give a shorter loading time
This is by pure chance, your first method was sub optimal and you just happened to have a case where this new method works better but I guarantee you that in most cases my method will work even better. Since it is optimal for this particular encoding method, there is no way you can get any better than it.

It looks to me that you did not understand fully what I was suggesting since the example you gave above is exactly what I suggest to not do.

### Re: Novalight - very fast tape loading

Posted: Sun Mar 03, 2019 12:12 am
NekoNoNiaow wrote:
Sat Mar 02, 2019 11:39 pm
For every empty entry in the dictionary
Aaaah, this is the key that I missed! I kept willing to have the right order before assigning to the dictionary... Sorry it took so long to understand, I feel stupid
It all makes sense now, thanks a lot
I'll give it a try

### Re: Novalight - very fast tape loading

Posted: Sun Mar 03, 2019 3:46 am
Symoon wrote:
Sun Mar 03, 2019 12:12 am
NekoNoNiaow wrote:
Sat Mar 02, 2019 11:39 pm
For every empty entry in the dictionary
Aaaah, this is the key that I missed! I kept willing to have the right order before assigning to the dictionary... Sorry it took so long to understand, I feel stupid
It all makes sense now, thanks a lot
I'll give it a try
Ahem, the fault does not lie with you, the reason is probably that I did not explain myself very well the first time.

However, I am very ashamed to admit that unfortunately I was wrong in saying this is optimal.
After writing my last post, I decided to illustrate visually how that method was optimal and in the process I actually demonstrated to myself that this was incorrect.

An important point though, is that the method, although not optimal, is probably overall close to optimal as the data below will show. A combinatorial approach would work better (although it could be prohibitively expensive) but maybe an intermediate approach between the two would work well enough. (I will explain that more in detail below.)

Now to the test data: I generated some random stats corresponding to a file to compress, selecting a few compression candidates and decided some arbitrary encoding values for the dictionary entries and graphed visually the gain obtained by each candidate for each entry.

Here is the data itself, all gain values were computed using the formula given in previous posts:
The corresponding Google spreadsheet with the data and graph is available there for those who want to verify my computations.
https://docs.google.com/spreadsheets/d/ ... sp=sharing

Since the gain numbers are difficult to interpret, I graphed the gains for each candidate for every entry of the dictionary.
Each line in the graph below correspond to the gains of a given candidate and logically, they all decrease as the entries in the dictionary increase in encoded size.
As you can see, the data proves me absolutely wrong : for several dictionary entries, there are cases where inverting the choices given by my method is the most optimal choice.
You will notice however that in most cases, there is not much difference between the choice suggested by my method and the inverted choice so the method globally is not too bad, but it is certainly not optimal.

This said, most of the lines of the graph decrease almost in parallel which generally means that my method kinda works well, however it is when they cross or are very close to one another that my method makes the wrong choice.

What to do then?
What I would recommend would be to mix my method with just one level of combinatorial exploration. The data seems to suggest that inversions usually concern only crossing or very close lines and these do not seem to involve more than two points (at least in my sample) so that may be enough.
Obviously more real-life testing would be needed to compare that with other methods you experimented with so far.

Here is what the algorithm could look like:

1 - For each candidate, compute their gain for the current empty dictionary entry, write down the candidate with the biggest gain and the candidate with the second best gain (respectively called entry1_c1 and entry1_c2).
2 - For each candidate, compute their gains for the next empty dictionary entry, once again write down the gains of the best two candidates (respectively entry2_c1 and entry2_c2).
3 - Compute which of the choices is the best by computing the gain over the two entries:

Code: Select all

``````normal_gain_over_two_entries = entry1_c1 + entry2_c2
inverted_gain_over_two_entries = entry1_c2 + entry2_c1``````
4 - If normal_gain_over_two_entries > inverted_gain_over_two_entries, then simply select the best candidate in this round, otherwise, select the second best candidate instead.
5 - repeat for the next dictionary entry.

Here you go.
I must apologize again for insisting that my method was optimal when the data proves me completely wrong. I am not sure why I (wrongly) remembered that this greedy algorithm (Wikipedia) was optimal but clearly this memory of mine was completely false.

So, sorry for that again, I will be more careful from now on.

### Re: Novalight - very fast tape loading

Posted: Sun Mar 03, 2019 7:32 am
Nooooooo
Ha ha, this problem drives me crazy as I'm sure its solution is not so hard, but each time we find something, it proves being a bit more complicated :p

Edit: I didn't see how the graph proved you wrong, but OK I see now.
That's what my brain tried painfully to show me last week and ended up telling me, in a simple way, before I collapsed: all combinations should be calculated as every gain depends on all the other choices one could make.

### Re: Novalight - very fast tape loading

Posted: Sun Mar 03, 2019 8:02 am
NekoNoNiaow wrote:
Sun Mar 03, 2019 3:46 am
I must apologize again for insisting that my method was optimal when the data proves me completely wrong.
Please don't, this problem is evil (and I also tend to read a bit fast too as I work on short-spare-times).

BTW, I also realize maybe I should have given more details of the real values with Novalight:
- each "normal" byte is encoded between 17 and 29 samples (actually 5+4*[3;6])
- each entry of the dictionary is encoded between 9 and 16 samples (actually [6;7]+[3;9])

So dictionary lengths are, from 1st to 14th position: 9, 10, 10, 11, 11, 12, 12, 13, 13, 14, 14, 15, 15, 16.
This way:
- we are sure any byte can be selected in the dictionary and be shorter
- maybe calculating with a two steps depth can be neglected as the gain would really be minimal with such dictionary lengths, which makes your 1st solution really good enough? (that's probably also why using an average seem to give good results, which would not with much more difference between the 1st and the last)

### Re: Novalight - very fast tape loading

Posted: Mon Mar 04, 2019 12:35 am
Symoon wrote:
Sun Mar 03, 2019 8:02 am
NekoNoNiaow wrote:
Sun Mar 03, 2019 3:46 am
I must apologize again for insisting that my method was optimal when the data proves me completely wrong.
Please don't, this problem is evil (and I also tend to read a bit fast too as I work on short-spare-times).
Evil, hence absolutely fascinating.
Symoon wrote:
Sun Mar 03, 2019 8:02 am
BTW, I also realize maybe I should have given more details of the real values with Novalight:
- each "normal" byte is encoded between 17 and 29 samples (actually 5+4*[3;6])
- each entry of the dictionary is encoded between 9 and 16 samples (actually [6;7]+[3;9])

So dictionary lengths are, from 1st to 14th position: 9, 10, 10, 11, 11, 12, 12, 13, 13, 14, 14, 15, 15, 16.
This way:
- we are sure any byte can be selected in the dictionary and be shorter
- maybe calculating with a two steps depth can be neglected as the gain would really be minimal with such dictionary lengths, which makes your 1st solution really good enough? (that's probably also why using an average seem to give good results, which would not with much more difference between the 1st and the last)
Interesting, thanks for these details.
May I ask what determine these sizes? Both the non encoded byte sizes and the sizes of dictionary encoded bytes.
And I guess this question also implies to delve into the details of how you select which bits to use to form all these "normal" and "dictionary" bytes.

I assume that your goal when deciding which bits to emit for each byte is to guarantee that no sequence of bits can ever be mistaken for another?
So the reason these "normal" bytes have the sizes you list above are that you have somehow determined that this condition is always verified?

### Re: Novalight - very fast tape loading

Posted: Mon Mar 04, 2019 6:53 am
NekoNoNiaow wrote:
Mon Mar 04, 2019 12:35 am
May I ask what determine these sizes? Both the non encoded byte sizes and the sizes of dictionary encoded bytes.
Sure

With Novalight, a byte begins by a sinusoid coding a 'stop/start' bit, which is there to give time to the Oric to work on building and storing the previous byte once all its bits have been read. This 'stop/start' bit must last at least 5 samples (115µs) to let the Oric work.
But to save time, I thought this could also be a 'switch bit': if it lasts 5 samples, what follows is a 'normal byte', 6 or 7 then it's a dictionary byte, 8 means a RLE repeated byte, and 9 means 'end of program'. This way, the time lost to let the Oric work becomes useful.
Then follow other sinusoids, the whole is encoded like this:

Code: Select all

``````Coding format in WAV signal:
Sinusoid 1 length in samples		Sinusoid 2 (and if required, 3, 4 and 5) length in samples
3	Stop
4	(unused)
5	Start normal byte		3	00 \
(Oric needs at least		4	01 |__ Four times to code a byte
5 samples time to work)	5	10 |
6	11 /
7	1111 -- replaces 6+6 samples for 1111 bits
6	Start byte from dictionary 1	[3;9]: byte position in dictionary
7	Start byte from dictionary 2	[3;9]: byte position in dictionary
8	Start repeat previous byte	RLE; for N repeats: (N-1)*3 + 4  (4 being for the last repetition)
9	End of program``````
Why not using 2 samples? Because the Oric does not react below 3 sample
Why coding '00' on the shortest period? Because statistics on thousands of TAP files show Oric programs are made of 62% of zeroes.
Why not using 8 and 9 samples to code more complexity in a 'normal byte'? Because there is no more room in the code, nor working time: adding a more complex decoding would last longer than the 3 samples sinusoid time (which is the major time constraint), so information will be lost.
The same goes for all kind of bytes: sometimes the decoding loop only last 1 or 2 µs less than the 3 samples length.

### Re: Novalight - very fast tape loading

Posted: Thu Mar 07, 2019 10:34 pm
Currently working at heavily re-structuring Novalight so the dictionary is stored in page 1, with the code, thus improve compatibility.

Not there yet...
1.2_working_screen.png (4.42 KiB) Viewed 3082 times

### Re: Novalight - very fast tape loading

Posted: Fri Mar 08, 2019 6:07 am
Done! Successfully loaded 3D Fongus this morning. Still needs testing with all the options and ROM 1.0.
But I'm glad I'm not using page 2 anymore, that's a relief and the loading penalty for this patch is only of 0.013 second. It also saves 8 more bytes for the stack.
Novalight is now cut into 8 parts!

Next steps before v1.2 release:
- try adding a memory relocation if a loading in page 1 is detected (actually, should I make it automatic of set up a user option to choose the memory page?)
- optimizing the dictionary as discussed with NekoNoNiaow

### Re: Novalight - very fast tape loading

Posted: Fri Mar 08, 2019 1:27 pm
Symoon wrote:
Fri Mar 08, 2019 6:07 am
- try adding a memory relocation if a loading in page 1 is detected (actually, should I make it automatic of set up a user option to choose the memory page?)
Well, thinking about it, I'll try to make it automatic, but also add a user option (that will have to indicate the memory page number to use), that will override the automatic relocation. Not sure I know how to program this yet, but I'll try!