Hi all,
I'm trying to replicate bas2tap (which is written in quite terrible c++ ) and since I'm Python programmer, I write it in Python of course.
Only thing I really don't understand is few bytes in tokenized basic line.
First word (2 bytes) is a starting address, but starting address of what? In bas2tap source code it's counted from 0x501 but when checking out file saved from Oricutron with hex editor starting address for first line is always 0x50F. And I never managed to match anything sensible for start address of next line. How this value should be calculated?
Since as I understood next word (2 bytes) is line number, then actual tokenized line ending to one 00 byte.
Basic TAP file format
Re: Basic TAP file format
Hi,
Basic:
Tokenized lines:
Tape file (CSAVE "TESTSAVE"):
I wrote 2 utilities in python bas2txt and txt2bas to translate from tokenized to text and from text to tokenized.
Basic:
Code: Select all
10 REM LIGNE 10
20 PRINT "TEST"
Code: Select all
+-----+----> Address for next line ($0510)
| | +----+----> Line number (10)
| | | | +----> Token for REM
| | | | | +------------------------+----> LIGNE 10
| | | | | | | +----> End Of Line
00000501 10 05 0a 00 9d 20 4c 49 47 4e 45 20 31 30 00 |..... LIGNE 10.|
+--+----> Address for next line ($051d)
| | +----+----> Line number (20)
| | | | +----> Token for PRINT
| | | | | +-------------------+----> "TEST"
| | | | | | | +----> End Of Line
00000510 1d 05 14 00 ba 20 22 54 45 53 54 22 00 |.... "TEST"..|
+---+---> End of Program
0000051d 00 00
Code: Select all
+---+--> End address
| | +---+--> Start address
00000000 16 16 16 16 24 ff ff 00 00 05 1f 05 01 03 54 45 |....$.........TE|
00000010 53 54 53 41 56 45 00 10 05 0a 00 9d 20 4c 49 47 |STSAVE...... LIG|
00000020 4e 45 20 31 30 00 1d 05 14 00 ba 20 22 54 45 53 |NE 10...... "TES|
00000030 54 22 00 00 00 0b |T"....|
Re: Basic TAP file format
Thank you very much for so illustrative explanation.
Apparently it was pure luck that my beginning of the next line addresses were same in both test cases.
Once I get my tokenizer working my plan is to build higher lever BASIC abstraction like:
or
Apparently it was pure luck that my beginning of the next line addresses were same in both test cases.
Once I get my tokenizer working my plan is to build higher lever BASIC abstraction like:
Code: Select all
if a > 100 then
do something
do something else
endif
Code: Select all
switch a then
case 1: print"foobar": break
case 2: print"barfoo": break
endswitch
Re: Basic TAP file format
\o/ got it working!
Got some small issues with byte ordering. For some strange reason (I blame English engineers back in the days), header addresses uses MSB format but then actual code lines uses LSB format. Ever heard of consistency..?
Also I couldn't figure out why an earth there are 2 random bytes at the end of the file...
But now journey continues and I can start to implement much higher level syntax on top of standard Oric syntax.
Got some small issues with byte ordering. For some strange reason (I blame English engineers back in the days), header addresses uses MSB format but then actual code lines uses LSB format. Ever heard of consistency..?
Also I couldn't figure out why an earth there are 2 random bytes at the end of the file...
But now journey continues and I can start to implement much higher level syntax on top of standard Oric syntax.
Re: Basic TAP file format
Good news.
You're right, the header use MSB and the ROM saves BASIC programs with one more byte than necessary ($0b i, my previous post).
So the end address in the header need also to be one more than the real end of BASIC program in memory.
You can add an arbitrary byte.
You're right, the header use MSB and the ROM saves BASIC programs with one more byte than necessary ($0b i, my previous post).
So the end address in the header need also to be one more than the real end of BASIC program in memory.
You can add an arbitrary byte.