Build Your Own Basic Stamp



The Basic Stamp revolutionised the world of embedded computing



The Basic Stamp from Parallax Inc changed the face of embedded computing.

By providing a small module, containing a microprocessor pre-programmed with a Basic language interpreter, an EEPROM program storage memory, a compiler and a program downloading facility, Parallax put microcontroller technology into the hands of the hobbyists, amateurs and even made professional software engineers look up with excitement.

Creating the Basic Stamp was a master-stroke; being programmable in Basic meant that there was no complex assembly language to be learnt, and its construction as a single module meant it was easy to wire up and program, using a serial or parallel interface lead, and no expensive specialised device programmer.

The Basic Stamp was the forerunner, and inspiration, for many clones of the device, and many alternatives such as the Atom Microcontroller, BasicX and OOPic systems.

The original Basic stamp was a module, mainly because microcontrollers at the time it was created were very small, were not flash programmable, didn't have onboard EEPROM and so an external memory chip had to be used.

As microcontrollers have improved, it is now possible to find those which are flash programmable and have EEPROM onboard, and it is therefore entirely feasible that a Basic Stamp can be re-created in a single chip, and, in particular, using a very low-cost Microchip PIC microcontroller.

This is exactly what Revolution Education Ltd appear to have done with their PICAXE range of microcontrollers.

I have always felt that these "Basic language interpreter microcontrollers", were quite expensive for what they were ( although I can appreciate that purchase cost is not the sole issue on what is good or not, value for money or not ), so I was delighted to see the PICAXE range cost little more than it would to buy an unprogrammed PIC.

One of the biggest problems I see with the PICAXE ( beyond its limited memory and data capacities which are intrinsic to Basic Stamp clones ), is that serial communications are limited to 2400 baud, which is quite slow.

The thought occurred to me that it must be possible to write a Basic language interpreter for a PIC ( or any other microcontroller if it has a suitable architecture ), and it would be possible to then run the serial interfaces at faster baud rates.

It also seemed, as Parallax's PBASIC language is the defacto programming language for this type of device, that simply interpreting what the Parallax PBASIC compiler generates as a compiled image would allow a Basic Stamp clone to be built which has many development tools readily available for it.

The challenge then, was to create a Basic Stamp 1 clone ( like the PICAXE ), but with additional baud rates for serial interfacing.

The Parallax PBASIC compiler allows baud rates to be specified as a constant, normally in the range 0 to 7 for input, and 0 to 15 for output, but doesn't complain if values greater than this are specified, and as it turned out, whatever the value is, it is placed in the compiled program image where the interpreter can get at it.

Replicating the Basic Stamp 1 interpreter would allow existing Basic Stamp 1 code to execute, but additional, higher, baud rates could be provided.

The challenge to create a Basic Stamp 1 interpreter, running on a Microchip PIC, had started.


Contents

Latest News on the DIYSTAMP Distribution Archive

Archive Version 9.09 ( 2002-11-17 )

DIYSTAMP supports creating .ASM output files
Added CODE2ASM Compiled Image Converter.


Where to Start ?

The first place to turn these days when embarking upon a quest for knowledge is "The Internet". If the information you want can't be found there, it doesn't exist, or you're unlikely to find it by any other means.

One of the first significant references found was the "Basic Stamp Divided By Four", which is also known by the appellation "BS/4".

This is a PBASIC interpreter, written by Antti Lukats of Sistudio, which is a cut down version of the Basic Stamp, running on the very limited PIC 16C84 and 16F84 chips. These have 64 bytes of EEPROM, allowing programs which are up to a quarter of the maximum size of the original Basic Stamp programs to run. Hence its name.

The interpreter is available for download for burning into a PIC, from Reflection Technology.

Unfortunately, development has lapsed, and there is no source code available.

Another interpreter, along similar lines as the BS/4, is the ST1-64 from the BSS Club, the PIC Interpreters Club.

Again, there is an interpreter available for burning into a PIC, but no source code, and the site doesn't seem to have been updated since March, 2000.

Prolonged searching for a ready written, freely available, royalty free, source code provided, PBASIC interpreter got me nowhere.

The only way forward was to create an interpreter from scratch.

Not an easy option, but one which was appealing. Even if I never created an actual interpreter, attempting to get there is part of the fun of software engineering.


Getting Underway

Before even thinking about how to create your own PBASIC interpreter, the first thing you need to know is what a compiled PBASIC program looks like. We don't particularly care about how we compile a program, as there are a couple of free tools to do this available ( these are discussed later ). All the interpreter needs is the compiled image to execute.

Unfortunately, but unsurprisingly, Parallax are in the business of making money and aren't publicly disclosing how their compiler or interpreter work or what the compiled code looks like.

Luckily, there are some enterprising individuals out there who are never thwarted by brick walls. One such person is Chuck McManis who reverse engineered almost the entire Basic Stamp compiled code in his, now infamous, Decoding the Basic Stamp article which he published on the web.

With Chuck's useful information, it's fairly easy to design a disassembler for the compiled image, and creating a disassembler is a good place to start. Firstly, it's a good way to get to grips with the compiled code format; to understand how each line is compiled, and find all those areas which Chuck hasn't touched upon - such as the GOSUB Return Address Table, and the allocation for user defined EEPROM data.

Although a disassembler only reports what's in the compiled image, and doesn't care about the semantics of execution, it is a step forward towards the creation of an interpreter.

Of course, writing a disassembler doesn't come without its own trials and tribulations.

The compiled "DEBUG" statement doesn't have any of the information about the original source code stored with it, only an indicator of where itself is in the compiled image, and is thus impossible to disassemble correctly.

The "FOR" statement has much of the source code information supplied moved to the compiled code of the "NEXT" statement which means the two have to be married up again. The job here is made easier because each compiled "NEXT" token jumps to the line after the "FOR" statement, so the "FOR" to which a "NEXT" applies to can be easily found.

Additionally, testing the disassembler shows up a number of interesting features of the PBASIC language, compiler technology and interpreter implementation.

The mysteriously missing 0000 arithmetic operator code in McManis's article revealed itself to be a unary negation assignment operator ( "= -" ), and the reason for having the 0001, "=", arithmetic operator became clear. If an assignment statement has the same variable on the left of the assignment as on the right ( as in "LET B0 = B0 * 2" ), then the "=" opcode is left out, as an optimisation, only appearing when an assignment is made into a different variable on the left to that on the right.

The 11-bit address value starting in the second byte of the compiled image, turned out to be a pointer to the byte and bit immediately after the end of the program code. This is also the start of the GOSUB Return Address Table, if it exists.

Despite all these issues cropping up, the disassembler turned out to be quite simple to implement, although it is rather bare on the user interfacing side.


The UNSTAMP Disassembler

The UNSTAMP Disassembler decodes a Basic Stamp 1 ( BS1 ) compiled image from the CODE.OBJ file created during compilation. It handles compiled PBASIC programs targeted at the BS1 only.

The UNSTAMP Disassembler is available as an MS-DOS executable and as Basic Source Code.

Downloading the UNSTAMP Disassembler

The UNSTAMP Disassembler Version 2.03 is available for download as part of the DIYSTAMP.ZIP Distribution Archive. The source code consists of the UNSTAMP.BAS file and a number of .BAZ "include files" ( please see README.TXT in the Distribution Archive for details ), and the UNSTAMP.EXE file is the disassembler executable. The entire Distribution Archive can be downloaded by clicking the link below ...

  Download DIYSTAMP.ZIP - Version 9.12 ( 357 KB )

Version 2.03 of the UNSTAMP Disassembler is the latest version.

Although I am running a Virus Checker on my development PC, please check the DIYSTAMP.ZIP and UNSTAMP.EXE files after downloading and unzipping to ensure that they are virus free.

Using the UNSTAMP Disassembler

The UNSTAMP Disassembler must be run in the same directory as the CODE.OBJ file is located.

The disassembler can either be placed in that directory or placed in a directory which is included in the "SET PATH=" environment variable, or it can be run by prefixing its name with the fully qualified path of where it is installed. In short, the UNSTAMP Disassembler is just like any other MS-DOS executable you will encounter.

The disassembler is run by using the "UNSTAMP filename" command. This will read the CODE.OBJ file and create the filename.LST and filename.DIS files. These files are described below.

The filename must not include an extension, and must not include any disk, directory or other relative path information. It must not be greater than eight characters long; long filenames are not supported.

Command line help, and version details, can be obtained by using the "UNSTAMP /?" command.

Output Files

The UNSTAMP Disassembler creates two output files ...

A .LST file which shows how and where the compiled image tokens are held in the compiled image, and the source code for the compiled lines.

A .DIS file which contains just source code determined from the compiled image.

The best way to see the functionality of the disassembler is to try it; create a .BS1 or .BAS source code file ( including the "BSAVE" line, and without using any "SYMBOL" definitions ), compile using the Parallax or BSS Club compilers, disassemble using UNSTAMP, and look at the two resulting files.

For an explanation of the tokens held within the compiled image, and shown in the disassembler output; please see Chuck McManis's Decoding the Basic Stamp article.

Tokens are shown against their address in the compiled image, which is given in the form xx:y, where xx indicates the byte in which the token starts ( 00 being the first address of the image, and FF being the last ), and y is the bit offset within that byte where the token starts ( with 0 being the leftmost, most significant bit and 7 being the rightmost, least significant bit of the byte ).

Note that program statements run down the compiled image ( from address 00 towards FF ), while user defined EEPROM data runs upwards ( from FF towards 00 ). The disassembler displays the user defined EEPROM data in the order it would have appeared in the original source code.

The disassembler can only create a source code representation based upon the information which is held within the CODE.OBJ file, and therefore it is impossible to know what label names were used and what "SYMBOL" definitions were made in the source code.

Labels are auto-generated in numerically ascending order throughout the disassembly files and are prefixed by "L" ( for "label" ) if they are the destination of an "IF ... THEN", "GOTO" or "BRANCH" statement, and prefixed by "S" ( for "subroutine" ) if they are the destination of a "GOSUB" statement.

Because "DEBUG" statements cannot be disassembled, these are always disassembled as "DEBUG B0".

Although both the Parallax and BSS Club compilers appear to compile source code using, "Brute force and ignorance", keeping almost everything entered as source code within the compiled image, without optimisation, there may be some optimisations applied which result in some source code being disassembled differently to the original source code. The disassembled code should, however, be as functionally correct as the original.

The UNSTAMP Source Code

The UNSTAMP Disassembler is written in Basic and is compatible with the FirstBasic 1.00 shareware compiler and the PowerBasic 2.10f compiler from PowerBasic Inc.

The source should be fairly easily convertible into other variants of the Basic language, including QBasic and Visual Basic, and even alternative programming languages, such as C, C++ and Java.

Reporting Bugs in the UNSTAMP Disassembler

The UNSTAMP disassembler is based upon the sterling work by Chuck McManis, supplemented by my own experiments to fill in the details missing from his reverse engineering efforts.

With Parallax's compiled image definitions otherwise cloaked in secrecy, there is no real way to confirm that the reverse engineering effort is complete and accurate in all respects, other than to generate code through their compiler, disassemble it and check the two sources match.

This is a tortuous task, as there are a massive number of combinations of almost every statement, and so although there has been a considerable effort to test the interpretation of the compiled image, testing cannot be exhaustive.

This means that there may be errors in the disassembler, both due to misinterpretations of the compiled image format and in coding the disassembler itself.

Before reporting a bug, please check that you have the latest Version 2.03 UNSTAMP Disassembler.

If you do find any significant bugs, please send an email detailing the problem ( with an indication as to how to replicate the bug, and specifying the disassembler version number ) to hippy@psynet.net.

Licensing

The UNSTAMP Disassembler is provided as Freeware for personal, educational and non-commercial use, however, it must not be used to reverse engineer, or to attempt to reverse engineer, any compiled images which you have not created yourself, or compiled images which have been delivered with, or pre-downloaded into a Basic Stamp, or any similar device.

The UNSTAMP Disassembler Executable and Source Code may be modified and redistributed providing that the Licensing and Copyright statements contained therein are unchanged, and no charge other than distribution cost is made."

For commercial use of the UNSTAMP Disassembler, or any program derived from it, please contact the original author, and Copyright holder, at hippy@psynet.net.

Warranty

The UNSTAMP Disassembler is provided "as is", without any warranty of any kind, and without any guarantee as to fitness for purpose.

Downloading, installing and using the UNSTAMP Disassembler is undertaken at your own risk.


Compilers

To disassemble a compiled image, you need to create one. This means you will need a PBASIC compiler, and you need one which stores the object code as a file on disk, so the disassembler can access it.

There are two primary choices for the compiler; the official STAMP.EXE compiler from Parallax and the ST1.EXE compiler from BSS Club which is part of their ST1-64 package.

Both compilers are MS-DOS based ( and have been used successfully at an MS-DOS Prompt under Windows 98 Second Edition ), and are free to download.

The Parallax STAMP.EXE Compiler

To create a CODE.OBJ file, a "BSAVE" line must be added to the PBASIC Source Code which you are compiling.

The source code is compiled using "STAMP filename.ext" command ( the file extension can be left off if it is .BAS ), which invokes the full screen editor, compiler and loader. The Alt-R key is used to compile the source, which is then normally downloaded into a connected Basic Stamp automatically. If there is no Basic Stamp connected, the compiler will produce an error message, but will have created a CODE.OBJ file.

The STAMP.EXE compiler is exited by using the ESC key.

The BSS Club ST1.EXE Compiler

Unlike the Parallax compiler, the ST1.EXE compiler is entirely command line based, and will create a CODE.OBJ file by using the "ST1 filename.ext" command. The filename extension is not optional, and must be included, but having a "BSAVE" line in the PBASIC source code is optional.

Because the BSS Club compiler is entirely command line based, it is much easier to used than the Parallax compiler, and will normally be the preferred compiler to use. It must, however, be borne in mind that the BSS Club compiler has been developed on the back of reverse engineered effort, and may not always create the same object code as the Parallax compiler, nor support the PBASIC language completely.

In particular, the ST1 compiler will not compile assignment statements which use unary negation ( such as "LET B0 = -1" ), whereas the official Parallax compiler will, and it throws a runtime error for "EEPROM ( 256 )", and when any EEPROM value larger than 8 bits is used. It also reports and EEPROM full error when a program fills up the entire EEPROM exactly.

On the other hand, it will accept self-assignments ( such as "LET B0 = B0" ) which the Parallax won't, unless followed by an arithmetic operator.

The BS4.EXE Compiler

The Basic Stamp Divided By Four interpreter includes BS4.EXE, a full-screen editor and compiler. This is a very nice MS-DOS "Window" application.

Unfortunately, the compiler is not as robust as either the Parallax or BSS Club compilers, allowing more than sixteen GOSUB's to be compiled without error, and generating completely wrong compiled image tokens for "LET B0 = -B1", and it probably has other bugs as well.

It is therefore recommended that the BS4.EXE compiler is not used, unless there is a particular reason for doing so, such as comparing the output of different compilers. Anything it does produce by way of a compiled image should be treated as suspicious, or wrong.


Lessons Learned

Compiling test programs and disassembling the compiled image revealed an awful lot of information about the compiled image and how parts of the interpreter works, beyond the information provided by McManis.

Both the Parallax and BSS Club compilers do very little optimisation; if you write "LET B0 = B1 - 0 - 0 - 0", then all those unnecessary zero subtractions are included in the compiled image. If you put an "END" at the end of your program, the compiler will still add one of its own.

The only optimisation which has been seen is that where the variable on the left hand side of an equation is the same as the one on the right; such as in "LET B0 = B0 + 1". In these cases, no assignment operator appears in the compiled image, only the equation itself.

Many instructions include the addresses of the subsequent instructions, or a part of themselves. Presumably this is because the original processor in the Basic Stamp had a limited stack and constrained memory, so it's easier to store these addresses away somewhere and come back to them later.

In the case of the power-down and sleep commands, I guess it's easier to store the 'start here when you wake-up' pointer than the current interpreter address, but it's not at all clear at the moment why this would be the case.

The handling of GOSUB's is quite interesting. In a normal processor, every GOSUB will push its return address on to the stack, every RETURN will pop the return address, and continue executing from there.

The Basic Stamp's limited stack size precludes this, so a clever technique is used. Every GOSUB is numbered, and every compiled GOSUB line includes this number and the target destination for the GOSUB. When the GOSUB is executed, the number of the GOSUB rather than the return address is pushed to a virtual stack ( implemented as an array in memory ), and on return the number of the GOSUB is popped from the virtual stack. The GOSUB number is used as an index into an array of GOSUB Return Addresses stored in the compiled image, after the rest of the compiled program. The address so indexed is taken, and execution continues at the address, which is one after that of the GOSUB instruction.

There is a limit to just sixteen GOSUB's which can appear in any PBASIC program. Although this limit appears to have been somewhat arbitrarily set, clues to the limitation are found in the Basic Stamp's operational description.

The variable w6, is reserved for use within subroutines and must not be used by the programmer within any subroutines, and the contents of w6 will be 'corrupted' after a GOSUB. Given that there are sixteen GOSUB Return Addresses allowed, identified by a 4-bit number, and subroutines can be nested four deep; it looks likely that w6 is being used as the GOSUB Return Address Stack.

It is quite remarkable that none of the compilers make any attempts to check for the dangerous use of w6, and by association b12 and b13, or the overly deep nesting of GOSUB's, but they don't.

The FOR / NEXT constructs are handled differently in PBASIC than in other Basic dialects. Most of the semantics of the FOR statement are transferred to the NEXT statement during compilation, making the FOR a simple, initial assignment, with NEXT determining whether to loop again or not.

This means that a FOR / NEXT loop will execute at least once in PBASIC whereas it would not execute at all in other dialects. The statement, "FOR B0 = 1 TO 0", will always execute once. Having "FOR B0 = 255 TO 0 STEP -1" causes a more severe problem, as the Basic Stamp deals only with positive maths, so the index variable (b0) at zero, is decremented, which becomes -1, but this is really 255 in positive only maths, which is in range, and thus the loop will continue for ever.

It is unclear why there is a separate "FOR" token, when the initial assignment could have been compiled as an equivalent "LET" construct, nor is clear why, given that the "FOR" token exists, that simple assignments to variables ( ie "LET B0=B1" ) were not optimised as a "FOR" initialisation, which uses less compiled code space.

Surprisingly, as I discovered later, the hardest part of compiling an image and disassembling it, turned out to be handling the data held in the first byte of the compiled image. This is the number of the byte in which the first unused bit of EEPROM code ( after the tokens and GOSUB Return Address Table ), which is inverted in the compiled image.

The 11-bit address, which starts in the second byte of the compiled image, points to the first unused bit after the tokens, and points at the GOSUB Return Address Table if one exists.

While this can be generated relatively easily ( once it's realised exactly what the value signifies ), it is not possible to use this value to quickly determine which is the last EEPROM byte used for token and GOSUB Return Address storage, nor the size of the compiled image.


The DIYSTAMP Compiler

Having written the UNSTAMP disassembler, a good understanding of the compiled image, the format of compiled statements and the semantics of the compiled tokens has been gained.

To gain an even deeper understanding, it was decided that a compiler would be written before starting work with an actual interpreter. Having to generate an actual compiled image from source code would show up any areas of misunderstanding or oversight. Writing a compiler would also create an alternative to using that provided by Parallax and others, and allow the chance to extend the PBASIC programming language for the BS1 to incorporate common features used by professional programmers which are unimplemented in those compilers.

The Compiler

The Parallax BS1 compiler ( STAMP.EXE ) is extremely lax in accepting semantically incorrect programs, preferring to place the onus for program correctness upon the programmer rather than rejecting incorrect programs. While this makes a compiler extremely simple to write, it allows incorrect code to be written, and unexpected operation to occur when it is executed.

The most common fault is with the use of the PINx variables which should not be used to specify pin numbers in the BUTTON, HIGH, INPUT, LOW, OUTPUT, PULSIN, PULSOUT, PWM, REVERSE, SERIN, SEROUT and SOUND commands.

The PINx variables always contain the value 0 or 1, reflecting the status of the voltage driving the input pin. When they are used to specify a pin number, the value is taken, and thus only pin 0 or 1 is referred to; not the pin which the programmer thought had been specified.

The other common mistake is in using the RANDOM command with something other than a word variable. Although such a mistake won't be rejected, the resulting execution of the program is undefined and unpredictable. Using RANDOM with the PORT variable, is an obviously stupid thing to do ( randomly changing pins to input or output ), but the compiler won't complain.

Why Parallax took the decision not to report semantic errors is unclear, and a mystery to anyone who has ever written an assembler or compiler, especially when the Basic Stamp is targeted at novices and those who may be unfamiliar with programming.

The decision also creates a nightmare for professional programmers who are notorious for not reading manuals, preferring to write code while relying on the compiler tell them when they have stepped outside the bounds of reasonableness and correctness.

The failure to add semantic checking is a major one, and unforgivable from a technical point of view, not least because it can make debugging erroneous code almost impossible; "LOW pin0" may, or may not, operate correctly depending upon the state of external hardware, and in the worse case may operate during weeks of testing, only to fail, for an apparently inexplicable reason, long after that particular line of code was written.

Whereas the ST1 and BS4 compilers slavishly follow the Parallax approach of leaving the programmer to dig their own graves and guess what they did wrong, the DIYSTAMP Compiler provides a lot of semantic checking, in particular where pin numbering by variables may occur, misuse of w6, b12 and b13 within subroutines, and detection of many other mistakes. The compiler also extends the syntax to permit constructs which ought to have been included in the original Parallax compiler.

The main additions are -

  • More flexible symbol definitions ...

    Arithmetic expressions, with precedence override, are handled ( ie SYMBOL X = 1 + 2 )
    Full character strings are handled ( ie SYMBOL X = "Hello" )
    Character string concatenation is handled ( ie SYMBOL X = "Hello" + "World" )
    Compiler defined symbols for time and date

  • Symbol defined variables can have !, % and $ postfix characters to aid identifying the purpose of variables ( ie SYMBOL FRED$ = "STRING" )

  • Numeric constant prefixes have been extended to include ...

    0x, 0h and &h to specify hexadecimal numbers ( ie $AB, 0xAB, &hAB )
    0b and &b to specify decimal numbers ( ie 0b0101, &b0101 )
    0o and &o to specify octal numbers ( ie 0o67, &o67 )

  • Include files can be specified by using INCLUDE "file.ext"

  • Two character strings can be used as 16-bit constants ( ie LET W0 = "AB" )

  • IF ... GOTO, IF ... THEN GOTO and IF ... THEN GOSUB are supported

  • REPEAT / UNTIL and WHILE / WEND are supported

  • ON ... GOTO is supported

  • Boolean variables allowed in IF, REPEAT and WHILE expression

  • A PLAY command, to produce music easily, is added

  • The DEBUG command doesn't require variables to be specified

  • Support for increment and decrement instructions ( ie B0++ and W1-- )

  • Support for increment and decrement by value ( ie B0 ++ 2, W1 -- 3 )

  • Alternative arithmetic operator naming supported ( XOR, MOD )

  • Alternative comparator operator naming supported ( = = and != )

  • Shift left ( << ) and shift right ( >> ) operators supported

  • Access to EEPROM can be done using an eeprom[] array

  • Compiled image sizes of 64, 128 and 256 bytes

Code optimisations are also performed when appropriate, including the removal of redundant arithmetic operations and unnecessary END statements.

The NEWSTUFF.BS1 file in the DIYSTAMP.ZIP Distribution Archive illustrates the new syntactical constructs supported.

The additional syntax and semantic checks may mean that a program which compiled without errors using the Parallax compiler may not when using the DIYSTAMP Compiler. To compile code which conforms to the original parallax syntax, applying relaxed semantic checking, the /STRICT switch may be specified on the command line when the DIYSTAMP Compiler is run.

The DIYSTAMP Compiler generates a .PRN listing file which shows the result of compilation, and comprehensive error messages are generated when a syntactical or semantic error is detected. Errors are also given in a .ERR error listing file.

Code which compiles using the Parallax compiler should compile using the DIYSTAMP Compiler when no dubious semantics are used, and will always compile if the /STRICT switch is specified.

Code using the enhancements provided by the DIYSTAMP Compiler may not always compile through the Parallax and BSS Club compilers. If the code compiles with the /STRICT switch, then it should also compile using the Parallax and BSS Club compilers.

The compiled image created by the DIYSTAMP Compiler ( in the CODE.OBJ file ) will always be 100% compatible with the BS1 interpreter, provided that a compiled image of 256 bytes is generated ( neither /64 nor /128 used ).

Downloading the DIYSTAMP Compiler

The DIYSTAMP Compiler Version 2.02 is available for download as part of the DIYSTAMP.ZIP Distribution Archive. The source code consists of the DIYSTAMP.BAS file and a number of .BAZ "include files" ( please see README.TXT in the Distribution Archive for details ), and the DIYSTAMP.EXE file is the compiler executable. The entire Distribution Archive can be downloaded by clicking the link below ...

  Download DIYSTAMP.ZIP - Version 9.12 ( 357 KB )

Version 2.02 of the DIYSTAMP Compiler is the latest version.

Although I am running a Virus Checker on my development PC, please check the DIYSTAMP.ZIP and DIYSTAMP.EXE files after downloading and unzipping to ensure that they are virus free.

Using the DIYSTAMP Compiler

The DIYSTAMP Compiler can either be placed in the directory from which it is to be run or placed in a directory which is included in the "SET PATH=" environment variable, or it can be run by prefixing its name with the fully qualified path of where it is installed. In short, the DIYSTAMP Compiler is just like any other MS-DOS executable you will encounter.

The compiler is run by using the "DIYSTAMP filename" command. This will read the specified source code file and create the filename.PRN, filename.OBJ, CODE.OBJ files and a filename.ERR file if any errors are detected. These files are described below.

The filename can specify a file in the directory where DIYSTAMP is run from, or may be a fully qualified filename or relative path to the source code file. Wildcard filenames are supported. The filename must not be greater than eight characters long; long filenames are not supported.

Command line help, and version details, can be obtained by using the "DIYSTAMP /?" command.

There is a set of test programs for 'regression testing' of the compiler included with the Distribution Archive; these are held in the TESTS sun-directory. Thse can be compiled by using the "DIYSTAMP .\TESTS\*" command.

Output Files

No matter where the source file is located, the output files created by the DIYSTAMP Compiler will be placed in the directory from which the compiler was executed.

The DIYSTAMP Compiler creates three or four output files ...

A .PRN file which shows how and where the compiled image tokens are placed in the compiled image, against the source code for the compiled lines, along with other information pertinent to the compilation.

A .OBJ file and a CODE.OBJ which contains the compiled image.

A .ERR file will be generated if any errors are detected. Checking whether or not this file exists will allow the success or failure of the compilation to be determined if the compiler is run within a .BAT batch file.

The best way to see the functionality of the compiler is to try it. There are a number of examples of source code included in the DIYSTAMP.ZIP Distribution Archive which can be used.

The NEWSTUFF.BS1 file illustrates the enhanced syntax available with the DIYSTAMP Compiler, and can be compiled using the "DIYSTAMP NEWSTUFF" command.

For an explanation of the tokens held within the compiled image, and shown in the .PRN file; please see Chuck McManis's Decoding the Basic Stamp article.

Tokens are shown against their address in the compiled image, which is given in the form xx:y, where xx indicates the byte in which the token starts ( 00 being the first address of the image, and FF being the last ), and y is the bit offset within that byte where the token starts ( with 0 being the leftmost, most significant bit and 7 being the rightmost, least significant bit of the byte ).

Note that program statements run down the compiled image ( from address 00 towards FF ), while user defined EEPROM data runs upwards ( from FF towards 00 ). The compiler displays the user defined EEPROM data in the order they appeared in the source code, interleaved between other tokens.

The DIYSTAMP Compiler Source Code

The DIYSTAMP Compiler is written in Basic and is compatible with the FirstBasic 1.00 shareware compiler and the PowerBasic 2.10f compiler from PowerBasic Inc.

The source should be fairly easily convertible into other variants of the Basic language, including QBasic and Visual Basic, and even alternative programming languages, such as C, C++ and Java.

Reporting Bugs in the DIYSTAMP Compiler

Operation of the DIYSTAMP compiler has been checked by compiling numerous source code files through the DIYSTAMP compiler, the BSS Club and Parallax compilers, and checking that the compiled images match. It is possible that there may be some source code configurations which have not been checked and the behaviour of the DIYSTAMP Compiler will not match that of the Parallax compiler; this is most likely to be in cases where the Parallax compiler will allow programs with incorrect, or questionable, semantics to compile whereas the DIYSTAMP compiler will won't. To check that this is not the case, please attempt to recompile using the /STRICT switch, which will turn off most semantic checks, and see if the problem still remains.

Before reporting a bug, please check that you have the latest Version 2.02 DIYSTAMP Compiler.

If you do find any significant bugs, please send an email detailing the problem ( with an indication as to how to replicate the bug, and specifying the compiler version number ) to hippy@psynet.net.

Licensing

The DIYSTAMP Compiler is provided as Freeware for personal, educational and non-commercial use.

The DIYSTAMP Compiler Executable and Source Code may be modified and redistributed providing that the Licensing and Copyright statements contained therein are unchanged, and no charge other than distribution cost is made.

For commercial use of the DIYSTAMP Compiler, or any program derived from it, please contact the original author, and Copyright holder, at hippy@psynet.net.

Warranty

The DIYSTAMP Compiler is provided "as is", without any warranty of any kind, and without any guarantee as to fitness for purpose.

Downloading, installing and using the DIYSTAMP Compiler is undertaken at your own risk.


Enhancements to the Basic Stamp

With the Basic Stamp architecture now well understood, it is necessary to step back and decide what we want or own interpreter to do. We could choose to stick with exactly what Parallax and PBASIC offer us, but we have a golden opportunity to to build upon their original design.

When Parallax designed the original Basic Stamp, they were constrained by the availability of processors which they could use within that product. This greatly limited what the PBASIC interpreter could do, and lead to the limitations on data memory, program size and subroutine calls which would not have existed had a better processor been available for use at the time.

Now that processors have improved, available at lower prices, with more memory, in-built EEPROM, Flash code storage, and a variety of enhancements, it is possible to overcome the previous limitations and produce an enhanced Basic Stamp style interpreter. This is primary goal of the Build Your Own Stamp project.

Providing additional serial baud rates is a simple exercise in extending the interpreter to accept a wider range of values which specify the baud rate to use. Other enhancements ( except those which are provided for by the compiler, and generate compiled images which are 100% compatible with the Basic Stamp ), require more significant changes to the interpreter, and changes to the compiler to support the code generation for the enhanced version.

A primary goal is that the enhanced interpreter must execute compiled image code which has been generated for the Basic Stamp, and so enhancements must be made to fit in with the existing compiled image structure. The enhanced interpreter will run original PBASIC compiled images and enhanced compiled images, but it is accepted that the enhanced images will not run on a Basic Stamp. The DIYSTAMP Compiler will generate compiled code that will run on the Basic Stamp, the enhanced interpreter, and often both.

The most obvious enhancement which can be made is to lose the GOSUB Return Address Table. With more data memory, an interpreter can push the actual return addresses to a stack, rather than an indicator of which GOSUB call it was, and the table becomes completely redundant. This frees up all compiled image space which would have been taken up by the table.

The consequence of using a proper subroutine stack is that W6, b12 and b13 are no longer corrupted within subroutines, giving those variables back to the programmer who no longer needs fear losing data stored within them during execution.

The PBASIC interpreter allows the addressing of 64 variables, of which 56 are mapped onto words, bytes, bits and I/O ports and pins for use; the eight unused addresses can be used to provide additional variables for programmer's use.

The most obvious enhancement is to provide w7 and its corresponding byte parts b14 and b15, which is easily accomplished.

With an increase in data memory available, it is desirable to allow the programmer to use that memory if the interpreter isn't going to. The data memory can be treated as an array of bytes which the programmer can use ( ram[...] ), and it makes sense to provide a means of accessing the memory by way of indexing. To support this, three additional variables are added - ram[w0], ram[w1] and ram[w2].

One thing which is noticeably lacking in the original Basic Stamp is an inability to use interrupts. Although changes on input lines can be acted upon by using the BUTTON command, it would be much nicer to have the code idling, with an automatic jump to a subroutine when the input lines change.

This can be supported by two additional variables; an Interrupt Handler variable ( set only by the INTERRUPT statement ), which stores the address of the interrupt handler to use, and an Interrupt Mask variable ( mask ) to determine which input lines should be monitored.

As well as providing indexed access into the ram[...] array, it is desirable to access that array directly ( ie, ram[45] ). This is achieved by modification of the compiled image tokens which represent constant numbers.

The compiled image deals with 1, 4, 8 and 16-bit numbers differently, compacting the image when small numbers are used. This means that a number below 16 will always be stored as a 1 or 4-bit number and a number below 256 will be stored as an 8-bit number.

If we find an 8-bit number which has its top four bits cleared, or a 16-bit number with its top eight bits cleared, we know that this is not what the compiler would have generated for that constant. We can use this feature to gain access to 264 previously unused tokens.

We use 256 of these tokens to provide read-only access to ram[0] through ram[255], and use the other eight to provide read-only access to another array, rtc[0] through rtc[7]. The rtc[...] array is used to provide date and time information which was not available on the original Basic Stamp.

The use of the ram[...] array not only provides additional byte variables but also allows for the creation of large data tables without utilising EEPROM data as would normally be necessary; a program is first run to initialise the ram[...] array, and subsequent programs can then be used to utilise the pre-loaded data, and modify it as required.

All addresses used in IF .. THEN, GOTO, GOSUB and INTERRUPT commands are stored as a token which is 11-bits long; an 8-bit indicator of the byte within which the destination of the jump starts, and a 7-bit offset into the byte to where the first bit of the token jumped to is.

Addressees must therefore point to valid tokens which are the start of a command. This means that there are numerous possible address tokens which are unusable in most programs, however, it is difficult to determine which they are, except for those which indicate a jump to a token which claims to start in the first 19 bits of the compiled code. These bits are not part of the executable code, but indicators of the size of the compiled program, and where the program ends. We also know that the first five bits of the program are those that form the first token of the program, of which four would never be jumped to. Likewise, the last four bits of the compiled image would never be jumped to either; a total of 27 known, and guaranteed, special cases of address destinations.

We can use this feature to generate special destination addresses for IF ... THEN, GOTO, GOSUB and INTERRUPT commands.

The maximum compiled image size of 256 bytes is somewhat limiting, but can be extended by allowing an interpreter to load multiple compiled images and allow the programs to jump from one image to another. Jumping to one of the special addresses will cause a jump to a specific compiled image. To aid implementation, the number of pages can be limited to 16; if the address is less than 16, it is a jump to one of compiled images, in page 0 to 15.

Whether a particular interpreter will support multiple compiled images, and if it does, how many, will depend upon the amount of token storage space that the processor running the interpreter has. Although single-chip interpreters are unlikely to support more than one compiled image at a time, interpreters which utilise a PC platform as the 'microcontroller' may well. Using a PC to emulate a Basic Stamp is overkill with just 256 bytes of image to play with, but with 16 pages, a total of 4096 bytes of image, this may well be practical.

It may also be ideal in cases where the compiled image and interpreter are burnt into a microcontroller's program memory, rather than uploaded into a microcontroller which has only the interpreter built in, and stores the compiled image in on-chip EEPROM.

Separate compiled images can be used as continuations of programs which have exceeded the size of a single page, and can be used as entire subroutines or interrupt handlers. Libraries of routines can be held in separate compiled images by calling them with a variable to specify which routine needs to be executed by way of a BRANCH statement when that page is entered. Returning from one page to another is done by using the RETURN statement; a call to another page is as easy as it were to a local routine.

All these optimisations fit in with the existing compiled image format; were a compiled image using these enhancements run on a Basic Stamp, it would probably execute without crashing ( except in the case of the loss of the GOSUB Return Address Table and the 'badly formed' inter-page address tokens ), but the results would be unpredictable. It is this feature which allows both PBASIC and enhanced compiled images to be run on the enhanced interpreter without the interpreter needing to be aware of which target the code was generated for.

There are further enhancements which can be incorporated, but these require that the interpreter is able to determine if the image is for a Basic Stamp or an enhanced interpreter, as the compiled image is different in each case.

The original Basic Stamp compiled images include address tokens as part of instructions to overcome limitations of the original processor. These are no longer required as the limitations have been removed with later processor capabilities, and if these redundant addresses can be removed, it will free up code space.

Both the DEBUG and SERIN commands generate an unnecessary address token, both of which point to an address immediately after the address token itself. Because of this characteristic, we can determine at compile time what that address will be, and likewise, an interpreter is able to predict what the address should be.

Using this knowledge, we know that the most significant bit of the address token will always match the most significant bit of the address which immediately follows it, if the two bits are not the same then it is not a valid address token.

This fact is used to remove the redundant addresses in an enhanced compiled image. When a redundant address is found ( in a non-enhanced compiled image ), the first bit will match the most significant bit of the address that would be expected after the address token, and therefore there is a complete address token. If the bit does not match, then we know the address token is not there. The compiler has only to generate a correct address token, or a single inverted bit when in enhanced mode, and the interpreter has to only check a single bit to determine which of the two cases it is. The enhanced interpreter will therefore execute original Basic Stamp compiled images and enhanced images with very little modification.

The token space saved with the optimisation of DEBUG and SERIN depends upon how many times these are used within the program. In the case of DEBUG, the saving reduces the impact of adding DEBUG statements during code development quite considerably.

The READ and WRITE statements also include redundant address tokens, however these cannot be optimised away as they are with DEBUG and SERIN, as it is not known, at the time when the address token is encountered, where the address will point to, as this is dependent upon other tokens which follow.

To gain write access to the ram[...] and rtc[...] arrays using a constant index, we can utilise the IF token sequence. Because it is impossible to compare with the Interrupt Handler variable, and the variable identifying index isn't used by the original Basic Stamp, we can use a comparison with the Interrupt Handler to indicate that this is not really an IF statement, but a specifier for the storage into the ram[...] or rtc[...] arrays. Up to that point, the tokens will have conformed with those expected for an IF statement, and will have had no adverse effect on the operation of the interpreter or data variables, subsequent tokens can be dealt with as if they were part of a LET assignment sequence.

As can be seen, considerable enhancements can be made to the original Basic Stamp by fairly easily extending an interpreter for the Basic Stamp. The enhancements are incorporated within the original compiled image, and do not cause excessive compiled image bloat. Any increase in code size is likely to be offset by code optimisations elsewhere.


The Enhanced DIYSTAMP Compiler

Version 2.02 and above of the DIYSTAMP Compiler supports all the enhancements which are described above, and is capable of generating code in a Basic Stamp compatible format, and in the newly defined enhanced format.

All that needs to be done to generate an enhanced compiled image is to specify the /ENHANCED switch on the DIYSTAMP command line or include it within the source code itself. The ENHANCED.BS1 file in the DIYSTAMP.ZIP Distribution Archive illustrates the new enhanced capabilities supported.

There is a set of test programs for 'regression testing' of the compiler included with the Distribution Archive; these are held in the TESTS sun-directory. Thse can be compiled by using the "DIYSTAMP .\TESTS\* /ENHANCED" command.

Version 2.03 and above of the UNSTAMP Disassembler supports the disassembly of enhanced compiled images.

The main additions to the enhanced interpreter are -

  • Additional w7, b14 and b15 variables

  • Access to ram[...] array data

  • Access to rtc[...] array data

  • Interrupt handling

  • Support for multiple page compiled images

As there may be interpreters which will not support all enhancements, as they may be implemented on a microntroller which is constrained in terms of memory or other features, every enhanced compiled image is given a 'code signature' which flags all the enhancements used, and required to allow that compiled image to be executed properly.

When the compiled image is uploaded to the interpreter chip, it can check if any features are required which it does not implement, so there is no danger that an interpreter will fail to execute as expected beccause it isn't able to support what the compiled image needs.


The CODE2ASM Compiled Image Converter

Getting a Compiled Image from the DIYSTAMP or any other compiler is all well and good, but it can't be used unless it is uploaded into a microcontroller which contains a PBASIC interpreter or into a Basic Stamp Emulator or Simulator.

From my own point of view, which seems to be backed up by those who have shown an interest in this project, there is a requirement for having a PBASIC interpreter which is burnt into a target microcontroller along with a predefined PBASIC program. This allows a smaller interpreter to be built, as it is not necessary to provide upload capabilities, and it also allows an interpreter to be developed without having to sort out the uploading first.

Another good reason for using an interpreter with a pre-loaded and predefined PBASIC program is that the program code cannot be easily extracted by a user, nor cannot it be inadvertantly overwritten by any other PBASIC program.

To allow Compiled Imges to be incorporated within a microcontroller, or any other program, the CODE2ASM utility is provided within the DIYSTAMP.ZIP Distribution Archive which will allow the conversion of a Compiled Image into a file in a format suitable for for use with almost any other programming language. The generated file can of course be processed further if so required.

The DIYSTAMP compiler ( from Version 2.02 upwards ) also supports the generation of an include file by using the /ASM command line switch. All switches supported by CODE2ASM are also supported by DIYSTAMP.

Documentation on the CODE2ASM utility and examples of use are included in the CODE2ASM.TXT file which is included in the Distribution Archive. The source code is also included.

If you are using this technique to create your own interpreter, it is recommended that the generated file be included twice ( which may require two different conversions to be made to get the correct formats ); once in read only code space ( for program execution ) and again in EEPROM ( for data storage ), so a running program can update any data which is stored in memory. This will require the interpreter to fetch program tokens from one location and read and write data in another. Partitioning a program in this way will allow up to 256 bytes of PBASIC tokens and 256 bytes of EEPROM data simultaneously, although such a program would very likely crash if run using an interpreter where both share the same 256 bytes of memory. If a program does not need to update the EEPROM data when executing, it is possible to store the program wholly in the microcontroller's read only memory.


The Interpreter

My original plan was to develop a PC based interpreter, or more correctly, a simulator, and then port the interpretor over to a PIC.

I have created a basic PC based simulator for PBASIC which supports all compiled tokens, but doesn't allow any I/O control; PC's do not have many convenient I/O lines beyond those provided by the parallel line printer port, sound card joystick inputs and a few serial port control lines.

The exercise did reveal some issues related to developing a properly embedded interpreter; ensuring compact program code, and the necessity of being able to get the compiled tokens in a PC file uploaded into the interpreting device.

The obvious move, having created the basics of an interpreter, was to start porting the interpreter to a PIC. Unfortunately, but for ease of design, the interpreter has been written in Basic and is not easy to compile directly to PIC code; the 'porting' is going to have to be program creation from scratch.

This is not a particularly large problem as that will need to be done no matter what the target processor chosen is, but it does present me with a number of problems ...

I have not used PIC processors for a few years now, and am pretty rusty in that field.

I truly loathe the PIC Assembly Language; and I mean really loathe it. So much so that I would not use it given an option.

When I was using PIC's regularly I wrote my own cross-assembler which supported assembly language mnemonics akin to the 6800 and 8051 series processors. The cross-assembler was designed for the old 12-bit PIC's ( 16C5X family ) and does not now appear to be simple to adapt for use with the current 14-bit PIC's ( 16F627 etc ).

I have looked at various C, and other high-level language cross-compilers but I am not convinced that these will be suitable in creating an interpreter which is fast and compact. I may be wrong, but low-level coding looks like the best way to proceed.

I am also at a disadvantage when it comes to development tools as I don't have any PIC programming hardware and don't want to get bogged-down in home-building and getting that to work. Although ready-built programmers can be purchased relatively cheaply, I don't want to invest money in a project which is going to go nowhere.

The project going nowhere is what worries me at the present. I may have set my goals too high, and have now wandered out into waters which are too deep; understanding PIC's, understanding PIC Assembly Language, understanding PIC programming, needing programming tools and having to write and debug the interpreter on top of that,

This may be just a short-term lack of focus, and loss of confidence, but I need to step back and decide how to move on, if at all.

There is considerable interest in a Basic Stamp clone, and it appears that many people are also considering the development of a PIC based interpreter. The best way forward may be to let them do that job, and I can then concentrate on the PC side tools needed; a field in which I am happier to operate.

Other projects, both design and development and mundane household tasks, have delayed the DIYSTAMP project, but I would like to see it fulfilled.

Whilst I am uncomfortable diving-in with further PIC based development, I have considered using the Nintendo GameBoy and the Psion Organiser II as the hardware platform for further development. These are both incredibly cheap to get hold of, and offer considerably more than some PIC's do at the same second-hand price. There is a phenomenal amount of architectural and system documentation available on each, and Software Development Kits are freely available.

Both are full microcomputer systems with large memory spaces and include LCD displays; 160 x 144 pixels for the GameBoy and 2 lines x 16 characters for the Psion. With additional interfaces these would be as equally suited to controlling projects as a Basic Stamp Clone would.

Both have advantages over PIC's but also have their problems, requiring hardware interfaces to be built and programmers to be purchased. Although the gains to be made look useful, their use takes us away from the single-chip Basic Stamp clone design envisaged when I embarked upon the project.

The support and encouragement to continue with a PIC based interpreter has not gone unnoticed, and it is the obvious, and more technically correct, way to proceed.

I do have access to an EPE ICEbreaker development system, utilising the PIC 16F877 which may be a suitable platform to move on with, and it is in this direction I think the next steps will be.

There will thus be a short time-out, while I take stock of the current project state, look at the ways I can get the project moved on, and decide how best to proceed. Please bear with me, and I'll let you know how I've progressed and what my plans are in the near future, by the end of November 2002 at the latest.


Basic Stamp and PBASIC are registered trademarks of Parallax Inc. PICAXE is a trademark of Revolution Education Ltd. PICmicro is a registered trademark of Microchip Inc. MS-DOS is a registered trademark of Microsoft Corporation. Build Your Own Basic Stamp, DIYSTAMP, DIYSTAMP Compiler, SHOWCODE, SHOWCODE Compiled Image Viewer, UNSTAMP, UNSTAMP Disassembler, SIMSTAMP, SIMSTAMP PC Simulator, RUNSTAMP and RUNSTAMP Interpreter are trademarks of the Happy Hippy.





Associated Articles

  DIYSTAMP.ZIP

  The PICAXE Processors
  Making Music with the Basic Stamp

  PICASM Assembler



Sites to Visit

  Parallax Inc

  PBASIC Manual
  STAMP.EXE

  BSS Club

  ST1-64v2.ZIP

  Reflection Technology

  BS4.ZIP

  PowerBASIC Inc



Site Navigation

  Home Page
  What's New
  Search
  Add Bookmark
  Have Your Say
  Guestbook




First published on Sunday the 4th of August, 2002 at 16:51:43
Last upload was on Thursday the 8th of January, 2004 at 14:07:32