As in most programming languages, Pancode deals with different types of data. Pancode is dynamically typed and has several built-in types of data. The types in Pancode are: numbers, strings, lists, functions, boxes, and symbols. I'll discuss each of these in detail below. There is no formal Boolean type in Pancode: any value other than zero is truthy (although conventionally -1 is used to represent the canonical "true"), and zero is falsy.
First, a note on comparisons. Many of the comparison operators,
such as =
and ≤
, take two arguments
and compare them, for equality, inequality, or some ambient
ordering. Equality never fails, in the sense that it always
returns either true or false. Two values of different types
always compare nonequal. Two values of the same type have
special rules to determine equality, which will be discussed
below. It's an error to attempt to order two values of different
types. Two values of the same type may have an ordering defined
on them, as discussed below.
Numbers in Pancode are IEEE floating-point values. Equality and
ordering are defined as specified in the relevant IEEE standard.
Integers (both positive and negative) can be pushed onto the
stack using the literal integer itself (with a preceding minus
sign, when relevant). Note that this is one case in Pancode
where whitespace matters: - 1
is a call to the -
subtract
operator followed by the number one, whereas -1
is
the literal number negative one.
The IEEE value infinity is written using ∞
.
Negative infinity is written using -∞
(again, there
must be no space between the two characters). NaN (not a number) is written using 👿
, because that's
usually how I feel when NaN shows up in my computations.
Strings are effectively arrays of characters. I make no
guarantee as to the encoding scheme used to store them, so for
all intents and purposes you should regard them as simply an
array of code points that is somehow distinct from the
general-purpose "list" type. Strings can be pushed using "
quotation
marks. Currently, the only supported escape sequences which can
appear in a Pancode string are \"
(for a quotation
mark), \\
(for a literal backslash), and \n
(for
a newline). Note that you are free to include literal newlines
in your strings, and they'll be preserved. \n
is
merely provided as an, often cleaner, alternative.
Two strings compare equal if they have the same characters. Strings are compared for ordering using standard dictionary order on the code points.
Strings may be marked as regular expressions using the r
command.
A string marked as a regexp still behaves like a string for most
intents and purposes, but a few searching commands will treat it
differently. A regexp will never compare equal to a string. If
two strings are ordered and one is a regexp, they still follow
the usual ordering, unless they are the same string, in which
case the regexp is considered greater.
Lists are sequences of Pancode values. Lists can contain any
Pancode data, including other lists. Lists are generally
constructed using curly braces {1 2 3}
(Note that this is not
special syntax; this is a call to the {
command,
then three numbers, then a call to the }
command),
and they print to the screen the same way.
Lists in Pancode can be either eager or lazy. Eager lists are inherently finite and are fully stored in memory at all times. Lazy lists may store only a finite prefix and a "thunk" indicating how to compute the rest of the list. Lazy lists can be finite or infinite. Evaluating additional elements of a lazy list is called forcing the list.
Two lists are equal if they have the same length and all of their values compare equal. Lists are ordered using dictionary order, and their values are compared recursively.
Laziness does not affect the ordering or equality comparisons of lists. A lazy list can be equal to an eager list, provided they contain the same elements. If two lists are compared for equality or ordering and at least one is lazy, the lazy list(s) are forced to the point necessary to determine the ordering. Note that comparing two infinite lists for equality can cause an infinite loop.
Note: Lazy lists are not yet fully implemented in Pancode, so some list functions may not work on them yet. For more details, see Issue #15. This restriction only applies to lazy lists.
Functions are blocks of code that can be run. A function can be
constructed using square brackets. For instance, [1 +]
is
a function which, when called, runs the 1
command followed by the +
command.
More succinctly, it is a function which adds one to its sole
argument. Functions can be called directly using $
,
which pops a function off the stack and calls it. More often,
functions are called indirectly by various control flow commands.
It is an error to try to order two functions. Function equality is undefined, in the sense that two functions may or may not compare equal, regardless of their definitions, and this result may change at any point.
Slips are blocks of code enclosed in corner brackets (「
and 」
).
When a slip is evaluated as literal code, it evaluates its
inside elements.
You generally don't need to deal with slips directly. Slips are created as a result of some forms of syntax sugar or macro expansions and generally behave behind the scenes.
It is an error to try to order two slips. Slip equality, like function equality, is undefined.
A box is a simple cell which contains a single Pancode value.
Boxes can be constructed using ⊂
and extracted
using ⊃
. Boxes are generally used to suppress
scalar extension on various commands, when we want a command to
run on a whole list rather than on each element.
Boxes can also be used to delay evaluation. An expression quoted
by a single apostrophe '
is represented in the
source code with the box type, and when a box is evaluated, it
pushes its contents onto the stack.
Two boxes compare equal if and only if their contents do. The order of two boxes is determined by their order of their contents, and it is an error to attempt to order two boxes whose contents are not orderable.
Symbols are the basic structure of Pancode programs. A symbol
consists of a single character representation and zero or more
modifiers. Every single-character command which is not
interpreted as special syntax is itself a symbol command. If a
literal symbol command appears in the code, it will normally be
evaluated. Thus, to push an actual symbol to the stack, the
single quote '
special
command can be used. That is, when it appears literally in the
code, +
is a command which adds numbers, but '+
is
a command which, when evaluated, pushes the symbol +
to
the stack. This is also different from the backtick syntax, as `+
is
equivalent to `[+]
, i.e. a function which adds numbers.
Two symbols are equal if they have the same character and the
same modifiers. Numerical modifiers are compared sequentially,
and so are prime modifiers, but the order of different types of
modifiers is irrelevant, so +⓪′
is equal to +′⓪
but +⓪①
and +①⓪
are
distinct. When symbols are compared for ordering, they are first
compared for their characters, then for the numerical modifiers
in order, then finally for the prime modifier.
Certain symbols have special designated meanings in Pancode.
These symbols are sometimes called sentinel values and are used
to indicate certain behaviors to built-in commands. When using a
symbol as a sentinel value, one should take care to ensure that
the symbol has no modifiers attached. For instance, ε
is
the null sentinel (see below), but ε⓪
is a distinct
symbol which happens to have the same text but different
modifiers. Pancode operations will not treat the latter as a
sentinel value. Sentinel values, when used as symbol commands,
always push themselves onto the stack, discarding any modifiers.
Currently, there are three sentinels defined in Pancode. Each sentinel is represented by a single character.
{
, the array start sentinel, is used by the }
command
to allow a natural array construction syntax in Pancode.⚐
, the white flag sentinel, is passed to a fold
function to indicate that an empty list was provided. Generally, if given ⚐
as
an argument, a command will return its "neutral" or identity value, so +
will
return 0, ×
will return 1, etc.ε
, the null sentinel, is returned by many
commands in case there's no other reasonable value to return,
such as when trying to access an out-of-bounds list position, or
when trying to read from input when there is no input.