-
BAD Bytecode VM should check whether we have the right values on the stack. Can be removed in release mode.
-
IDEA Make structs non-gc'd and make
object
a GC'd equivalent? Objects can be used to handle by-ref situations but structs can be used in most cases- Objects can also perhaps implement
interface
type stuff as below
- Objects can also perhaps implement
-
IDEA Rather than having arrays or indexing syntax built in, maybe just have a primitive "slice" type? Pretty much the Zig approach
- A slice + encoding/iterator covers like 90% of cases
- If we want these things to be efficiently encoded, we should just start having multiple values on the stack
- A slice is just a length int value + a
TINY_VAL_SLICE_PTR
so we don't have to box every slice and we also don't have to blow up the size of everyTiny_Value
- But if we're gonna support unboxed structs and stuff, then we might as well use an OCaml-like value representation
(use a high bit to represent whether anything is an
Object*
vs a primitive)- There are some complications with this model; can no longer have true 64-bit ints without boxing
- Might also be troublesome for embedded systems where we'd want integer type not to be the same width as pointer type
- Oh but actually
Tiny_Value
is already always at least pointer width
- Oh but actually
- We can support unboxed arrays via the extern system
-
IDEA First class namespaces/modules would be pretty sick; easy way to soup up the macro system by making them truly hygenic
- Basically like OCaml, but I believe we can make the language simpler somehow using them
- Types can be namespaces? Or namespaces can have types?
- Basically formalize the "protocol" thing we've been doing with the indexing syntax
-
IDEA What if all
struct
are structurally typed? Assuming structs -
IDEA Since it's become such a common pattern to call methods like
arr->aint_len()
, maybe a shorthand likearr:len()
which just compiles to{type(arr)}_len()
would simplify a lot of code -
BAD No way to signal errors. Would be nice if we had error in addition to options like
int!
-
REFACTOR Unify symbol types for foreign and regular functions. Both of them have indices, argument types, and return types. The only difference is ellipsis and the way that the compiled code interacts with them
- Actuallyyyyy, this is not that straightforward; foreign functions just track tags for their args, no names, for example
- Could do with some simplifying helpers though
-
BAD No for..in type thing
- Another protocol thing? Just anything that has a
_len
and a_get_index
would work for now - Could also do a true "iterator" thing where it expects a
_iter
,_iter_next
to be defined and goes from there?- 90% of use cases should work with just an integer IMO; the iterator case should probably be explicit
- Nullable types could make it easier to do the
_iter_next
thing - I could have an unboxed integer iterator easily by just stuffing it into a pointer (light native) or even literally just having it be
an integer at runtime (since
Tiny_Value
technically boxes it anyways)
- Another protocol thing? Just anything that has a
-
BAD No function pointers
- Allow specifying types like
func(int, int) void
which just get turned intofunc__int__int__void
in line with the simple flat type system - You can then use functions like values
- No closures for now; this means that function pointers can be unboxed
- Only non-ellipsis foreign functions can be used as values
- We can add closures on top of this easily later
- How would the upvalues work? In some other languages, all upvalues work as though they were created in the same scope as the closure, so you can have things like a counter in the outer scope
- Golang deals with this by allocating the upvalues on the heap; we may have to box upvalues too to make this work
- Could stuff all upvalues into a single (anonymous?) struct and then have the function receive that as the first parameter
- Allow specifying types like
-
BAD All types are nullable?
-
BAD No debugger of any kind
-
BAD No language server
- Thinking I'll just write this one in Python (I know) and bind to Tiny using (autogenerated) ctypes
- There are a lot of reasonable libraries for LSP in python and as a first pass this is fine
- A lot of functions come from C bindings so we need some way to bring those in easily
- Could have a function which dumps bound functions/symbols to a buffer which the language server
can load up and then provide completions for
- This doesn't work with macros, which are the primary mechanism for e.g. generic data structures
- Maybe the people embedding tiny are responsible for compiling a DLL that has a
CreateLanguageServerState
function exposed. Our python language server loads this DLL, calls that function, and uses the resulting state to figure out what functions/types are exposed.- I feel like this is the most correct approach; macros can do a lot of stuff (introduce new types, new functions) in unpredictable ways so there's no getting around some sort of live approach
- Alternatively, you dump all the definitions every time your executable runs? Then we don't even need to do C bindings at all,
we just use the definition JSON?
- Could be useful for other tools too
- What happens if the code fails to compile though? Can't dump symbols then
- I guess that's fine, you need to have a baseline which compiles
- Just point the LSP to an executable that dumps the symbols you care about for a given file, and the LSP will run it as you make changes
- Could have a function which dumps bound functions/symbols to a buffer which the language server
can load up and then provide completions for
-
COOL Trailing commas in function calls are fine! That makes it really easy to do the "parameter pack" thing below kinda generically
-
BAD In macros, when we're doing codegen on top of functions, sometimes we want to "forward" parameter packs, and this ends up being really annoying (see
CallWaitMacro
andDelegate
stuff) -
BAD Would be nice to have a generic AST visitor function that takes in some context
-
IDEA Be able to launch other
StateThread
and cooperatively yield from them- They don't share heaps though so what do we do about that?
- Should we instead just have an actual coroutine/fiber thing that just has a stack?
-
IDEA Adopt Zig's
break value
syntax to support expression-oriented blocks
a := 10
x := {
if a > 5 {
break 10
} else {
break 20
}
}
-
TEST Short circuting &&
-
TEST Short circuiting ||
-
TEST Break and continue in for loops
-
TEST Got rid of null-terminated strings but didn't really add too many tests
-
BAD Right now always crashes if the same symbol is bound twice
-
BAD Cannot store any context
void*
with modules/bound functions -
BAD You can return
any
, but you cannot assignany
to anything; it is more like "unknown" I guess? Except in the event that you're returning it lol??? -
BAD Array out of bounds asserts in debug and segfaults in real
-
BAD
Tiny_Value
is 16 bytes because we store type tag even though technically we can get away with only boxing "any" or reference types (see thesnow-common
branch)- Likely a big performance win
- Will need typed bytecode
-
BAD Cannot cast reference types to any
-
BAD Accessing null values causes segfault
-
BAD No type aliases (strong)
-
BUG The following compiles without error (no checking of whether function returns value)
- This would require some amount of control flow analysis though (for the non-trivial case)
- Could just always require the top-level block to contain a return expression? Could catch 90% of these bugs
func test(): int {}
- BAD No error types/error info
- Add "error" primitive type which is just an error code (int? string? it's own primitive type) and a pointer to a context object
- BAD No runtime type information on struct object (could probably stuff a
uint16_t
struct ID somewhere at least?)- Could even just get rid of teh
nfields
since I think that's only used for pretty printing and debug checks
- Could even just get rid of teh
- BAD No interface to allocate/deallocate using the thread's context allocator
- BAD No standard regex
- BAD No syntax highlighting
- BAD Pretty printing
%q
in printf - BAD No panics??
- BAD Memory unsafety introduced by varargs functions (unless we do runtime type checking)
- For example,
i64_add_many
- For example,
- BAD Sometimes error messages point to the wrong line (off by one?)
- BAD No functions-as-values (not even without captures)
- Could patch this hole with runtime polymorphism but ehhhhh
- Could also technically implement this with a C library haha
- BAD No ranges/range-based loops
- BAD No multiline comments
- BAD No builtin array or dict
- Mainly for type safety; parametric polymorphism (at the library level only?) could solve this
- The library-only parapoly prevents the script code from becoming too complex
- Builtin array or dict will probably cover most use cases though (see Golang)
- BAD No 64-bit integers
- BAD No char type
- BAD No polymorphism of any kind
- BAD We don't pass in alignment to user-provided allocation function
-
Refactor VM from compiler
-
First class types (store a "type" which is just an integer in the VM)
-
Once "any" is safe, we can have typed bytecode instructions;
OP_ADD_INT
,OP_EQUAL_STRING
etc- Untagged values
-
Function overloading
-
"Method" sugar (first argument to func matches type of x => x.func())
-
Do not
exit
anywhere in the library; user code must be able to handle errors -
Less haphazard allocation: Allow the user to supply a malloc, use Arenas
-
Function return type inference (Kind of unsafe at times tho)
-
Interfaces? There is currently no ways of doing polymorphism and I think the golang approach to interfaces is pretty nice:
interface Stringable
{
to_string(): str
}
struct Vec2
{
x: float
y: float
}
func to_string(v: Vec2): str {
return strcat(ntos(v.x), ",", ntos(v.y))
}
func stringify(s: Stringable): str {
return s.to_string()
}
func thing(v: Vec2) {
// This still works because there might be functionality that was created to work with certain
// interfaces that you don't want to repeat for your particular struct
v.stringify()
}
- Anonymous functions (closures)
HARD TO IMPLEMENT AND MAYBE OVERSTEPPING THE BOUNDARIES OF A SCRIPTING LANGUAGE:
- SIMPLE GENERICS (See how C# does it):
// NOTE: This is not a C++ template; the types are just used for type-checking at the call site;
// the use of a generic type inside the body of the function is simply not allowed (i.e. trying to
// access a property on an object of that type for example)
// SYNTAX FOR COMPILE-TIME PARAMETER IS NOT FINAL
func get(e: entity, $t): t { ... }
// e is an entity; this will produce an ItemComponent; cast is not necessary
e.get(ItemComponent)
Of course, if it's implemented at the tiny level, then generics must also be implemented at the C level:
// $$t means pass in the actual type value as well (so we can inspect it in the C code to get the appropriate component for example)
Tiny_BindFunction(state, "get(entity, $$t): t", Lib_GetComponent);
Tiny_RegisterType(state, "array($t)");
// $t is matched with the t of the array passed in
Tiny_BindFunction(state, "get(array($t), int): t");
- BAD No array syntax, no way to iterate arrays
- Make
CurTok
local or at least thread local - BAD Cannot access type information in C binding functions
- Would be mega useful to make things type-safe
- For example, could allow you to succinctly define a
delegate
binding- Just need the function name, nothing else
- BAD No ternary operator
- IDEA Rescript style pipe operator
arr := array_int()
// Equivalent to array_int_push(arr, 10)
arr->array_int_push(10)
arr->array_int_push(20)
arr->array_int_get(10)
- BAD Cannot do
.
subscript on cast expression values - BAD No designated struct init
- BUG Comparing structs does nothing
- BUG Assigning to arguments doesn't seem to work, repro
func mutate_arg(i: int) {
i = 10
// Prints 5
printf("%i\n", i)
}
mutate_arg(5)
- BAD NULL TERMINATED STRINGS??
- BUG The following compiles without error
func split_lines(s: str): array {
return ""
}
-
BAD || does not short circuit
-
BAD %c is not handled as a format specifier
-
BUG
&&
doesn't short circuit? -
BUG
continue
doesn't seem to run the "step" part of the for loop -
Make sure VM bytecode instructions and data are aligned properly
-
Safer "any" type: must be explicitly converted
-
NOTE Slices no longer needed since index sugar added
-
BAD No slices! I think the quickest way to support indexing arbitrary sequences is to allow them to produce a slice
-
Actually this may not be that great after all because this necessitates having generic types (e.g. one for slice)
-
What if I just allow native values to be indexable (along with string)? Just add
getIndex
andsetIndex
toTiny_NativeProp
-
Alternatively (and this works better with our macro system) allow for operator overloading (maybe just the indexing operator)
-
Something like
func __index_str(s: str, i: int) { return stridx(s, i) }
, nothin fancy"hello"[0] -> stridx("hello", 0)
module M = struct let ( a + b ) = add a b end open M let open M in 10 + 10
array_push(a, 10) a->array_push(10) // turns
-
Would be nice if we inlined these (a function that just consists of returning the result of another function)
-
Maybe for now only allow adding operator overloads in C (but they should be
BindFunction
, not viaNativeProp
; maybeBindOperator
) -
In order to make it explicit that you're calling an overloaded operator, we can have them be like
str.[0]
(note the.
). We can do this for+
and-
soa .+ b
(this could also solve the problem of which side we get the operator from like+.
would mean grab the overload from the rhs, and so on)
-
-
-