A versatile command line tool to deal with streams, with a (mostly point-free) functional approach.
State: working and usable but I want to 'compile', that
is, down to a linear sequence of instructions. Current way
of interpreting is very wasteful. broken every other day!
Also see the 'wip and such' below.
Similarly to commands such as awk or sed, sel takes
a script to apply to its standard input to produce its
standard output.
In its most basic form, the script given to sel is
a series of functions separated by , (comma). See the
complete syntax bellow. In this way, each function
transforms its input and passes its output to the next one
(- is the function that returns the input stream):
$ printf 12-42-27 | sel -, split :-:, map [add 1], join :-:
13-43-28
$ printf abc | sel -, codepoints # same as 'sel codepoint -'
97
98
99When the first argument names a file starting with #!
the file is read and parsed first. Any additional arguments
are also parsed in continuation of the script.
$ cat pred.sel
#!/usr/bin/env sel
sub 1
$ sel pred.sel 5
4TLDR:
- lists:
{0b1, 0o2, 0x3, 4.2}, strings::hi how you: my-func first-arg [add 1 2] third-argf, gisg(f(..))(orpipe f gor[flip compose] g f)
Special characters and keywords:
, : = [ ] def let use { }
3 special forms:
def name :description: valuewill define a new name that essentially replaces by value where it is used;use :some/file: fwill 'import' all the defined names from the file asf-<name>(top level not evaluated);let pattern result fallbackwill make a function of one argument that computes result if pattern matches, pattern can introduces names (eglet {a, b,, rest} [add a b] 0, the,, restmatches the rest of the list), if pattern is irrefutable then there is no fallback.
Here is the complete syntax:
top ::= {'use' <bytes> <word> ','} {'def' <word> <bytes> <value> ','} [<script>]
script ::= <apply> {',' <apply>}
apply ::= (<binding> | <value>) {<value>}
value ::= <atom> | <subscr> | <list> | <pair>
binding ::= 'let' (<irrefut> <value> | <pattern> <value> <value>)
irrefut ::= <word> | <irrefut> '=' <irrefut>
pattern ::= <atom> | <patlist> | <patpair>
patlist ::= '{' [<pattern> {',' <pattern>} [',' [',' <word>]]] '}'
patpair ::= (<atom> | <patlist>) '=' <pattern>
atom ::= <word> | <bytes> | <number>
subscr ::= '[' <script> ']'
list ::= '{' [<apply> {',' <apply>} [',' [',' <apply>]]] '}'
pair ::= (<atom> | <subscr> | <list>) '=' <value>
word ::= /[-a-z]+/
bytes ::= /:([^:]|::)*:/
number ::= /0b[01]+/ | /0o[0-7]+/ | /0x[0-9A-Fa-f]+/ | /[0-9]+(\.[0-9]+)?/
comment ::= '#-' <balanced> [','] | '#' /.*/ '\n'
balanced ::= '[' <matched> ']' | '{' <matched> '}' | <bytes> | /[^\t\n\f\r ]+/
The objective here was to make it possible to type the script plainly in any (most?) shell without worrying about quoting much if at all:
- the script can span multiple arguments, they are joined naturally with a single space
- the single and double quotes are not used, so to feel safer the whole script can be quoted
One case which can cause problem is lists ({ .. }) which
can be interpreted as glob if not containing a space.
For that reason, it is highly recommended to keep the space
after the , separating items.
Type notations are inspired by Haskell:
- number and bytestring:
NumandStr; - list:
[a]; - function:
a -> b, whenbis itself a function it will bea -> x -> y, but whenais a function then it is(x -> y) -> b; - pair:
(a, b).
Lists and bytestring can take a + suffix (eg. Str+
and [Num]+) which represent a potentially unbounded
value (simplest example is repeat 1 :: [Num]+, an
infinite list of 1s).
The item type of a literal list is inferred as the list is parsed:
{1, 2, 3, :soleil:}not ok because inferred as[Num]{repeat 1, {1}} :: [[Num]+]ok because{1}can 'lose' its bounded charateristic safely{{1}, repeat 1}not ok because inferred as[[Num]]at the first item andrepeat 1can never 'lose' its unbounded charateristic safely TODO: idk if this still the case
The CLI -t option will give the type of the expression.
When a direct function argument doesn't match the parameter, one of these function is automatically inserted:
| wanted | true type | inserted |
|---|---|---|
Num |
Str+ |
, tonum, |
Str |
Num |
, tostr, |
[Num]+ |
Str+ |
, codepoints, |
[Str]+ |
Str+ |
, graphemes, |
Str+ |
[Str+]+ |
, ungraphemes, |
Str+ |
[Num]+ |
, uncodepoints, |
There is also a for now temporary behavior on the output depending on the type:
Num: printed with a newline(a, b): printed with a tabulation between the two and a newlineStr: printed as is[a]: printed with a newline after each entries
The existing functions can be queried with -l:
$ sel -l
[... list of every functions ...]
$ sel -l map add
map :: (a -> b) -> [a]+ -> [b]+
make a new list by applying an unary operation to each value from a list
add :: Num -> Num -> Num
add two numbers
$ sel -l :: 'a -> Num'
[... list of matching functions ...]Python, Haskell, Rust, KDL, jq, tree-sitter, dt, Helix
- try to free indices that are not used
- polish for cases such as 2
as being distinct - ex of inf type
(a -> a) -> a <- (b -> Num) -> b - something about pseudo syntaxes in named type ('paramof', 'returnof', 'a=b', also '?' and '?abc')
something like $PYTHONSTARTUP, between prelude and user script
process description of defs (eg. markdown-ish?)
maybe name for var types in there
{1, 2, 3}, map lncould tostr in mapsplit :-:, map [add1]could tonum in mapadd 1, tonumcould tostr in between
-
constant folding; because pure, identify what is not compile-time known:
- can fold: literal (numbers, bytestrings), a list if all items can be folded, a call if all arguments are provided and can be folded
- cannot fold:
input, infinite sources cannot be turned into a finite structure but can still be expressed statically, 'control-point' functions and/or functions with side effects if ever
-
thunks? but I'm wondering if there is a way to even more directly put the instruction at the location at c-time rather than packing them at r-time
-
lifetime tracking, or maybe 'duplication tracking'
the GitHub "Need inspiration?" bit was "super-spoon"