-
Notifications
You must be signed in to change notification settings - Fork 1
Expand file tree
/
Copy pathmanpage
More file actions
2081 lines (2006 loc) · 62 KB
/
manpage
File metadata and controls
2081 lines (2006 loc) · 62 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
.\" Manpage for specs.
.\" Open an issue at https://github.com/yoavnir/specs2016 to correct errors or typos
.mso www.tmac
.TH man 1 "1 May 2026" "0.9.9" "specs man page"
.SH NAME
specs \- a text processing tool
.SH SYNOPSIS
specs [switches] spec-units
.SH DESCRIPTION
.B specs
is a command line utility for parsing and re-arranging text input. It allows re-alignment of fields, some format conversion, and re-formatting multiple lines into single lines or vice versa. Input comes from standard input, and output flows to standard output.
.P
This is a re-writing of the specs pipeline stage from CMS, only changed quite a bit.
.P
This version is liberally based on the CMS Pipelines User's Guide and Reference
.URL "https://www.vm.ibm.com/library/740pdfs/74625200.pdf"
-- especially chapters 16, 24, and 20.
.SS "Spec Units"
.B Spec Units
are the building blocks of a
.B specs
.I specification.
Each spec unit tells the
.B specs
engine to perform some action. The most common spec unit is a
.I data field
which consists of five arguments, three of which may be omitted:
.P
.RS 5
[fieldIdentifier] InputPart [conversion] OutputPart [alignment]
.RE
.P
A
.B fieldIdentifier
is a single letter followed by a colon (like
.I a:
), that maps to the input or output of a single data field unit for later reference such as
in later data fields or in
.I expressions.
If the fieldIdentifier is at the start of the data field unit, it contains the input. If it
is the OutputPart, it contains the output. For example:
.RS 5
a: w1 ucase b:
.RE
.I a
contains the first word of the input, while
.I b
contains an uppercase version of
.I a
The
.B InputPart
argument may be any of the following:
.RS 3
.P
A range of characters, such as `5`, `3-7`, or `5.8`, the last one indicating 8 characters starting in the 5th position. Note that the indexing of characters is 1- rather than 0-based.
.P
A range of words, such as `w5` or `words 5-7`, where words are separated by one or more `wordseparator` characters -- locale-defined whitespace by default. The word indexing is 1-based.
.P
A range of fields, such as `fields 5` or `f5-7`, where fields are separated by exactly one `fieldseparator` character -- a tab by default. The field indexing is 1-based.
.P
.B TODclock
-- A 64-bit formatted timestamp, giving microseconds since the Unix epoch.
.P
.B DTODclock
-- A 64-bit formatted timestamp, giving microseconds since the Unix epoch. The difference is that TODclock shows the time when this run of
.I specs
begun, while DTODclock gives the time of producing the current record.
.P
.B NUMBER
or
.B RECNO
-- A record counter as a 10-digit decimal number.
.P
.B TIMEDIFF
-- A 12-digit decimal number indicating the number of microseconds since the invocation of the program.
.P
The
.B ID
keyword followed by a previously defined
.B FieldIdentifier.
.P
The
.B PRINT
keyword followed by a calculated expression.
.B PRINT
can be replaced by a question mark (
.B ?
) either as its own command-line token or prefixing the calculated expression.
.P
A string literal, optionally enclosed by delimiters, such as `/TODclock/` or `'NUMBER'`. Note that to include the single quotes on the command line requires you to enclose them in double quotes.
.P
A
.B SUBSTring
of another InputPart.
.RE
The
.B OutputPart
argument specifies where to put the source:
.RS 3
.P
Absolute position (such as `1`) with no limit on output length.
.P
A range (such as `1-5` or `1.5`)
.P
.B N
or
.B NEXT
for placing the output immediately after the previous output.
.P
.B NW
or
.B NEXTWORD
for placing the output following a space character after the previous output.
.P
.B NF
or
.B NEXTFIELD
for placing the output following a tab character after the previous output.
.P
A field identifier specification such as
.B a:
or
.B C:
\&.
.RE
The alignment argument can be `l`, `c`, or `r`, for
.B left
,
.B center
, and
.B right
respectively. Yes, you can use the entire word. In fact, for compatibility with other countries, you can even spell it
.B centre.
.RE
The
.B OutputPart
argument can also be specified as a
.B Composed Output Placement
argument. It is enclosed in non-optional parenthesis and contains 1, 2, or 3 comma-separated expressions. The expressions are evaluated at each cycle to produce the starting column, the field width, and the alignment of the field.
.P
With one expression, that expression is the starting column. For example:
.P
specs w1 (@cols)
.P
places the first word at the column given by the
.I @cols
label.
.P
With two expressions, they are the starting column and the width. A width of zero is a special value denoting the natural width of the output. For example:
.P
specs w1 (1,10)
.P
places the first word in columns 1-10.
.P
With three expressions, the third expression is evaluated as a string and determines the alignment. If that value begins with a capital or lower-case
.I c
the field is centered; if it begins with a capital or lower-case
.I r
the field is right-aligned; otherwise it is left-aligned. For example:
.P
specs /hello/ (1,@cols,'R')
.P
right-aligns the string in the available display width.
.P
If the alignment field is two characters in length, the second character is a digit between
.I 1-5,
and the content is longer than the output field, the truncation will use an ellipsis as follows: For a value of
.I 1,
.B specs
will output an ellipsis followed by the suffix of the content; for
.I 5,
.B specs
will output a prefix followed by an ellipsis; for
.I 2, 3,
and
.I 4,
.B specs
will output a prefix, an ellipsis and a suffix, with the prefix being one third, one half, or two thirds of the output string respectively. If a composed output placement argument appears in a particular data field, neither a regular Output Placement nor an Alignment argument may appear. The first arguments in a composed output placement can be elided, as in
.I (,,'R').
If that is done, the default for the output start is the next place, equivalent to the function
.I next(),
while the default for the second argument is the rest of the line, based on the
.I @cols
label. In other words, omitted arguments default the same way they would if the field were placed at the next available position and allowed to use the remaining width.
The conversion argument can specify any of the following conversions:
.IP "rot13" 3
Encrypts the bytes using the ROT-13 cipher.
.IP "C2B" 3
Converts characters to binary: "AB" --> "0010000100100010".
.IP "C2X" 3
Converts characters to hexadecimal: "AB" --> "4142".
.IP "B2C" 3
Converts binary to characters: "0010000100100010" --> "AB". Will throw an exception if called with an invalid character.
.IP "X2CH" 3
Converts hexadecimal to characters: "4142" --> "AB". Will throw an exception if called with an invalid character.
.IP "b2x" 3
Converts binary data to hex.
.IP "D2X" 3
Convert decimal to hex: "314159265" --> "12b9b0a1".
.IP "X2D" 3
Convert hex to decimal: "12b9b0a1" --> "314159265".
.IP "ucase" 3
Converts text to uppercase.
.IP "lcase" 3
Converts text to lowercase.
.IP "BSWAP" 3
Byte Swap. reverses the order of bytes. "AB" --> "BA"
.IP "ti2f" 3
Convert internal time format (8-byte microseconds since the epoch) to printable format using the conventions of strftime, plus %xf for fractional seconds, where x represents number of digits from 0 to 6.
.IP "tf2i" 3
Convert printable time format to the internal 8-byte representation.
.IP "s2tf" 3
Convert a decimal number with up to six decimal places, representing seconds since the epoch, to printable format using the conventions of strftime, plus %xf for fractional seconds, where x represents number of digits from 0 to 6.
.IP "tf2s" 3
Convert printable time format to a decimal number, representing seconds since the epoch.
.IP "mcs2tf" 3
Convert a number, representing microseconds since the epoch, to printable format using the conventions of strftime, plus %xf for fractional seconds, where x represents number of digits from 0 to 6.
.IP "tf2mcs" 3
Convert printable time format to a number, representing microseconds since the epoch.
.SS "Other Spec Units"
There are also other spec units, that may be used:
.IP "READ" 3
Causes the program to read the next line of input. If we have already read the last line, the read line is taken to be the empty string.
.IP "READSTOP" 3
Causes the program to read the next line of input. If we have already read the last line, no more processing is done for this iteration.
.IP "UNREAD" 3
Causes the program to push back the current active record back to the reader, so that the next iteration of the specification or the next
.I READ
or
.I READSTOP
spec unit will read it back. This is useful when looping with READ searching for the next line to run the specification on.
.IP "WRITE" 3
Immediately writes the current output record and resets the output buffer to empty so that subsequent spec units in the same cycle can build a new output record.
.IP "NOWRITE" 3
Suppresses the output record for the current processing cycle. The specification continues to execute normally, but no output record is written at the end of the cycle.
.I NOPRINT
is a synonym for
.I NOWRITE
.IP "NOPRINT" 3
A synonym for
.I NOWRITE
.IP "ASSERT" 3
Followed by a condition, this keyword halts the specification immediately if the condition evaluates to
.B false
or zero. The normal use of assertions is to make sure that the read data or the internal state makes sense.
.IP "ABEND" 3
The
.I ABnormal END
keyword is used to abruptly end the run of the specification. The token that follows the ABEND token is output in the error message. ABEND units make sense only within conditionals as they are never part of the normal processing or a record.
.IP "WORDSEPARATOR and FIELDSEPARATOR" 3
Declares a string of characters to be the word of field separators respectively, which affects word and field ranges. For
.I WORDSEPARATOR
it is possible to use the special value
.I default
to make all whitespace defined by the locale work as a word separator. This keyword can also be used locally in a
.I SUBSTRING
spec unit.
.IP "REDO" 3
Causes the current output line to become the new input line.
.IP "SPLITW" 3
Splits the current input record by words and produces one output record for each word.
Any spec units before the
.B SPLITW
form a prefix that is replicated in each output record.
Any spec units after the
.B SPLITW
(such as
.B REDO
) are applied to each output record individually.
An optional
.B WORDSEPARATOR
may follow the
.B SPLITW
keyword to specify a custom word separator for splitting.
An optional
.B OF
clause may follow to specify which part of the input record to split. The
.B OF
clause accepts the same input parts as
.B SUBSTRING:
a character range, a word range, or a field range.
.P
Example:
.P
echo "one two three" | specs splitw 1
.P
produces three output records: "one", "two", "three".
.P
echo "one two three" | specs 'prefix:' 1 splitw nextword
.P
produces: "prefix: one", "prefix: two", "prefix: three".
.P
echo "a:b:c are items" | specs splitf fs : of word 1 1
.P
produces: "a", "b", "c" (splitting only the first word by field separator).
.P
Nested
.B SPLITW
or
.B SPLITF
units in the same specification are not allowed.
.IP "SPLITF" 3
Splits the current input record by fields and produces one output record for each field.
Works like
.B SPLITW
but splits by field separator instead of word separator. Empty fields are preserved.
An optional
.B FIELDSEPARATOR
may follow the
.B SPLITF
keyword to specify a custom field separator for splitting.
An optional
.B OF
clause may follow to specify which part of the input record to split, accepting the same input parts as
.B SUBSTRING.
.IP "SET" 3
Followed by a single assignment operation or multiple assignment operations separated by semicolons, SET implements this and assigns a value to the system counters. See the Expressions and Assignments section.
.IP "CONTINUE" 3
Stops execution of the current cycle immediately and goes on to the next record. Any output already accumulated for the current cycle remains in the output buffer, and is handled by the normal end-of-cycle logic: it is written, unless suppressed by
.B NOWRITE
or by
.B PRINTONLY,
and if
.B PRINTONLY KEEP
is in effect, the buffered output may be carried forward instead of being discarded. This is typically used within an
.B IF
block.
.IP "SKIP-WHILE and SKIP-UNTIL" 3
These statements are one-time gates for the beginning of a specification. They cause the specification to skip input lines
.B while
a condition is true, or
.B until
the condition becomes true. More precisely, each of them stops the current cycle and goes on to the next record, similar to the
.B CONTINUE
unit, but once the pass condition occurs once, the condition is never evaluated again and future records are passed unchecked. It usually only makes sense to place
.B SKIP-WHILE
and
.B SKIP-UNTIL
units at the beginning of the specification.
.P
For example:
.P
specs w1 a: SKIP-WHILE "a<20200701" 1-* 1
.P
skips records until the first record whose first word is at least 20200701, then processes that record and all subsequent records normally.
.P
Similarly:
.P
specs w1 a: SKIP-UNTIL "a=BEGIN" 1-* 1
.P
skips records until the first record whose first word is BEGIN, then processes that record and all subsequent records normally.
.SS "MainOptions"
These are optional spec units that appear at the beginning of the specification and modify the behavior of the entire specification.
.IP "STOP" 3
This option is followed by either the keyword
.B ALLEOF,
the keyword
.B ANYEOF,
or a number indicating an input stream. This indicates when the specification stops. The default is
.B ALLEOF
which means that the specification terminates when every input stream is exhausted. When some but not all of the streams are exhausted, those that are get treated as if they emit empty records. With
.B ANYEOF
the specification terminates when any of the streams is exhausted. With a numeric value the specification terminates when the specified stream is exhausted. Other streams, if exhausted are treated as if they emit empty records.
.IP "PRINTONLY" 3
This option instructs
.B specs
to suppress output records unless a specified
.I break level
is established. The break level is either a field identifier (case matters) or it can be the keyword
.B EOF
which specifies records are suppressed until the input is exhausted or the condition specified with
.B STOP
is satisfied.
.IP "KEEP" 3
This option, always following
.B PRINTONLY
instructs
.B specs
not to reset the output buffer when a record in not output due to break level not being established. This allows the content from several records to be aggregated into a single output record.
.SS "Conditions and Loops"
A specification can include conditions and loops.
Conditions begin with the word
.B if
followed by an
.B expression
that evaluates to true of false, followed by the token
.B then,
followed by some
.B Spec Units.
Those will be executed only if the condition evaluates to true. They may be followed by an
.B else
token followed by more
.B Spec Units
that will be executed if the expression is not true, or they may be followed by an
.B elseif
token with its own condition,
.B then
token, and set of
.B Spec Units.
The chain of
.B elseif
tokens may be arbitrarily long, but there may only be at most one
.B else
token. The conditional block ends with an
.B endif
token. For example:
.RS 5
.B if
#2 > 5
.B then
.RS 4
/big/ 1
.RE
.B elseif
#2 > 3
.B then
.RS 4
/medium/ 1
.RE
.B else
.RS 4
/small/ 1
.RE
.B endif
.RE
The loop available in
.B specs
is a
.B while
loop. It begins with the
.B while
token, followed by an
.B expression
that evaluates to true of false, followed by the token
.B do,
and a series of
.B Spec Units
that will be executed as long as the expression evaluates to true. The series of
.B Spec Units is terminated by the token
.B done.
Example:
.RS 5
.B while
#2 > 0
.B do
.RS 4
print /#2/ 1
.RS 1
.RE
write
.RS 1
.RE
set /#2 -= 1/
.RE
.B done
.RE
.SS "While-Guard"
The
.I While-Guard
feature, available from version 0.9.5 makes an effort to prevent
.B specs
from entering endless loops. For example, consider this specification:
.RS 5
.B set
"#1 := 5"
.RS 1
.RE
.B while
#1 > 3
.B do
.RS 4
.B set
"#1 += 1"
.RE
.B done
.RE
Without
.I while-guard
this specification will loop forever. To solve this,
.B specs
keeps a counter for each
.I while
statement that it increments each time the loop is entered. This counter is reset to zero if a record is read from any input. The
.B specs
program exists when the counter reaches 5000.
.I While-Guard
is not perfect. To disable it, you can use the command-line switch
.B --no-while-guard
or you can override the maximum iteration count at which the program exist by setting the
.B while-guard-limit
to some integer value.
.SS "Control Breaks"
The
.B Field Identifiers
have another use. When used with the
.B BREAK
keyword or the
.B break()
function they act as flow control. A
.B break level
is
.I established
when the value of the corresponding field identifier changes from the previous iteration. For
example, consider a three-field CSV file where the first field is the department name, the second is
the employee's first name, and the third is his or her last name. Suppose further that the file is
sorted by department name. You can print this out without repeating the department name like this:
FIELDSEPARATOR ,
c: FIELD 1 .
FIELD 3 10
/,/ NEXT
FIELD 2 NEXTWORD
BREAK c
ID c 1
.SS "RunIn and RunOut Cycles"
A
.B cycle
is a single run of the specification on the current active input record. A cycle may read additional input records, produce zero output records, or produce multiple output records. If the specification contains
.B read
or
.B readstop
tokens, a single cycle can consume more than one input record.
The
.B runin
cycle is the first cycle. In the runin cycle, the function
.B first()
returns 1. This can be used for initial processing such as printing of headers or setting initial values.
The
.B runout
cycle happens
.I after
the last line has been read, but only when the specification requires a runout cycle. It consists of the spec items that follow the
.B EOF
token, or (when
.I select second
is used) conditional specifications with the
.B eof()
function. Example:
if first() then
/Item/ 1 /Square/ nw write
/====/ 1 /======/ nw write
endif
a: w1 1.4 right
print "a*a" 6.6 right
set '#0+=(a*a)'
EOF
/==========/ 1 write
/Total:/ 1
print #0 nw
.SS "Input Streams"
The keyword
.B SELECT
along with the keywords
.B FIRST
and
.B SECOND
can be used to select between two input stations. The
.B FIRST
input station is the regular primary input. The
.B SECOND
input station holds the previous record from the primary input.
Imagine an input stream that has the natural numbers. Let's run the following specification:
specs w1 1 /- 1 =/ NW SELECT SECOND w1 nw
The result will be the following:
1 - 1 =
2 - 1 = 1
3 - 1 = 2
4 - 1 = 3
5 - 1 = 4
and so on.
The keyword
.B SELECT
can also be used with a number between 1 and 8 to denote different input streams. These extra streams are specified with the
.B --is2 - --is8
command-line switches to specify the files that contain the other input streams. Every cycle of the specification begins with
.B specs
set to stream #1. For example, suppose we have two files: f1 and f2, both of which contain lists of numbers, and we want to add them:
specs -i f1 --is2 f2 a: W1 1 '+' NW SELECT 2 b: W1 NW '=' NW "?a+b" NW
If both files contained the natural numbers we would get the following output:
1 + 1 = 2
2 + 2 = 4
3 + 3 = 6
4 + 4 = 8
5 + 5 = 10
and so on.
.SS "Output Streams"
The keyword
.B OUTSTREAM
can be used to select between multiple output streams. Output stream number #1 is the primary default output stream. Output streams #2 through #8 can be files specified with the
.B --os2 - --os8
command-line switches. There is one additional output stream, which is the standard error. It is selected with the keyword
.B STDERR
for a total of up to 9 output streams. Example:
ls -l | specs --os2 filesizes --os3 owners
W9 1 WRITE
OUTSTREAM 2 W5 1 WRITE
OUTSTREAM 3 w3 1
This specification will print the list of owners to file
.I owners
the file sizes to file
.I filesizes
and the file names to standard output.
.SS "Expressions and Assignments"
Expressions are used in PRINT data units as well as in assignments. Assignments are used in SET Spec Units.
Expressions are made up of numbers, field identifiers (no colon needed), and counter numbers preceded by a hash mark. They allow ordinary arithmetic and logical operations as well as function calls for pre-defined functions:
.IP "Unary Operators" 3
+ (plus - does nothing), - (minus), ! (logical NOT)
.IP "Binary Arithmetic Operators" 3
+, -, *. / (division), // (integer division), % (remainder)
.IP "Binary String Operator" 3
|| (string append)
.IP "Binary Arithmetic/String Logical Operators" 3
<, >, <=, >=, =, !=
.IP "Binary Strict Logical Operators" 3
<<, >>, <<=, >>=, ==. !==
.IP "Binary Logical Operators" 3
& (AND), | (OR)
.P
Assignments assign the result of an expression into a
.I numbered counter.
For example:
specs SET #3:=b+3
.P
.I SET
statements can also be compound. For example:
specs SET "#3:=b+3 ; #3:=b-3"
.P
Assignment can also be used as expressions. When used as such, the value returned is the content of the counter after executing the assignment. For example, the following specification:
specs print "#0:=2+2" 1
OR
specs "?#0:=2+2"
will output 4. The assignment is performed and the value stored in the counter.
.IP "Assignment Operators" 3
:=, +=, -=, *=, /=, //=, %=, ||=
.IP "At-sign (@) Operator" 3
The at-sign allows the inclusion of user-defined and system-defined labels as strings in expressions. The double at-sign (@@) substitutes for the entire input record.
.SS "Built-In Functions"
.IP "abs(x)" 3
The absolute function returns the absolute value of the number passed to it. Will return an integer if the parameter is integer, or float otherwise.
.IP "ceil(x), floor(x), round(x,d)" 3
These function return the closest numbers to
.I x
:
.B ceil
returns a whole number no smaller than
.I x
;
.B floor
returns a whole number no greater than
.I x
;
.B round
returns a the number closest to
.I x
with up to
.I d
decimal places. If
.I d
is omitted,
.B round
returns the closest whole number.
.IP "fact(n)" 3
The
.B fact
function returns the factorial of
.I n
.IP "combinations(n,k) and permutations(n,k)" 3
These function give the number of ways in which
.I k
elements can be chosen from a set of
.I n
elements. The difference is that with
.B combinations
the order of the elements within the subgroup (or the order in which they were selected)
does not matter, while with
.B permutations,
it does. For example, if we are choosing 2 elements from a group of 6,
.B permutations
will return 30, because there are 6 ways to choose the first element and 5 to choose the
second element. On the other hand,
.B combinations
will return 15, because it doesn't matter which of the two chosen elements was chosen first.
.IP "fmt(value,format,digits,decimal,separator)" 3
The
.B fmt
function formats a floating-point
.B value
as a string. The
.B format
argument can be omitted, or it can begin with
.I f
for a
.I fixed
number of
.B digits
after the decimal point, or
.I s
for
.I scientific
notation. When omitted, the
.B digits
argument sets the total number of digits displayed. The
.B decimal
argument sets the character used for the decimal point (default is a period), while the
.B separator
argument sets the character used as thousands separator (default is none).
.IP "pretty(value,flimit,ilimit,locale)" 3
The
.B pretty
function formats the number in
.B value
as a printable string with commas. The optional
.B flimit
and
.B ilimit
parameters determine how many digits the floating-point or integer number should have (to the left of the decimal point) before the function switches to scientific notation. These parameters default to 10 and infinite digits respectively. The number is formatted according to the locale specified in the
.B locale
configuration string. The optional
.B locale
parameter overrides that configuration.
.IP "pow(x,y)" 3
The power function returns x to the power of y. Much like arithmetic operations, if either operand is float, the result is a float. If both operands are integers, the result is an integer. Otherwise, if both operands are whole numbers, the result is integer, otherwise it is float.
.IP "rand(x)" 3
The random function returns a random integer value between 0 up to and not including
.B x.
If
.B x
is omitted, returns a random
.I real
value no smaller than 0.0 and smaller than 1.0.
.IP "sin(x), cos(x), tan(x), dsin(x), dcos(x), dtan(x)" 3
These trigonometric functions return the sine, cosine and tangent of an angle.
.B sin, cos,
and
.B tan
accept an argument in radians, while
.B dsin, dcos,
and
.B dtan
accept an argument in degrees.
.IP "arcsin(x), arccos(x), arctan(x), arcdsin(x), arcdcos(x), arcdtan(x)" 3
These trigonometric functions return the inverse sine, cosine and tangent.
.B arcsin, arccos,
and
.B arctan
return an angle in radians, while
.B arcdsin, arcdcos,
and
.B arcdtan
return an angle in degrees.
.IP "exp(x) and log(x,base)" 3
These return the
.B exponent
of x or the
.B logarithm
of x. By default, the logarithm function returns the
.I natural
log.
.IP "string(x)" 3
Returns the same value as the argument, but forced to be stored as a string. There is nothing preventing this from later evaluating as a number, so
.I string(3)+2
will evaluate to 5.
.IP "substitute(haystack,needle,subst,max)"
Returns the string in
.I haystack
where occurences of
.I needle
are replaced by the string
.I subst
for at most
.I max
(by defualt: 1) times. The special value
.B """U"""
for
.I max
is used to indicate that all occurrences should be replaced.
.IP "sqrt(x)" 3
The square root function always returns a float.
.IP "c2u(x)" 3
Returns an unsigned decimal number from the binary representation of x.
.IP "c2d(x)" 3
Returns a signed decimal number from the binary representation of x.
.IP "c2f(x)" 3
Returns a floating point number assuming that x is a binary representation of a floating point number in the native encoding of the platform. On most platforms valid lengths are 4, 8, and 16 bytes (32, 64, and 128 bits).
.IP "frombin(x)" 3
Returns a decimal number from the binary representation in x.
.IP "tobine(x,d)" 3
Returns a binary representation of the unsigned integer in x with field of length d bits. Valid values for d are 8, 16, 32, and 64.
.IP "tobin(x)" 3
Returns a binary representation of the unsigned integer in x. The field length is automatically determined by the value of x, but will be 1, 2, 4, or 8 characters in length.
.IP "length(x)" 3
Returns the length of the argument when viewed as a string. For example, len(37) is 2; len('hello') is 5.
.IP "first()" 3
Returns 1 during the
.B runin
cycle, and zero otherwise.
.IP "eof()" 3
Returns 1 during the
.B runout
cycle, and zero otherwise.
.IP "number()" 3
Returns the number of times the specification has so far been run on different records.
.IP "recno()" 3
Returns the number of input records that have been read so far. If the specification contains no
.I READ
or
.I READSTOP
spec units, then this number will be equal to the result of
.B number().
Otherwise, it will be higher.
.IP "record()" 3
Returns the entire input record. Equivalent to
.B range(1,-1)
.IP "range(s,e)" 3
Returns the range of characters from
.B s
(default first) to
.B e
(default last) from the input record. If the end of the range exceeds the end of the input, the result is truncated at the end. If the start of the range exceeds the end of the input, the result is the empty string. If the start is greater than the end, the function returns NaN.
.IP "wordcount(s,p)" 3
Returns the number of words in the string
.B s,
or in the current record if
.B s
is not specified. The separator used is
.B p.
If
.B p
is not specified, the separator is the current word separator if processing the current record, or a
.I blank space
if processing
.B s.
.IP "word(i)" 3
Returns the i-th word in the input record.
.IP "wordstart(i)" 3
Returns the starting position of the i-th word in the input record.
.IP "wordend(i)" 3
Returns the ending position of the i-th word in the input record.
.I range(wordstart(i),wordend(i))
is equivalent to
.I word(i)
but less efficient.
.IP "wordrange(s,e)" 3
Returns the string from the start of the
.B s-th
word (default first) to the end of the
.B e-th
word (default last). It is also equivalent to
.I range(wordstart(s),wordend(e))
but less efficient.
.IP "fieldcount(s,p)" 3
Returns the number of fields in the string
.B s,
or in the current record if
.B s
is not specified. The separator used is
.B p.
If
.B p
is not specified, the separator is the current field separator if processing the current record, or a
.I tab
if processing
.B s.
.IP "field(i)" 3
Returns the i-th field in the input record.
.IP "fieldstart(i)" 3
Returns the starting position of the i-th field in the input record.
.IP "fieldend(i)" 3
Returns the ending position of the i-th field in the input record.
.I range(fieldstart(i),fieldend(i))
is equivalent to
.I field(i)
but less efficient.
.IP "fieldrange(s,e)" 3
Returns the string from the start of the
.B s-th
field (default first) to the end of the
.B e-th
field (default last). It is also equivalent to
.I range(fieldstart(s),fieldend(e))
but less efficient.
.IP "splus(s,o,[l])" 3
Returns a substring of the current record, starting with the first character at offset
.B o
from the first instance of the string
.B s,
and having a length of
.B l
which defaults to 1.
.B o
may be non-positive. If the search string
.B s
is not found or if the start of the substring underflows or overflows the input record, an empty string is returned.
.IP "wplus(s,o,[l])" 3
Returns a substring of the current record, starting with the first word at offset
.B o
(measured in words) from the first instance of the word
.B s,
and having a length of
.B l
words, which defaults to 1.
.B o
may be non-positive. If the search word
.B s
is not found or if the start of the substring underflows or overflows the input record, an empty string is returned.
.IP "fplus(s,o,[l])" 3
Returns a substring of the current record, starting with the first field at offset
.B o
(measured in fields) from the first instance of the field
.B s,
and having a length of
.B l
fields, which defaults to 1.
.B o
may be non-positive. If the search field
.B s
is not found or if the start of the substring underflows or overflows the input record, an empty string is returned.
.IP "tf2mcs(string,format)" 3
Converts a string containing a date and time representation using the format specified in the
.I format
argument to a floating point or integer number representing the number of microseconds since the UNIX epoch. Formatting is as described for the similarly named conversion.
.IP "mcs2tf(microseconds,format)" 3
Converts a number (integer or float) representing the number of microseconds since the UNIX epoch to a printable format.
.IP "tf2s(string,format)" 3
Converts a string containing a date and time representation using the format specified in the
.I format
argument to a floating point or integer number representing the number of seconds since the UNIX epoch. Formatting is as that of tf2mcs.
.IP "s2tf(seconds,format)" 3
Converts a number (integer or float) representing the number of seconds since the UNIX epoch to a printable format.
.IP "x2d(string,length)" 3
Returns the value in
.I string
converted to decimal. If
.I length is missing or non-positive, the resulting decimal is signed, otherwise it is unsigned.
.IP "next()" 3
Returns the column of the next character to print if a
.I spec unit
specifies the
.B NEXT
position.
.IP "exact(expression)" 3
Returns whether the evaluation of
.I expression
results in an exact result (1) or not (0).
.B NOTE:
This function has some limitations. For example, Python functions are always taken to return an exact value when they return an integer or a string, but an inexact value when they return a float, which may or may not be correct. When unsure, the
.B exact
function errs on the side of returning 0.
.SS "Built-In String Functions"
.IP "abbrev(information, info, len)" 3
Returns
.I 1