A few days ago I learned about pz
. A Python library that exposes a few simple one-letter shorthands for line-based editing of pipes at the command-line. I immediately thought there could be potential.
This is simple and clever. The default `s` variable holds the contents of stdin. @jbangdev idea? 🤔https://t.co/pBBcIfEIJb
— Edoardo Vacchi (@evacchi) February 12, 2022
I liked the idea so I pestered Max Andersen: what if JBang supported that kind of shorthand syntax? It turned out Max was already working on something:
This might be possible sooner than I thought it would be.pic.twitter.com/nO4CHgQenV
— Max Rydahl Andersen (@maxandersen) February 3, 2022
This is already available since JBang 0.90.0 (but you should use v0.90.1).
I was pretty sure that this new feature could be used to implement AWK-like scripting using Java:
This+data frame lib = jawk !
— Edoardo Vacchi (@evacchi) February 18, 2022
For the following days I have been playing around with a short Prelude
to enhance these kind of one-liners on JBang. But today I came across a blog post about Prig on Hacker News. Prig is «like AWK, but uses Go for “scripting”».
Well, that did it. I had to show I could do the same with JBang. I give you: Prelude.jsh
It is a very short collections of utilities. The main idea is that the class Line
may be used to split the fields of a stdin line into an AWK-like “record”. I didn’t call it record
not to confuse it with Java 16+’s record
feature.
class Line {
private final String line;
private final String pattern;
private final String[] fields;
public final int nf;
Line(String line_, String pattern) {
line = line_; fields = line.split(pattern);
nf = fields.length == 1 ? 0 : fields.length;
}
Line(String line) { this(line, "\\s+"); }
public String s(int n) { return (n == 0)? line : fields[n-1]; }
public int i(int n) { return Integer.parseInt(s(n)); }
public double d(int n) { return Double.parseDouble(s(n)); }
public String toString() { return line; }
}
All it does is splitting a line into whitespace-separated fields. You can access a field with
Line#s
. At index 0
you’ll find the entire line; at 1..n you’ll find the first..n-th field.
Line#d
, Line#i
, are just shorthands to convert the n-th field to a double or an integer.
Line#nf
gives you the number of fields, just like AWK
’s $NF
.
There you go. Now suppose you want to print the second field for each line in logs.txt
$ cat logs.txt
GET /robots.txt HTTP/1.1
HEAD /README.md HTTP/1.1
GET /wp-admin/ HTTP/1.0
You would write:
$ cat logs.txt | jbang -s Prelude.jsh -c \
'lines().map(Line::new).map(l -> l.s(2)).forEach(s -> println(s))'
of course, you’ll need to first download Prelude.jsh
:
$ curl -L https://bit.ly/prelude-jsh -o Prelude.jsh
oh, by the way, since JBang is awesome, you can also write:
$ cat logs.txt | jbang -s https://bit.ly/prelude-jsh -c \
'lines().map(Line::new).map(l -> l.s(2)).forEach(s -> println(s))'
🚨 Update: JBang v0.91.0 has become even awesomer: you can now skip the download and use the catalog I posted here
$ cat logs.txt | jbang -s prelude@evacchi -c \ 'lines().map(Line::new).map(l -> l.s(2)).forEach(s -> println(s))'
Now, because creating a Line
object, then mapping it and then printing each result is so frequent, I also defined a few shorthands for you:
Stream<Line> $lines() { return lines().map(Line::new); }
void $$(Function<Line, Object> f) { $lines().map(f).forEach(o -> println(o)); }
There you go, now you can write:
$ cat logs.txt | jbang -s Prelude.jsh -c '$$(l -> l.s(2))'
and of course, now we can implement the example found in the Prig blog post
$ cat logs.txt | jbang -s Prelude.jsh -c '$$(l -> "https://example.com" + l.s(2))'
But let’s see how we may implement the other examples as well.
The average of the third column in average.txt
:
$ cat average.txt
a b 400
c d 200
e f 200
g h 200
would be:
cat average.txt | jbang -s Prelude.jsh -c \
'$lines().mapToInt(l -> l.i(l.nf)).average().ifPresent(d -> println(d))'
Format into millis the third row in millis.txt
$ cat millis.txt
1 GET 3.14159
2 HEAD 4.0
3 GET 1.0
is just:
$ cat millis.txt | jbang -s Prelude.jsh -c \
'$lines().filter(l -> l.s(0).matches(".*(GET|HEAD).*"))
.forEach(l -> printf("%.0fms\n", l.d(3)*1000))'
This is only slightly more cumbersome because String#matches
matches against the entire line; hence requiring the leading .*(
and the trailing ).*
in the pattern. You may easily add a shorthand to Line
to decorate the pattern and avoid the noise.
e.g.:
boolean matches(int n, String pattern) { return s(n).matches(".*" + pattern + ".*"); }
Finally, counting word frequency in words.txt
$ cat words.txt
The foo barfs
foo the the the
In fact, this does not even require the Prelude
!
$ cat words.txt | jbang -c \
'println(lines().flatMap(s -> Stream.of(s.split("\s+")))
.collect(Collectors.groupingBy(Function.identity(), Collectors.counting())))'
The JBang line-editing feature does not stop here. You have all JBang’s power at your fingertips: you can declare dependencies, extend the prelude further… have fun!
Thanks to Ben Hoyt for nerd-sniping me!