ls but The Unix Way
spell.mywire.org:1960/~stack/0257.gmi
Where I build a composable ls toy...
Jan 10 · 4 months ago · 👍 RubyMaelstrom
11 Comments ↓
🚀 RubyMaelstrom · Jan 10 at 22:39:
Looks like a fun project, at the very least you'll understand better how ls does what it does!
Interesting argument!
That a tool as basic as `ls` should not be unix-y enough is certainly not uncontroversial - given that `-l` seems to be present already in the 1st edition: https://man.cat-v.org/unix-1st/1/ls
Looking forward to see where this path may lead you!
🐦 wasolili [...] · Jan 10 at 23:50:
I have a T-shirt with the summary of ls options, and it is a lot of text!
I would love to see this shirt
👻 darkghost · Jan 11 at 00:46:
I've actually been playing around with antique unix on SIMH and indeed ls has always functioned in a way that does not conform to the unix philosophy. It might just be the earliest example of the limitations of the unix philosophy because your toy program here shows the unix philosophical way to properly handle this. Sacrifices must be made in the name of usability. I about blew my lid trying to work in the hierarchical directories of RT-11. Changing directories means setting an environment variable! How? set default device:[directory] yes, the variable name is default! 🤯
*ahem* the apologists' response may be that it is still a text stream.
A few comments:
- Use TSV (Tab separated values) as more commands in Unix default to that as a separator, like cut and sort. Also, you program will mess up if a comma appears in a filename.
- If the filename is "." or "..", don't bother with it.
- Don't print the header line, or make it an option to print it.
Those alone will get rid of several commands in your pipe line with no real loss.
🚀 stack [OP] · Jan 11 at 17:54:
I just figured out that TSV is not like CSV, as it does not escape the data. Definitely more sensible.
I am kind of surprised there is no universal support for the ASCII field and record separator characters...
isn't there stat + du there to get everything out of the fs already instead of ls?
🚀 stack [OP] · Jan 12 at 18:58:
Yes, stat is yet another variation on the theme -- it can do what ls does and more:
Again, it does everything. You can get stat to output TSV I suppose, as it has a finer control of what is printed.
However, if you just want names of files in a directory there is no need to stat each file, right?
i got used to using du for traversing a dirent. i know that disk usage doesn't sound like the right name of a command for that. in the plan9 mailing lists i believe to have seem more than once the need for an abstracted command to walk a tree in order to capture what -R or du are doing here. i guess this paired with stat in some cases (cause you might be walking a VFS where stat almost makes no sense) and something to do the presentation is what could be the parts of ls you are talking about.
This is a tricky problem. In most Unix-like systems, the only character that cannot appear anywhere in a path is NUL. Therefore, NUL is the only character that could unambiguously separate paths in a listing. However, NUL is also used to terminate strings in C, so most POSIX utilities have undefined or unspecified behavor upon encountering it. Usually, they either strip the NULs or give an error.
It's a double-edged sword: by nature, a tool that could list any possible set of paths perfectly couldn't have its output processed as text by other tools, because the only entry separator that works perfectly is usually not allowed in text. I wouldn't be surprised if that's why `ls` was designed the way it was in the first place.
Other alternatives are binary formats, such as DER or SDSER formats. (In my idea of operating system design it does use something similar, rather than only text output; it is possible to make programs that do something similar on UNIX as well (although it is not quite the same, but can be similar).)