New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
to-lines
should support a configurable EOL char
#1070
Comments
That's not entirely true since it is considerably slower than the
It's hard to see how the |
Note that if |
On further reflection I think |
to-lines
should support a configurable EOL char
However, one property that it doesn't have is streaming - The point of |
I don't understand why introducing a new command is preferable to adding a I'm also dismayed that the second example is 1.5 times slower than the first example. I would have expected the two examples to be very close in execution time. With the second example possibly a smidge slower due to the second pipe but not 1.5x slower. Having said that, those aren't particularly realistic examples. More realistic would be comparing 1024 instances of a 128 byte literal:
The |
You can of course bend the language and generalize the concept of "line" - but would a new user be able to guess that? When a new user wants add NUL to each string in a stream, would they look for |
They might if something like an |
This change, combined with other changes to `edit:command-history`, makes feeding its output into `fzf` extremely fast. Which makes it a practical alternative to using the builtin Ctrl-R binding for searching command history. It's also just generally useful to have efficient ways to process "lines" that are null terminated rather than newline (or cr-nl on Windows). Resolves elves#1070 Related elves#1053
This change, combined with other changes to `edit:command-history`, makes feeding its output into `fzf` extremely fast. Which makes it a practical alternative to using the builtin Ctrl-R binding for searching command history. It's also just generally useful to have efficient ways to process "lines" that are null terminated rather than newline (or cr-nl on Windows). Resolves elves#1070 Related elves#1053
My opinion remains that this functionality should be supported by a different pair of commands. |
@xiaq, I think adding another pair of commands is a mistake for a couple of reasons. First, despite what I wrote earlier I no longer think there is any point in supporting arbitrary "line" terminators. I can't think of a single instance (at least in the recent past) where I have needed to process lines that were not null or (cr)nl terminated. Second, I think you're too hung up on narrow definition of "line". You wrote earlier:
I disagree. And I would bet a survey of Elvish users would agree with me. Would a new user who wants to add, or split on, NUL when writing or reading a byte stream know to look for |
Also, if we add the * |
@xiaq wrote on IRC that the goal is to avoid conflicts between builtins and "widely used programs"; therefore, there isn't much risk in new It still seems to me that introducing new commands in the |
Please see #1070 (comment), the problem remains unaddressed in how new users would discover that a command to parse null-terminated strings is called |
I don't think that's a good idea either - this is now twisting the definition of "end of lines".
They do but would they call them "lines"? I understand that it makes sense for you to generalize the concept of "lines" and it may even seem quite elegant. If Elvish is some kind of proprietary text-processing software that specializes in text parsing and generation, I will be convinced that it's better to say "lines" instead of saying "strings that are terminated by some character" every time. But Elvish is not that, and this is really an unnecessary deviation from common sense. You can argue with me but you can't argue with common sense. |
How will new users discover that they should use
Because a "line" in this context is merely a sequence of bytes with a particular sentinel: I'm going to cave in since it seems I'm not going to be able to convince you of the correctness of my perspective. 😞 Getting this feature implemented is more important for the improved performance of common scenarios than the specific names of the commands that provide the feature. |
If I look at a list of functions - for example, by pressing Tab at the beginning of a command - and I see
Whether or not to generalize the concept of "lines" to null-terminated strings, is a design choice, and "correctness" is always relative to some extent when it comes to design choices. It depends on the designer's past experience of what worked and what didn't, and the designer's intended usage scenarios. I'm sure that this generalization makes a ton of sense to you, and I've acknowledged that your perspective is perfectly valid. It's just not my perspective, even after me taking your arguments into account. It's certainly not great when people draw different conclusions when presented with the same argument. But that's part of human nature. But there's also something in general I wish to talk about. I find it really counter-productive for you to frame design discussions as one person convincing the other that their perspective is "correct". Elvish is not the "correct" shell. It's never intended to be. I wouldn't even claim that Elvish is more "correct" than POSIX shell. It just makes much more sense for me. Obviously it is not entirely subjective - I listen to other people's arguments and try understand them to the extent allowed by my experience. But at the end of the day I draw my conclusion and that's what ended up being committed into Elvish's codebase. If you are on a quest to find the shell that feels most correct to you, the respectful thing to do is starting your own project, not trying to argue me into building Elvish in your preferred manner. The Elvish project is not some kind of Unix shell elysium, and I'm no gatekeeper to such elysium. The project is a small garden I take care of as a hobby. I open it up for other people to enjoy because that's part of the hobby, but ultimately I decide which flowers are grown in the garden. If you do decide to continue contributing to Elvish, please understand that your "correct" shell is never going to be exactly the same as my "correct" shell, and please respect everyone else's individual agency. |
This change makes feeding output to commands which handle NUL terminated "lines" (e.g., `fzf -read0` or `xargs -0`) extremely fast compared to using an explicit Elvish loop that does `print $val"\x00"`. Similarly for handling input from commands that produce NUL terminated "lines" (e.g., `find . -print0`) compared to an Elvish loop using `read-upto "\x00"`. Resolves elves#1070 Related elves#1053
This change makes feeding output to commands which handle NUL terminated "lines" (e.g., `fzf -read0` or `xargs -0`) extremely fast compared to using an explicit Elvish loop that does `print $val"\x00"`. Similarly for handling input from commands that produce NUL terminated "lines" (e.g., `find . -print0`) compared to an Elvish loop using `read-upto "\x00"`. Resolves #1070 Related #1053
I was working on a change to the elvish unit test framework to avoid deadlocks due to the shell writing more output than can be buffered by a pipe. So I wrote a test that did, among other things, this:
What I want is exactly 128 KiB of the letter "x", without a trailing newline, written to stdout. What I get as a consequence of the
to-lines
is the 128 KiB of "x"s plus a newline. The solution was to replace the above with this:I'm perfectly happy with the solution above. What I'm wondering is how many other people would have reached for the
to-lines
command like I did and perhaps not notice they were getting an unwanted newline? Also, should there be an alternative, such asto-lines &n
, to suppress the newline that might be easier to discover and more intuitive for some people?The text was updated successfully, but these errors were encountered: