Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add time builtin #295

Closed
huiyiqun opened this issue Dec 7, 2016 · 14 comments
Closed

Add time builtin #295

huiyiqun opened this issue Dec 7, 2016 · 14 comments
Milestone

Comments

@huiyiqun
Copy link
Contributor

huiyiqun commented Dec 7, 2016

It's very inconvenient without it.

@Artoria2e5
Copy link

Artoria2e5 commented Dec 9, 2016

If you are looking for a "builtin command", GNU's time will certainly suffice. What we need from bash is the time syntax keyword, which can time a whole block of code as well as loops.

@huiyiqun
Copy link
Contributor Author

huiyiqun commented Dec 9, 2016

I guess buitin is enough for it could execute a function and outout a time delta?

@xiaq
Copy link
Member

xiaq commented Dec 9, 2016

There is an experimental -time builtin that takes a callable $fn, calls it, and writes out the time it takes:

~> -time { echo hello }
hello
286µs

It is a tricky to save the time in a variable and leave the output of $fn untouched, which is precisely why it is now experimental.

@Artoria2e5 there isn't a need for time to be a syntax keyboard since elvish has lambdas.

@huiyiqun
Copy link
Contributor Author

huiyiqun commented Dec 9, 2016

How about separating user time and system time?

@krader1961
Copy link
Contributor

It is a tricky to save the time in a variable and leave the output of $fn untouched...

Perhaps add a time option that specifies the var in which to store the time(s)? For example, -time &var=times { put time me }; put $@times. If the option is not used the time(s) are put to stdout in a human readable form as happens now and will be mixed into the stdout of the command/function. Note that if the option is used the value is a float64, not a human readable string, and has millisecond or microsecond precision. You can argue which units should be used but for this particular command microseconds should be good enough for the foreseeable future. Too, milliseconds is probably sufficiently granular today making microseconds a bit of future proofing.

@zzamboni
Copy link
Contributor

AFAICT, -time's output always has the following structure:

output from lambda
time of lambda

e.g.:

[~]─> put (-time { put abc })
▶ abc
▶ 26.887µs
[~]─> put (-time { put abc bcd })
▶ abc
▶ bcd
▶ 86.305µs
[~]─> put (-time { nop })
▶ 29.738µs
[~]─> put (-time { ls -d / })
▶ /
▶ 17.893638ms
[~]─> put (-time { ls / })
▶ Applications
▶ bin
▶ cores
...
▶ 12.060905ms

In my opinion, as long as this is consistent, the time and the output can be cleanly separated:

[~]─> @res = (-time { ls / })
[~]─> output time = $res[:-1] $res[-1]
[~]─> put $output
▶ [Applications bin cores dev Library opt private sbin System Users usr Volumes 'etc -> private/etc' 'home -> /System/Volumes/Data/home' 'Network -> /System/Volumes/Data/Network' 'tmp -> private/tmp' 'var -> private/var']
[~]─> put $time
▶ 14.491929ms

A couple of comments:

  • It would be nice if the output were returned first, so that separating it wouldn't be a two-step process (one could do time @out = (-time { ls / } )), but I think this would mean caching all the output from the lambda, so not sure if it's possible.
  • It would be nice to return the value in some defined unit as a number instead of specifying a unit suffix.

@krader1961
Copy link
Contributor

It would be nice to return the value in some defined unit as a number instead of specifying a unit suffix.

@zzamboni, I think there are two distinct use cases. The most common is using it interactively in an adhoc manner to time the execution of a function or command. In which case it should automatically write appropriately scaled values with named units (e.g., 3.7s) to stdout or stderr. The other is non-interactive where the function using time wants to capture the data. In that case it should be a set of float64 values in a specific unit (probably microseconds but perhaps milliseconds).

The main problem with the time command of POSIX shells is that they always write their output to the stderr byte stream. Which makes using them in a non-interactive context exceedingly difficult and prone to errors given that the command being timed can also write to the stderr byte stream. Elvish can do better by making it safe to use time in a non-interactive context without resorting to parsing the output. This is especially important with respect to future proofing the feature. What happens when we want time to include other information in its output?

@krader1961
Copy link
Contributor

It would be better, IMHO, if the output didn't have to be captured and split apart for a couple of reasons. First, it makes it harder to enhance the time command with additional information. It is fairly obvious the output should include elapsed, user mode, and system mode time in the next version of this command. What isn't obvious is whether we'll want to include other information in the future. Second, it is almost certainly going to be less efficient both in terms of memory and CPU. Capturing the output in a map var makes that far easier to do in a backward compatible manner with minimal overhead. In fact, I'm inclined to make capturing the output in a locally scoped var the default behavior. Then either the time command can automatically format it in a human readable form if used interactively, or an option can be added to do so, or an adjunct command can be used to format the data which makes it possible for the user to write a function to customize how the info is displayed.

Capturing the data in a map also makes it easy to wrap its use in a function that can do useful things like run the timed function n times and display just the best of the n runs or perform a basic statistical analysis of the results.

@xiaq
Copy link
Member

xiaq commented Apr 21, 2020

@krader1961 adding an option for which variable to save the time in can work but it requires making time. Normal commands do not have access to variables in other scopes.

A better way is to let time take a callback. It is slightly more verbose but also more flexible:

# To save time info in a variable:
time-info = ''
time { do-expensive-things } &cb=[i]{ time-info = $i }
# To write time info to stderr (default behavior when &cb is unspecified):
time { do-expensive-things } &cb=[i]{ echo $i >&2 }

The option needs a better name than &cb though.

@xiaq xiaq added this to the 0.14 milestone Apr 21, 2020
@xiaq xiaq closed this as completed in 90a34f2 Apr 26, 2020
@krader1961
Copy link
Contributor

@xiaq, Any objection to augmenting the callback to take a map rather than a simple scalar? It would be nice if there were a simple means of passing additional information in a backwards compatible manner. Specifically, I'm envisioning adding values like system and user mode CPU time on platforms where that makes sense (e.g., all UNIX platforms and probably Windows). For now it's fine if the map has a single elapsed key.

@xiaq
Copy link
Member

xiaq commented Apr 27, 2020

@krader1961 What you described is only applicable to external commands; see #990.

@krader1961
Copy link
Contributor

What you described is only applicable to external commands; see #990.

It is also applicable to arbitrary elvish code. Yes, there is the risk that work done by unrelated threads will be included in the data. That needs to be clearly documented. But that will normally be an insignificant source of variation.

@xiaq
Copy link
Member

xiaq commented Apr 28, 2020

@krader1961 I think you mean implementing this by taking the difference in rusage before and after the execution? That indeed works in simple scenarios but as you acknowledged, it includes all the work done by unrelated threads. This might be not so common, but it makes the API misleading on a fundamental level - if the command takes a closure, the user should expect it to give you information only pertaining to the closure.

A better approach is to expose Getrusage, and the user can take measurements themselves before and after executing a piece of code, and calculate the difference themselves.

@xiaq
Copy link
Member

xiaq commented Apr 28, 2020

See #992.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants