diff options
author | Eric Sunshine <sunshine@sunshineco.com> | 2022-11-08 20:08:29 +0100 |
---|---|---|
committer | Taylor Blau <me@ttaylorr.com> | 2022-11-08 21:10:49 +0100 |
commit | 5f0321a9f25bd1366a6f13c2e2b51d6e1e2a5ded (patch) | |
tree | 7d68756da92ab7c7788082776db6e1d2b17a5f68 /shell.c | |
parent | chainlint: tighten accuracy when consuming input stream (diff) | |
download | git-5f0321a9f25bd1366a6f13c2e2b51d6e1e2a5ded.tar.xz git-5f0321a9f25bd1366a6f13c2e2b51d6e1e2a5ded.zip |
chainlint: latch start/end position of each token
When chainlint detects problems in a test, such as a broken &&-chain, it
prints out the test with "?!FOO?!" annotations inserted at each problem
location. However, rather than annotating the original test definition,
it instead dumps out a parsed token representation of the test. Since it
lacks comments, indentations, here-doc bodies, and so forth, this
tokenized representation can be difficult for the test author to digest
and relate back to the original test definition.
To address this shortcoming, an upcoming change will make it print out
an annotated copy of the original test definition rather than the
tokenized representation. In order to do so, it will need to know the
start and end positions of each token in the original test definition.
As preparation, upgrade TestParser::scan_token() to latch the start and
end position of the token being scanned, and return that information
along with the token itself. A subsequent change will take advantage of
this positional information.
In terms of implementation, TestParser::scan_token() is retrofitted to
return a tuple consisting of the token's lexeme and its start and end
positions, rather than returning just the lexeme. However, an
alternative would be to define a class which represents a token:
package Token;
sub new {
my ($class, $lexeme, $start, $end) = @_;
bless [$lexeme, $start, $end] => $class;
}
sub as_string {
my $self = shift @_;
return $self->[0];
}
sub compare {
my ($x, $y) = @_;
if (UNIVERSAL::isa($y, 'Token')) {
return $x->[0] cmp $y->[0];
}
return $x->[0] cmp $y;
}
use overload (
'""' => 'as_string',
'cmp' => 'compare'
);
The major benefit of the class-based approach is that it is entirely
non-invasive; it requires no additional changes to the rest of the
script since a Token converts automatically to a string, which is what
scan_token() historically returned.
The big downside to the Token approach, however, is that it is _slow_;
on this developer's (old) machine, it increases user-time by an
unacceptable seven seconds when scanning all test scripts in the
project. Hence, the simple tuple approach is employed instead since it
adds only a fraction of a second user-time.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Diffstat (limited to 'shell.c')
0 files changed, 0 insertions, 0 deletions