Skip to content

Commit

Permalink
feat(fmt): fmt comments with local insertions (FuelLabs#4192)
Browse files Browse the repository at this point in the history
## Description

Closes FuelLabs#3938

### Problem

I've submitted a few PRs (FuelLabs#4129, FuelLabs#4093, FuelLabs#4005) implementing a new way of
formatting comments in tackling FuelLabs#3938, but after submitting FuelLabs#4129 and
following that working on formatting storage comments in a local branch
realized that the method of formatting previously by writing
`write_comments(..)` between every span is getting very unwieldy. That
method results in a lot of repeated calls to `write_comments()`, and in
some cases where we have to check if some value exists, it would lead to
even more vertical space taken up because we have to do something like
this:

```rust
...
if let Some(item) = my_item {
  write_comments(
    formatted_code,
    start..end,
    formatter,
  )
  // do other things
}
...
```

My sense is that this will probably lead to the formatter code being
hard to maintain in future because of the length of each `Format`
implementations, and having to work your way through a lot of
conditional code and repeated calls to `write_comments(..)` isn't a very
nice experience. In FuelLabs#4005, the comment formatting for `ItemAbi` alone
led to **56 additional line insertions** in that PR (previously it was
54 LoC, which means the LoC doubled!)

### Solution

We adapt the original approach of inserting comments, but this time on a
local scale. That means we parse each node on its own, compare its
unformatted and formatted versions, and find out where to insert
comments found between spans. Unlike the original approach, this
approach will parse and format while the AST node is being visited and
formatted - that means the entire `handle_comments(..)` that reparses
the entire `Module` can be deprecated once this PR is in.

This is done mainly through the function `rewrite_with_comments(..)`
that is called at the end of a `impl Format`. Essentially what this
function does is it parses the unformatted and formatted versions of the
currently visited node with `parse_snippet::<P: sway_parse::Parse +
Format>(..)`, finding the comments between the unformatted spans, and
inserting them into the formatted spans.

Based on the `CommentKind` of the found comments, we do some formatting
to decide how they should be written.

Of course, this also means that unlike the previously proposed method of
using `write_comments()` everywhere, where we have the indentation
context, this new approach means that we yet again do not have the local
context. To figure out indentation, we assume that each comment will
always be 'pinned' to some node below it - if this node does not exist
we try to 'pin' to a node above it instead.


## Checklist

- [x] I have linked to any relevant issues.
- [x] I have commented my code, particularly in hard-to-understand
areas.
- [ ] I have updated the documentation where relevant (API docs, the
reference, and the Sway book).
- [x] I have added tests that prove my fix is effective or that my
feature works.
- [ ] I have added (or requested a maintainer to add) the necessary
`Breaking*` or `New Feature` labels where relevant.
- [ ] I have done my best to ensure that my PR adheres to [the Fuel Labs
Code Review
Standards](https://github.com/FuelLabs/rfcs/blob/master/text/code-standards/external-contributors.md).
- [x] I have requested a review from the relevant team or maintainers.
  • Loading branch information
eightfilms authored Mar 10, 2023
1 parent d735e99 commit 318fcde
Show file tree
Hide file tree
Showing 22 changed files with 454 additions and 421 deletions.
4 changes: 2 additions & 2 deletions examples/config_time_constants/src/main.sw
Original file line number Diff line number Diff line change
Expand Up @@ -29,9 +29,9 @@ abi TestContract {
}

impl TestContract for Contract {
// ANCHOR: using_configurables
// ANCHOR: using_configurables
fn return_configurables() -> (u8, bool, [u32; 3], str[4], StructWithGeneric<u8>) {
(U8, BOOL, ARRAY, STR_4, STRUCT)
}
// ANCHOR_END: using_configurables
// ANCHOR_END: using_configurables
}
2 changes: 1 addition & 1 deletion examples/option/src/main.sw
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ fn main() {
match result {
// The division was valid
Option::Some(x) => std::logging::log(x),
// The division was invalid
// The division was invalid
Option::None => std::logging::log("Cannot divide by 0"),
}
}
2 changes: 1 addition & 1 deletion examples/storage_variables/src/main.sw
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ impl StorageExample for Contract {
storage.var2.z = true;
}
// ANCHOR_END: storage_write

// ANCHOR: storage_read
#[storage(read)]
fn get_something() -> (u64, u64, b256, bool) {
Expand Down
19 changes: 19 additions & 0 deletions sway-ast/src/token.rs
Original file line number Diff line number Diff line change
Expand Up @@ -129,6 +129,25 @@ pub enum CommentKind {
/// var foo = 1; // Trailing comment
/// ```
Trailing,

/// An inlined comment is a block comment nested between 2 tokens without a newline after it.
///
/// # Examples
///
/// ```sway
/// fn some_function(baz: /* inlined comment */ u64) {}
/// ```
Inlined,

/// A multiline comment is a block comment that may be nested between 2 tokens with 1 or more newlines within it.
///
/// # Examples
///
/// ```sway
/// fn some_function(baz: /* multiline
/// comment */ u64) {}
/// ```
Multilined,
}

#[derive(Debug, Clone, PartialEq, Eq, PartialOrd, Hash)]
Expand Down
23 changes: 13 additions & 10 deletions sway-parse/src/token.rs
Original file line number Diff line number Diff line change
Expand Up @@ -158,7 +158,6 @@ pub fn lex_commented(
.filter(|&c| c == '\n')
.count()
> 0;

// We found a comment at the start of file, which should be accounted for as a Newlined comment.
let start_of_file_found = search_end == 0 && index == 0;

Expand Down Expand Up @@ -187,7 +186,7 @@ pub fn lex_commented(
continue;
}
Some((_, '*')) => {
if let Some(token) = lex_multiline_comment(&mut l, index) {
if let Some(token) = lex_block_comment(&mut l, index) {
token_trees.push(token);
}
continue;
Expand Down Expand Up @@ -393,7 +392,7 @@ fn lex_line_comment(
}
}

fn lex_multiline_comment(l: &mut Lexer<'_>, index: usize) -> Option<CommentedTokenTree> {
fn lex_block_comment(l: &mut Lexer<'_>, index: usize) -> Option<CommentedTokenTree> {
// Lexing a multi-line comment.
let _ = l.stream.next();
let mut unclosed_indices = vec![index];
Expand All @@ -405,6 +404,9 @@ fn lex_multiline_comment(l: &mut Lexer<'_>, index: usize) -> Option<CommentedTok
None
};

// We first start by assuming that block comments are inlined.
let mut comment_kind = CommentKind::Inlined;

loop {
match l.stream.next() {
None => return unclosed_multiline_comment(l, unclosed_indices),
Expand All @@ -419,13 +421,7 @@ fn lex_multiline_comment(l: &mut Lexer<'_>, index: usize) -> Option<CommentedTok
// We could represent them as several ones, but that's unnecessary.
let end = slash_ix + '/'.len_utf8();
let span = span(l, start, end);
return Some(
Comment {
span,
comment_kind: CommentKind::Newlined,
}
.into(),
);
return Some(Comment { span, comment_kind }.into());
}
}
Some(_) => {}
Expand All @@ -436,6 +432,13 @@ fn lex_multiline_comment(l: &mut Lexer<'_>, index: usize) -> Option<CommentedTok
Some((_, '*')) => unclosed_indices.push(next_index),
Some(_) => {}
},
Some((_, '\n')) => {
// If we find a newline character while lexing, this means that the block comment is multiline.
// Example:
// /* this is a
// multilined block comment */
comment_kind = CommentKind::Multilined;
}
Some(_) => {}
}
}
Expand Down
248 changes: 246 additions & 2 deletions swayfmt/src/comments.rs
Original file line number Diff line number Diff line change
@@ -1,9 +1,16 @@
use ropey::Rope;
use std::{fmt::Write, ops::Range};
use sway_ast::token::{Comment, CommentKind};
use sway_types::Spanned;
use sway_types::{Span, Spanned};

use crate::{
formatter::FormattedCode, utils::map::comments::CommentMap, Formatter, FormatterError,
formatter::FormattedCode,
parse::parse_snippet,
utils::map::{
byte_span::{ByteSpan, LeafSpans},
comments::CommentMap,
},
Format, Formatter, FormatterError,
};

pub type UnformattedCode = String;
Expand Down Expand Up @@ -60,6 +67,9 @@ fn write_trailing_comment(
/// Given a range, writes comments contained within the range. This function
/// removes comments that are written here from the CommentMap for later use.
///
/// Most comment formatting should be done using `rewrite_with_comments` in
/// the context of the AST, but in some cases (eg. at the end of module) we require this function.
///
/// Returns:
/// `Ok(true)` on successful execution with comments written,
/// `Ok(false)` on successful execution and if there are no comments within the given range,
Expand Down Expand Up @@ -102,6 +112,13 @@ pub fn write_comments(
CommentKind::Trailing => {
write_trailing_comment(formatted_code, comment)?;
}
CommentKind::Inlined => {
// We do a trim and truncate here to ensure that only a single whitespace separates
// the inlined comment from the previous token.
formatted_code.truncate(formatted_code.trim_end().len());
write!(formatted_code, " {} ", comment.span().as_str(),)?;
}
CommentKind::Multilined => {}
}
}
}
Expand All @@ -117,6 +134,233 @@ pub fn write_comments(
Ok(true)
}

/// The main function that rewrites a piece of formatted code with comments found in its unformatted version.
///
/// This takes a given AST node's unformatted span, its leaf spans and its formatted code (a string) and
/// parses the equivalent formatted version to get its leaf spans. We traverse the spaces between both
/// formatted and unformatted leaf spans to find possible comments and inserts them between.
pub fn rewrite_with_comments<T: sway_parse::Parse + Format + LeafSpans>(
formatter: &mut Formatter,
unformatted_span: Span,
unformatted_leaf_spans: Vec<ByteSpan>,
formatted_code: &mut FormattedCode,
last_formatted: usize,
) -> Result<(), FormatterError> {
// Since we are adding comments into formatted code, in the next iteration the spans we find for the formatted code needs to be offsetted
// as the total length of comments we added in previous iterations.
let mut offset = 0;
let mut to_rewrite = formatted_code[last_formatted..].to_string();

let formatted_leaf_spans = parse_snippet::<T>(&formatted_code[last_formatted..])
.unwrap()
.leaf_spans();

let mut previous_unformatted_leaf_span = unformatted_leaf_spans
.first()
.ok_or(FormatterError::CommentError)?;
let mut previous_formatted_leaf_span = formatted_leaf_spans
.first()
.ok_or(FormatterError::CommentError)?;
for (unformatted_leaf_span, formatted_leaf_span) in unformatted_leaf_spans
.iter()
.zip(formatted_leaf_spans.iter())
{
// Search for comments between the previous leaf span's end and the next leaf span's start
let range = std::ops::Range {
start: previous_unformatted_leaf_span.end,
end: unformatted_leaf_span.start,
};
let iter = formatter.comments_context.map.comments_between(&range);

let mut comments_found = vec![];
for i in iter {
comments_found.push(i.clone());
}

if !comments_found.is_empty() {
let extra_newlines = collect_extra_newlines(unformatted_span.clone(), &comments_found);

offset += insert_after_span(
previous_formatted_leaf_span,
comments_found.clone(),
offset,
&mut to_rewrite,
extra_newlines,
)?;

formatter
.comments_context
.map
.retain(|bs, _| !bs.contained_within(&range))
}

previous_unformatted_leaf_span = unformatted_leaf_span;
previous_formatted_leaf_span = formatted_leaf_span;
}

formatted_code.truncate(last_formatted);
write!(formatted_code, "{to_rewrite}")?;
Ok(())
}

/// Collect extra newline before comment(s). The main purpose of this function is to maintain
/// newlines between comments when inserting multiple comments at once.
fn collect_extra_newlines(unformatted_span: Span, comments_found: &Vec<Comment>) -> Vec<usize> {
// The first comment is always assumed to have no extra newlines before itself.
let mut extra_newlines = vec![0];

if comments_found.len() == 1 {
return extra_newlines;
}

let mut prev_comment: Option<&Comment> = None;
for comment in comments_found {
if let Some(prev_comment) = prev_comment {
// Get whitespace between the end of the previous comment and the start of the next.
let whitespace_between = unformatted_span.as_str()[prev_comment.span().end()
- unformatted_span.start()
..comment.span().start() - unformatted_span.start()]
.to_string();

// Count the number of newline characters we found above.
// By default, we want 0 extra newlines, but if there are more than 1 extra newline, we want to squash it to 1.
let mut extra_newlines_count = 0;
if whitespace_between.chars().filter(|&c| c == '\n').count() > 1 {
extra_newlines_count = 1;
};

extra_newlines.push(extra_newlines_count);
}

prev_comment = Some(comment);
}

extra_newlines
}

/// Check if a block is empty. When formatted without comments, empty code blocks are formatted into "{}", which is what this check is for.
fn is_empty_block(formatted_code: &FormattedCode, end: usize) -> bool {
formatted_code.chars().nth(end - 1) == Some('{') && formatted_code.chars().nth(end) == Some('}')
}

/// Main driver of writing comments. This function is a no-op if the block of code is empty.
///
/// This iterates through comments inserts each of them after a given span and returns the offset.
/// While inserting comments this also inserts whitespaces/newlines so that alignment is intact.
/// To do the above, there are some whitespace heuristics we stick to:
///
/// 1) Assume comments are anchored to the line below, and follow its alignment.
/// 2) In some cases the line below is the end of the function eg. it contains only a closing brace '}'.
/// in such cases we then try to anchor the comment to the line above.
/// 3) In the cases of entirely empty blocks we actually should prefer using `write_comments` over
/// `rewrite_with_comments` since `write_comments` would have the formatter's indentation context.
fn insert_after_span(
from: &ByteSpan,
comments_to_insert: Vec<Comment>,
offset: usize,
formatted_code: &mut FormattedCode,
extra_newlines: Vec<usize>,
) -> Result<usize, FormatterError> {
let mut comment_str = String::new();

// We want to anchor the comment to the next line, and here,
// we make the assumption here that comments will never be right before the final leaf span.
let mut indent = formatted_code[from.end + offset..]
.chars()
.take_while(|c| c.is_whitespace())
.collect::<String>();

// In the case of empty blocks, we do not know the indentation of comments at that time.
// Writing comments in empty blocks has to be deferred to `write_comments` instead, which will
// contain the Formatter's indentation context.
if !is_empty_block(formatted_code, from.end) {
// There can be cases where comments are at the end.
// If so, we try to 'pin' our comment's indentation to the previous line instead.
if formatted_code.chars().nth(from.end + offset + indent.len()) == Some('}') {
// It could be possible that the first comment found here is a Trailing,
// then a Newlined.
// We want all subsequent newlined comments to follow the indentation of the
// previous line that is NOT a comment.
if comments_to_insert
.iter()
.any(|c| c.comment_kind == CommentKind::Newlined)
{
// Find and assign the indentation of the previous line to `indent`.
let prev_line = formatted_code[..from.end + offset]
.trim_end()
.chars()
.rev()
.take_while(|&c| c != '\n')
.collect::<String>();
indent = prev_line
.chars()
.rev()
.take_while(|c| c.is_whitespace())
.collect();
if let Some(comment) = comments_to_insert.first() {
if comment.comment_kind != CommentKind::Trailing {
comment_str.push('\n');
}
}
}
}

for (comment, extra_newlines) in comments_to_insert.iter().zip(extra_newlines) {
// Check for newlines to preserve.
for _ in 0..extra_newlines {
comment_str.push('\n');
}

match comment.comment_kind {
CommentKind::Trailing => {
if comments_to_insert.len() > 1 && indent.starts_with('\n') {
write!(comment_str, " {}", comment.span().as_str())?;
} else {
writeln!(comment_str, " {}", comment.span().as_str())?;
}
}
CommentKind::Newlined => {
if comments_to_insert.len() > 1 && indent.starts_with('\n') {
write!(comment_str, "{}{}", indent, comment.span().as_str())?;
} else {
writeln!(comment_str, "{}{}", indent, comment.span().as_str())?;
}
}
CommentKind::Inlined => {
if !formatted_code[..from.end].ends_with(' ') {
write!(comment_str, " ")?;
}
write!(comment_str, "{}", comment.span().as_str())?;
if !formatted_code[from.end + offset..].starts_with([' ', '\n']) {
write!(comment_str, " ")?;
}
}
CommentKind::Multilined => {
write!(comment_str, "{}{}", indent, comment.span().as_str())?;
}
};
}

let mut src_rope = Rope::from_str(formatted_code);

// We do a sanity check here to ensure that we don't insert an extra newline
// if the place at which we're going to insert comments already ends with '\n'.
if let Some(char) = src_rope.get_char(from.end + offset) {
if char == '\n' && comment_str.ends_with('\n') {
comment_str.pop();
}
};

// Insert the actual comment(s).
src_rope.insert(from.end + offset, &comment_str);

formatted_code.clear();
formatted_code.push_str(&src_rope.to_string());
}

Ok(comment_str.len())
}

#[cfg(test)]
mod tests {
use crate::utils::map::byte_span::ByteSpan;
Expand Down
Loading

0 comments on commit 318fcde

Please sign in to comment.