Optimize LSP parse_project (FuelLabs#4414)

## Description Closes FuelLabs#4413 This change drastically improves the UX of the VSCode plugin in github codespaces. It's slightly noticeably faster running locally too. The key optimizations are: #### Only one thread of `parse_project` can run at a time - A write lock taken on `diagnostics` function forces the program to wait until the previous `parse_project` execution is finished - A semaphore with 2 permits accessed with `try_acquire` means that only 1 thread of `parse_project` can wait on the mutex. When the first one finishes, the waiting one will run against the latest version of the user's code. All others will not be able to acquire a permit and will exit immediately. This was a big improvement because previously, as the user was typing, 5+ threads of `parse_project` would kick off simultaneously, all operating on the same data stores (taking turns with read & write locks). We only really care that the most recent version of the user's code is compiled & indexed. #### Hot swapping the engines with write lock Previously, we only used a write lock to clear the engines, then read locks to do the rest of the parsing. After making the change to limit the # of parsing threads and the subsequent latency improvement, I encountered an issue where we'd sometimes request stale TypeIds from the engines. To prevent this, I'm using a new set of engines for compilation (`pkg::check`), and only storing them in the session once compilation is complete. The write lock only applies to the swapping process, but ensures that no readers will read incomplete data. - This also means we clear the data stores _after_ compilation is complete. - I had to replace the `read` lock from `create_runnables` since it caused a deadlock (`create_runnables` is called in the section of code that is now holding a write lock) #### Spawn blocking for parse_project We're now calling the synchronous `session::parse_project` in a [spawn_blocking](https://docs.rs/tokio/latest/tokio/task/fn.spawn_blocking.html) thread so that it doesn't block tokio's runtime, allowing other requests like `inlay_hints` to be processed on whatever runtime threads are available. #### Publish compiler diagnostics only when they're fresh Since we have "passive" threads of `parse_project` that simply wait for the active threads to finish, there's no need for the passive threads of `parse_project` to publish their results, since they'll be the same as what the active threads before them have gathered. In practice this cuts the number of `publishDiagnostic` messages roughly in half, saving resources for the server to handle to other requests. It also reduces the amount of squiggly lines jumping around reporting warnings/errors while the user is typing. #### Performance Here's a video of it running in github codespaces on a **2-core** machine - the smallest option available. Previously it was taking upwards of 1 minute to render inlay hints, mostly due to requests timing out and getting cancelled. Now it takes 2ms plus the ~1 second of compile time :-) https://user-images.githubusercontent.com/47993817/230686772-a521c072-19c5-4729-8dad-a48630b8735d.mp4 #### Notes I had to increase the timeout on a test that waits for diagnostics to be published. I suspect this means that parsing takes slightly longer because of the locks, but overall the LSP performs better because it's doing less throwaway work. ## Checklist - [x] I have linked to any relevant issues. - [x] I have commented my code, particularly in hard-to-understand areas. - [ ] I have updated the documentation where relevant (API docs, the reference, and the Sway book). - [ ] I have added tests that prove my fix is effective or that my feature works. - [ ] I have added (or requested a maintainer to add) the necessary `Breaking*` or `New Feature` labels where relevant. - [x] I have done my best to ensure that my PR adheres to [the Fuel Labs Code Review Standards](https://github.com/FuelLabs/rfcs/blob/master/text/code-standards/external-contributors.md). - [x] I have requested a review from the relevant team or maintainers. --------- Co-authored-by: Joshua Batty <[email protected]>
RiXelanya · Apr 12, 2023 · cd936a5 · cd936a5
1 parent 999557c
commit cd936a5
Show file tree

Hide file tree

Showing 5 changed files with 95 additions and 69 deletions.
diff --git a/sway-lsp/src/capabilities/diagnostic.rs b/sway-lsp/src/capabilities/diagnostic.rs
@@ -4,7 +4,7 @@ use sway_error::warning::CompileWarning;
 use sway_error::{error::CompileError, warning::Warning};
 use sway_types::{LineCol, Spanned};
 
-#[derive(Debug)]
+#[derive(Debug, Default, Clone)]
 pub struct Diagnostics {
     pub warnings: Vec<Diagnostic>,
     pub errors: Vec<Diagnostic>,

diff --git a/sway-lsp/src/core/session.rs b/sway-lsp/src/core/session.rs
@@ -32,6 +32,7 @@ use sway_core::{
 };
 use sway_types::{Span, Spanned};
 use sway_utils::helpers::get_sway_files;
+use tokio::sync::Semaphore;
 use tower_lsp::lsp_types::{
     CompletionItem, GotoDefinitionResponse, Location, Position, Range, SymbolInformation,
     TextDocumentContentChangeEvent, TextEdit, Url,
@@ -60,6 +61,11 @@ pub struct Session {
     pub type_engine: RwLock<TypeEngine>,
     pub decl_engine: RwLock<DeclEngine>,
     pub sync: SyncWorkspace,
+    // Limit the number of threads that can wait to parse at the same time. One thread can be parsing
+    // and one thread can be waiting to start parsing. All others will return the cached diagnostics.
+    parse_permits: Arc<Semaphore>,
+    // Cached diagnostic results that require a lock to access. Readers will wait for writers to complete.
+    diagnostics: Arc<RwLock<Diagnostics>>,
 }
 
 impl Session {
@@ -72,6 +78,8 @@ impl Session {
             type_engine: <_>::default(),
             decl_engine: <_>::default(),
             sync: SyncWorkspace::new(),
+            parse_permits: Arc::new(Semaphore::new(2)),
+            diagnostics: Arc::new(RwLock::new(Diagnostics::default())),
         }
     }
 
@@ -108,9 +116,22 @@ impl Session {
         &self.token_map
     }
 
-    pub fn parse_project(&self, uri: &Url) -> Result<Diagnostics, LanguageServerError> {
-        self.token_map.clear();
-        self.runnables.clear();
+    /// Wait for the cached [Diagnostics] to be unlocked after parsing and return a copy.
+    pub fn wait_for_parsing(&self) -> Diagnostics {
+        self.diagnostics.read().clone()
+    }
+
+    /// Parses the project and returns true if the compiler diagnostics are new and should be published.
+    pub fn parse_project(&self, uri: &Url) -> Result<bool, LanguageServerError> {
+        // Acquire a permit to parse the project. If there are none available, return false. This way,
+        // we avoid publishing the same diagnostics multiple times.
+        let permit = self.parse_permits.try_acquire();
+        if permit.is_err() {
+            return Ok(false);
+        }
+
+        // Lock the diagnostics result to prevent multiple threads from parsing the project at the same time.
+        let mut diagnostics = self.diagnostics.write();
 
         let manifest_dir = PathBuf::from(uri.path());
         let locked = false;
@@ -140,19 +161,35 @@ impl Session {
             pkg::BuildPlan::from_lock_and_manifests(&lock_path, &member_manifests, locked, offline)
                 .map_err(LanguageServerError::BuildPlanFailed)?;
 
-        let mut diagnostics = Diagnostics {
-            warnings: vec![],
-            errors: vec![],
-        };
-
-        *self.type_engine.write() = <_>::default();
-        *self.decl_engine.write() = <_>::default();
-        let type_engine = &*self.type_engine.read();
-        let decl_engine = &*self.decl_engine.read();
-        let engines = Engines::new(type_engine, decl_engine);
+        let new_type_engine = TypeEngine::default();
+        let new_decl_engine = DeclEngine::default();
         let tests_enabled = true;
-        let results = pkg::check(&plan, BuildTarget::default(), true, tests_enabled, engines)
-            .map_err(LanguageServerError::FailedToCompile)?;
+
+        let results = pkg::check(
+            &plan,
+            BuildTarget::default(),
+            true,
+            tests_enabled,
+            Engines::new(&new_type_engine, &new_decl_engine),
+        )
+        .map_err(LanguageServerError::FailedToCompile)?;
+
+        // Acquire locks for the engines before clearing anything.
+        let mut type_engine = self.type_engine.write();
+        let mut decl_engine = self.decl_engine.write();
+
+        // Update the engines with the new data.
+        *type_engine = new_type_engine;
+        *decl_engine = new_decl_engine;
+
+        // Clear other data stores.
+        self.token_map.clear();
+        self.runnables.clear();
+
+        // Create context with write guards to make readers wait until the update to token_map is complete.
+        // This operation is fast because we already have the compile results.
+        let ctx = ParseContext::new(&self.token_map, Engines::new(&type_engine, &decl_engine));
+
         let results_len = results.len();
         for (i, res) in results.into_iter().enumerate() {
             // We can convert these destructured elements to a Vec<Diagnostic> later on.
@@ -172,8 +209,12 @@ impl Session {
             } = value.unwrap();
 
             let ast_res = CompileResult::new(typed, warnings, errors);
-            let typed_program = self.compile_res_to_typed_program(&ast_res)?;
-            let ctx = ParseContext::new(&self.token_map, engines);
+
+            // Get a reference to the typed program AST.
+            let typed_program = ast_res.value.as_ref().ok_or_else(|| {
+                *diagnostics = get_diagnostics(&ast_res.warnings, &ast_res.errors);
+                LanguageServerError::FailedToParse
+            })?;
 
             // The final element in the results is the main program.
             if i == results_len - 1 {
@@ -186,7 +227,7 @@ impl Session {
                 self.parse_ast_to_tokens(&parsed, &ctx, |an, _ctx| parsed_tree.traverse_node(an));
 
                 // Finally, create runnables and populate our token_map with typed ast nodes.
-                self.create_runnables(typed_program);
+                self.create_runnables(typed_program, &decl_engine);
 
                 let typed_tree = TypedTree::new(&ctx, &typed_program.root.namespace);
                 typed_tree.collect_module_spans(typed_program);
@@ -198,7 +239,7 @@ impl Session {
                 self.save_parsed_program(parsed.to_owned().clone());
                 self.save_typed_program(typed_program.to_owned().clone());
 
-                diagnostics = get_diagnostics(&ast_res.warnings, &ast_res.errors);
+                *diagnostics = get_diagnostics(&ast_res.warnings, &ast_res.errors);
             } else {
                 // Collect tokens from dependencies and the standard library prelude.
                 self.parse_ast_to_tokens(&parsed, &ctx, |an, ctx| {
@@ -210,7 +251,7 @@ impl Session {
                 });
             }
         }
-        Ok(diagnostics)
+        Ok(true)
     }
 
     pub fn token_ranges(&self, url: &Url, position: Position) -> Option<Vec<Range>> {
@@ -414,23 +455,9 @@ impl Session {
         root_nodes.chain(sub_nodes).for_each(|n| f(n, ctx));
     }
 
-    /// Get a reference to the [ty::TyProgram] AST.
-    fn compile_res_to_typed_program<'a>(
-        &'a self,
-        ast_res: &'a CompileResult<ty::TyProgram>,
-    ) -> Result<&'a ty::TyProgram, LanguageServerError> {
-        ast_res
-            .value
-            .as_ref()
-            .ok_or(LanguageServerError::FailedToParse {
-                diagnostics: get_diagnostics(&ast_res.warnings, &ast_res.errors),
-            })
-    }
-
     /// Create runnables if the `TyProgramKind` of the `TyProgram` is a script.
-    fn create_runnables(&self, typed_program: &ty::TyProgram) {
+    fn create_runnables(&self, typed_program: &ty::TyProgram, decl_engine: &DeclEngine) {
         // Insert runnable test functions.
-        let decl_engine = &self.decl_engine.read();
         for (decl, _) in typed_program.test_fns(decl_engine) {
             // Get the span of the first attribute if it exists, otherwise use the span of the function name.
             let span = decl

diff --git a/sway-lsp/src/error.rs b/sway-lsp/src/error.rs
@@ -1,8 +1,6 @@
 use swayfmt::FormatterError;
 use thiserror::Error;
 
-use crate::capabilities::diagnostic::Diagnostics;
-
 #[derive(Debug, Error)]
 pub enum LanguageServerError {
     // Inherited errors
@@ -17,7 +15,7 @@ pub enum LanguageServerError {
     #[error("Failed to compile. {0}")]
     FailedToCompile(anyhow::Error),
     #[error("Failed to parse document")]
-    FailedToParse { diagnostics: Diagnostics },
+    FailedToParse,
     #[error("Error formatting document: {0}")]
     FormatError(FormatterError),
 }

diff --git a/sway-lsp/src/server.rs b/sway-lsp/src/server.rs
@@ -1,6 +1,6 @@
 pub use crate::error::DocumentError;
 use crate::{
-    capabilities::{self, diagnostic::Diagnostics},
+    capabilities,
     config::{Config, Warnings},
     core::{session::Session, sync},
     error::{DirectoryError, LanguageServerError},
@@ -19,6 +19,7 @@ use std::{
     sync::Arc,
 };
 use sway_types::{Ident, Spanned};
+use tokio::task;
 use tower_lsp::lsp_types::*;
 use tower_lsp::{jsonrpc, Client, LanguageServer};
 use tracing::metadata::LevelFilter;
@@ -62,26 +63,27 @@ impl Backend {
     }
 
     async fn parse_project(&self, uri: Url, workspace_uri: Url, session: Arc<Session>) {
-        // pass in the temp Url into parse_project, we can now get the updated AST's back.
-        let diagnostics = match session.parse_project(&uri) {
-            Ok(diagnostics) => diagnostics,
-            Err(err) => {
-                tracing::error!("{}", err.to_string().as_str());
-                if let LanguageServerError::FailedToParse { diagnostics } = err {
-                    diagnostics
-                } else {
-                    Diagnostics {
-                        warnings: vec![],
-                        errors: vec![],
-                    }
-                }
-            }
-        };
-        self.publish_diagnostics(&uri, &workspace_uri, session, diagnostics)
-            .await;
+        let should_publish = run_blocking_parse_project(uri.clone(), session.clone()).await;
+        if should_publish {
+            self.publish_diagnostics(&uri, &workspace_uri, session)
+                .await;
+        }
     }
 }
 
+/// Runs parse_project in a blocking thread, because parsing is not async.
+async fn run_blocking_parse_project(uri: Url, session: Arc<Session>) -> bool {
+    task::spawn_blocking(move || match session.parse_project(&uri) {
+        Ok(should_publish) => should_publish,
+        Err(err) => {
+            tracing::error!("{}", err);
+            matches!(err, LanguageServerError::FailedToParse)
+        }
+    })
+    .await
+    .unwrap_or_default()
+}
+
 /// Returns the capabilities of the server to the client,
 /// indicating its support for various language server protocol features.
 pub fn capabilities() -> ServerCapabilities {
@@ -150,13 +152,7 @@ impl Backend {
         Ok(session)
     }
 
-    async fn publish_diagnostics(
-        &self,
-        uri: &Url,
-        workspace_uri: &Url,
-        session: Arc<Session>,
-        diagnostics: Diagnostics,
-    ) {
+    async fn publish_diagnostics(&self, uri: &Url, workspace_uri: &Url, session: Arc<Session>) {
         let diagnostics_res = {
             let mut diagnostics_to_publish = vec![];
             let config = &self.config.read();
@@ -173,6 +169,7 @@ impl Backend {
                 }
                 Warnings::Default => {}
             }
+            let diagnostics = session.wait_for_parsing();
             if config.diagnostic.show_warnings {
                 diagnostics_to_publish.extend(diagnostics.warnings);
             }
@@ -259,7 +256,7 @@ impl LanguageServer for Backend {
                 // update this file with the new changes and write to disk
                 match session.write_changes_to_file(&uri, params.content_changes) {
                     Ok(_) => {
-                        self.parse_project(uri, params.text_document.uri.clone(), session.clone())
+                        self.parse_project(uri, params.text_document.uri.clone(), session)
                             .await;
                     }
                     Err(err) => tracing::error!("{}", err.to_string()),
@@ -406,9 +403,12 @@ impl LanguageServer for Backend {
         params: SemanticTokensParams,
     ) -> jsonrpc::Result<Option<SemanticTokensResult>> {
         match self.get_uri_and_session(&params.text_document.uri) {
-            Ok((uri, session)) => Ok(capabilities::semantic_tokens::semantic_tokens_full(
-                session, &uri,
-            )),
+            Ok((uri, session)) => {
+                let _ = session.wait_for_parsing();
+                Ok(capabilities::semantic_tokens::semantic_tokens_full(
+                    session, &uri,
+                ))
+            }
             Err(err) => {
                 tracing::error!("{}", err.to_string());
                 Ok(None)
@@ -511,6 +511,7 @@ impl Backend {
     ) -> jsonrpc::Result<Option<Vec<InlayHint>>> {
         match self.get_uri_and_session(&params.text_document.uri) {
             Ok((uri, session)) => {
+                let _ = session.wait_for_parsing();
                 let config = &self.config.read().inlay_hints;
                 Ok(capabilities::inlay_hints::inlay_hints(
                     session,

diff --git a/sway-lsp/tests/utils/src/lib.rs b/sway-lsp/tests/utils/src/lib.rs
@@ -104,7 +104,7 @@ pub async fn assert_server_requests(
     tokio::spawn(async move {
         let request_stream = socket.take(expected_requests.len()).collect::<Vec<_>>();
         let requests =
-            tokio::time::timeout(timeout.unwrap_or(Duration::from_secs(5)), request_stream)
+            tokio::time::timeout(timeout.unwrap_or(Duration::from_secs(10)), request_stream)
                 .await
                 .expect("Timed out waiting for requests from server");