Language Server Protocol Tutorial: From VSCode to Vim

The main artifact of all your work is most likely plain text files. So why don’t you use Notepad to create them?

Syntax highlighting and automatic formatting are just the tips of the iceberg. What about linting, code completion, and semi-automatic refactoring? These are all very good reasons to use a “real” code editor. These are vital to our day-to-day, but do we understand how they work?

In this Language Server Protocol tutorial, we’ll explore these questions a bit and find out what makes our text editors tick. In the end, together we’ll implement a basic language server along with example clients for VSCode, Sublime Text 3, and Vim.

Compilers vs. Language Services

We’ll skip over syntax highlighting and formatting for now, which is handled with static analysis—an interesting topic in its own right—and focus on the main feedback we get from these tools. There are two main categories: compilers and language services.

Compilers take in your source code and spit out a different form. If the code does not follow the language’s rules, the compiler will return errors. These are quite familiar. The problem with this is it’s usually quite slow and limited in scope. How about offering assistance while you’re still creating the code?

This is what language services provider. They can give you insights into your codebase while it’s still in the works, and probably a lot faster than compiling the whole project.

The scope of these services is varied. It can be something as simple as returning a list of all the symbols in the project, or something complex like returning steps to refactor code. These services are the primary reason we use our code editors. If we just wanted to compile and see errors, we could do that with a few keystrokes. Language services give us more insights, and very quickly.

Betting on a Text Editor for Programming

Notice that we have not called out specific text editors yet. Let’s explain why with an example.

Say you’ve developed a new programming language called Lapine. It’s a beautiful language and the compiler gives terrific Elm-like error messages. Additionally, you are able to provide code completion, references, refactoring help, and diagnostics.

Which code/text editor do you support first? What about after that? You’ve got an uphill battle fighting to get people to adopt it, so you want to make it as easy as possible. You don’t want to pick the wrong editor and miss out on users. What if you keep your distance from the code editors and focus on your specialty—the language and its features?

Language Servers

Enter language servers. These are tools that talk to language clients and provide the insights we’ve mentioned. They are independent of text editors for the reasons we just described with our hypothetical situation.

As usual, another layer of abstraction is just what we need. These promise to break the tight coupling of language tools and code editors. Language creators can wrap their features in a server once, and code/text editors can add small extensions to turn themselves into clients. It’s a win for everyone. To facilitate this, though, we need to agree on how these clients and servers will communicate.

Lucky for us, this isn’t hypothetical. Microsoft has already begun by defining the Language Server Protocol.

As with most great ideas, it grew out of necessity rather than foresight. Many code editors had already started adding support for various language features; some features outsourced to third-party tools, some done under the hood within the editors. Scalability issues came about, and Microsoft took the lead on splitting things. Yes, Microsoft paved the way to move these features out of the code editors rather than hoarding them within VSCode. They could have kept building their editor, locking in users—but they set them free.

Language Server Protocol

The Language Server Protocol (LSP) was defined in 2016 to help separate language tools and editors. There are still many VSCode fingerprints on it, but it is a major step in the direction of editor agnosticism. Let’s examine the protocol a bit.

Clients and servers—think code editors and language tools—communicate in simple text messages. These messages have HTTP-like headers, JSON-RPC content, and may originate from either the client or server. The JSON-RPC protocol defines requests, responses, and notifications and a few basic rules around them. A key feature is that it is designed to work asynchronously, so clients/servers can deal with messages out of order and with a degree of parallelism.

In short, JSON-RPC allows a client to request another program to run a method with parameters and return a result of an error. LSP builds on this and defines the methods available, expected data structures, and a few more rules around the transactions. For example, there’s a handshake process when the client starts up the server.

The server is stateful and only meant to handle a single client at a time. There are no explicit restrictions on communication, though, so a language server could run on a different machine than the client. In practice, that would be pretty slow for real-time feedback, though. Language servers and clients work with the same files and are pretty chatty.

The LSP has a decent amount of documentation once you know what to look for. As mentioned, much of this is written within the context of VSCode, though the ideas have a much broader application. For example, the protocol specification is written in TypeScript. To aid explorers unfamiliar with VSCode and TypeScript, here’s a primer.

LSP Message Types

There are many groups of messages defined in the Language Server Protocol. They can be roughly divided into “admin” and “language features.” Admin messages contain those used in the client/server handshake, opening/altering files, etc. Importantly, this is where clients and servers share which features they handle. Certainly, different languages and tools offer different features. This also allows for incremental adoption. Langserver.org names a half-dozen key features that clients and servers should support, at least one of which is required to make the list.

Language features are what we’re most interested in. Of these, there is one to call out specifically: the diagnostic message. Diagnostics are one of the key features. When you open a file, it’s mostly assumed that this will run. Your editor should tell you if there is something wrong with the file. The way this happens with LSP is:

  1. The client opens the file and sends textDocument/didOpen to the server.
  2. The server analyzes the file and sends the textDocument/publishDiagnostics notification.
  3. The client parses the results and displays error indicators in the editor.

This is a passive way to get insights from your language services. A more active example would be finding all the references for the symbol under your cursor. This would go something like:

  1. The client sends textDocument/references to the server, specifying a location in a file.
  2. The server figures out the symbol locate references in this and other files and respond with a list.
  3. The client displays the references to the user.

A Blacklist Tool

We could surely dig into the specifics of the Language Server Protocol, but let’s leave that for client implementers. To cement the idea of editor and language tool separation, we’ll play the role of tool creator.

We will keep it simple and, instead of creating a new language and features, we will stick to diagnostics. Diagnostics are a good fit: They’re just warnings about a file’s content. A linter returns diagnostics. We’ll make something similar.

We will make a tool to notify us of words we would like to avoid. Then, we’ll provide that functionality to a couple of different text editors.

The Language Server

First, the tool. We’ll bake this right into a language server. For simplicity, this will be a Node.js app, though we could do it with any tech able to use streams for reading and writing.

Here is the logic. Given some text, this method returns an array of the matched blacklisted words and the indices where they were found.

const getBlacklisted = (text) => {
  const blacklist = [
    'foo',
    'bar',
    'baz',
  ]
  const regex = new RegExp(`\\b(${blacklist.join('|')})\\b`, 'gi')
  const results = []
  while ((matches = regex.exec(text)) && results.length < 100) {
    results.push({
      value: matches[0],
      index: matches.index,
    })
  }
  return results
}

Now, let’s make it a server.

const {
  TextDocuments,
  createConnection,
} = require('vscode-languageserver')
const {TextDocument} = require('vscode-languageserver-textdocument')

const connection = createConnection()
const documents = new TextDocuments(TextDocument)

connection.onInitialize(() => ({
  capabilities: {
    textDocumentSync: documents.syncKind,
  },
}))

documents.listen(connection)
connection.listen()

Here, we are utilizing the vscode-languageserver. The name is misleading, as it can certainly work outside VSCode. This is one of the many “fingerprints” you see of LSP’s origins. vscode-languageserver takes care of the lower-level protocol and allows you to focus on the use cases. This snippet starts a connection and ties it into a document manager. When a client connects to the server, the server will tell it that it would like to be notified of text documents being opened.

We could stop here. This is a fully functioning, albeit pointless, LSP server. Instead, let’s respond to document changes with some diagnostic information.

documents.onDidChangeContent(change => {
  connection.sendDiagnostics({
    uri: change.document.uri,
    diagnostics: getDiagnostics(change.document),
  })
})

Finally, we connect the dots between the document that changed, our logic, and the diagnostics response.

const getDiagnostics = (textDocument) =>
  getBlacklisted(textDocument.getText())
    .map(blacklistToDiagnostic(textDocument))

const {
  DiagnosticSeverity,
} = require('vscode-languageserver')

const blacklistToDiagnostic = (textDocument) => ({ index, value }) => ({
  severity: DiagnosticSeverity.Warning,
  range: {
    start: textDocument.positionAt(index),
    end: textDocument.positionAt(index + value.length),
  },
  message: `${value} is blacklisted.`,
  source: 'Blacklister',
})

Our diagnostics payload will be the result of running the document’s text through our function, then mapped to the format expected by the client.

This script will create all that for you.

curl -o- https://raw.githubusercontent.com/reergymerej/lsp-article-resources/revision-for-6.0.0/blacklist-server-install.sh | bash

Note: If you’re uncomfortable with strangers adding executables to your machine, please check the source. It creates the project, downloads index.js, and npm links it for you.

COMPLETE SERVER SOURCE

The final blacklist-server source is:

#!/usr/bin/env node

const {
  DiagnosticSeverity,
  TextDocuments,
  createConnection,
} = require('vscode-languageserver')

const {TextDocument} = require('vscode-languageserver-textdocument')

const getBlacklisted = (text) => {
  const blacklist = [
    'foo',
    'bar',
    'baz',
  ]
  const regex = new RegExp(`\\b(${blacklist.join('|')})\\b`, 'gi')
  const results = []
  while ((matches = regex.exec(text)) && results.length < 100) {
    results.push({
      value: matches[0],
      index: matches.index,
    })
  }
  return results
}

const blacklistToDiagnostic = (textDocument) => ({ index, value }) => ({
  severity: DiagnosticSeverity.Warning,
  range: {
    start: textDocument.positionAt(index),
    end: textDocument.positionAt(index + value.length),
  },
  message: `${value} is blacklisted.`,
  source: 'Blacklister',
})

const getDiagnostics = (textDocument) =>
  getBlacklisted(textDocument.getText())
    .map(blacklistToDiagnostic(textDocument))

const connection = createConnection()
const documents = new TextDocuments(TextDocument)

connection.onInitialize(() => ({
  capabilities: {
    textDocumentSync: documents.syncKind,
  },
}))

documents.onDidChangeContent(change => {
  connection.sendDiagnostics({
    uri: change.document.uri,
    diagnostics: getDiagnostics(change.document),
  })
})

documents.listen(connection)
connection.listen()

LANGUAGE SERVER PROTOCOL TUTORIAL: TIME FOR A TEST DRIVE

After the project is linked, try running the server, specifying stdio as the transport mechanism:

blacklist-server --stdio

It’s listening on stdio now for the LSP messages, we talked about before. We could provide those manually, but let’s create a client instead.

Language Client: VSCode

As this technology originated in VSCode, it seems appropriate to start there. We’ll create an extension that will create an LSP client and connect it to the server we just made.

There are a number of ways to create a VSCode extension, including using Yeoman and the appropriate generator, generator-code. For simplicity, though, let’s do a barebones example.

Let’s clone the boilerplate and install its dependencies:

git clone [email protected]:reergymerej/standalone-vscode-ext.git blacklist-vscode
cd blacklist-vscode
npm i # or yarn

Open the blacklist-vscode directory in VSCode.

Press F5 to start another VSCode instance, debugging the extension.

In the first VSCode instance’s “debug console,” you will see the text, “Look, ma. An extension!”

We’ve now got a basic VSCode extension working without all the bells and whistles. Let’s make it an LSP client. Close both VSCode instances and from within the blacklist-vscode directory, run:

npm i vscode-languageclient

Replace extension.js with:

const { LanguageClient } = require('vscode-languageclient')

module.exports = {
  activate(context) {
    const executable = {
      command: 'blacklist-server',
      args: ['--stdio'],
    }

    const serverOptions = {
      run: executable,
      debug: executable,
    }

    const clientOptions = {
      documentSelector: [{
        scheme: 'file',
        language: 'plaintext',
      }],
    }

    const client = new LanguageClient(
      'blacklist-extension-id',
      'Blacklister',
      serverOptions,
      clientOptions
    )

    context.subscriptions.push(client.start())
  },
}

This uses the vscode-languageclient package to create an LSP client within VSCode. Unlike vscode-languageserver, this is tightly coupled to VSCode. In short, what we’re doing in this extension is creating a client and telling it to use the server we created in the previous steps. Glossing over the VSCode extension specifics, we can see that we’re telling it to use this LSP client for plain text files.

To test drive it, open the blacklist-vscode directory in VSCode. Press F5 to start another instance, debugging the extension.

In the new VSCode instance, create a plain text file and save it. Type “foo” or “bar” and wait a moment. You will see warnings that these are blacklisted.

 

 

That’s it! We didn’t have to recreate any of our logic, just coordinate the client and server.

Let’s do it again for another editor, this time Sublime Text 3. The process will be quite similar and a little easier.

Language Client: Sublime Text 3

First, open ST3 and open the command palette. We need a framework to make the editor an LSP client. Type “Package Control: Install Package” and hit enter. Find the package “LSP” and install it. Once complete, we have the ability to specify LSP clients. There are many presets, but we’re not going to use those. We’ve created our own.

Again, open the command palette. Find “Preferences: LSP Settings” and hit enter. This will open the configuration file, LSP.sublime-settings, for the LSP package. To add a custom client, use the configuration below.

{
  "clients": {
    "blacklister": {
      "command": [
        "blacklist-server",
        "--stdio"
      ],
      "enabled": true,
      "languages": [
      {
        "syntaxes": [
          "Plain text"
        ]
      }
      ]
    }
  },
  "log_debug": true
}

This may look familiar from the VSCode extension. We defined a client, told it to work on plain text files, and specified the language server.

Save the settings, then create and save a plain text file. Type “foo” or “bar” and wait. Again, you’ll see warnings that these are blacklisted. The treatment—how the messages are displayed in the editor—is different. However, our functionality is the same. We barely even did anything this time to add support to the editor.

Language “Client”: Vim

If you’re still not convinced that this separation of concerns makes it easy to share features across text editors, here are the steps to add the same functionality to Vim via Coc.

Open Vim and type :CocConfig, then add:

"languageserver": {
  "blacklister": {
    "command": "blacklist-server",
    "args": ["--stdio"],
    "filetypes": ["text"]
  }
}

Client-server Separation Lets Languages and Language Services Thrive

Separating the responsibility of language services from the text editors they are used in is clearly a win. It allows language feature creators to focus on their specialty and editor creators to do the same. It’s a fairly new idea, but adoption is spreading.

Now that you’ve got a basis to work from, maybe you can find a project and help move this idea forward. The editor flame war will never end, but that’s OK. As long as the language abilities can exist outside specific editors, you are free to use whatever editor you like.

Loading