Writing My First Vim Plugin

Screenshot of Vim editing a file, with a search for the term "fibonacci" in progress. The plugin I wrote has added "3 matches of /\vfibonacci/" to the statusline.
Screenshot of Vim editing a file, with a search for the term “fibonacci” in progress. The plugin I wrote has added “3 matches of /\vfibonacci/” to the statusline.

Last night, I put my first full Vim plugin up on GitHub. It’s called match-count-statusline. It’s intended for anyone who uses Vim to edit text files. It adds a small item to the statusline (Vim’s version of a status bar) describing how many times the current search occurs in the current buffer and, if there’s enough room, what the current search pattern is.

This began as something I needed because I kept opening files and needing to count how many times a term or pattern occurred in that file. I wrote a small script to do it, and I tacked on more things to the script over the next few days and weeks.

Eventually I felt confident enough to share it. I’d also like to share about the process of writing it and preparing it for use by others. If you’re interested, read on.

Who Should Read This?

This guide is meant for advanced Vim users who wish to understand more about the process of writing a plugin through a first-hand narrative.

I won’t be explaining how to use or configure Vim itself, nor the basics of how to write in Vim’s scripting language. I’ll assume familiarity with both. If you don’t know Vim’s scripting language (which I am going to call “VimL” for the rest of this post1), I cannot recommend enough Steve Losh’s Learn Vimscript the Hard Way. It is freely available online, and if you find it useful, I encourage you to buy a copy as well.

That book will set you on your way, but it can never include everything you need to know. Vim is always changing, and its own documentation is deep and voluminous. It is for very good reason that Steve sends his readers to Vim’s own documentation at the end of every section—he could not ever hope to include everything in a single book.

I’ll be referring to my plugin as it exists at the time of this writing by commit hash. It is liable to change in the future as I learn how to improve it. The newest version should always be found here, which will incorporate fixes and improvements.

The Problem

I tend to like to open log files in Vim and search around in them to find patterns while I’m investigating a problem at work. I’ve developed over time a lot of tools for doing so. Some of them are crude, but when I find myself doing the same crude thing over and over, I tend to want to polish it.

One common thing I do is discover how often a search pattern occurs in a log. This indicates to me whether something is signal or noise. I’ll place my cursor over something like an ID, press *, and then count how many times it occurs and write that down in another file.

I could do this from the command line using something like grep | wc -l, but I get a lot of value from seeing the whole context and being able to skip between the occurrences of my search, finding out what happened before and after, and so on. I’m very visual. I like to see them in the file.

So to count them, I’d been doing a crude thing. I would run :g//d and watch how many lines got deleted. (Without a pattern specified, that command will delete whatever the current search pattern is.) This has a few drawbacks. First, I had to undelete the lines after I was done. In large files, it was somewhat slow. And finally, it only told me the count of lines, not the count of occurrences.

I figured what I really wanted was simple, and it should be built into Vim.

The Solution

It is not really built into Vim, though. I figured I’d never find it in Vim’s vast documentation, so I searched online and found a Vim Tips article called, “Count number of matches of a pattern.” The very first thing offered in the article was the answer I needed.

:%s/pattern//gn

I knew that when the pattern was already searched for, it could be implied, so this simplified down even more.

:%s///gn

That did it. I ran that after searching for something, and a message printed to the bottom of the screen resembling the following.

2 matches on 2 lines

I could have stopped there.

However, I tend to forget obscure looking commands, and I thought it would be nice if this happened automatically anytime I searched for something. I knew graphical text editors tended to automatically surface information like this in their status bar, and I was inspired to make the same thing happen.

The Solution According to Vim

I probably should not have given up on Vim’s documentation so easily. I didn’t know that the substitution command (:s) was the place to start, but my use of the global command (:g) before might have been a hint.

What I didn’t know is that Vim already internally documents how to count items, and it’s filed under “count-items,” logically enough. You can find it mentioned under the documentation for the substitution section (:help :s, or more specifically, :help :s_flags and see the “n” flag) as well. To find out how this works, run :help count-items. You’ll see it suggests ways to count characters, words, and so on. Generally, it boils down to a very similar command.

:%s//&/gn

Again, this assumes an existing pattern has been searched for. The one difference is the ampersand. The ampersand has a special meaning in the substitution, referring to the thing being replaced. In this case, it would effectively replace a thing with itself, causing nothing to happen. But since the “n” flag is there, no replacement will happen regardless, so there’s effectively no difference if it’s included or not. I have included it in subsequent examples as it is safer (in case the “n” is forgotten).

The Script

So that single, small command forms the core of the functionality I later expanded.

To display it, I learned how to insert an item into vim-airline‘s statusline. That wasn’t too difficult once I found an example online, and you can see how I accomplish that here. Using that, I began a small script in my personal configuration files which added the count when I was searching for a pattern, and otherwise displayed nothing.

Over the next several days, I added on improvements because I found it was slow on larger files. This included caching its results momentarily rather than updating constantly. I also found that each time it updated the search, it would also move the mouse cursor, so I needed to fix that.

The resulting script grew in size and became safer and better performing. Eventually, I figured I’d put enough work into it that it would be safe enough to share with others without breaking their Vim installs.

The Plugin

To create a plugin, I added some documentation, reworked the script to make it configurable, and “licensed” it into the public domain. Then I put it all in a Git repository and hosted it on Github.

I want to spend the rest of this post talking about all the funny little things I did in the core of the script contained in the plugin to build it up from that one substitute hack into more fully fledged functionality.

Below, I’ll be referencing parts of the script itself as contained in the plugin. Each section covers a different topic I applied as I elaborated on the design and improved the plugin so it could be used by more people.

Opening Stanza

The script begins as many plugins do, with an opening stanza ensuring it’s safe to run.

" require 7.4.1658 for v:vim_did_enter
if &compatible
      \ || (v:version <= 704 && !has('patch1658'))
      \ || exists('g:loaded_match_count_statusline')
  finish
endif
let g:loaded_match_count_statusline = v:true

This bit of code makes three checks. First, it checks if &compatible is set. If it is, then there’s no need to check anything else—the script won’t work, and we can exit right here.

Next, it checks for a version of Vim greater than or equal to 7.4.1658 (using Vim’s peculiar syntax for doing so, the predefined variable v:version). That patch, 1658, is the one in which v:vim_did_enter was introduced. (I discovered this by searching the vim_dev mailing list.) That patch was introduced two and a half years ago, so I felt comfortable with that baseline.

Finally, I check to see if my plugin is already loaded using a global variable I will set directly after this condition. If the variable is already set, it means I’ve reached that line before, and I should exit before reaching it again. This prevents loading the script more than once.

The “finish” command ends the script if any of these conditions are true.

The Global Flag

Notice how the above examples of the substitute command all included the “g” flag. That tells the substitution to apply to all matches in a line, not just the first one.

That probably seems like a funny default: only doing one substitution per line and then moving along. It’s just the way regular expressions work from the olden days, and Vi (the editor that Vim is based on) picked up the behavior. Vim never breaks compatibility, so its substitution also needs that “g” flag to apply  substitutions globally.

For that reason, Vim introduced a setting called &gdefault (see :help gdefault). It has the effect of implying a “g” flag on every substitution. Unfortunately, it also means that if you add the “g” flag to a substitution while that setting is on, it undoes the effect. That means you never know what the “g” flag will do unless you know what the setting &gdefault is. That’s not confusing, right?

All of that means that setting &gdefault would effectively break counting if I did not check it. That’s what these lines are all about.

if &gdefault
  let s:match_command = '%s//&/ne'
else
  let s:match_command = '%s//&/gne'
endif

Here, a command to run to search for matches is saved to a variable based on whether &gdefault is set.

There is one minor drawback to doing this here as opposed to dynamically setting it every time the match is calculated. If the user changes &gdefault after loading Vim, the counts will be incorrect when a pattern occurs more than once on a line. This could be a bug! The fix would be to set the match command each time we run a match.

Settings and Defaults

When I decided to release this as a plugin, I realized that I had baked in several decisions about the script that I knew others would disagree with, and I wanted to give them the freedom to configure the script the way they wanted. The next several lines check for global variables which the user may set and then either use their values or use my defaults. Here’s an example of a couple of such settings.

let s:start =
      \ get(g:, 'match_count_start', '')

let s:end =
      \ get(g:, 'match_count_end', '')

To use a variable in Vim, while falling back to a default if it’s undefined, an old way was to use the exists() function. My syntax uses a stupid Vim trick involving variable scoping.

Notice how all the variables I use begin with a letter followed by a colon. These are known in VimL as “internal variables” (see :help internal-variables), and each kind begins with a letter denoting its scope, a colon, and then the name of the variable. (Vim also refers to these scopes as “name spaces” in its documentation.)

Each scope has a special meaning and is treated by Vim in a special way. For example, variables scoped to b: only exist for the one buffer you’re in, and no other buffer can see that variable.

More importantly for our purposes, though, every scope is also considered by Vim to be a Dictionary unto itself, and you can treat all the variables in it as keys containing values. Weird? Yes, weird. But it lets you do things like, “:echo has_key(g:, 'match_count_end'),” which will tell you if that variable is defined.

So in the case of each global setting, I use the get() function, which takes a Dictionary, a key, and optionally also a default value to use in case the key is missing. This allows me to set variables scoped to my script (in s:) based on whether variables in the global scope (in g:) are defined or not. Then the rest of the script uses those script-scoped variables.

This unfortunately means that no changes to the global variables can get picked up while Vim is running. To fix this, I could set all these inside the match counting function I defined, if I chose.

Caching

Caching is one of the earlier things I implemented, and it’s gone through a few revisions. Here’s how it works today.

The basic idea is, the match counting function gets called repeatedly every time the statusline needs to get redrawn, and this happens far more often than the match count itself will change. If the script remembers the match count for each buffer, the match counting function (MatchCountStatusline()) won’t have to recalculate until it’s really necessary, saving itself some work.

So caching is one among many ways MatchCountStatusline() attempts to defer doing any hard work. How does MatchCountStatusline() find out when to update its cache? A couple of ways.

The cache is a simple Dictionary made anew for each buffer. It remembers the last time the cached values were referenced or updated, the number of changes to the current buffer, the current search pattern, and the number of matches.

let s:unused_cache_values = {
      \   'pattern':     -1,
      \   'changedtick': -1,
      \   'match_count': -1,
      \   'last_run':    -1
      \ }

The default values for the cache are ones that could never occur normally, so I call them “sentinel values.” Sentinel values are generally used to represent invalid values, so in this case I can recognize when the cache contains values which have not yet been set by MatchCountStatusline().

When MatchCountStatusline() is first run, if the buffer doesn’t already have a cache, a new one is created with the default sentinel values.

let b:count_cache = 
    \ get(b:, 'count_cache', copy(s:unused_cache_values))

I know that because the cache is scoped to the buffer, each buffer gets a new set of values. On subsequent calls to MatchCountStatusline() for that buffer, it reuses the cache which already exists. (Notice also that I had to copy() the values. If I didn’t, modifications to the buffer’s count cache would modify the script’s count cache defaults and would affect all the other buffers as well. See :help copy.)

A few lines later, MatchCountStatusline() checks to see if the values in the cache match the state of the buffer. If there is a match, there is no need to continue, and the function ends here. Otherwise, we need to set the cache’s values with new ones farther down.

if b:count_cache.pattern == @/
    \ && b:count_cache.changedtick == b:changedtick
  return s:PrintMatchCount(b:count_cache)
endif

It’s a simple check. All I look at here is whether the pattern has changed or the buffer has changed. I don’t have to look at the buffer itself to determine whether it’s changed. A predefined Vim variable increments for every change, called b:changedtick. If it’s changed, I know the buffer has changed. See :help b:changedtick. Those two things let me know if a match count needs to be recalculated for the current buffer.

Notice also that the current search pattern is stored in a register called @/. All registers can be referred to by a name which starts with the @ sigil. There are a bunch of generic registers with single letter names, “a” through “z,” and then there are special-use registers such as @/. It’s easy to see where it got its name, with a little thought—the / key begins a search in Vim, so the @/ register holds the search pattern. Like any register, it may be echoed, compared to, or even assigned to. Any time in my script I want to see what’s being searched for, I inspect @/. For more information, see :help @/ and :help registers. The kinds of registers available will definitely surprise you!


You probably noticed the cache holds two other things besides the number of changes to the current buffer and the current search pattern. The match_count value is the thing we’re caching, so we never care what its value is until we’re printing it out. But last_run deserves an explanation.

I decided early on that I wanted to keep and print out the same cached values for a short period of time, even if the buffer has changed. That’s what last_run is for. It lets me know how long the cache has been holding onto those values so I can know when they do need updating.

Why would I want to keep old values around, even if they don’t reflect reality anymore? It’s not always necessary to reflect reality the very instant it changes. Humans don’t need updates right away for everything—a delay of a tenth or a quarter of a second is tolerable sometimes.

When someone is actively typing, or if they’re entering a new search (and &incsearch is enabled), every single keystroke would invalidate the cache and trigger a new calculation. If we instead wait for a fraction of a second, we can allow the user to type a few characters before updating. By the time they look down, we’ve probably already updated, but we won’t have wasted time needlessly updating for every keystroke.

The function responsible for determining if enough time has passed before even checking the cache is called s:IsCacheStale(). (Notice that it’s script-scoped, so that it doesn’t collide with other scripts’ functions.) It comes before the cache check in the program, so it “short-circuits” that logic, precluding it when it’s not needed. The function s:IsCacheStale() has a very simple mandate—it receives the cache and checks how much time has gone by. If enough time has passed, it returns a true value. Otherwise, it evaluates to false. How it implements this is a little complicated.

I won’t copy in the whole body of the function here, but go glance over it if you want. There are a couple of weird things going on, and I’ll touch on them here.

First, instead of using ones and zeros to represent true and false, I’m using Vim’s predefined values of v:true and v:false. Vim introduced these in 7.4.1154 to help with JSON parsing. They evaluate to 1 and 0 respectively, so I’m using them to make the script more readable.

Next, I put in a little stanza at the beginning to optimize when loading a new buffer.

if a:count_cache == s:unused_cache_values
  if has('reltime')
    let a:count_cache.last_run = reltime()
  else
    let a:count_cache.last_run = localtime()
  endif

  return v:false
endif

The very first time that MatchCountStatusline() gets called for a buffer, recall that it puts in the sentinel values for the buffer’s count cache. Here, I detect whether or not that’s the case. If it is, I merely return false (meaning that the cache is not stale), but I also importantly update the last_run cached value with the current time, so that this condition never evaluates to true in the future for this buffer.

What this accomplishes is that the first time a buffer is loaded, the cache always appears to be stale (even though it really only contains invalid values). This serves to cause no match counting to occur for the first cache timeout period (by default, a tenth of a second) after a buffer gets loaded. After that, the first match count occurs. This lets the buffer get up on its feet and lets other plugins and automatic commands do things before we start.

Finally, both here and further down, I test whether the “reltime” feature exists in Vim and use it only if it does. Otherwise, I fall back to using the localtime() function. There are a handful of other places I’ve checked for functionality before using it. This is one of the ways I’ve tried to make the plugin safer for a wider audience. I imagine most people will have a Vim with “reltime” available, but I couldn’t tell how widely available it would be.

The cache check algorithm is pretty simple otherwise. It checks the current time against the last_run time. If it’s greater than or equal to the cache timeout, last_run gets updated, and the function returns true (meaning the cache is stale). Otherwise it returns false, leaving last_run alone so that the next time it’s called, it will be clear how much time has passed since the last time the cache was stale.

Both by giving the cache a little time before it’s updated, and by checking its values, MatchCountStatusline() avoids updating the cache (and running the expensive match-counting calculating) whenever possible.

Another Optimization: v:vim_did_enter

One of the more recent Vim features I took advantage of was the v:vim_did_enter predefined variable (see :help v:vim_did_enter). It allows MatchCountStatusline() to know if it’s being called before Vim has even loaded all the way, and so it’s the very first check. This means that the cache isn’t even warmed up before Vim has properly started up, so the cache grace period kicks in after. This allows Vim to start up faster.

" don't bother executing until Vim has fully loaded
if v:vim_did_enter == v:false
  return ''
endif

File Size Checking

I discovered early on that match counting works really poorly for large files. I tried throwing in a couple of optimizations for the counting itself (I’ll mention those below), but the counting still dragged, and so I assumed it would never be fast enough to work transparently and interactively.

Then I considered whether I could do the counting asynchronously. I eventually ruled this out for now because it means shelling outside of the Vim process, as near as I can tell. I wanted my script to be purely VimL and self-contained.

So I settled on checking for whether the size of the file exceeds a certain size and stopping there. The function responsible for checking is called s:IsLargeFile().

function! s:IsLargeFile(force)
  if a:force
    return v:false
  else
    if getfsize(expand(@%)) >= s:max_file_size_in_bytes
      return v:true
    else
      return v:false
    endif
  endif
endfunction

Notice that it takes an argument called force. This allows MatchCountStatusline() to tell it to skip checking the file size and just report that the file is safe in all cases. Otherwise, this one is very straightforward.

Toggling Match Counting

I realized that sometimes, even for large files, I still wanted a count, so I implemented a command which would manipulate two buffer variables to force match counting to occur (or would disable match counting altogether). The logic here is a bit weird because of the file size checking. I wanted to be able to force things on if they’re turned off due to the file size, and I wanted to be able to toggle things normally otherwise.

You can review the function which implements the toggling, but I won’t cover it in more detail here.

Optimizing Match Counting Itself

We’ve covered several preliminary checks that MatchCountStatusline() makes, and we’re down to the nitty gritty—updating the cache and then calling s:PrintMatchCount() to render the statusline.

When MatchCountStatusline() sees there’s no pattern for the current buffer, its job is easy—it clears the cached values, and that’s it.

" don't count matches that aren't being searched for
if @/ == ''
  let b:count_cache.pattern     = ''
  let b:count_cache.match_count = 0
  let b:count_cache.changedtick = b:changedtick
[...]

Otherwise, we have work to do.

(Incidentally, there’s a small optimization I forgot to make here. See it? We’re updating the cache. We can set last_run here and keep these cached values for longer. We should do this here and anywhere below where we update the cache.)


Since we’re now in the part of the script involves the actual calculation, and since it has the chance of raising exceptions, I begin a try/catch/finally block here. It’s important to begin it here, when I start to change the state of the buffer, so that I know my changes will get fixed up by the finally block below. The worst case scenario would be attempting to count the matches, having an error occur, and leaving the user’s editor in a broken state.

Now we’re free to begin mucking about. First thing I do is save the current view.

" freeze the view in place
let l:view = winsaveview()

In the finally block, the last thing I do is restore this view.

call winrestview(l:view)

This ensures that nothing visibly changes while I do what I’m about to do. (See :help views-sessions, :help winsaveview, and :help winrestview.) One problem I had was that the core match count command (s//&/gne) would cause the cursor to jump to the beginning of the line. Saving and restoring the view fixes that problem.

Next, I disable automatic commands and highlight search. Likewise, I re-enable those later on. Because these are compile-time features, I check that I can do so before I do. I save off their values as local variables so that I can restore them as they were before I touched them.

By turning these features off, Vim won’t attempt to fire off events during the fake substitution it’s about to run, which can save a little time. Turning off &hlsearch in particular is one of Vim’s own suggestions in :help count-items.

Match Counting

Finally, I perform the actual count.

" this trick counts the matches (see :help count-items)
redir => l:match_output
silent! execute s:match_command
redir END

I added a comment to help others understand how this worked because it’s probably the least obvious part of the program.

The redir here sends all the output from that command to the l:match_output local variable. The actual command uses echom to display a message with the results, like I showed above, something like “2 matches on 2 lines“. If no matches are found, the output is empty due to our use of the n flag.

Then the string is parsed, and the cache is updated with the new match count and other values. The finally block restores things how they were, as I mentioned earlier. And at the very end of the MatchCountStatusline() function, the cache is rendered into a string by s:PrintMatchCount().

What Else?

There are several things I didn’t cover here, like how I implemented the global customizations. I might cover those in a future post if there’s interest, but they’re straightforward enough if you followed the above.

Making it appear in the statusline was interesting, but it’s more of interest to vim-airline users only, probably. Feel free to look at the stanza which attempts to guess how to do that, though.

Do also check out the documentation for my plugin, which consumed most of the time it took to get it ready for wider distribution.

I hope that you found what you were looking for in this post and took away at least one useful thing!