Making ijq Fast

I recently received an issue report that ijq was performing slowly. The issue claimed that, when used on a large (16 MB) JSON file, ijq was “too slow to be usable”. I downloaded the test file which was (helpfully) provided in the issue and tried it myself. Even using the most charitable definition, I agree that this classifies as “unusably slow”. Note that in this screen recording I am displaying my key presses to show just how unresponsive ijq is:

In the rest of this post I’ll talk about why ijq was performing so badly, and what I changed to fix it. The end result is that ijq is now, dare I say, not just usable, but downright snappy:

Background

First, a brief explanation of how ijq works.

ijq is, more or less, an interactive TUI wrapper around jq. When ijq starts, the input document (whether read from a file or stdin) is read and processed with jq using the default filter (.), and the output is written to the left (Input) pane. A second jq process is run which processes the input document using whatever filter the user supplied to ijq; that output is written to the right (Output) document. The input text box is the filter passed to jq. Whenever the input text box changes, jq is run on the input document using the new filter and the output is written to the right pane.

Conceptually, this is fairly simple, but there are plenty of ways to do this wrong (as I learned).

The input document is represented as a Document object:

type Document struct {
	input   string
	filter  string
	options Options
}

The Document object implements the WriterTo interface for writing to an io.Writer object:

// Filter the document with the given jq filter and options
func (d *Document) WriteTo(w io.Writer) (n int64, err error) {
	opts := d.options
	if _, ok := w.(*tview.TextView); ok {
		// Writer is a TextView, so set options accordingly
		opts.forceColor = true
		opts.monochrome = false
		opts.compact = false
		opts.rawOutput = false
	}

	args := append(opts.ToSlice(), d.filter)
	cmd := exec.Command(d.options.command, args...)
	stdin, err := cmd.StdinPipe()
	if err != nil {
		return 0, err
	}

	go func() {
		defer stdin.Close()
		_, _ = io.WriteString(stdin, d.input)
	}()

	out, err := cmd.CombinedOutput()
	if err != nil {
		if exiterr, ok := err.(*exec.ExitError); ok {
			exiterr.Stderr = out
		}
		return 0, err
	}

	if tv, ok := w.(*tview.TextView); ok {
		w = tview.ANSIWriter(tv)
		tv.Clear()
	}

	m, err := w.Write(out)
	n = int64(m)
	return n, err
}

If the Writer is a TextView (the TUI widget used for the input and output panes) then we set some options which are passed to the child jq process. The child process is started and the document’s input is piped to the process’s stdin. When the process finishes, the stdout is written to the Writer.

The filter input text box has a callback which is fired anytime its value changes:

filterInput.SetChangedFunc(func(text string) {
	go app.QueueUpdateDraw(func() {
		errorView.Clear()
		doc.filter = text
		outputView.ScrollToBeginning()
		_, err := doc.WriteTo(outputView)
		if err != nil {
			exitErr, ok := err.(*exec.ExitError)
			if ok {
				fmt.Fprint(tview.ANSIWriter(errorView), string(exitErr.Stderr))
			}

			return
		}
	})
})

Whenever the input changes, we update the current Document’s filter and call WriteTo using the output text view. If the jq subprocess returns an error, write the error text to the error text view.

These two functions are the backbone of ijq, and both have fatal flaws (experienced Go and/or UI developers have perhaps already spotted them) responsible for ijq’s slow performance.

Before moving on, it’s worth mentioning that these problems had not been noticed so far because I have only ever used ijq on relatively small inputs. When the input is small enough, the typing speed of the user becomes the bottleneck, not the jq process or updating the text views, so these issues have gone undetected (by me) for a long time (maybe other ijq users have noticed them, but until this issue report I was unaware!).

Problem

Refer back to the filter input callback implementation above. The first problem is the use of QueueUpdateDraw. ijq uses a TUI widget framework called tview to manage the application run loop. tview’s documentation for QueueUpdate states:

QueueUpdate is used to synchronize access to primitives from non-main goroutines. The provided function will be executed as part of the event loop and thus will not cause race conditions with other such update functions or the Draw() function.

QueueUpdate (and QueueUpdateDraw) is used to modify UI primitives from non-main threads. Importantly, these functions block the caller until the main thread processes the callback and synchronizes the state. tview’s wiki page on Concurrency says:

Any event handlers you install, e.g. InputField.SetChangedFunc() or Table.SetSelectedFunc(), are invoked from the main goroutine. It is safe to make changes to your primitives in these handlers. If they are invoked in response to a key event, you must not call Application.Draw() as the application’s main loop will do that for you and calling Application.Draw() (or Application.QueueUpdate() or Application.QueueUpdateDraw()) from within an event handler will lead to a deadlock.

The use of QueueUpdateDraw() in this handler is in violation of this advice. The function is started in a separate goroutine (it’s called with go), so this avoids deadlock, but it still blocks the main thread until the WriteTo function finishes. Recall that WriteTo runs the jq subprocess, so effectively this runs each jq process on the main thread every time the filter input changes.

This change was added (by me) over 3 years ago, because it (apparently) solved another issue with race conditions in the prior implementation (no doubt that it did: when everything is on the main thread there are no race conditions).

The second important problem is in the WriteTo implementation above. This function uses the CombinedOutput function to get both stdout and stderr from the jq process. This function writes the process’s stdout and stderr into a temporary buffer. After the jq process completes, that buffer is then written to the output text view.

This is slow, because we first wait for the process to write to the buffer, and only when that is finished do we start writing to the text view. This is the cause of ijq’s long startup time (visible in the first video at the top of the post). Instead, we should be able to pipe the output from the subprocess directly to the text view so that the output is visible immediately.

Solutions

Fixing the first problem was straightforward: drop QueueUpdateDraw, run WriteTo in a separate thread, and use a ChangedFunc on the output view to redraw the application when the output view changes. With this change ijq immediately becomes more responsive. User input is never blocked on the jq process.

This is suboptimal though, because a new jq process is spawned on every input change. We should be able to cancel old processes when the input changes, and if multiple input change callbacks are fired before starting a new jq process, we should only ever use the most recent input. Go’s native channels are almost a good fit here, but what we want is a LIFO (last-in, first-out), whereas Go’s channels are FIFO (first-in, first-out): every input written into a channel is read out in the same order. On the receiving end of the channel, there’s no (easy) way to only read the most recent one. So instead of channels, we use a regular old mutex and condition variable. The mutex guards changes to the Document object and a pending variable. When the filter input changes, the Document object is updated, pending is set to true, any existing jq processes are cancelled, and the condition variable is signalled. A separate thread waits for pending to become true and when it does, makes a copy of the Document object before starting a new jq process.

Fixing the second problem was a bit trickier. Go’s Cmd object can use any object that implements the io.Writer interface in its Stdout and Stderr fields. Instead of writing to a temporary buffer, waiting for the process to complete, and then copying to the text view, we can simply write to the text view immediately! This change, combined with removing another incorrect use of QueueUpdateDraw in the app startup, solves the initial startup time issue.

But sometimes fixing one issue introduces another: in this case, when the jq process writes directly to the text view, we have to clear the text view before running the jq process (or else each successive process will append to the previous process’s output). But when we do this, the output view is cleared every time the user updates the filter input. We only want the output view to change when there is actually new visible output. This ensures that if the user types in an invalid filter, the last output is still visible. This is an important property for an interactive jq tool where experimenting with the right filter is a large part of the value proposition.

To address this, we make a new type that wraps a TextView and tracks its “dirty” state. When the text view is dirty, we call the view’s Clear() method before it is written (and then reset the dirty state).

type pane struct {
	tv    *tview.TextView
	dirty bool
}

func (pane *pane) Write(p []byte) (n int, err error) {
	if pane.dirty {
		pane.tv.Clear()
		pane.dirty = false
	}

	return pane.tv.Write(p)
}

The dirty state is then set before the jq process is invoked.

With these changes, ijq’s performance problems are solved!

Conclusion

I’m not sure there is a lesson here other than re-learning that concurrency (and UIs) are hard, even in languages like Go which give you great tools to manage concurrency. Ultimately, for me, it was a learning experience and a fun excuse to hack on a side project. If you are an ijq user, I hope this makes your user experience even better!

Last modified on