lmontopo and I recently contributed to jquast’s wcwidth, a pure Python implementation of the system call of the same name. Contributing was really easy, and jquast was very supportive and helpful. Working on wcwidth led me to (re)discover Blessed, a terminal wrapper used in wcwidth to assist with manually inspecting whether character sequences were the right length in wcwidth.
Blessings is “a thin, practical wrapper around terminal capabilities in Python.” Blessed is jquast’s fork that slightly thickens this wrapping of terminal capabilities with useful features like keypress detection. Curtsies is a library I wrote for bpython-curtsies that probably crosses the line from wrapper to whatever something that isn’t a wrapper anymore is. Originally I wrote my own terminal interaction code, but in June I slotted in Blessings and it improved the code a bunch. Upon looking into Blessed I’ve found I’m again duplicating functionality, and I’m excited to make some net negative lines of code commits soon.
Both Curtsies and Blessed implement context managers1 for terminal modes like cbreak0 and have some capabilities for formatting strings containing terminal control sequences (Curtsies uses formatted string objects with custom methods while Blessed uses functions that act on strings containing a wider variety of terminal control sequences than Curtsies allows). But I was most interested in comparing our libraries’ approaches to detecting keys.
Mapping bytes read from stdin to what key a user probably pressed to cause those bytes to be there is something Darius helped me think through with Sturm, though I’ve since strayed significantly from his clean ideas. If you’re going to follow along with all the details of this post you might open up or pull down Blessed and Curtsies.
I prefer Blessed’s approach to almost everything that Curtsies also does. Here are some of the differences between the two and why I like the approach Blessed takes for each. If you’re looking for general takeaways, scan the post for the bits in bold.
Ways Blessed’s key detection is nicer than Curtsies’:
curtsies.Inputobject is the equivalent to
inkey()is a normal method, it can accept optional arguments, while configuring the behavior of calling
curtsies.Inputobject behaves is done at instantiation.
First general takeaway: make the magic optional.
I think it’s cool that
Input objects implement the Python iterator
it’s more difficult to understand (and to write about in a blog post).
It’s also nice to document the behavior of a method in the docs for that
method, though probably fine if defaults can be set in a constructor if that makes the api
more convenient. I should have a method like
inkey() and then alias
Keystrokeobjects, while calling
next()on a Curtsies
Inputobject returns unicode strings or
Eventobjects. Curtsies makes up new names for these, while Keystroke objects include original Curses names and convenient aliases.
Building off of the Curses names instead of making up new ones is mostly a matter of taste. But the potential for typos and not having the assistance of a linter just isn’t worth the slightly prettier syntax (my original justification) of
if key in (u'<SPACE>', u'<ESC>)`: do_stuff()
The embarrasingly obvious general takeaway: use enums or constants instead of strings.
- Less keys are detected by Blessed than Curtsies. For the most part these could easily be added so this worth mentioning except for the decision to not to support old-style meta (which adds 128 to the value of the key pressed).
This is because Blessed assumes bytes read in from stdin are always individually decodable with that stream’s encoding, so old-style meta keys cause decoding errors. This allows for a nice decomposition of the problem: first convert read bytes to unicode characters, then detect multiple characters which are part of a terminal control sequence.
The lesson for me here: resist implementing features that are hard! Next time I get that urge to implement something for completion, I should put it in a branch for that hack session - not everything has to end up in master. doy also advised me not to worry about meta keys - they’re terrible, why would you want to support them? The decomposition Blessings uses of bytes to unicode and unicode to key sequence really cleans things up, and regardless of how correct it is (though it sounds like it’s pretty correct) it’s worth considering an assumption that makes writing the code so much easier.
inkey()is really just for keys, while Curtsies’
next(Input)might return a SigIntEvent or a PasteEvent, and custom events can be scheduled as well. The goal of the Input object in Curtsies was to write code like:
for event in curtsies.Input(): # react to event
with OS events included in this loop.
It wouldn’t be hard
to build a reactor object using Blessed’s
inkey() since signals3 can
interrupt it. The Curtsies
Input object can be interrupted by a signal from
another thread - but that would be possible to simulate by sending a signal
that would interrupt the call. The Blessings decomposition is nicer than the one
Curtsies uses, with the
Input responsible for being a reactor for all
Blessed also generalizes the idea of signal interruption, while I considered each
signal separately as I wanted to write a handler for it in Bpython.
curtsies.Input.__next__ is interrupted it can return
the signal as an event for SigInts only.
This decision was wrapped up in other concerns as well at the time,
but the lesson I take from it is to more aggresively modularize code.
As my needs for the
Input object changed I should have reconsidered its
Blessed carefully uses terminal capabilities to build key sequences, and then augments them with empirically found sequences. Curtsies just looks for key sequences that worked for me or that Bpython users suggested. When I found some keys weren’t detected by curses, I abandoned its key names altogether, but building onto existing standards in a compatible way is a better idea.
The Blessed code is just better.
jquast’s comments assume the reader knows Python, and that they don’t know the ins and outs of terminals. They’re great comments! Write more comments about domain-specific concerns!
Another code quality difference: I should use existing idioms in places I made up my own thing. Even if it’s not a performance bottleneck, if there’s a standard library solution to a problem I ought to use it because it better communicates the problem being solved. Ways Blessed does this better than Curtsies:
- using stdlib incremental decoder
encodings.utf_8.IncrementalDecoderinstances where I use a state machine I build myself4
- using a
collections.dequefor read bytes buffer - sure inserting at the beginning of a 6 item
listisn’t a big deal performance-wise, but
dequealso describes how we’re going to use it
- using existing infrastructure to discover key sequences (curses)
I like that Blessed uses methods where I use local functions. Besides being more testable, it’s clearer to see what parameters something takes, particularly since instance variable references are pretty obviously distinguishable from local variable references in Python, while outer scope references aren’t.
- Both libraries process one additional byte at a time, but
inkey()always reads a single byte at time with
os.read(stdin.fileno(), 1)while Curtsies tries to read as much as possible and utilizes this information about what came in on the same read - for example, to tell the difference between plain escape key and escape-prepended other key. I think my approach is more elegant, but it doesn’t work.5 I don’t like having to do a timed delay to detect the escape key, but it sounds like it’s necessary. Getting it right is important. I’ll go easy on myself here because I only recently found out this doesn’t quite work in the general case, and haven’t observed it not working yet - but if I were going to use it perhaps it was my responsibility to research whether it worked generally.
Never reading a byte that isn’t part of a requested keypress also
allows a Blessed
Terminal object to share stdin with other readers,
while the Curtsies
object should have complete control since it might read too many
Ways Blessed’s key detection is different from Curtsies’:
Paste events - Curtsies tries to detect that bytes read are probably due to the user pasting text because they were entered very quickly by setting a flag when more than ~10 bytes are read on a single
os.read(sys.stdin.fileno(), 1024). I really like this feature, but it could be approximated by timing when keypresses happen without modifying Blessed’s code at all.
inkey()requires the developer to have manually set cbreak or raw mode. Curtsies enters cbreak itself and optionally installs a signal handler that can cause an event signalling that the SIGINT occurred from the call. I wanted the library user not to have to worry about this (and this difference reflects the the difference between a terminal wrapper and a library that provides a service more separated from implementation), so I think this difference was justified. I think I’ll keep this functionality in the event reactor object I hope to write for Curtsies that will use Blessed’s
I hope to leverage Blessed for key detection and remove the analogous bits of the Curtsies code. Not only does this the make sense from an API perspective, I think the Blessed code is a lot better. In order to do this, I’ll want to:
- add escape key-prepended sequences to Blessed
- add support in Blessings for currently supported features in Curtsies like distinguishing ctrl-left and ctrl-right from normal forms
- add cursor position querying to Blessed - right now Curtsies
Windowobjects are coupled with
Inputobjects to make this work properly
- build a reactor/scheduler object with the tools Blessed provides.
with open('file.txt')in Python? That’s a context manager, and files (which ought to be closed no matter what) and setting terminal modes (which ought to set back to what they were when the program exits so the cursor is visible again etc.) are good candidates for context managers - there’s cleanup we want to do whether we leave a block of code normally or by a raise exception. It does the same thing
try: ... finally: ...does, so check those out first if you’re confused.
The Python iterator protocol allows an object to be used in a for loop. The important bit here is the[return]
.next()in Python 2) which will get invoked each cycle through the for loop to get the value to assign to the target variable before the body of the for loop is executed.
Signals like SIGINT (which is triggered by ctrl-c) are messages from the operating system to the running program which in Python can interrupt normal execution of the program and trigger execution of other code called a signal handler. They can interrupt blocking system calls like[return]
To give myself a break on this one, this was made necessary by the decision to detect old-style meta key combinations, but at least I could have mimicked the interface in constructing my own incremental decoder.[return]
Darius says on his new computer this doesn’t work anymore - the escape byte gets written first in such a way that the[return]
os.readreturns it alone, and then a later read gets the rest of the signal. So it sounds like the delay for escape key is probably required
But perhaps this would have been ok with[return]
ungetc? I’d appreciate advice on whether
ungetcis reasonable to use on stdin and how to nicely use it from Python because it could be useful for cursor query code I want to write for Blessed.