Emacs Lisp Profiling

Emacs lisp is slow. Like, really slow. Python can sum 1,000,000 numbers in less than a tenth of a second, while Emacs Lisp takes an order of magnitude longer.

timeit(stmt="reduce(add, [5] * 100)",
       setup="from operator import add ; from functools import reduce",
       number=10000) # 0.079246
(timeit 10000 (cl-reduce '+ (make-list 100 5))) ; 0.710435

This means it is of utmost importance to write performant elisp code, as poorly performing elisp code is 10x more noticeable than poorly performing python code. Furthermore, elisp code is typically user-facing. Somebody is actively waiting for your code to finish running, and they probably can't do anything with their active buffer until it finishes, thanks to emacs' synchronous nature.

Below, I will profile some elisp code, disassemble some of emacs' C source, and hopefully clear up some common myths about writing performant elisp code.

Methodology

As with any performance benchmarks, absolute numbers are significantly less important than the relation between two benchmarks. However, if you are interested in how long emacs will take to do certain operations on your box, here are the specifications of the machine I'm performing these tests on:

  Part        SKU
  ----------- ------------------
  Processor   i3-5005U
  Memory      4GB LPDDR3-1333
  Emacs       25.1 (Arch; -O2)
  Kernel      4.9.6

We will be using this macro to test our code; It's a modified version of a macro from the emacs mailing list ca. 2006.

(defmacro timeit (run-times &rest body)
  "Measure the time it takes to evaluate BODY."
  `(let ((time (current-time))
         (i 0))
     (while (< i ,run-times)
       ,@body
       (setq i (1+ i)))
     (message "%.06f" (float-time (time-since time)))))

Now that our specs and testing methodology have been covered, let's get to it. First, let's put the debate about \“right\” way to write a map in elisp to rest.

Profiling Maps

There are plenty of ways to write a generic map in emacs lisp – the most popular ways being the association list, or alist, property list, or plist, and hash tables. Some people swear by alists, while others stand by hash tables because of their C implementation and significantly better big-O runtimes. Others, like me, think plists look cleaner than both types, especially when the data is structurally complex.

(setq hash #s(hash-table size 65 data (:top kek :kek bur :foo bar :baz quux :xyzzy ham)))
(setq plist '(:top kek :kek bur :foo bar :baz quux :xyzzy ham))
(setq alist '((:top . kek)
              (:kek . bur)
              (:foo . bar)
              (:baz . quux)
              (:xyzzy . ham)))

(timeit 1000000 (plist-get plist :kek)) ; 0.254863
(timeit 1000000 (gethash :kek hash)) ; 0.287944
(timeit 1000000 (alist-get :kek alist)) ; 0.335358

For those of you that don't know, #s is a reader macro which allows significantly easier definition of hash tables, giving definitions a plist-like syntax. All hash tables have a size of 65 by default, and a size of 200000 did not produce different results.

It looks like plists win for a small n, while alists surprisingly perform worse than hash tables. It appears alist-get has some overhead which could be worked out, because the function would logically be very similar to plist-get.

Hash tables, surprisingly, have very light overhead compared to plists, so using them for small datasets is also a viable option. How about huge datasets? Let's populate a list with 300 elements, and then the element we're looking for, using dash:

(setq plist (-snoc (-interleave (make-list 300 :foo) (make-list 300 'bar)) :kek 'bur))
(setq alist (-snoc (-zip (make-list 300 :foo) (make-list 300 'bar)) '(:kek . bur)))
(defun alist-to-ht (list)
  (let ((ht (make-hash-table :size 400)))
    (dolist (item list)
      (puthash (car item) (cdr item) ht))
    ht))
(setq hash (alist-to-ht alist))

(timeit 1000000 (plist-get plist :kek)) ; 1.480839
(timeit 1000000 (gethash :kek hash))    ; 0.285781
(timeit 1000000 (alist-get :kek alist)) ; 1.290769

As anybody who has taken a first-year CS course would guess, the hash table wins when n becomes large. It appears alist-get performs slightly better for large n as well. I wonder why? Let's look at plist-get, first.

Disassembling plist-get

Checking out the emacs git repository, registering the source directory with emacs, and pressing the location link after typing C-h f plist-get will take us right where we need to go: the definition of plist-get in emacs' C source

DEFUN ("plist-get", Fplist_get, Splist_get, 2, 2, 0,
       doc: /* long docstring */ )
  (Lisp_Object plist, Lisp_Object prop)
{
  FOR_EACH_TAIL_SAFE (plist)
  {
    if (! CONSP (XCDR (li.tail)))
      break;
    if (EQ (prop, XCAR (li.tail)))
      return XCAR (XCDR (li.tail));
    li.tail = XCDR (li.tail);
    if (EQ (li.tail, li.tortoise))
      break;
  }
  return Qnil;
}

Oh, so that's why they call it editor macros. It looks like FOR_EACH_TAIL_SAFE is doing a lot of either unnecessary or expensive operations in its incrementor. alist-get appears to suffer from some overhead from being written in elisp, but perhaps iterating through each list on the elisp side of things is less heavy.

Thankfully, I don't have to read through that heap of macros myself; gcc can do it for me. gcc -E to the rescue!

(8) Splist_get = { { PVEC_SUBR << PSEUDOVECTOR_AREA_BITS }, { .a2 = Fplist_get }, 2, 2, "plist-get", 0, 0}; Lisp_Object Fplist_get
(Lisp_Object plist, Lisp_Object prop)
{ // Hope you like horizontal scrolling
  for (struct for_each_tail_internal li = { plist, plist, 2, 0, 2 };
       (((enum Lisp_Type) ((li.tail) & ~(((0x7fffffffffffffffL >> (3 - 1)) / 2 < (9223372036854775807L)) ? - (1 << 3) : (0x7fffffffffffffffL >> (3 - 1))))) == Lisp_Cons) || ((void) 0, 0);
       (li.tail = (((void) (0 && ((((enum Lisp_Type)
                                    ((li.tail) & ~(((0x7fffffffffffffffL >> (3 - 1)) / 2 < (9223372036854775807L))
                                                   ? - (1 << 3) : (0x7fffffffffffffffL >> (3 - 1))))) == Lisp_Cons)))),
                   (struct Lisp_Cons *) __builtin_assume_aligned ((void *) (intptr_t) ((li.tail) - (Lisp_Cons)), 8))->u.cdr,
        ((--li.q != 0 || (( 0 ) ? maybe_quit () : (void) 0, 0 < --li.n)
        || (li.q = li.n = li.max <<= 1, li.n >>= (((0) < 0) + ((((((!!((((((((0x7fff * 2 + 1)) >> 31 >> 31 >> 2) >> 31 >> 1) >> 16) >> 8) >> 4) & 8)
        + !!((((((((0x7fff * 2 + 1)) >> 31 >> 31 >> 2) >> 31 >> 1) >> 16) >> 8) >> 4) & 4)
        + !!((((((((0x7fff * 2 + 1)) >> 31 >> 31 >> 2) >> 31 >> 1) >> 16) >> 8) >> 4) & 2)
        + !!((((((((0x7fff * 2 + 1)) >> 31 >> 31 >> 2) >> 31 >> 1) >> 16) >> 8) >> 4) & 1))
        + (!!(((((((0x7fff * 2 + 1)) >> 31 >> 31 >> 2) >> 31 >> 1) >> 16) >> 8) & 8)
        + !!(((((((0x7fff * 2 + 1)) >> 31 >> 31 >> 2) >> 31 >> 1) >> 16) >> 8) & 4)
        + !!(((((((0x7fff * 2 + 1)) >> 31 >> 31 >> 2) >> 31 >> 1) >> 16) >> 8) & 2)
        + !!(((((((0x7fff * 2 + 1)) >> 31 >> 31 >> 2) >> 31 >> 1) >> 16) >> 8) & 1)))
        + ((!!(((((((0x7fff * 2 + 1)) >> 31 >> 31 >> 2) >> 31 >> 1) >> 16) >> 4) & 8)
        + !!(((((((0x7fff * 2 + 1)) >> 31 >> 31 >> 2) >> 31 >> 1) >> 16) >> 4) & 4)
        + !!(((((((0x7fff * 2 + 1)) >> 31 >> 31 >> 2) >> 31 >> 1) >> 16) >> 4) & 2)
        + !!(((((((0x7fff * 2 + 1)) >> 31 >> 31 >> 2) >> 31 >> 1) >> 16) >> 4) & 1))
        + (!!((((((0x7fff * 2 + 1)) >> 31 >> 31 >> 2) >> 31 >> 1) >> 16) & 8)
        + !!((((((0x7fff * 2 + 1)) >> 31 >> 31 >> 2) >> 31 >> 1) >> 16) & 4)
        + !!((((((0x7fff * 2 + 1)) >> 31 >> 31 >> 2) >> 31 >> 1) >> 16) & 2)
        + !!((((((0x7fff * 2 + 1)) >> 31 >> 31 >> 2) >> 31 >> 1) >> 16) & 1))))
        + (((!!(((((((0x7fff * 2 + 1)) >> 31 >> 31 >> 2) >> 31 >> 1) >> 8) >> 4) & 8)
        + !!(((((((0x7fff * 2 + 1)) >> 31 >> 31 >> 2) >> 31 >> 1) >> 8) >> 4) & 4)
        + !!(((((((0x7fff * 2 + 1)) >> 31 >> 31 >> 2) >> 31 >> 1) >> 8) >> 4) & 2)
        + !!(((((((0x7fff * 2 + 1)) >> 31 >> 31 >> 2) >> 31 >> 1) >> 8) >> 4) & 1))
        + (!!((((((0x7fff * 2 + 1)) >> 31 >> 31 >> 2) >> 31 >> 1) >> 8) & 8)
        + !!((((((0x7fff * 2 + 1)) >> 31 >> 31 >> 2) >> 31 >> 1) >> 8) & 4)
        + !!((((((0x7fff * 2 + 1)) >> 31 >> 31 >> 2) >> 31 >> 1) >> 8) & 2)
        + !!((((((0x7fff * 2 + 1)) >> 31 >> 31 >> 2) >> 31 >> 1) >> 8) & 1)))
        + ((!!((((((0x7fff * 2 + 1)) >> 31 >> 31 >> 2) >> 31 >> 1) >> 4) & 8)
        + !!((((((0x7fff * 2 + 1)) >> 31 >> 31 >> 2) >> 31 >> 1) >> 4) & 4)
        + !!((((((0x7fff * 2 + 1)) >> 31 >> 31 >> 2) >> 31 >> 1) >> 4) & 2)
        + !!((((((0x7fff * 2 + 1)) >> 31 >> 31 >> 2) >> 31 >> 1) >> 4) & 1))
        + (!!(((((0x7fff * 2 + 1)) >> 31 >> 31 >> 2) >> 31 >> 1) & 8)
        + !!(((((0x7fff * 2 + 1)) >> 31 >> 31 >> 2) >> 31 >> 1) & 4)
        + !!(((((0x7fff * 2 + 1)) >> 31 >> 31 >> 2) >> 31 >> 1) & 2)
        + !!(((((0x7fff * 2 + 1)) >> 31 >> 31 >> 2) >> 31 >> 1) & 1)))))
        + ((((!!(((((((0x7fff * 2 + 1)) >> 31 >> 31 >> 2) >> 16) >> 8) >> 4) & 8)
        + !!(((((((0x7fff * 2 + 1)) >> 31 >> 31 >> 2) >> 16) >> 8) >> 4) & 4)
        + !!(((((((0x7fff * 2 + 1)) >> 31 >> 31 >> 2) >> 16) >> 8) >> 4) & 2)
        + !!(((((((0x7fff * 2 + 1)) >> 31 >> 31 >> 2) >> 16) >> 8) >> 4) & 1))
        + (!!((((((0x7fff * 2 + 1)) >> 31 >> 31 >> 2) >> 16) >> 8) & 8)
        + !!((((((0x7fff * 2 + 1)) >> 31 >> 31 >> 2) >> 16) >> 8) & 4)
        + !!((((((0x7fff * 2 + 1)) >> 31 >> 31 >> 2) >> 16) >> 8) & 2)
        + !!((((((0x7fff * 2 + 1)) >> 31 >> 31 >> 2) >> 16) >> 8) & 1)))
        + ((!!((((((0x7fff * 2 + 1)) >> 31 >> 31 >> 2) >> 16) >> 4) & 8)
        + !!((((((0x7fff * 2 + 1)) >> 31 >> 31 >> 2) >> 16) >> 4) & 4)
        + !!((((((0x7fff * 2 + 1)) >> 31 >> 31 >> 2) >> 16) >> 4) & 2)
        + !!((((((0x7fff * 2 + 1)) >> 31 >> 31 >> 2) >> 16) >> 4) & 1)) + (!!(((((0x7fff * 2 + 1)) >> 31 >> 31 >> 2) >> 16) & 8)
        + !!(((((0x7fff * 2 + 1)) >> 31 >> 31 >> 2) >> 16) & 4)
        + !!(((((0x7fff * 2 + 1)) >> 31 >> 31 >> 2) >> 16) & 2)
        + !!(((((0x7fff * 2 + 1)) >> 31 >> 31 >> 2) >> 16) & 1)))) + (((!!((((((0x7fff * 2 + 1)) >> 31 >> 31 >> 2) >> 8) >> 4) & 8)
        + !!((((((0x7fff * 2 + 1)) >> 31 >> 31 >> 2) >> 8) >> 4) & 4)
        + !!((((((0x7fff * 2 + 1)) >> 31 >> 31 >> 2) >> 8) >> 4) & 2)
        + !!((((((0x7fff * 2 + 1)) >> 31 >> 31 >> 2) >> 8) >> 4) & 1))
        + (!!(((((0x7fff * 2 + 1)) >> 31 >> 31 >> 2) >> 8) & 8)
        + !!(((((0x7fff * 2 + 1)) >> 31 >> 31 >> 2) >> 8) & 4)
        + !!(((((0x7fff * 2 + 1)) >> 31 >> 31 >> 2) >> 8) & 2)
        + !!(((((0x7fff * 2 + 1)) >> 31 >> 31 >> 2) >> 8) & 1)))
        + ((!!(((((0x7fff * 2 + 1)) >> 31 >> 31 >> 2) >> 4) & 8)
        + !!(((((0x7fff * 2 + 1)) >> 31 >> 31 >> 2) >> 4) & 4)
        + !!(((((0x7fff * 2 + 1)) >> 31 >> 31 >> 2) >> 4) & 2)
        + !!(((((0x7fff * 2 + 1)) >> 31 >> 31 >> 2) >> 4) & 1))
        + (!!((((0x7fff * 2 + 1)) >> 31 >> 31 >> 2) & 8)
        + !!((((0x7fff * 2 + 1)) >> 31 >> 31 >> 2) & 4)
        + !!((((0x7fff * 2 + 1)) >> 31 >> 31 >> 2) & 2)
        + !!((((0x7fff * 2 + 1)) >> 31 >> 31 >> 2) & 1))))))
        + (((((!!(((((((0x7fff * 2 + 1)) >> 31 >> 1) >> 16) >> 8) >> 4) & 8)
        + !!(((((((0x7fff * 2 + 1)) >> 31 >> 1) >> 16) >> 8) >> 4) & 4)
        + !!(((((((0x7fff * 2 + 1)) >> 31 >> 1) >> 16) >> 8) >> 4) & 2)
        + !!(((((((0x7fff * 2 + 1)) >> 31 >> 1) >> 16) >> 8) >> 4) & 1)) + (!!((((((0x7fff * 2 + 1)) >> 31 >> 1) >> 16) >> 8) & 8)
        + !!((((((0x7fff * 2 + 1)) >> 31 >> 1) >> 16) >> 8) & 4)
        + !!((((((0x7fff * 2 + 1)) >> 31 >> 1) >> 16) >> 8) & 2)
        + !!((((((0x7fff * 2 + 1)) >> 31 >> 1) >> 16) >> 8) & 1)))
        + ((!!((((((0x7fff * 2 + 1)) >> 31 >> 1) >> 16) >> 4) & 8)
        + !!((((((0x7fff * 2 + 1)) >> 31 >> 1) >> 16) >> 4) & 4)
        + !!((((((0x7fff * 2 + 1)) >> 31 >> 1) >> 16) >> 4) & 2)
        + !!((((((0x7fff * 2 + 1)) >> 31 >> 1) >> 16) >> 4) & 1)) + (!!(((((0x7fff * 2 + 1)) >> 31 >> 1) >> 16) & 8)
        + !!(((((0x7fff * 2 + 1)) >> 31 >> 1) >> 16) & 4)
        + !!(((((0x7fff * 2 + 1)) >> 31 >> 1) >> 16) & 2)
        + !!(((((0x7fff * 2 + 1)) >> 31 >> 1) >> 16) & 1))))
        + (((!!((((((0x7fff * 2 + 1)) >> 31 >> 1) >> 8) >> 4) & 8)
        + !!((((((0x7fff * 2 + 1)) >> 31 >> 1) >> 8) >> 4) & 4)
        + !!((((((0x7fff * 2 + 1)) >> 31 >> 1) >> 8) >> 4) & 2)
        + !!((((((0x7fff * 2 + 1)) >> 31 >> 1) >> 8) >> 4) & 1)) + (!!(((((0x7fff * 2 + 1)) >> 31 >> 1) >> 8) & 8)
        + !!(((((0x7fff * 2 + 1)) >> 31 >> 1) >> 8) & 4)
        + !!(((((0x7fff * 2 + 1)) >> 31 >> 1) >> 8) & 2)
        + !!(((((0x7fff * 2 + 1)) >> 31 >> 1) >> 8) & 1)))
        + ((!!(((((0x7fff * 2 + 1)) >> 31 >> 1) >> 4) & 8)
        + !!(((((0x7fff * 2 + 1)) >> 31 >> 1) >> 4) & 4)
        + !!(((((0x7fff * 2 + 1)) >> 31 >> 1) >> 4) & 2)
        + !!(((((0x7fff * 2 + 1)) >> 31 >> 1) >> 4) & 1)) + (!!((((0x7fff * 2 + 1)) >> 31 >> 1) & 8)
        + !!((((0x7fff * 2 + 1)) >> 31 >> 1) & 4)
        + !!((((0x7fff * 2 + 1)) >> 31 >> 1) & 2)
        + !!((((0x7fff * 2 + 1)) >> 31 >> 1) & 1)))))
        + ((((!!((((((0x7fff * 2 + 1)) >> 16) >> 8) >> 4) & 8)
        + !!((((((0x7fff * 2 + 1)) >> 16) >> 8) >> 4) & 4)
        + !!((((((0x7fff * 2 + 1)) >> 16) >> 8) >> 4) & 2)
        + !!((((((0x7fff * 2 + 1)) >> 16) >> 8) >> 4) & 1)) + (!!(((((0x7fff * 2 + 1)) >> 16) >> 8) & 8)
        + !!(((((0x7fff * 2 + 1)) >> 16) >> 8) & 4)
        + !!(((((0x7fff * 2 + 1)) >> 16) >> 8) & 2)
        + !!(((((0x7fff * 2 + 1)) >> 16) >> 8) & 1)))
        + ((!!(((((0x7fff * 2 + 1)) >> 16) >> 4) & 8)
        + !!(((((0x7fff * 2 + 1)) >> 16) >> 4) & 4)
        + !!(((((0x7fff * 2 + 1)) >> 16) >> 4) & 2)
        + !!(((((0x7fff * 2 + 1)) >> 16) >> 4) & 1))
        + (!!((((0x7fff * 2 + 1)) >> 16) & 8)
        + !!((((0x7fff * 2 + 1)) >> 16) & 4)
        + !!((((0x7fff * 2 + 1)) >> 16) & 2)
        + !!((((0x7fff * 2 + 1)) >> 16) & 1))))
        + (((!!(((((0x7fff * 2 + 1)) >> 8) >> 4) & 8)
        + !!(((((0x7fff * 2 + 1)) >> 8) >> 4) & 4)
        + !!(((((0x7fff * 2 + 1)) >> 8) >> 4) & 2)
        + !!(((((0x7fff * 2 + 1)) >> 8) >> 4) & 1)) + (!!((((0x7fff * 2 + 1)) >> 8) & 8)
        + !!((((0x7fff * 2 + 1)) >> 8) & 4)
        + !!((((0x7fff * 2 + 1)) >> 8) & 2)
        + !!((((0x7fff * 2 + 1)) >> 8) & 1)))
        + ((!!((((0x7fff * 2 + 1)) >> 4) & 8)
        + !!((((0x7fff * 2 + 1)) >> 4) & 4)
        + !!((((0x7fff * 2 + 1)) >> 4) & 2)
        + !!((((0x7fff * 2 + 1)) >> 4) & 1)) + (!!(((0x7fff * 2 + 1)) & 8)
        + !!(((0x7fff * 2 + 1)) & 4)
        + !!(((0x7fff * 2 + 1)) & 2)
        + !!(((0x7fff * 2 + 1)) & 1)))))))),
        li.tortoise = li.tail,  0 )) && ((li.tail) == (li.tortoise))) ? ((void) (li.tail = builtin_lisp_symbol (0))) : (void) 0)) {
    if (!(((enum Lisp_Type) (((((void) ( 0 && ((((enum Lisp_Type) ((li.tail) & ~(((0x7fffffffffffffffL >> (3 - 1)) / 2 <  (9223372036854775807L) ) ? - (1 << 3)
        : (0x7fffffffffffffffL >> (3 - 1))))) == Lisp_Cons)))), (struct Lisp_Cons *) __builtin_assume_aligned ((void *) (intptr_t) ((li.tail) - (Lisp_Cons)), 8))->u.cdr)
        & ~(((0x7fffffffffffffffL >> (3 - 1)) / 2 <  (9223372036854775807L)) ? - (1 << 3) : (0x7fffffffffffffffL >> (3 - 1))))) == Lisp_Cons))
      break;
    if (((prop) == ((((void) ( 0  && ((((enum Lisp_Type) ((li.tail) & ~(((0x7fffffffffffffffL >> (3 - 1)) / 2 <  (9223372036854775807L) ) ? - (1 << 3)
       : (0x7fffffffffffffffL >> (3 - 1))))) == Lisp_Cons)))), (struct Lisp_Cons *) __builtin_assume_aligned ((void *) (intptr_t) ((li.tail) - (Lisp_Cons)), 8))->car)))
      return (((void) ( 0  && ((((enum Lisp_Type) (((((void) ( 0  && ((((enum Lisp_Type) ((li.tail) & ~(((0x7fffffffffffffffL >> (3 - 1)) / 2 <  (9223372036854775807L) ) ? - (1 << 3)
             : (0x7fffffffffffffffL >> (3 - 1))))) == Lisp_Cons)))), (struct Lisp_Cons *) __builtin_assume_aligned ((void *) (intptr_t) ((li.tail) - (Lisp_Cons)), 8))->u.cdr)
             & ~(((0x7fffffffffffffffL >> (3 - 1)) / 2 <  (9223372036854775807L) ) ? - (1 << 3) : (0x7fffffffffffffffL >> (3 - 1))))) == Lisp_Cons)))),
             (struct Lisp_Cons *) __builtin_assume_aligned ((void *) (intptr_t) (((((void) ( 0  && ((((enum Lisp_Type) ((li.tail) & ~(((0x7fffffffffffffffL >> (3 - 1)) / 2
             < (9223372036854775807L)) ? - (1 << 3) : (0x7fffffffffffffffL >> (3 - 1))))) == Lisp_Cons)))), (struct Lisp_Cons *) __builtin_assume_aligned ((void *) (intptr_t)
             ((li.tail) - (Lisp_Cons)), 8))->u.cdr) - (Lisp_Cons)), 8))->car;
             li.tail = (((void) ( 0  && ((((enum Lisp_Type) ((li.tail) & ~(((0x7fffffffffffffffL >> (3 - 1)) / 2 <  (9223372036854775807L)) ? - (1 << 3)
             : (0x7fffffffffffffffL >> (3 - 1))))) == Lisp_Cons)))), (struct Lisp_Cons *) __builtin_assume_aligned ((void *) (intptr_t) ((li.tail) - (Lisp_Cons)), 8))->u.cdr;
    if (((li.tail) == (li.tortoise)))
      break;
  }
  return builtin_lisp_symbol (0);
}

Oh dear, that's very unhelpful, and there's no way that's getting formatted correctly on mobile. We need to go deeper. Disassembling the file reveals that plist_get takes around 80 instructions per loop, and has 11 possible jumps per iteration. Five of those jumps are in the for loop's incrementation step. I'm sure we don't need five jumps to move through an iterator, but that's for another day.

  5acd:   55                      push   rbp
  5ace:   48 89 e5                mov    rbp,rsp
  5ad1:   48 83 ec 40             sub    rsp,0x40
  5ad5:   48 89 7d c8             mov    QWORD PTR [rbp-0x38],rdi
  5ad9:   48 89 75 c0             mov    QWORD PTR [rbp-0x40],rsi
FOR_EACH_TAIL_SAFE (plist)
  5add:   48 8b 45 c8             mov    rax,QWORD PTR [rbp-0x38]
  5ae1:   48 89 45 d0             mov    QWORD PTR [rbp-0x30],rax
  5ae5:   48 8b 45 c8             mov    rax,QWORD PTR [rbp-0x38]
  5ae9:   48 89 45 d8             mov    QWORD PTR [rbp-0x28],rax
  5aed:   48 c7 45 e0 02 00 00    mov    QWORD PTR [rbp-0x20],0x2
  5af4:   00
  5af5:   48 c7 45 e8 00 00 00    mov    QWORD PTR [rbp-0x18],0x0
  5afc:   00
  5afd:   66 c7 45 f0 02 00       mov    WORD PTR [rbp-0x10],0x2
  5b03:   e9 e7 00 00 00          jmp    5bef <Fplist_get+0x122>
  {
    if (! CONSP (XCDR (li.tail)))
  5b08:   48 8b 45 d0             mov    rax,QWORD PTR [rbp-0x30]
  5b0c:   48 83 e8 03             sub    rax,0x3
  5b10:   48 8b 40 08             mov    rax,QWORD PTR [rax+0x8]
  5b14:   83 e0 07                and    eax,0x7
  5b17:   83 f8 03                cmp    eax,0x3
  5b1a:   0f 85 e1 00 00 00       jne    5c01 <Fplist_get+0x134>
      break;
    if (EQ (prop, XCAR (li.tail)))
  5b20:   48 8b 45 d0             mov    rax,QWORD PTR [rbp-0x30]
  5b24:   48 83 e8 03             sub    rax,0x3
  5b28:   48 8b 00                mov    rax,QWORD PTR [rax]
  5b2b:   48 3b 45 c0             cmp    rax,QWORD PTR [rbp-0x40]
  5b2f:   75 18                   jne    5b49 <Fplist_get+0x7c>
      return XCAR (XCDR (li.tail));
  5b31:   48 8b 45 d0             mov    rax,QWORD PTR [rbp-0x30]
  5b35:   48 83 e8 03             sub    rax,0x3
  5b39:   48 8b 40 08             mov    rax,QWORD PTR [rax+0x8]
  5b3d:   48 83 e8 03             sub    rax,0x3
  5b41:   48 8b 00                mov    rax,QWORD PTR [rax]
  5b44:   e9 c6 00 00 00          jmp    5c0f <Fplist_get+0x142>
    li.tail = XCDR (li.tail);
  5b49:   48 8b 45 d0             mov    rax,QWORD PTR [rbp-0x30]
  5b4d:   48 83 e8 03             sub    rax,0x3
  5b51:   48 8b 40 08             mov    rax,QWORD PTR [rax+0x8]
  5b55:   48 89 45 d0             mov    QWORD PTR [rbp-0x30],rax
    if (EQ (li.tail, li.tortoise))
  5b59:   48 8b 55 d0             mov    rdx,QWORD PTR [rbp-0x30]
  5b5d:   48 8b 45 d8             mov    rax,QWORD PTR [rbp-0x28]
  5b61:   48 39 c2                cmp    rdx,rax
  5b64:   0f 84 9a 00 00 00       je     5c04 <Fplist_get+0x137>
FOR_EACH_TAIL_SAFE (plist)
  5b6a:   48 8b 45 d0             mov    rax,QWORD PTR [rbp-0x30]
  5b6e:   48 83 e8 03             sub    rax,0x3
  5b72:   48 8b 40 08             mov    rax,QWORD PTR [rax+0x8]
  5b76:   48 89 45 d0             mov    QWORD PTR [rbp-0x30],rax
  5b7a:   0f b7 45 f0             movzx  eax,WORD PTR [rbp-0x10]
  5b7e:   83 e8 01                sub    eax,0x1
  5b81:   66 89 45 f0             mov    WORD PTR [rbp-0x10],ax
  5b85:   0f b7 45 f0             movzx  eax,WORD PTR [rbp-0x10]
  5b89:   66 85 c0                test   ax,ax
  5b8c:   75 46                   jne    5bd4 <Fplist_get+0x107>
  5b8e:   48 8b 45 e8             mov    rax,QWORD PTR [rbp-0x18]
  5b92:   48 83 e8 01             sub    rax,0x1
  5b96:   48 89 45 e8             mov    QWORD PTR [rbp-0x18],rax
  5b9a:   48 8b 45 e8             mov    rax,QWORD PTR [rbp-0x18]
  5b9e:   48 85 c0                test   rax,rax
  5ba1:   7f 31                   jg     5bd4 <Fplist_get+0x107>
  5ba3:   48 8b 45 e0             mov    rax,QWORD PTR [rbp-0x20]
  5ba7:   48 01 c0                add    rax,rax
  5baa:   48 89 45 e0             mov    QWORD PTR [rbp-0x20],rax
  5bae:   48 8b 45 e0             mov    rax,QWORD PTR [rbp-0x20]
  5bb2:   48 89 45 e8             mov    QWORD PTR [rbp-0x18],rax
  5bb6:   48 8b 45 e8             mov    rax,QWORD PTR [rbp-0x18]
  5bba:   66 89 45 f0             mov    WORD PTR [rbp-0x10],ax
  5bbe:   48 8b 45 e8             mov    rax,QWORD PTR [rbp-0x18]
  5bc2:   48 c1 f8 10             sar    rax,0x10
  5bc6:   48 89 45 e8             mov    QWORD PTR [rbp-0x18],rax
  5bca:   48 8b 45 d0             mov    rax,QWORD PTR [rbp-0x30]
  5bce:   48 89 45 d8             mov    QWORD PTR [rbp-0x28],rax
  5bd2:   eb 1b                   jmp    5bef <Fplist_get+0x122>
  5bd4:   48 8b 55 d0             mov    rdx,QWORD PTR [rbp-0x30]
  5bd8:   48 8b 45 d8             mov    rax,QWORD PTR [rbp-0x28]
  5bdc:   48 39 c2                cmp    rdx,rax
  5bdf:   75 0e                   jne    5bef <Fplist_get+0x122>
  5be1:   bf 00 00 00 00          mov    edi,0x0
  5be6:   e8 00 00 00 00          call   5beb <Fplist_get+0x11e>
  5beb:   48 89 45 d0             mov    QWORD PTR [rbp-0x30],rax
  5bef:   48 8b 45 d0             mov    rax,QWORD PTR [rbp-0x30]
  5bf3:   83 e0 07                and    eax,0x7
  5bf6:   83 f8 03                cmp    eax,0x3
  5bf9:   0f 84 09 ff ff ff       je     5b08 <Fplist_get+0x3b>
  5bff:   eb 04                   jmp    5c05 <Fplist_get+0x138>
      break;
  5c01:   90                      nop
  5c02:   eb 01                   jmp    5c05 <Fplist_get+0x138>
      break;
  5c04:   90                      nop
  }

return Qnil;
  5c05:   bf 00 00 00 00          mov    edi,0x0
  5c0a:   e8 00 00 00 00          call   5c0f <Fplist_get+0x142>

Disassembling alist-get

alist-get is an O(1) elisp wrapper around assq, another C function. Inspecting the source and disassembly reveals a slight efficiency gain over plists.

DEFUN ("assq", Fassq, Sassq, 2, 2, 0,
       doc: /* long docstring */
  (Lisp_Object key, Lisp_Object list)
{
  FOR_EACH_TAIL (list) {
    if (CONSP (XCAR (li.tail)) && EQ (XCAR (XCAR (li.tail)), key))
      return XCAR (li.tail);
  }
  return Qnil;
}
    39d6:   55                      push   rbp
    39d7:   48 89 e5                mov    rbp,rsp
    39da:   48 83 ec 40             sub    rsp,0x40
    39de:   48 89 7d c8             mov    QWORD PTR [rbp-0x38],rdi
    39e2:   48 89 75 c0             mov    QWORD PTR [rbp-0x40],rsi
  FOR_EACH_TAIL (list)
    39e6:   48 8b 45 c0             mov    rax,QWORD PTR [rbp-0x40]
    39ea:   48 89 45 d0             mov    QWORD PTR [rbp-0x30],rax
    39ee:   48 8b 45 c0             mov    rax,QWORD PTR [rbp-0x40]
    39f2:   48 89 45 d8             mov    QWORD PTR [rbp-0x28],rax
    39f6:   48 c7 45 e0 02 00 00    mov    QWORD PTR [rbp-0x20],0x2
    39fd:   00
    39fe:   48 c7 45 e8 00 00 00    mov    QWORD PTR [rbp-0x18],0x0
    3a05:   00
    3a06:   66 c7 45 f0 02 00       mov    WORD PTR [rbp-0x10],0x2
    3a0c:   e9 c3 00 00 00          jmp    3ad4 <Fassq+0xfe>
    if (CONSP (XCAR (li.tail)) && EQ (XCAR (XCAR (li.tail)), key))
    3a11:   48 8b 45 d0             mov    rax,QWORD PTR [rbp-0x30]
    3a15:   48 83 e8 03             sub    rax,0x3
    3a19:   48 8b 00                mov    rax,QWORD PTR [rax]
    3a1c:   83 e0 07                and    eax,0x7
    3a1f:   83 f8 03                cmp    eax,0x3
    3a22:   75 28                   jne    3a4c <Fassq+0x76>
    3a24:   48 8b 45 d0             mov    rax,QWORD PTR [rbp-0x30]
    3a28:   48 83 e8 03             sub    rax,0x3
    3a2c:   48 8b 00                mov    rax,QWORD PTR [rax]
    3a2f:   48 83 e8 03             sub    rax,0x3
    3a33:   48 8b 00                mov    rax,QWORD PTR [rax]
    3a36:   48 3b 45 c8             cmp    rax,QWORD PTR [rbp-0x38]
    3a3a:   75 10                   jne    3a4c <Fassq+0x76>
      return XCAR (li.tail);
    3a3c:   48 8b 45 d0             mov    rax,QWORD PTR [rbp-0x30]
    3a40:   48 83 e8 03             sub    rax,0x3
    3a44:   48 8b 00                mov    rax,QWORD PTR [rax]
    3a47:   e9 b5 00 00 00          jmp    3b01 <Fassq+0x12b>
  FOR_EACH_TAIL (list)
    3a4c:   48 8b 45 d0             mov    rax,QWORD PTR [rbp-0x30]
    3a50:   48 83 e8 03             sub    rax,0x3
    3a54:   48 8b 40 08             mov    rax,QWORD PTR [rax+0x8]
    3a58:   48 89 45 d0             mov    QWORD PTR [rbp-0x30],rax
    3a5c:   0f b7 45 f0             movzx  eax,WORD PTR [rbp-0x10]
    3a60:   83 e8 01                sub    eax,0x1
    3a63:   66 89 45 f0             mov    WORD PTR [rbp-0x10],ax
    3a67:   0f b7 45 f0             movzx  eax,WORD PTR [rbp-0x10]
    3a6b:   66 85 c0                test   ax,ax
    3a6e:   75 4b                   jne    3abb <Fassq+0xe5>
    3a70:   e8 00 00 00 00          call   3a75 <Fassq+0x9f>
    3a75:   48 8b 45 e8             mov    rax,QWORD PTR [rbp-0x18]
    3a79:   48 83 e8 01             sub    rax,0x1
    3a7d:   48 89 45 e8             mov    QWORD PTR [rbp-0x18],rax
    3a81:   48 8b 45 e8             mov    rax,QWORD PTR [rbp-0x18]
    3a85:   48 85 c0                test   rax,rax
    3a88:   7f 31                   jg     3abb <Fassq+0xe5>
    3a8a:   48 8b 45 e0             mov    rax,QWORD PTR [rbp-0x20]
    3a8e:   48 01 c0                add    rax,rax
    3a91:   48 89 45 e0             mov    QWORD PTR [rbp-0x20],rax
    3a95:   48 8b 45 e0             mov    rax,QWORD PTR [rbp-0x20]
    3a99:   48 89 45 e8             mov    QWORD PTR [rbp-0x18],rax
    3a9d:   48 8b 45 e8             mov    rax,QWORD PTR [rbp-0x18]
    3aa1:   66 89 45 f0             mov    WORD PTR [rbp-0x10],ax
    3aa5:   48 8b 45 e8             mov    rax,QWORD PTR [rbp-0x18]
    3aa9:   48 c1 f8 10             sar    rax,0x10
    3aad:   48 89 45 e8             mov    QWORD PTR [rbp-0x18],rax
    3ab1:   48 8b 45 d0             mov    rax,QWORD PTR [rbp-0x30]
    3ab5:   48 89 45 d8             mov    QWORD PTR [rbp-0x28],rax
    3ab9:   eb 19                   jmp    3ad4 <Fassq+0xfe>
    3abb:   48 8b 55 d0             mov    rdx,QWORD PTR [rbp-0x30]
    3abf:   48 8b 45 d8             mov    rax,QWORD PTR [rbp-0x28]
    3ac3:   48 39 c2                cmp    rdx,rax
    3ac6:   75 0c                   jne    3ad4 <Fassq+0xfe>
    3ac8:   48 8b 45 c0             mov    rax,QWORD PTR [rbp-0x40]
    3acc:   48 89 c7                mov    rdi,rax
    3acf:   e8 00 00 00 00          call   3ad4 <Fassq+0xfe>
    3ad4:   48 8b 45 d0             mov    rax,QWORD PTR [rbp-0x30]
    3ad8:   83 e0 07                and    eax,0x7
    3adb:   83 f8 03                cmp    eax,0x3
    3ade:   0f 84 2d ff ff ff       je     3a11 <Fassq+0x3b>
    3ae4:   48 8b 45 d0             mov    rax,QWORD PTR [rbp-0x30]
    3ae8:   48 8b 55 c0             mov    rdx,QWORD PTR [rbp-0x40]
    3aec:   48 89 d6                mov    rsi,rdx
    3aef:   48 89 c7                mov    rdi,rax
    3af2:   e8 00 00 00 00          call   3af7 <Fassq+0x121>
  return Qnil;
    3af7:   bf 00 00 00 00          mov    edi,0x0
    3afc:   e8 00 00 00 00          call   3b01 <Fassq+0x12b>
}
    3b01:   c9                      leave
    3b02:   c3                      ret

For those keeping count, that's 70 instructions per loop, and 9 jumps. That'd be the source of our constant-factor speedup. The slowness for small values of n are probably because of the elisp layer, which runs only once per call.

Of course, big n is better served with hash tables, as the prettier literal syntax of plists and alists becomes unwieldy after n grows larger than 5.

I have plenty of other things to cover about emacs' performance, like the myth of recursive vs. iterative code, lists and vectors, and the common lisp libraries. I'll cover those in a future blog.