[Larceny-users] Performance of List Based Data Structures With ERR5RS Recorcs

Felix Klock

2008-09-14 19:40:43 UTC

Ray-

Post by Ray Racine
Naively I did not expect this result in the sense of O(1) for vector-
like vs. O(n) of cons lists.

I think you may have to use larger records/lists before you'll see the
O(1) versus O(n) matter more than the constant factors. (A list of
length 5 is still pretty short.)

Post by Ray Racine
Records are almost 10x slower then cons lists.

I think you may need to take care in what you are comparing: the cost
of accessing fields in a particular data representation, or the cost
of a procedure call versus a primitive operation that executes as
inlined machine code.

Here's a couple more cases I added to your example:

;;; RAYS ORIGINAL CODE
;;; (with one change, replacing the symbol 'lists with the symbol
'lists-ca/dr)

(import (rnrs))
(import (err5rs records syntactic))
(import (primitives time))

(define-record-type rec
#t #t a b c d e)

'records

(time
(let ((r (make-rec 'a 'b 'c 'd 'e)))
(do ((i 10000000 (fx- i 1)))
((fxzero? i)
(rec-e r))
(rec-a r)
(rec-b r)
(rec-c r)
(rec-d r)
(rec-e r)
)))

'lists-ca/dr

(time
(let ((data '(a b c d e)))
(do ((i 10000000 (fx- i 1)))
((fxzero? i)
(cadr (cdddr data)))
(car data)
(cadr data)
(caddr data)
(cadddr data)
(cadr (cdddr data))
)))

;;; FELIXS NEW CODE

'lists-ref

(time
(let ((data '(a b c d e)))
(do ((i 10000000 (fx- i 1)))
((fxzero? i)
(list-ref data 4))
(list-ref data 0)
(list-ref data 1)
(list-ref data 2)
(list-ref data 3)
(list-ref data 4)
)))

'lists-accessors

(define (list-a data) (car data))
(define (list-b data) (cadr data))
(define (list-c data) (caddr data))
(define (list-d data) (cadddr data))
(define (list-e data) (cadr (cdddr data)))

(time
(let ((data '(a b c d e)))
(do ((i 10000000 (fx- i 1)))
((fxzero? i)
(list-ref data 4))
(list-a data)
(list-b data)
(list-c data)
(list-d data)
(list-e data)
)))

;;; END OF CODE

The 'list-accessors code at the end is meant to be the closest match
to the 'records code, since both involve procedure invocations for
"field access"

Here are the results I get when I run this. (I don't have the specs
for the machine immediately available, but for future reference, I'm
doing these experiments on the machine named "artichoke" in the CCIS
lab)

Post by Ray Racine
records
Words allocated: 0

Words reclaimed: 0
Elapsed time...: 1115 ms (User: 1112 ms; System: 4 ms)
Elapsed GC time: 0 ms (CPU: 0 in 0 collections.)
e

Post by Ray Racine
lists-ca/dr
Words allocated: 0

Words reclaimed: 0
Elapsed time...: 141 ms (User: 140 ms; System: 0 ms)
Elapsed GC time: 0 ms (CPU: 0 in 0 collections.)
e

Post by Ray Racine
lists-ref
Words allocated: 0

Words reclaimed: 0
Elapsed time...: 1759 ms (User: 1752 ms; System: 4 ms)
Elapsed GC time: 0 ms (CPU: 0 in 0 collections.)
e

Post by Ray Racine
lists-accessors
Words allocated: 0

Words reclaimed: 0
Elapsed time...: 635 ms (User: 636 ms; System: 0 ms)
Elapsed GC time: 0 ms (CPU: 0 in 0 collections.)
e

So from this I conclude:
1. Inlined car/cdr beats field access via function invocation (in all
three cases).
2. Out-of-line car/cdr beats record accessor functions (by about 2x,
not 10x).
3. List-ref is slower than record accessor functions.

A more thorough experiment would also include a comparison with a
representation based directly on vectors. A more thorough analysis
would also include disassembly of the machine code generated for the
different cases above.

I admit that conclusion 2 above is a little surprising. I suspect the
record accessors are paying some fixed cost in checking that the input
is of the correct record type; the corresponding predicates for pairs
are heavily bummed.

-Felix