[Larceny-users] How to lock down Larceny memory

Discussion:

David Rush

2009-07-13 12:05:02 UTC

I'm currently working on ODBC bindings for Larceny. Unfortunately the
ODBC standard requires the caller to manage all memory, which is a
little bit problematic. Thus far I have been able to marshall data out
of the bytevectors I am using as buffers, but I have now reached the
point where I will need to keep a bytevector pinned at the address it
has when I first push it through the FFI across many different calls
(while fetching data). I fully expect Larceny to invoke the collector
somewhere as I build up the in-memory data structures from the DB
storage, so I'd like to know how to protect my buffers from getting
moved by the collector...assuming that it can be done, that is

Thanks in advance.

david

--
GPG Public key at http://cyber-rush.org/drr/gpg-public-key.txt

Felix Klock

2009-07-13 15:05:29 UTC

Permalink

David-

[...] I will need to keep a bytevector pinned at the address it
has when I first push it through the FFI across many different calls
(while fetching data). I fully expect Larceny to invoke the collector
somewhere as I build up the in-memory data structures from the DB
storage, so I'd like to know how to protect my buffers from getting
moved by the collector [...]

The short answer: the Larceny FFI has to do this too (my Scheme
Workshop 2008 paper [1] talks a bit about why), so we have some
support for it; look in lib/Ffi/memory.sch.

The long answer:

Right now Larceny does not have proper support for object pinning;
that is, there is no primitive operation that takes an arbitrary
previously allocated object and commands the GC to cease moving that
object until it is unpinned or unreachable.

Having said that, one can almost define a procedure that would have
the semantics given in the previous paragraph.

I am going to break down the problem of pinning an object down into
smaller parts.

Part 1. Allocating an object and ensuring it will never be moved by
the collector (the "immovable object" problem)

There are procedures in the Larceny FFI library that do this; there
are make-nonrelocatable-bytevector, cons-nonrelocatable, and make-
nonrelocatable-vector, and they are all defined in lib/Ffi/
memory.sch. So the software engineer in me wants to say "just use the
same procedure that the FFI uses; that will be supported as much as
the FFI is." But there is a caveat that the current implementation of
all three procedures puts the objects into the static area, which
means the storage they allocate will never be reclaimed.

This may not matter if you are only allocating a small number of
bytevectors via this technique. But if that is not the case, there is
another approach that you can use if this is not acceptable to you;
I've included it as footnote [2]. (The approach is a definite hack
and is not officially supported, which is why I'm putting it into a
footnote.)

Having said that, I *want* to fix the implementation of the procedures
used by the FFI and offer it via the same API. I have thought of ways
to use tricks similar to footnote [2] to get nonrelocatable objects
that can still be reclaimed by garbage collection. I cannot spend
time on the problem in the short term, but if you need better support
for this feature than is currently present, let us know. (If you were
to offer to help test alpha-quality implementations of the feature,
that might act as a catalyst towards getting them implemented.)

----

Part 2. How to replace one previously allocated object with another
newly allocated object (the "become" problem, though I was tempted to
call it the "unstoppable force" problem :)

I hope you don't actually need to do this; it sounded from your
problem description above that a solution to part 1 would suffice for
you. I have included one approach to it footnote [3], but there's no
guarantees that this hack applies in your situation.

----

We may offer proper support for object pinning in the future; I know
Will has told me that he sees it as the right direction.

-Felix

[1] http://www.ccs.neu.edu/home/will/scheme2008/paper7.pdf

[2] There are essentially two ways to allocate an object and ensure it
never moves in the current runtime system. One way is to allocate the
object to the static area; one might think of the static area as a
generation that is so long-lived that it is assumed to be immortal by
the garbage collector.

The other way is to allocate the object to the Large Object Space
(LOS); the LOS is an area where the runtime employs a mark-sweep
technique to collect the objects rather than incur the cost of copying
them. There is currently no way to directly allocate into the LOS,
but any object that is promoted out of the nursery *and* is larger
than a certain threshold will end up there. You can certainly ensure
that those two conditions are satisfied (for bytevectors and vectors);
for the latter, just allocate a larger object. For the former,
invoking the (collect) procedure will force the nursery to be
collected, so all of the objects in the nursery will be promoted out
of it, including any large objects that happened to fit into the
nursery.

[3] The sro procedure can be used to build up a vector holding every
reachable object in the heap. To get the effect of replacing an
object X with another object Y, one could use sro to find all of the
objects that refer to X and then bang on them to now refer to object
Y. You would need to make sure to traverse three kinds of traversable
objects holding such references: pairs, vector-likes, and procedures.
The code to do this is easy to write, once one knows about the eight
procedures {vector-like,procedure}-{length,ref,set!}; its just such a
simple-mindedly slow and scary hack that I do not want to recommend it
to the faint of heart. (Also, there exist constructions that will
hide references that need to be updated in bytevectors, but I consider
those invalid inputs for the "become" problem.)

David Rush

2009-07-13 19:31:38 UTC

Permalink

[...] I will need to keep a bytevector pinned at the address it
has when I first push it through the FFI across many different calls

The short answer: the Larceny FFI has to do this too (my Scheme Workshop
2008 paper [1] talks a bit about why), so we have some support for it; look
in lib/Ffi/memory.sch.

Ok :)
(re: allocating bytevectors that won't ever be collected)

This is relatively easily to manage by using an old-style free-list in
my user-space. Obviously, this is less than ideal.

Having said that, I *want* to fix the implementation of the procedures used
by the FFI and offer it via the same API. I have thought of ways to use
tricks similar to footnote [2] to get nonrelocatable objects that can still
be reclaimed by garbage collection. I cannot spend time on the problem in
the short term, but if you need better support for this feature than is
currently present, let us know. (If you were to offer to help test
alpha-quality implementations of the feature, that might act as a catalyst
towards getting them implemented.)

Hmmm...well better support would definitely help out a lot. I have
embarked on skunkworks project re-implementing a 90s-era financial
application (currently a monolithic windoesn't app using ODBC) as a
server farm to go into a high-volume OLTP environment. I would be
happy to test alpha code, assuming that I can transport a bootstrap
larceny heap image from a 32-bit debian build because I don't have a
properly configured environment for building Larceny on Win32. Past
experience leads me to believe that the heaps are fairly portable -
unless you know better, of course.

ATM, I am trying a hack using foreign-stdlib. That should still leave
me close to the kind of interface you would need for pinnable
bytevectors because, regardless of the solution you choose for
Larceny's implementation, at some point you have to finalize the
pinned extent. This is isomorphic to a manual free so it seems like it
would be all good.

Part 2. How to replace one previously allocated object with another newly
allocated object
I hope you don't actually need to do this

No. And I remember the issue in Smalltalk, and it involved ugly hacks
there which are essentially equivalent to the solution you proposed.

We may offer proper support for object pinning in the future; I know Will
has told me that he sees it as the right direction.

Yay. Between the workarounds you suggested and the prospects of future
features, I'm good to go :)

[2] The other way is to allocate the object to the Large Object Space (LOS); the

Evil hack :) But I will remember it if I ever get back around to my
soft synthesizer work. For that one I need big buffers to hand off to
windows...

david rush

--
GPG Public key at http://cyber-rush.org/drr/gpg-public-key.txt

Felix Klock

2009-07-13 19:50:00 UTC

Permalink

David-

[...] I would be
happy to test alpha code, assuming that I can transport a bootstrap
larceny heap image from a 32-bit debian build because I don't have a
properly configured environment for building Larceny on Win32. Past
experience leads me to believe that the heaps are fairly portable -
unless you know better, of course.

Well, if we ask you to help test, I assume we'd hand off win32 heaps
to you to try out...

(I think attempting to directly use a heap built on a linux-x86-host
on a win32-x86-host may lead to some compatibility problems... but
maybe its only a problem if one attempts to use the twobit.heap... I
have not tried to do it in a while.)

-Felix