I had heard that the vector
class in C++ STL doesn't double every time it needs to resize (as we were taught to do in data structures class), but instead it multiplies the size by 1.5. In a recent article on comp.lang.c++.moderated
, Andrew Koenig explains why. It's related to the golden ratio and the Fibonacci sequence:
From: "Andrew Koenig" <ark@acm.org> Subject: Re: vector growth factor of 1.5 Newsgroups: comp.lang.c++.moderated Date: 5 Nov 2003 17:26:10 -0500 Organization: AT&T Worldnet There is a technical reason to prefer 1.5 to 2 -- more specifically, to prefer values less than 1+sqrt(5)/2. Suppose you are using a first-fit memory allocator, and you're progressively appending to a vector. Then each time you reallocate, you allocate new memory, copy the elements, then free the old memory. That leaves a gap, and it would be nice to be able to use that memory eventually. If the vector grows too rapidly, it will always be too big for the available memory. It turns out that if the growth factor is >= 1+sqrt(5)/2, the new memory will always be too big for the hole that has been left sofar; if it is <1+sqrt(5)/2, the new memory will eventually fit. So 1.5 is small enough to allow the memory to be recycled.
3 comments:
Subscribe to:
Post Comments (Atom)
Why?
Hello Amit,
Thank you very much for all your excellent posts and articles, particularly on path-finding. Your experience and explanations have taught me a great deal of information that I am becoming increasingly interested in. I really appreciate the significant effort you have obviously put into it all.
I realize that this original post is quite old now and has not seen any recent comment activity, but I'm hoping you may still get notified about my new comment here and may be able to respond when you have time.
Maybe I have made an error, but my calculation of 1+sqrt(5)/2 yields a little over 2.118 as the explained threshold factor for resizing. I too was taught in college that doubling size was a good practice, and that approach still seems to be within the acceptable range as described. So why then would 1.5 be a better choice?
I'm also not grasping quite how the outgrown originally allocated memory block would leave a gap that is reusable later, as the vector presumably continues to expand. That doesn't make sense to me yet either.
Thanks in advance for any explanation you might be able to provide, at your convenience.
-Pip
Hi Pip, thanks! I think this only applies to certain types of allocators.
Suppose your vector was size N initially, and then you resized it to S*N. The memory allocator has allocated N bytes and then S*N bytes, and then it copies the elements over to the new space and frees up the original space. That means the memory will be N free bytes, then S*N used bytes.
Then you resize again, to S*S*N. You allocate a new block of size S*S*N, but it can't fit in that free N bytes, so it allocates at the end, copies, and releases the S*N bytes. Your memory now looks like N free bytes, S*N free bytes, and S*S*N used bytes.
The *next* time you resize, to S*S*S*N, you need a new block. If you want it to reuse those N + S*N bytes, you need S*S*S*N < N+S*N.
That's as far as I got though. I haven't been able to figure out 1+sqrt(5)/2. Given that sqrt(5) often shows up in fibonacci sequences, it's probably somehow related to fibonacci numbers, where the next number is the sum of the previous two. Sorry — I don't know!
Also, I think it's probably not relevant if you have a different type of memory allocator, or if you're allocating memory anywhere else in the program…
Post a Comment