Struct wasmtime::PoolingAllocationConfig

source ·
pub struct PoolingAllocationConfig { /* private fields */ }
Expand description

Configuration options used with InstanceAllocationStrategy::Pooling to change the behavior of the pooling instance allocator.

This structure has a builder-style API in the same manner as Config and is configured with Config::allocation_strategy.

Note that usage of the pooling allocator does not affect compiled WebAssembly code. Compiled *.cwasm files, for example, are usable both with and without the pooling allocator.

§Advantages of Pooled Allocation

The main benefit of the pooling allocator is to make WebAssembly instantiation both faster and more scalable in terms of parallelism. Allocation is faster because virtual memory is already configured and ready to go within the pool, there’s no need to mmap (for example on Unix) a new region and configure it with guard pages. By avoiding mmap this avoids whole-process virtual memory locks which can improve scalability and performance through avoiding this.

Additionally with pooled allocation it’s possible to create “affine slots” to a particular WebAssembly module or component over time. For example if the same module is multiple times over time the pooling allocator will, by default, attempt to reuse the same slot. This mean that the slot has been pre-configured and can retain virtual memory mappings for a copy-on-write image, for example (see Config::memory_init_cow for more information. This means that in a steady state instance deallocation is a single madvise to reset linear memory to its original contents followed by a single (optional) mprotect during the next instantiation to shrink memory back to its original size. Compared to non-pooled allocation this avoids the need to mmap a new region of memory, munmap it, and mprotect regions too.

Another benefit of pooled allocation is that it’s possible to configure things such that no virtual memory management is required at all in a steady state. For example a pooling allocator can be configured with Config::memory_init_cow disabledd, dynamic bounds checks enabled through Config::static_memory_maximum_size(0), and sufficient space through PoolingAllocationConfig::table_keep_resident / PoolingAllocationConfig::linear_memory_keep_resident. With all these options in place no virtual memory tricks are used at all and everything is manually managed by Wasmtime (for example resetting memory is a memset(0)). This is not as fast in a single-threaded scenario but can provide benefits in high-parallelism situations as no virtual memory locks or IPIs need happen.

§Disadvantages of Pooled Allocation

Despite the above advantages to instantiation performance the pooling allocator is not enabled by default in Wasmtime. One reason is that the performance advantages are not necessarily portable, for example while the pooling allocator works on Windows it has not been tuned for performance on Windows in the same way it has on Linux.

Additionally the main cost of the pooling allocator is that it requires a very large reservation of virtual memory (on the order of most of the addressable virtual address space). WebAssembly 32-bit linear memories in Wasmtime are, by default 4G address space reservations with a 2G guard region both before and after the linear memory. Memories in the pooling allocator are contiguous which means that we only need a guard after linear memory because the previous linear memory’s slot post-guard is our own pre-guard. This means that, by default, the pooling allocator uses 6G of virtual memory per WebAssembly linear memory slot. 6G of virtual memory is 32.5 bits of a 64-bit address. Many 64-bit systems can only actually use 48-bit addresses by default (although this can be extended on architectures nowadays too), and of those 48 bits one of them is reserved to indicate kernel-vs-userspace. This leaves 47-32.5=14.5 bits left, meaning you can only have at most 64k slots of linear memories on many systems by default. This is a relatively small number and shows how the pooling allocator can quickly exhaust all of virtual memory.

Another disadvantage of the pooling allocator is that it may keep memory alive when nothing is using it. A previously used slot for an instance might have paged-in memory that will not get paged out until the Engine owning the pooling allocator is dropped. While suitable for some applications this behavior may not be suitable for all applications.

Finally the last disadvantage of the pooling allocator is that the configuration values for the maximum number of instances, memories, tables, etc, must all be fixed up-front. There’s not always a clear answer as to what these values should be so not all applications may be able to work with this constraint.

Implementations§

source§

impl PoolingAllocationConfig

source

pub fn max_unused_warm_slots(&mut self, max: u32) -> &mut Self

Configures the maximum number of “unused warm slots” to retain in the pooling allocator.

The pooling allocator operates over slots to allocate from, and each slot is considered “cold” if it’s never been used before or “warm” if it’s been used by some module in the past. Slots in the pooling allocator additionally track an “affinity” flag to a particular core wasm module. When a module is instantiated into a slot then the slot is considered affine to that module, even after the instance has been deallocated.

When a new instance is created then a slot must be chosen, and the current algorithm for selecting a slot is:

  • If there are slots that are affine to the module being instantiated, then the most recently used slot is selected to be allocated from. This is done to improve reuse of resources such as memory mappings and additionally try to benefit from temporal locality for things like caches.

  • Otherwise if there are more than N affine slots to other modules, then one of those affine slots is chosen to be allocated. The slot chosen is picked on a least-recently-used basis.

  • Finally, if there are less than N affine slots to other modules, then the non-affine slots are allocated from.

This setting, max_unused_warm_slots, is the value for N in the above algorithm. The purpose of this setting is to have a knob over the RSS impact of “unused slots” for a long-running wasm server.

If this setting is set to 0, for example, then affine slots are aggressively reused on a least-recently-used basis. A “cold” slot is only used if there are no affine slots available to allocate from. This means that the set of slots used over the lifetime of a program is the same as the maximum concurrent number of wasm instances.

If this setting is set to infinity, however, then cold slots are prioritized to be allocated from. This means that the set of slots used over the lifetime of a program will approach PoolingAllocationConfig::total_memories, or the maximum number of slots in the pooling allocator.

Wasmtime does not aggressively decommit all resources associated with a slot when the slot is not in use. For example the PoolingAllocationConfig::linear_memory_keep_resident option can be used to keep memory associated with a slot, even when it’s not in use. This means that the total set of used slots in the pooling instance allocator can impact the overall RSS usage of a program.

The default value for this option is 100.

source

pub fn decommit_batch_size(&mut self, batch_size: usize) -> &mut Self

The target number of decommits to do per batch.

This is not precise, as we can queue up decommits at times when we aren’t prepared to immediately flush them, and so we may go over this target size occasionally.

A batch size of one effectively disables batching.

Defaults to 1.

source

pub fn async_stack_zeroing(&mut self, enable: bool) -> &mut Self

Configures whether or not stacks used for async futures are reset to zero after usage.

When the async_support method is enabled for Wasmtime and the call_async variant of calling WebAssembly is used then Wasmtime will create a separate runtime execution stack for each future produced by call_async. During the deallocation process Wasmtime won’t by default reset the contents of the stack back to zero.

When this option is enabled it can be seen as a defense-in-depth mechanism to reset a stack back to zero. This is not required for correctness and can be a costly operation in highly concurrent environments due to modifications of the virtual address space requiring process-wide synchronization.

This option defaults to false.

source

pub fn async_stack_keep_resident(&mut self, size: usize) -> &mut Self

How much memory, in bytes, to keep resident for async stacks allocated with the pooling allocator.

When PoolingAllocationConfig::async_stack_zeroing is enabled then Wasmtime will reset the contents of async stacks back to zero upon deallocation. This option can be used to perform the zeroing operation with memset up to a certain threshold of bytes instead of using system calls to reset the stack to zero.

Note that when using this option the memory with async stacks will never be decommitted.

source

pub fn linear_memory_keep_resident(&mut self, size: usize) -> &mut Self

How much memory, in bytes, to keep resident for each linear memory after deallocation.

This option is only applicable on Linux and has no effect on other platforms.

By default Wasmtime will use madvise to reset the entire contents of linear memory back to zero when a linear memory is deallocated. This option can be used to use memset instead to set memory back to zero which can, in some configurations, reduce the number of page faults taken when a slot is reused.

source

pub fn table_keep_resident(&mut self, size: usize) -> &mut Self

How much memory, in bytes, to keep resident for each table after deallocation.

This option is only applicable on Linux and has no effect on other platforms.

This option is the same as PoolingAllocationConfig::linear_memory_keep_resident except that it is applicable to tables instead.

source

pub fn total_component_instances(&mut self, count: u32) -> &mut Self

The maximum number of concurrent component instances supported (default is 1000).

This provides an upper-bound on the total size of component metadata-related allocations, along with PoolingAllocationConfig::max_component_instance_size. The upper bound is

total_component_instances * max_component_instance_size

where max_component_instance_size is rounded up to the size and alignment of the internal representation of the metadata.

source

pub fn max_component_instance_size(&mut self, size: usize) -> &mut Self

The maximum size, in bytes, allocated for a component instance’s VMComponentContext metadata.

The wasmtime::component::Instance type has a static size but its internal VMComponentContext is dynamically sized depending on the component being instantiated. This size limit loosely correlates to the size of the component, taking into account factors such as:

  • number of lifted and lowered functions,
  • number of memories
  • number of inner instances
  • number of resources

If the allocated size per instance is too small then instantiation of a module will fail at runtime with an error indicating how many bytes were needed.

The default value for this is 1MiB.

This provides an upper-bound on the total size of component metadata-related allocations, along with PoolingAllocationConfig::total_component_instances. The upper bound is

total_component_instances * max_component_instance_size

where max_component_instance_size is rounded up to the size and alignment of the internal representation of the metadata.

source

pub fn max_core_instances_per_component(&mut self, count: u32) -> &mut Self

The maximum number of core instances a single component may contain (default is 20).

This method (along with PoolingAllocationConfig::max_memories_per_component, PoolingAllocationConfig::max_tables_per_component, and PoolingAllocationConfig::max_component_instance_size) allows you to cap the amount of resources a single component allocation consumes.

If a component will instantiate more core instances than count, then the component will fail to instantiate.

source

pub fn max_memories_per_component(&mut self, count: u32) -> &mut Self

The maximum number of Wasm linear memories that a single component may transitively contain (default is 20).

This method (along with PoolingAllocationConfig::max_core_instances_per_component, PoolingAllocationConfig::max_tables_per_component, and PoolingAllocationConfig::max_component_instance_size) allows you to cap the amount of resources a single component allocation consumes.

If a component transitively contains more linear memories than count, then the component will fail to instantiate.

source

pub fn max_tables_per_component(&mut self, count: u32) -> &mut Self

The maximum number of tables that a single component may transitively contain (default is 20).

This method (along with PoolingAllocationConfig::max_core_instances_per_component, PoolingAllocationConfig::max_memories_per_component, PoolingAllocationConfig::max_component_instance_size) allows you to cap the amount of resources a single component allocation consumes.

If a component will transitively contains more tables than count, then the component will fail to instantiate.

source

pub fn total_memories(&mut self, count: u32) -> &mut Self

The maximum number of concurrent Wasm linear memories supported (default is 1000).

This value has a direct impact on the amount of memory allocated by the pooling instance allocator.

The pooling instance allocator allocates a memory pool, where each entry in the pool contains the reserved address space for each linear memory supported by an instance.

The memory pool will reserve a large quantity of host process address space to elide the bounds checks required for correct WebAssembly memory semantics. Even with 64-bit address spaces, the address space is limited when dealing with a large number of linear memories.

For example, on Linux x86_64, the userland address space limit is 128 TiB. That might seem like a lot, but each linear memory will reserve 6 GiB of space by default.

source

pub fn total_tables(&mut self, count: u32) -> &mut Self

The maximum number of concurrent tables supported (default is 1000).

This value has a direct impact on the amount of memory allocated by the pooling instance allocator.

The pooling instance allocator allocates a table pool, where each entry in the pool contains the space needed for each WebAssembly table supported by an instance (see table_elements to control the size of each table).

source

pub fn total_stacks(&mut self, count: u32) -> &mut Self

The maximum number of execution stacks allowed for asynchronous execution, when enabled (default is 1000).

This value has a direct impact on the amount of memory allocated by the pooling instance allocator.

source

pub fn total_core_instances(&mut self, count: u32) -> &mut Self

The maximum number of concurrent core instances supported (default is 1000).

This provides an upper-bound on the total size of core instance metadata-related allocations, along with PoolingAllocationConfig::max_core_instance_size. The upper bound is

total_core_instances * max_core_instance_size

where max_core_instance_size is rounded up to the size and alignment of the internal representation of the metadata.

source

pub fn max_core_instance_size(&mut self, size: usize) -> &mut Self

The maximum size, in bytes, allocated for a core instance’s VMContext metadata.

The Instance type has a static size but its VMContext metadata is dynamically sized depending on the module being instantiated. This size limit loosely correlates to the size of the Wasm module, taking into account factors such as:

  • number of functions
  • number of globals
  • number of memories
  • number of tables
  • number of function types

If the allocated size per instance is too small then instantiation of a module will fail at runtime with an error indicating how many bytes were needed.

The default value for this is 1MiB.

This provides an upper-bound on the total size of core instance metadata-related allocations, along with PoolingAllocationConfig::total_core_instances. The upper bound is

total_core_instances * max_core_instance_size

where max_core_instance_size is rounded up to the size and alignment of the internal representation of the metadata.

source

pub fn max_tables_per_module(&mut self, tables: u32) -> &mut Self

The maximum number of defined tables for a core module (default is 1).

This value controls the capacity of the VMTableDefinition table in each instance’s VMContext structure.

The allocated size of the table will be tables * sizeof(VMTableDefinition) for each instance regardless of how many tables are defined by an instance’s module.

source

pub fn table_elements(&mut self, elements: u32) -> &mut Self

The maximum table elements for any table defined in a module (default is 20000).

If a table’s minimum element limit is greater than this value, the module will fail to instantiate.

If a table’s maximum element limit is unbounded or greater than this value, the maximum will be table_elements for the purpose of any table.grow instruction.

This value is used to reserve the maximum space for each supported table; table elements are pointer-sized in the Wasmtime runtime. Therefore, the space reserved for each instance is tables * table_elements * sizeof::<*const ()>.

source

pub fn max_memories_per_module(&mut self, memories: u32) -> &mut Self

The maximum number of defined linear memories for a module (default is 1).

This value controls the capacity of the VMMemoryDefinition table in each core instance’s VMContext structure.

The allocated size of the table will be memories * sizeof(VMMemoryDefinition) for each core instance regardless of how many memories are defined by the core instance’s module.

source

pub fn max_memory_size(&mut self, bytes: usize) -> &mut Self

The maximum byte size that any WebAssembly linear memory may grow to.

This option defaults to 4 GiB meaning that for 32-bit linear memories there is no restrictions. 64-bit linear memories will not be allowed to grow beyond 4 GiB by default.

If a memory’s minimum size is greater than this value, the module will fail to instantiate.

If a memory’s maximum size is unbounded or greater than this value, the maximum will be max_memory_size for the purpose of any memory.grow instruction.

This value is used to control the maximum accessible space for each linear memory of a core instance. This can be thought of as a simple mechanism like Store::limiter to limit memory at runtime. This value can also affect striping/coloring behavior when used in conjunction with memory_protection_keys.

The virtual memory reservation size of each linear memory is controlled by the Config::static_memory_maximum_size setting and this method’s configuration cannot exceed Config::static_memory_maximum_size.

source

pub fn total_gc_heaps(&mut self, count: u32) -> &mut Self

The maximum number of concurrent GC heaps supported (default is 1000).

This value has a direct impact on the amount of memory allocated by the pooling instance allocator.

The pooling instance allocator allocates a GC heap pool, where each entry in the pool contains the space needed for each GC heap used by a store.

Trait Implementations§

source§

impl Clone for PoolingAllocationConfig

source§

fn clone(&self) -> PoolingAllocationConfig

Returns a copy of the value. Read more
1.0.0 · source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
source§

impl Debug for PoolingAllocationConfig

source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
source§

impl Default for PoolingAllocationConfig

source§

fn default() -> PoolingAllocationConfig

Returns the “default value” for a type. Read more

Auto Trait Implementations§

Blanket Implementations§

source§

impl<T> Any for T
where T: 'static + ?Sized,

source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
source§

impl<T> Borrow<T> for T
where T: ?Sized,

source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
source§

impl<T> CloneToUninit for T
where T: Clone,

source§

unsafe fn clone_to_uninit(&self, dst: *mut T)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dst. Read more
source§

impl<T> From<T> for T

source§

fn from(t: T) -> T

Returns the argument unchanged.

source§

impl<T, U> Into<U> for T
where U: From<T>,

source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

source§

impl<T> IntoEither for T

source§

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
source§

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
source§

impl<T> Pointable for T

source§

const ALIGN: usize = _

The alignment of pointer.
source§

type Init = T

The type for initializers.
source§

unsafe fn init(init: <T as Pointable>::Init) -> usize

Initializes a with the given initializer. Read more
source§

unsafe fn deref<'a>(ptr: usize) -> &'a T

Dereferences the given pointer. Read more
source§

unsafe fn deref_mut<'a>(ptr: usize) -> &'a mut T

Mutably dereferences the given pointer. Read more
source§

unsafe fn drop(ptr: usize)

Drops the object pointed to by the given pointer. Read more
source§

impl<T> Same for T

source§

type Output = T

Should always be Self
source§

impl<T> ToOwned for T
where T: Clone,

source§

type Owned = T

The resulting type after obtaining ownership.
source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

source§

type Error = Infallible

The type returned in the event of a conversion error.
source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.