Gaussian Splatting

class fvdb.ProjectedGaussianSplats(impl: ProjectedGaussianSplats, _private: Any = None)[source]

A class representing a set of Gaussian splats projected onto a batch of 2D image planes.

A ProjectedGaussianSplats instance contains the 2D projections of 3D Gaussian splats, which can be used to render images onto the image planes. Instances of this class are created by calling the GaussianSplat3d.project_gaussians_for_images(), GaussianSplat3d.project_gaussians_for_images_and_depths(), etc. methods.

Note

The reason to have a separate class for projected Gaussian splats is to be able to run projection once, and then render the splats multiple times (e.g. rendering crops) without re-projecting them each time. This can save significant computation time.

property antialias: bool

Return whether antialiasing was enabled during the projection of the Gaussian splats.

Returns:

antialias (bool) – True if antialiasing was enabled during projection, False otherwise.

property depths: Tensor

Return the depth of each projected Gaussian in each image plane. The depth is defined as the distance from the camera to the mean of the Gaussian along the camera’s viewing direction.

Returns:

depths (torch.Tensor) – A tensor of shape (C, N) representing the depth of each projected Gaussian, where C is the number of image planes, and N is the number of projected Gaussians.

property eps_2d: float

Return the epsilon value used during the projection of the Gaussian splats to avoid numerical issues. This value is used to clamp very small radii during projection.

Returns:

eps_2d (float) – The epsilon value used during projection.

property far_plane: float

Return the far plane distance used during the projection of the Gaussian splats.

Returns:

far_plane (float) – The far plane distance.

property image_height: int

Return the height of the image planes used during the projection of the Gaussian splats.

Returns:

image_height (int) – The height of the image planes.

property image_width: int

Return the width of the image planes used during the projection of the Gaussian splats.

Returns:

image_width (int) – The width of the image planes.

property inv_covar_2d: Tensor

The inverse of the 2D covariance matrices of the Gaussians projected into each image plane. These define the spatial extent of ellipses for each splatted Gaussian. Note that since covariance matrices are symmetric, we pack them into a tensor of shape (num_projected_gaussians, 3) where each covariance matrix is represented as (Cxx, Cxy, Cyy).

Returns:

inv_covar_2d (torch.Tensor) – A tensor of shape (C, N, D) representing the packed inverse 2D covariance matrices, where C is the number of image planes, N is the number of projected Gaussians, and D is number of feature channels for each Gaussian (see GaussianSplat3d.num_channels).

property means2d: Tensor

Return the 2D projected means (in pixel units) of the Gaussians in each image plane.

Returns:

means2d (torch.Tensor) – A tensor of shape (C, N, 2) representing the 2D projected means, where C is the number of image planes, N is the number of projected Gaussians, and the last dimension contains the (x, y) coordinates of the means in pixel space.

property min_radius_2d: float

Return the minimum radius (in pixels) used to clip Gaussians during projection. Gaussians whose radius projected to less than this value are ignored to avoid numerical issues.

Returns:

min_radius_2d (float) – The minimum radius used during projection.

property near_plane: float

Return the near plane distance used during the projection of the Gaussian splats.

Returns:

near_plane (float) – The near plane distance.

property opacities: Tensor

Return the opacities of each projected Gaussian in each image plane.

Returns:

opacities (torch.Tensor) – A tensor of shape (C, N) representing the opacity of each projected Gaussian, where C is the number of image planes, and N is the number of projected Gaussians.

property projection_type: ProjectionType

Return the projection type used during the projection of the Gaussian splats.

Returns:

projection_type (ProjectionType) – The projection type (e.g. ProjectionType.PERSPECTIVE or ProjectionType.ORTHOGRAPHIC).

property radii: Tensor

Return the 2D radii (in pixels) of each projected Gaussian in each image plane. The radius of a Gaussian is the maximum extent of the Gaussian along any direction in the image plane.

Returns:

radii (torch.Tensor) – A tensor of shape (C, N) representing the 2D radius of each projected Gaussian, where C is the number of image planes, and N is the number of projected Gaussians.

property render_quantities: Tensor

Return the render quantities of each projected Gaussian in each image plane. The render quantities are used for shading and lighting calculations during rendering.

Returns:

render_quantities (torch.Tensor) – A tensor of shape (C, N, D) representing the render quantities of each projected Gaussian, where C is the number of image planes, N is the number of projected Gaussians, and D is the number of feature channels for each Gaussian (see GaussianSplat3d.num_channels).

property sh_degree_to_use: int

Return the spherical harmonic degree used during the projection of the Gaussian splats.

Note

This indicates up to which degree the spherical harmonics coefficients were projected for each Gaussian. For example, if this value is 0, only the diffuse (degree 0) coefficients were projected. If this value is 2, coefficients up to degree 2 were projected.

Returns:

sh_degree_to_use (int) – The spherical harmonic degree used during projection.

property tile_gaussian_ids: Tensor

Return a tensor containing the ID of each tile/gaussian intersection.

Returns:

tile_gaussian_ids (torch.Tensor) – A tensor of shape (M,) containing the IDs of the Gaussians.

property tile_offsets: Tensor

Return the starting offset of the set of intersections for each tile into tile_gaussian_ids.

Returns:

tile_offsets (torch.Tensor) – A tensor of shape (C, TH, TW,) where C is the number of image planes, TH is the number of tiles in the height dimension, and TW is the number of tiles in the width dimension.

class fvdb.GaussianSplat3d(impl: GaussianSplat3d, _private: Any = None)[source]

An efficient data structure representing a Gaussian splat radiance field in 3D space.

A GaussianSplat3d instance contains a set of 3D Gaussian splats, each defined by its mean position, orientation (quaternion), scale, opacity, and spherical harmonics coefficients for color representation.

Together, these define a radiance field which can be volume rendered to produce images and depths from arbitrary viewpoints. This class provides a variety of methods for rendering and manipulating Gaussian splats radiance fields. These include:

Background

Mathematically, the radiance field represented by a GaussianSplat3d is defined as a sum of anisotropic 3D Gaussians, with view-dependent features represented using spherical harmonics. The radiance field \(R(x, v)\) accepts as input a 3D position \(x \in \mathbb{R}^3\) and a viewing direction \(v \in \mathbb{S}^2\), and is defined as:

\[ \begin{align}\begin{aligned}R(x, v) = \sum_{i=1}^{N} o_i \cdot \alpha_i(x) \cdot SH(v; C_i)\\\alpha_i(x) = \exp\left(-\frac{1}{2}(x - \mu_i)^T \Sigma_i^{-1} (x - \mu_i)\right)\\\Sigma_i = R(q_i)^T \cdot \text{diag}(S_i) \cdot R(q_i)\end{aligned}\end{align} \]

where:

  • \(N\) is the number of Gaussians (see num_gaussians).

  • \(\mu_i \in \mathbb{R}^3\) is the mean of the i-th Gaussian (see means).

  • \(\Sigma_i \in \mathbb{R}^{3 \times 3}\) is the covariance matrix of the i-th Gaussian, defined by its scale diagonal scale \(S_i \in \mathbb{R}^3\) (see scales) and orientation quaternion \(q_i \in \mathbb{R}^4\) (see quats).

  • \(o_i \in [0, 1]\) is the opacity of the i-th Gaussian (see opacities).

  • \(SH(v; C_i)\) is the spherical harmonics function evaluated at direction \(v\) with coefficients \(C_i\).

  • \(R(q_i)\) is the rotation matrix corresponding to the quaternion \(q_i\).

To render images from a GaussianSplat3d, you volume render the radiance field using

\[I(u, v) = \int_{t \in r(u, v)} T(t) R(r(t), d) dt\]

where \(r(u, v)\) is the camera ray through pixel \((u, v)\), \(d\) is the viewing direction of the ray, and \(T(t) = \exp\left(-\int_{0}^{t} R(r(s), s) ds\right)\) is the accumulated transmittance along the ray up to distance \(t\).

and to render depths you compute

\[D(u, v) = \int_{t \in r(u, v)} t \cdot T(t) \sum_{i=1}^{N} o_i \cdot \alpha_i(r(t), d) dt\]
PLY_VERSION_STRING = 'fvdb_ply 1.0.0'

Version string written to PLY files saved using the save_to_ply() method. This string will be written in the comment section of the PLY file to identify the version of the fvdb library used to save the file. The comment will have the form comment fvdb_gs_ply <PLY_VERSION_STRING>.

__getitem__(index: slice) GaussianSplat3d[source]
__getitem__(index: Tensor) GaussianSplat3d

Select Gaussians using either an integer index tensor, a boolean mask tensor, or a slice.

Note

If accumulate_mean_2d_gradients or accumulate_max_2d_radii is enabled on this GaussianSplat3d instance, the returned GaussianSplat3d will also contain the corresponding accumulated values.

Example usage:

# Using a slice
gs_subset = gsplat3d[10:20] # Selects Gaussians from index 10 to 19

# Using an integer index tensor
indices = torch.tensor([0, 2, 4, 6])
gs_subset = gsplat3d[indices] # Selects Gaussians at indices 0, 2, 4, and 6

# Using a boolean mask tensor

mask = torch.tensor([True, False, True, False, ...]) # Length must be num_gaussians
gs_subset = gsplat3d[mask] # Selects Gaussians where mask is True
Parameters:

index (slice | torch.Tensor) – A slice object or a 1D tensor containing either integer indices or a boolean mask.

Returns:

gaussian_splat_3d (GaussianSplat3d) – A new instance of GaussianSplat3d containing only the selected Gaussians.

__setitem__(index: slice, value: GaussianSplat3d) None[source]
__setitem__(index: Tensor, value: GaussianSplat3d) None

Set the values of Gaussians in this GaussianSplat3d instance using either an integer index tensor, a boolean mask tensor, or a slice.

Note

If using integer indices with duplicate indices, the Gaussian set from value at the duplicate indices will overwrite in a random order.

Note

If accumulate_mean_2d_gradients or accumulate_max_2d_radii is enabled on this GaussianSplat3d instance, the corresponding accumulated values will also be updated for the selected Gaussians, based on the values from the value instance. If value does not have these accumulations enabled, the accumulated values for the selected Gaussians will be reset to zero.

Example:

# Using a slice
gs_subset: GaussianSplat3d = ...  # Some GaussianSplat3d instance with 10 Gaussians
gsplat3d[10:20] = gs_subset  # Sets Gaussians from index 10 to 19

# Using an integer index tensor
indices = torch.tensor([0, 2, 4, 6])
gs_subset: GaussianSplat3d = ...  # Some GaussianSplat3d instance with 4 Gaussians
gsplat3d[indices] = gs_subset  # Sets Gaussians at indices 0, 2, 4, and 6

# Using a boolean mask tensor
mask = torch.tensor([True, False, True, False, ...])  # Length must be num_gaussians
gs_subset: GaussianSplat3d = ...  # Some GaussianSplat3d instance with num unmasked Gaussians
gsplat3d[mask] = gs_subset  # Sets Gaussians where mask is True
Parameters:
  • index (torch.Tensor | slice) – A slice object or a 1D tensor containing either integer indices or a boolean mask.

  • value (GaussianSplat3d) – The GaussianSplat3d instance containing the new values to set. Must have the same number of Gaussians as the selected indices or mask.

property accumulate_max_2d_radii: bool

Returns whether to track the maximum 2D projected radius of each Gaussian across calls to render_* functions. This is used by certain optimization techniques to ensure that the Gaussians do not become too large or too small during the optimization process.

See also

See accumulated_max_2d_radii for the actual maximum radii values.

Returns:

accumulate_max_radii (bool) – True if the maximum 2D radii are being tracked across rendering calls, False otherwise.

property accumulate_mean_2d_gradients: bool

Returns whether to track the average norm of the gradient of projected means for each Gaussian during the backward pass of projection. This property is used by certain optimization techniques to split/prune/duplicate Gaussians. The accumulated 2d gradient norms are defined as follows:

\[\sum_{t=1}^{T} \| \partial_{L_t} \mu_i^{2D} \|_2\]

where \(\mu_i^{2D}\) is the projection of the mean of Gaussian \(g_i\) onto the image plane, and \(L_t\) is the loss at iteration \(t\).

See also

See accumulated_mean_2d_gradient_norms for the actual average norms of the gradients.

Returns:

accumulate_mean_2d_grads (bool) – True if the average norm of the gradient of projected means is being tracked, False otherwise.

property accumulated_gradient_step_counts: Tensor

Returns the accumulated gradient step counts for each Gaussian.

If this GaussianSplat3d instance is set to track accumulated gradients (i.e accumulate_mean_2d_gradients is True), then this tensor contains the number of Gradient steps that have been applied to each Gaussian during optimization.

If accumulate_mean_2d_gradients is False, this property will be an empty tensor.

Note

To reset the counts, call call the reset_accumulated_gradient_state() method.

Returns:

step_counts (torch.Tensor) – A tensor of shape (N,) where N is the number of Gaussians (see num_gaussians). Each element represents the accumulated gradient step count for a Gaussian.

property accumulated_max_2d_radii: Tensor

Returns the maximum 2D projected radius (in pixels) for each Gaussian across all calls to render_* functions. This is used by certain optimization techniques to ensure that the Gaussians do not become too large or too small during the optimization process.

If :this GaussianSplat3d instance is set to track maximum 2D radii (i.e accumulate_max_2d_radii is True), then this tensor contains the maximum 2D radius for each Gaussian.

If accumulate_max_2d_radii is False, this property will be an empty tensor.

Note

To reset the maximum radii to zero, you can call the reset_accumulated_gradient_state() method.

Returns:

max_radii (torch.Tensor) – A tensor of shape (N,) where N is the number of Gaussians (see num_gaussians). Each element represents the maximum 2D radius for a Gaussian across all optimization iterations.

property accumulated_mean_2d_gradient_norms: Tensor

Returns the average norm of the gradient of projected (2D) means for each Gaussian across every backward pass. This is used by certain optimization techniques to split/prune/duplicate Gaussians. The accumulated 2d gradient norms are defined as follows:

\[\sum_{t=1}^{T} \| \partial_{L_t} \mu_i^{2D} \|_2\]

where \(\mu_i^{2D}\) is the projection of the mean of Gaussian \(g_i\) onto the image plane, and \(L_t\) is the loss at iteration \(t\).

Note

To reset the accumulated norms, call the reset_accumulated_gradient_state() method.

Returns:

accumulated_grad_2d_norms (torch.Tensor) – A tensor of shape (N,) where N is the number of Gaussians (see num_gaussians). Each element represents the average norm of the gradient of projected means for a Gaussian across all optimization iterations. The norm is computed in 2D space, i.e., the projected means.

static cat(splats: Sequence[GaussianSplat3d], accumulate_mean_2d_gradients: bool = False, accumulate_max_2d_radii: bool = False, detach: bool = False) GaussianSplat3d[source]

Concatenates a sequence of GaussianSplat3d instances into a single GaussianSplat3d instance.

The returned GaussianSplat3d will contain all the Gaussians from the input instances, in the order they were provided.

Note

All input GaussianSplat3d instances must have the same number of channels and spherical harmonic degree.

Note

If accumulate_mean_2d_gradients is True, the concatenated instance will track the average norm of projected mean gradients for each Gaussian during the backward pass of projection. This value is copied over from each input instance if they were tracking it, and initialized to zero otherwise.

Note

If accumulate_max_2d_radii is True, the concatenated instance will track the maximum 2D radii for each Gaussian during the backward pass of projection. This value is copied over from each input instance if they were tracking it, and initialized to zero otherwise.

Parameters:
  • splats (Sequence[GaussianSplat3d]) – A sequence of GaussianSplat3d instances to concatenate.

  • accumulate_mean_2d_gradients (bool) – If True, copies over the accumulated mean 2D gradients for each GaussianSplat3d into the new one, or initializes it to zero if the input instance was not tracking it. Defaults to False.

  • accumulate_max_2d_radii (bool) – If True, copies the accumulated maximum 2D radii for each GaussianSplat3d into the concatenated one, or initializes it to zero if the input instance was not tracking it. Defaults to False.

  • detach (bool) – If True, detaches the concatenated GaussianSplat3d from the computation graph. Defaults to False.

Returns:

GaussianSplat3d – A new instance of GaussianSplat3d containing the concatenated Gaussians.

detach() GaussianSplat3d[source]

Return a new GaussianSplat3d instance whose tensors are detached from the computation graph. This is useful when you want to stop tracking gradients for this instance.

Returns:

gaussian_splat (GaussianSplat3d) – A new GaussianSplat3d instance whose tensors are detached.

detach_() None[source]

Detaches this GaussianSplat3d instance from the computation graph in place. This modifies the current instance to stop tracking gradients.

Note

This method modifies the current instance and does not return a new instance.

property device: device

Returns the device on which the Tensors managed by this GaussianSplat3d instance is stored.

Returns:

device (torch.device) – The device of this GaussianSplat3d instance.

property dtype: dtype

Returns the data type of of the tensors managed by this GaussianSplat3d instance (e.g., torch.float32, torch.float64).

Returns:

torch.dtype – The data type of the tensors managed by this GaussianSplat3d instance.

classmethod from_ply(filename: Path | str, device: str | device = 'cuda') tuple[GaussianSplat3d, dict[str, str | int | float | Tensor]][source]

Create a GaussianSplat3d instance from a PLY file.

Parameters:
  • filename (str) – The name of the file to load the PLY data from.

  • device (torch.device) – The device to load the data onto. Default is “cuda”.

Returns:
  • splats (GaussianSplat3d) – An instance of GaussianSplat3d initialized with the data from the PLY file.

  • metadata (dict[str, str | int | float | torch.Tensor]) – A dictionary of metadata where the keys are strings and the values are either strings, ints, floats, or tensors. Can be empty if no metadata is saved in the PLY file.

classmethod from_state_dict(state_dict: dict[str, Tensor]) GaussianSplat3d[source]

Creates a GaussianSplat3d instance from a state dictionary generated by state_dict(). This method is typically used to load a saved state of the GaussianSplat3d instance.

A state dictionary must contains the following keys which are all the required parameters to initialize a GaussianSplat3d. Here N denotes the number of Gaussians (see num_gaussians)

  • 'means': Tensor of shape (N, 3) representing the means of the Gaussians.

  • 'quats': Tensor of shape (N, 4) representing the quaternions of the Gaussians.

  • 'log_scales': Tensor of shape (N, 3) representing the log scales of the Gaussians.

  • 'logit_opacities': Tensor of shape (N,) representing the logit opacities of the Gaussians.

  • 'sh0': Tensor of shape (N, 1, D) representing the diffuse SH coefficients where D is the number of channels (see num_channels).

  • 'shN': Tensor of shape (N, K-1, D) representing the directionally varying SH coefficients where D is the number of channels (see num_channels), and K is the number of spherical harmonic bases (see num_sh_bases).

  • 'accumulate_max_2d_radii': bool Tensor with a single element indicating whether to track the maximum 2D radii for gradients.

  • 'accumulate_mean_2d_gradients': bool Tensor with a single element indicating whether to track the average norm of the gradient of projected means for each Gaussian.

It can also optionally contain the following keys:

  • 'accumulated_gradient_step_counts': Tensor of shape (N,) representing the accumulated gradient step counts for each Gaussian.

  • 'accumulated_max_2d_radii': Tensor of shape (N,) representing the maximum 2D projected radius for each Gaussian across every iteration of optimization.

  • 'accumulated_mean_2d_gradient_norms': Tensor of shape (N,) representing the average norm of the gradient of projected means for each Gaussian across every iteration of optimization.

Parameters:

state_dict (dict[str, torch.Tensor]) – A dictionary containing the state of the GaussianSplat3d instance, usually generated via the state_dict() method.

Returns:

gaussian_splat (GaussianSplat3d) – An instance of GaussianSplat3d initialized with the provided state dictionary.

classmethod from_tensors(means: Tensor, quats: Tensor, log_scales: Tensor, logit_opacities: Tensor, sh0: Tensor, shN: Tensor, accumulate_mean_2d_gradients: bool = False, accumulate_max_2d_radii: bool = False, detach: bool = False) GaussianSplat3d[source]

Create a new GaussianSplat3d from the provided tensors. This constructs a new Gaussian splat radiance field with the specified means, orientations, scales, opacities, and spherical harmonics coefficients.

Note

The GaussianSplat3d stores the log of scales scales (log_scales) rather than the scales directly. This ensures numerical stability, especially when optimizing the scales, since each gaussian is defined as \(\exp(R(q)^T S R(q))\) where \(R(q)\) is rotation matrix defined by the unit quaternion of the Gaussian, and \(S = diag(exp(log_scales))\).

Note

The GaussianSplat3d stores the logit of opacities (logit_opacities) rather than the opacities directly. The actual opacities are obtained by applying the sigmoid function to the logit opacities. This ensures opacities are always in the range [0, 1] and improves numerical stability during optimization.

Parameters:
  • means (torch.Tensor) – Tensor of shape (N, 3) representing the means of the gaussians, where N is the number of gaussians.

  • quats (torch.Tensor) – Tensor of shape (N, 4) representing the quaternions (orientations) of the gaussians, where N is the number of gaussians.

  • log_scales (torch.Tensor) – Tensor of shape (N, 3) representing the log scales of the gaussians, where N is the number of gaussians.

  • logit_opacities (torch.Tensor) – Tensor of shape (N,) representing the logit opacities of the gaussians, where N is the number of gaussians.

  • sh0 (torch.Tensor) – Tensor of shape (N, 1, D) representing the diffuse SH coefficients where D is the number of channels (see num_channels).

  • shN (torch.Tensor) – Tensor of shape (N, K-1, D) representing the directionally varying SH coefficients where D is the number of channels (see num_channels), and K is the number of spherical harmonic bases (see num_sh_bases).

  • accumulate_mean_2d_gradients (bool, optional) – If True, tracks the average norm of the gradient of projected means for each Gaussian during the backward pass of projection. This is useful for some optimization techniques, such as the one in the original paper. Defaults to False.

  • accumulate_max_2d_radii (bool, optional) –

    If True, tracks the maximum 2D radii for each Gaussian during the backward pass of projection. This is useful for some optimization techniques, such as the one in the original paper. Defaults to False.

  • detach (bool, optional) – If True, creates copies of the input tensors and detaches them from the computation graph. Defaults to False.

property log_scales: Tensor

Returns the log of the scales for each Gaussian. Gaussians are represented in 3D space, as ellipsoids defined by their means, orientations (quaternions), and scales. i.e.

\[g_i(x) = \exp(-0.5 (x - \mu_i)^T \Sigma_i^{-1} (x - \mu_i))\]

where \(\mu_i\) is the mean and \(\Sigma_i = R(q_i)^T S_i R(q_i)\) is the covariance of the i-th Gaussian with \(R(q_i)\) being the rotation matrix defined by the unit quaternion \(q_i\) of the Gaussian, and \(S_i = diag(\exp(log\_scales_i))\).

Note

The GaussianSplat3d stores the log of scales scales (log_scales) rather than the scales directly. This ensures numerical stability, especially when optimizing the scales. To read the scales directly, see the scales property (which is read-only).

Returns:

log_scales (torch.Tensor) – A tensor of shape (N, 3) where N is the number of Gaussians (see num_gaussians). Each row represents the log of the scale of a Gaussian in 3D space.

property logit_opacities: Tensor

Return the logit (inverse of sigmoid) of the opacities of each Gaussian in the scene.

Note

The GaussianSplat3d stores the logit of opacities (logit_opacities) rather than the opacities directly. The actual opacities are obtained by applying the sigmoid function to the logit opacities. To read the opacities directly, see the opacities property (which is read-only).

Returns:

logit_opacities (torch.Tensor) – A tensor of shape (N,) where N is the number of Gaussians (see num_gaussians). Each row represents the logit of the opacity of a Gaussian in 3D space.

property means: Tensor

Return the means (3d positions) of the Gaussians in this GaussianSplat3d. The means represent the center of each Gaussian in 3D space. i.e each Gaussian \(g_i\) is defined as:

\[g_i(x) = \exp(-0.5 (x - \mu_i)^T \Sigma_i^{-1} (x - \mu_i))\]

where \(\mu_i\) is the mean and \(\Sigma_i = R(q_i)^T S_i R(q_i)\) is the covariance of the i-th Gaussian with \(R(q_i)\) being the rotation matrix defined by the unit quaternion \(q_i\) of the Gaussian, and \(S_i = diag(\exp(log\_scales_i))\).

Returns:

torch.Tensor – A tensor of shape (N, 3) where N is the number of Gaussians (see num_gaussians). Each row represents the mean of a Gaussian in 3D space.

property num_channels: int

Returns the number of channels in the Gaussian splatting representation. For example, if you are rendering RGB images, this method will return 3.

Returns:

num_channels (int) – The number of channels.

property num_gaussians: int

Returns the number of Gaussians in the Gaussian splatting representation. This is the total number of individual gaussian splats that are being used to represent the scene.

Returns:

num_gaussians (int) – The number of Gaussians.

property num_sh_bases: int

Returns the number of spherical harmonics (SH) bases used in the Gaussian splatting representation.

Note

The number of SH bases is related to the SH degree (see sh_degree) by the formula \(K = (sh\_degree + 1)^2\), where \(K\) is the number of spherical harmonics bases.

Returns:

num_sh_bases (int) – The number of spherical harmonics bases.

property opacities: Tensor

Returns the opacities of the Gaussians in the Gaussian splatting representation. The opacities encode the visibility of each Gaussian in the scene.

Note

This property is read only. GaussianSplat3d stores the logit (inverse of sigmoid) of the opacities to ensure numerical stability, which you can modify. See logit_opacities.

Returns:

opacities (torch.Tensor) – A tensor of shape (N,) where N is the number of Gaussians (see num_gaussians). Each element represents the opacity of a Gaussian.

project_gaussians_for_depths(world_to_camera_matrices: Tensor, projection_matrices: Tensor, image_width: int, image_height: int, near: float, far: float, projection_type=ProjectionType.PERSPECTIVE, min_radius_2d: float = 0.0, eps_2d: float = 0.3, antialias: bool = False) ProjectedGaussianSplats[source]

Projects this GaussianSplat3d onto one or more image planes for rendering depth images in those planes. You can render depth images from the projected Gaussians by calling render_projected_gaussians().

Note

The reason to have a separate projection and rendering step is to enable rendering crops of an image without having to project the Gaussians again.

Note

All images being rendered must have the same width and height.

See also

fvdb.ProjectedGaussianSplats for the projected Gaussians representation.

# Assume gaussian_splat_3d is an instance of GaussianSplat3d
# Project the Gaussians for rendering depth images onto C image planes
projected_gaussians = gaussian_splat_3d.project_gaussians_for_depths(
    world_to_camera_matrices, # tensor of shape [C, 4, 4]
    projection_matrices, # tensor of shape [C, 3, 3]
    image_width, # width of the C images
    image_height, # height of the C images
    near, # near clipping plane
    far) # far clipping plane

# Now render a crop of size 100x100 starting at (10, 10) from the projected Gaussians
# in each image plane.
# Returns a tensor of shape [C, 100, 100, 1] containing the depth images,
# and a tensor of shape [C, 100, 100, 1] containing the final alpha (opacity) values
# of each pixel.
cropped_depth_images_1, cropped_alphas = gaussian_splat_3d.render_from_projected_gaussians(
    projected_gaussians,
    crop_width=100,
    crop_height=100,
    crop_origin_w=10,
    crop_origin_h=10)

# To get the depth images, divide the last channel by the alpha values
true_depths_1 = cropped_images_1[..., -1:] / cropped_alphas
Parameters:
  • world_to_camera_matrices (torch.Tensor) – Tensor of shape (C, 4, 4) representing the world-to-camera transformation matrices for C cameras. Each matrix transforms points from world coordinates to camera coordinates.

  • projection_matrices (torch.Tensor) – Tensor of shape (C, 3, 3) representing the projection matrices for C cameras. Each matrix projects points in camera space into homogeneous pixel coordinates.

  • image_width (int) – The width of the images to be rendered. Note that all images must have the same width.

  • image_height (int) – The height of the images to be rendered. Note that all images must have the same height.

  • near (float) – The near clipping plane distance for the projection.

  • far (float) – The far clipping plane distance for the projection.

  • projection_type (ProjectionType) – The type of projection to use. Default is ProjectionType.PERSPECTIVE.

  • min_radius_2d (float) – The minimum radius (in pixels) below which Gaussians are ignored during rendering.

  • eps_2d (float) – A value used to pad Gaussians when projecting them onto the image plane, to avoid very projected Gaussians which create artifacts and numerical issues.

  • antialias (bool) – If True, applies opacity correction to the projected Gaussians when using eps_2d > 0.0.

Returns:

projected_gaussians (ProjectedGaussianSplats) – An instance of ProjectedGaussianSplats containing the projected Gaussians. This object contains the projected 2D representations of the Gaussians, which can be used for rendering depth images or further processing.

project_gaussians_for_images(world_to_camera_matrices: Tensor, projection_matrices: Tensor, image_width: int, image_height: int, near: float, far: float, projection_type=ProjectionType.PERSPECTIVE, sh_degree_to_use: int = -1, min_radius_2d: float = 0.0, eps_2d: float = 0.3, antialias: bool = False) ProjectedGaussianSplats[source]

Projects this GaussianSplat3d onto one or more image planes for rendering multi-channel (see num_channels) images in those planes. You can render images from the projected Gaussians by calling render_projected_gaussians().

Note

The reason to have a separate projection and rendering step is to enable rendering crops of an image without having to project the Gaussians again.

Note

All images being rendered must have the same width and height.

See also

fvdb.ProjectedGaussianSplats for the projected Gaussians representation.

# Assume gaussian_splat_3d is an instance of GaussianSplat3d
# Project the Gaussians for rendering images onto C image planes
projected_gaussians = gaussian_splat_3d.project_gaussians_for_images(
    world_to_camera_matrices, # tensor of shape [C, 4, 4]
    projection_matrices, # tensor of shape [C, 3, 3]
    image_width, # width of the C images
    image_height, # height of the C images
    near, # near clipping plane
    far) # far clipping plane

# Now render a crop of size 100x100 starting at (10, 10) from the projected Gaussians
# in each image plane.
# Returns a tensor of shape [C, 100, 100, D] containing the images (where D is num_channels),
# and a tensor of shape [C, 100, 100, 1] containing the final alpha (opacity) values
# of each pixel.
cropped_images_1, cropped_alphas = gaussian_splat_3d.render_from_projected_gaussians(
    projected_gaussians,
    crop_width=100,
    crop_height=100,
    crop_origin_w=10,
    crop_origin_h=10)
Parameters:
  • world_to_camera_matrices (torch.Tensor) – Tensor of shape (C, 4, 4) representing the world-to-camera transformation matrices for C cameras. Each matrix transforms points from world coordinates to camera coordinates.

  • projection_matrices (torch.Tensor) – Tensor of shape (C, 3, 3) representing the projection matrices for C cameras. Each matrix projects points in camera space into homogeneous pixel coordinates.

  • image_width (int) – The width of the images to be rendered. Note that all images must have the same width.

  • image_height (int) – The height of the images to be rendered. Note that all images must have the same height.

  • near (float) – The near clipping plane distance for the projection.

  • far (float) – The far clipping plane distance for the projection.

  • projection_type (ProjectionType) – The type of projection to use. Default is ProjectionType.PERSPECTIVE.

  • sh_degree_to_use (int) – The degree of spherical harmonics to use for rendering. -1 means use all available SH bases. 0 means use only the first SH base (constant color). Note that you can’t use more SH bases than available in the GaussianSplat3d instance. Default is -1.

  • min_radius_2d (float) – The minimum radius (in pixels) below which Gaussians are ignored during rendering.

  • eps_2d (float) – A value used to pad Gaussians when projecting them onto the image plane, to avoid very projected Gaussians which create artifacts and numerical issues.

  • antialias (bool) – If True, applies opacity correction to the projected Gaussians when using eps_2d > 0.0.

Returns:

projected_gaussians (ProjectedGaussianSplats) – An instance of ProjectedGaussianSplats containing the projected Gaussians. This object contains the projected 2D representations of the Gaussians, which can be used for rendering images or further processing.

project_gaussians_for_images_and_depths(world_to_camera_matrices: Tensor, projection_matrices: Tensor, image_width: int, image_height: int, near: float, far: float, projection_type=ProjectionType.PERSPECTIVE, sh_degree_to_use: int = -1, min_radius_2d: float = 0.0, eps_2d: float = 0.3, antialias: bool = False) ProjectedGaussianSplats[source]

Projects this GaussianSplat3d onto one or more image planes for rendering multi-channel (see num_channels) images with depths in the last channel. You can render images+depths from the projected Gaussians by calling render_projected_gaussians().

Note

The reason to have a separate projection and rendering step is to enable rendering crops of an image without having to project the Gaussians again.

Note

All images being rendered must have the same width and height.

See also

fvdb.ProjectedGaussianSplats for the projected Gaussians representation.

# Assume gaussian_splat_3d is an instance of GaussianSplat3d
# Project the Gaussians for rendering images onto C image planes
projected_gaussians = gaussian_splat_3d.project_gaussians_for_images_and_depths(
    world_to_camera_matrices, # tensor of shape [C, 4, 4]
    projection_matrices, # tensor of shape [C, 3, 3]
    image_width, # width of the C images
    image_height, # height of the C images
    near, # near clipping plane
    far) # far clipping plane

# Now render a crop of size 100x100 starting at (10, 10) from the projected Gaussians
# in each image plane.
# Returns a tensor of shape [C, 100, 100, D] containing the images (where D is num_channels + 1 for depth),
# and a tensor of shape [C, 100, 100, 1] containing the final alpha (opacity) values
# of each pixel.
cropped_images_1, cropped_alphas = gaussian_splat_3d.render_from_projected_gaussians(
    projected_gaussians,
    crop_width=100,
    crop_height=100,
    crop_origin_w=10,
    crop_origin_h=10)

cropped_images = cropped_images_1[..., :-1]  # Extract image channels

# Divide by alpha to get the final true depth values
cropped_depths = cropped_images_1[..., -1:] / cropped_alphas  # Extract depth channel
Parameters:
  • world_to_camera_matrices (torch.Tensor) – Tensor of shape (C, 4, 4) representing the world-to-camera transformation matrices for C cameras. Each matrix transforms points from world coordinates to camera coordinates.

  • projection_matrices (torch.Tensor) – Tensor of shape (C, 3, 3) representing the projection matrices for C cameras. Each matrix projects points in camera space into homogeneous pixel coordinates.

  • image_width (int) – The width of the images to be rendered. Note that all images must have the same width.

  • image_height (int) – The height of the images to be rendered. Note that all images must have the same height.

  • near (float) – The near clipping plane distance for the projection.

  • far (float) – The far clipping plane distance for the projection.

  • projection_type (ProjectionType) – The type of projection to use. Default is fvdb.ProjectionType.PERSPECTIVE.

  • sh_degree_to_use (int) – The degree of spherical harmonics to use for rendering. -1 means use all available SH bases. 0 means use only the first SH base (constant color). Note that you can’t use more SH bases than available in the GaussianSplat3d instance. Default is -1.

  • min_radius_2d (float) – The minimum radius (in pixels) below which Gaussians are ignored during rendering.

  • eps_2d (float) – A value used to pad Gaussians when projecting them onto the image plane, to avoid very projected Gaussians which create artifacts and numerical issues.

  • antialias (bool) – If True, applies opacity correction to the projected Gaussians when using eps_2d > 0.0.

Returns:

projected_gaussians (ProjectedGaussianSplats) – An instance of ProjectedGaussianSplats containing the projected Gaussians. This object contains the projected 2D representations of the Gaussians, which can be used for rendering images or further processing.

property quats: Tensor

Returns the unit quaternions representing the orientation of the covariance of the Gaussians in this GaussianSplat3d. i.e each Gaussian \(g_i\) is defined as:

\[g_i(x) = \exp(-0.5 (x - \mu_i)^T \Sigma_i^{-1} (x - \mu_i))\]

where \(\mu_i\) is the mean and \(\Sigma_i = R(q_i)^T S_i R(q_i)\) is the covariance of the i-th Gaussian with \(R(q_i)\) being the rotation matrix defined by the unit quaternion \(q_i\) of the Gaussian, and \(S_i = diag(\exp(log\_scales_i))\).

Returns:

quats (torch.Tensor) – A tensor of shape (N, 4) where N is the number of Gaussians (see num_gaussians). Each row represents the unit quaternion of a Gaussian in 3D space.

render_depths(world_to_camera_matrices: Tensor, projection_matrices: Tensor, image_width: int, image_height: int, near: float, far: float, projection_type=ProjectionType.PERSPECTIVE, tile_size: int = 16, min_radius_2d: float = 0.3, eps_2d: float = 0.3, antialias: bool = False) tuple[Tensor, Tensor][source]

Render C depth maps from this GaussianSplat3d from C camera views.

Note

All depth maps being rendered must have the same width and height.

Example:

# Assume gaussian_splat_3d is an instance of GaussianSplat3d
# Render depth maps from C camera views
# depth_images is a tensor of shape [C, H, W, 1]
# alpha_images is a tensor of shape [C, H, W, 1]
depth_images, alpha_images = gaussian_splat_3d.render_depths(
    world_to_camera_matrices, # tensor of shape [C, 4, 4]
    projection_matrices, # tensor of shape [C, 3, 3]
    image_width, # width of the depth maps
    image_height, # height of the depth maps
    near, # near clipping plane
    far) # far clipping plane

true_depths = depth_images / alpha_images  # Get true depth values by dividing by alpha
Parameters:
  • world_to_camera_matrices (torch.Tensor) – Tensor of shape (C, 4, 4) representing the world-to-camera transformation matrices for C cameras. Each matrix transforms points from world coordinates to camera coordinates.

  • projection_matrices (torch.Tensor) – Tensor of shape (C, 3, 3) representing the projection matrices for C cameras. Each matrix projects points in camera space into homogeneous pixel coordinates.

  • image_width (int) – The width of the depth maps to be rendered. Note these are the same for all depth maps being rendered.

  • image_height (int) – The height of the depth maps to be rendered. Note these are the same for all depth maps being rendered.

  • near (float) – The near clipping plane distance for the projection.

  • far (float) – The far clipping plane distance for the projection.

  • projection_type (ProjectionType) – The type of projection to use. Default is fvdb.ProjectionType.PERSPECTIVE.

  • tile_size (int) – The size of the tiles to use for rendering. Default is 16. You shouldn’t set this parameter unless you really know what you are doing.

  • min_radius_2d (float) – The minimum radius (in pixels) below which Gaussians are ignored during rendering.

  • eps_2d (float) – A value used to pad Gaussians when projecting them onto the image plane, to avoid very projected Gaussians which create artifacts and numerical issues.

  • antialias (bool) – If True, applies opacity correction to the projected Gaussians when using eps_2d > 0.0.

Returns:
  • depth_images (torch.Tensor) – A tensor of shape (C, H, W, 1) where C is the number of camera views, H is the height of the depth maps, and W is the width of the depth maps. Each element represents the depth value at that pixel in the depth map.

  • alpha_images (torch.Tensor) – A tensor of shape (C, H, W, 1) where C is the number of camera views, H is the height of the images, and W is the width of the images. Each element represents the alpha value (opacity) at a pixel such that 0 <= alpha < 1, and 0 means the pixel is fully transparent, and 1 means the pixel is fully opaque.

render_from_projected_gaussians(projected_gaussians: ProjectedGaussianSplats, crop_width: int = -1, crop_height: int = -1, crop_origin_w: int = -1, crop_origin_h: int = -1, tile_size: int = 16) tuple[Tensor, Tensor][source]

Render a set of images from Gaussian splats that have already been projected onto image planes (See for example project_gaussians_for_images()). This method is useful when you want to render images from pre-computed projected Gaussians, for example, when rendering crops of images without having to re-project the Gaussians.

Note

If you want to render the full image, pass negative values for crop_width, crop_height, crop_origin_w, and crop_origin_h (default behavior). To render full images, all these values must be negative or this method will raise an error.

Note

If your crop goes beyond the image boundaries, the resulting image will be clipped to be within the image boundaries.

Example:

# Assume gaussian_splat_3d is an instance of GaussianSplat3d
# Project the Gaussians for rendering images onto C image planes
projected_gaussians = gaussian_splat_3d.project_gaussians_for_images_and_depths(
    world_to_camera_matrices, # tensor of shape [C, 4, 4]
    projection_matrices, # tensor of shape [C, 3, 3]
    image_width, # width of the C images
    image_height, # height of the C images
    near, # near clipping plane
    far) # far clipping plane

# Now render a crop of size 100x100 starting at (10, 10) from the projected Gaussians
# in each image plane.
# Returns a tensor of shape [C, 100, 100, D] containing the images (where D is num_channels + 1 for depth),
# and a tensor of shape [C, 100, 100, 1] containing the final alpha (opacity) values
# of each pixel.
cropped_images_1, cropped_alphas = gaussian_splat_3d.render_from_projected_gaussians(
    projected_gaussians,
    crop_width=100,
    crop_height=100,
    crop_origin_w=10,
    crop_origin_h=10)

cropped_images = cropped_images_1[..., :-1]  # Extract image channels

# Divide by alpha to get the final true depth values
cropped_depths = cropped_images_1[..., -1:] / cropped_alphas  # Extract depth channel
Parameters:
  • projected_gaussians (ProjectedGaussianSplats) – An instance of fvdb.ProjectedGaussianSplats containing the projected Gaussians after spherical harmonic evaluation. This object should have been created by calling project_gaussians_for_images(), project_gaussians_for_depths(), project_gaussians_for_images_and_depths(), etc.

  • crop_width (int) – The width of the crop to render. If -1, the full image width is used. Default is -1.

  • crop_height (int) – The height of the crop to render. If -1, the full image height is used. Default is -1.

  • crop_origin_w (int) – The x-coordinate of the top-left corner of the crop. If -1, the crop starts at (0, 0). Default is -1.

  • crop_origin_h (int) – The y-coordinate of the top-left corner of the crop. If -1, the crop starts at (0, 0). Default is -1.

  • tile_size (int) – The size of the tiles to use for rendering. Default is 16. This parameter controls the size of the tiles used for rendering the images. You shouldn’t set this parameter unless you really know what you are doing.

Returns:
  • rendered_images (torch.Tensor) – A tensor of shape (C, H, W, D) where C is the number of image planes, H is the height of the rendered images, W is the width of the rendered images, and D is the number of channels (e.g., RGB, RGBD, etc.).

  • alpha_images (torch.Tensor) – A tensor of shape (C, H, W, 1) where C is the number of cameras, H is the height of the images, and W is the width of the images. Each element represents the alpha value (opacity) at a pixel such that 0 <= alpha < 1, and 0 means the pixel is fully transparent, and 1 means the pixel is fully opaque.

render_images(world_to_camera_matrices: Tensor, projection_matrices: Tensor, image_width: int, image_height: int, near: float, far: float, projection_type=ProjectionType.PERSPECTIVE, sh_degree_to_use: int = -1, tile_size: int = 16, min_radius_2d: float = 0.0, eps_2d: float = 0.3, antialias: bool = False) tuple[Tensor, Tensor][source]

Render C multi-channel images (see num_channels) from this GaussianSplat3d from C camera views.

Note

All images being rendered must have the same width and height.

Example:

# Assume gaussian_splat_3d is an instance of GaussianSplat3d
# Render images from C camera views.
# images is a tensor of shape [C, H, W, D] where D is the number of channels
# alpha_images is a tensor of shape [C, H, W, 1]
images, alpha_images = gaussian_splat_3d.render_images(
    world_to_camera_matrices, # tensor of shape [C, 4, 4]
    projection_matrices, # tensor of shape [C, 3, 3]
    image_width, # width of the images
    image_height, # height of the images
    near, # near clipping plane
    far) # far clipping plane
Parameters:
  • world_to_camera_matrices (torch.Tensor) – Tensor of shape (C, 4, 4) representing the world-to-camera transformation matrices for C cameras. Each matrix transforms points from world coordinates to camera coordinates.

  • projection_matrices (torch.Tensor) – Tensor of shape (C, 3, 3) representing the projection matrices for C cameras. Each matrix projects points in camera space into homogeneous pixel coordinates.

  • image_width (int) – The width of the images to be rendered. Note these are the same for all images being rendered.

  • image_height (int) – The height of the images to be rendered. Note these are the same for all images being rendered.

  • near (float) – The near clipping plane distance for the projection.

  • far (float) – The far clipping plane distance for the projection.

  • projection_type (ProjectionType) – The type of projection to use. Default is fvdb.ProjectionType.PERSPECTIVE.

  • sh_degree_to_use (int) – The degree of spherical harmonics to use for rendering. -1 means use all available SH bases. 0 means use only the first SH base (constant color). Note that you can’t use more SH bases than available in the GaussianSplat3d instance. Default is -1.

  • tile_size (int) – The size of the tiles to use for rendering. Default is 16. You shouldn’t set this parameter unless you really know what you are doing.

  • min_radius_2d (float) – The minimum radius (in pixels) below which Gaussians are ignored during rendering.

  • eps_2d (float) – A value used to pad Gaussians when projecting them onto the image plane, to avoid very projected Gaussians which create artifacts and numerical issues.

  • antialias (bool) – If True, applies opacity correction to the projected Gaussians when using eps_2d > 0.0.

Returns:
  • images (torch.Tensor) – A tensor of shape (C, H, W, D) where C is the number of camera views, H is the height of the images, W is the width of the images, and D is the number of channels.

  • alpha_images (torch.Tensor) – A tensor of shape (C, H, W, 1) where C is the number of camera views, H is the height of the images, and W is the width of the images. Each element represents the alpha value (opacity) at a pixel such that 0 <= alpha < 1, and 0 means the pixel is fully transparent, and 1 means the pixel is fully opaque.

render_images_and_depths(world_to_camera_matrices: Tensor, projection_matrices: Tensor, image_width: int, image_height: int, near: float, far: float, projection_type=ProjectionType.PERSPECTIVE, sh_degree_to_use: int = -1, tile_size: int = 16, min_radius_2d: float = 0.0, eps_2d: float = 0.3, antialias: bool = False) tuple[Tensor, Tensor][source]

Render C multi-channel images (see num_channels) with depth as the last channel from this GaussianSplat3d from C camera views.

Note

All images being rendered must have the same width and height.

Example:

# Assume gaussian_splat_3d is an instance of GaussianSplat3d
# Render images with depth maps from C camera views.
# images is a tensor of shape [C, H, W, D + 1] where D is the number of channels
# alpha_images is a tensor of shape [C, H, W, 1]
images, alpha_images = gaussian_splat_3d.render_images(
    world_to_camera_matrices, # tensor of shape [C, 4, 4]
    projection_matrices, # tensor of shape [C, 3, 3]
    image_width, # width of the images
    image_height, # height of the images
    near, # near clipping plane
    far) # far clipping plane

images = images[..., :-1]  # Extract image channels

depths = images[..., -1:] / alpha_images  # Extract depth channel by dividing by alpha
Parameters:
  • world_to_camera_matrices (torch.Tensor) – Tensor of shape (C, 4, 4) representing the world-to-camera transformation matrices for C cameras. Each matrix transforms points from world coordinates to camera coordinates.

  • projection_matrices (torch.Tensor) – Tensor of shape (C, 3, 3) representing the projection matrices for C cameras. Each matrix projects points in camera space into homogeneous pixel coordinates.

  • image_width (int) – The width of the images to be rendered. Note these are the same for all images being rendered.

  • image_height (int) – The height of the images to be rendered. Note these are the same for all images being rendered.

  • near (float) – The near clipping plane distance for the projection.

  • far (float) – The far clipping plane distance for the projection.

  • projection_type (ProjectionType) – The type of projection to use. Default is fvdb.ProjectionType.PERSPECTIVE.

  • sh_degree_to_use (int) – The degree of spherical harmonics to use for rendering. -1 means use all available SH bases. 0 means use only the first SH base (constant color). Note that you can’t use more SH bases than available in the GaussianSplat3d instance. Default is -1.

  • tile_size (int) – The size of the tiles to use for rendering. Default is 16. You shouldn’t set this parameter unless you really know what you are doing.

  • min_radius_2d (float) – The minimum radius (in pixels) below which Gaussians are ignored during rendering.

  • eps_2d (float) – A value used to pad Gaussians when projecting them onto the image plane, to avoid very projected Gaussians which create artifacts and numerical issues.

  • antialias (bool) – If True, applies opacity correction to the projected Gaussians when using eps_2d > 0.0.

Returns:
  • images (torch.Tensor) – A tensor of shape (C, H, W, D + 1) where C is the number of camera views, H is the height of the images, W is the width of the images, and D is the number of channels.

  • alpha_images (torch.Tensor) – A tensor of shape (C, H, W, 1) where C is the number of camera views, H is the height of the images, and W is the width of the images. Each element represents the alpha value (opacity) at a pixel such that 0 <= alpha < 1, and 0 means the pixel is fully transparent, and 1 means the pixel is fully opaque.

render_num_contributing_gaussians(world_to_camera_matrices: Tensor, projection_matrices: Tensor, image_width: int, image_height: int, near: float, far: float, projection_type=ProjectionType.PERSPECTIVE, tile_size: int = 16, min_radius_2d: float = 0.0, eps_2d: float = 0.3, antialias: bool = False) tuple[Tensor, Tensor][source]

Renders C images where each pixel contains the number of contributing Gaussians for that pixel from C camera views.

Note

All images being rendered must have the same width and height.

Example:

# Assume gaussian_splat_3d is an instance of GaussianSplat3d
# Render images from C camera views.
# images is a tensor of shape [C, H, W, D] where D is the number of channels
# alpha_images is a tensor of shape [C, H, W, 1]
num_gaussians, alpha_images = gaussian_splat_3d.render_images(
    world_to_camera_matrices, # tensor of shape [C, 4, 4]
    projection_matrices, # tensor of shape [C, 3, 3]
    image_width, # width of the images
    image_height, # height of the images
    near, # near clipping plane
    far) # far clipping plane

num_gaussians_cij = num_gaussians[c, i, j, 0]  # Number of contributing Gaussians at pixel (i, j) in camera c
Parameters:
  • world_to_camera_matrices (torch.Tensor) – Tensor of shape (C, 4, 4) representing the world-to-camera transformation matrices for C cameras. Each matrix transforms points from world coordinates to camera coordinates.

  • projection_matrices (torch.Tensor) – Tensor of shape (C, 3, 3) representing the projection matrices for C cameras. Each matrix projects points in camera space into homogeneous pixel coordinates.

  • image_width (int) – The width of the images to be rendered. Note these are the same for all images being rendered.

  • image_height (int) – The height of the images to be rendered. Note these are the same for all images being rendered.

  • near (float) – The near clipping plane distance for the projection.

  • far (float) – The far clipping plane distance for the projection.

  • projection_type (ProjectionType) – The type of projection to use. Default is fvdb.ProjectionType.PERSPECTIVE.

  • tile_size (int) – The size of the tiles to use for rendering. Default is 16. You shouldn’t set this parameter unless you really know what you are doing.

  • min_radius_2d (float) – The minimum radius (in pixels) below which Gaussians are ignored during rendering.

  • eps_2d (float) – A value used to pad Gaussians when projecting them onto the image plane, to avoid very projected Gaussians which create artifacts and numerical issues.

  • antialias (bool) – If True, applies opacity correction to the projected Gaussians when using eps_2d > 0.0.

Returns:
  • images (torch.Tensor) – A tensor of shape (C, H, W, 1) where C is the number of camera views, H is the height of the images, W is the width of the images. Each element represents the number of contributing Gaussians at that pixel.

  • alpha_images (torch.Tensor) – A tensor of shape (C, H, W, 1) where C is the number of camera views, H is the height of the images, and W is the width of the images. Each element represents the alpha value (opacity) at a pixel such that 0 <= alpha < 1, and 0 means the pixel is fully transparent, and 1 means the pixel is fully opaque.

render_top_contributing_gaussian_ids(num_samples: int, world_to_camera_matrices: Tensor, projection_matrices: Tensor, image_width: int, image_height: int, near: float, far: float, projection_type=ProjectionType.PERSPECTIVE, tile_size: int = 16, min_radius_2d: float = 0.0, eps_2d: float = 0.3, antialias: bool = False) tuple[Tensor, Tensor][source]

Renders the ids of the top num_samples contributing Gaussians in C camera views. i.e. the ids of the most opaque Gaussians contributing to each pixel in each image.

Note

If there are fewer than num_samples Gaussians contributing to a pixel, the remaining ids will be set to -1, and their corresponding weights will be set to 0.0.

Parameters:
  • world_to_camera_matrices (torch.Tensor) – Tensor of shape (C, 4, 4) representing the world-to-camera transformation matrices for C cameras. Each matrix transforms points from world coordinates to camera coordinates.

  • projection_matrices (torch.Tensor) – Tensor of shape (C, 3, 3) representing the projection matrices for C cameras. Each matrix projects points in camera space into homogeneous pixel coordinates.

  • image_width (int) – The width of the images to be rendered. Note these are the same for all images being rendered.

  • image_height (int) – The height of the images to be rendered. Note these are the same for all images being rendered.

  • near (float) – The near clipping plane distance for the projection.

  • far (float) – The far clipping plane distance for the projection.

  • projection_type (ProjectionType) – The type of projection to use. Default is fvdb.ProjectionType.PERSPECTIVE.

  • tile_size (int) – The size of the tiles to use for rendering. Default is 16. You shouldn’t set this parameter unless you really know what you are doing.

  • min_radius_2d (float) – The minimum radius (in pixels) below which Gaussians are ignored during rendering.

  • eps_2d (float) – A value used to pad Gaussians when projecting them onto the image plane, to avoid very projected Gaussians which create artifacts and numerical issues.

  • antialias (bool) – If True, applies opacity correction to the projected Gaussians when using eps_2d > 0.0.

Returns:
  • top_contributing_gaussian_ids (torch.Tensor) – An int64 tensor of shape (C, H, W, num_samples) where C is the number of cameras, H is the height of the images, W is the width of the images, and num_samples is the number of top contributing Gaussians to return for each pixel. Each element represents the id of a Gaussian that contributes to the pixel.

  • weights (torch.Tensor) – A tensor of shape (C, H, W, num_samples) where C is the number of cameras, H is the height of the images, W is the width of the images, and num_samples is the number of top contributing Gaussians to return for each pixel. Each element represents the transmittance-weighted opacity of the Gaussian that contributes to the pixel (i.e. its proportion of the visible contribution to the pixel).

property requires_grad: bool

Returns whether the tensors tracked by this GaussianSplat3d instance are set to require gradients. This is typically set to True if you want to optimize the parameters of the Gaussians.

Example:

gsplat3d = GaussianSplat3d(...)  # Some GaussianSplat3d instance
gsplat3d.requires_grad = True  # Enable gradient tracking for optimization

assert gsplat3d.means.requires_grad  # Now the means will require gradients
assert gsplat3d.covariances.requires_grad  # Now the covariances will require gradients
assert gsplat3d.logit_opacities.requires_grad  # Now the logit opacities will require gradients
assert gsplat3d.log_scales.requires_grad  # Now the log scales will require gradients
assert gsplat3d.sh0.requires_grad  # Now the SH coefficients will require gradients
assert gsplat3d.shN.requires_grad  # Now the SH coefficients will require gradients
Returns:

requires_grad (bool) – True if gradients are required, False otherwise.

reset_accumulated_gradient_state() None[source]

Reset the accumulated projected gradients of the mans if accumulate_mean_2d_gradients is True, and the accumulated max 2D radii if accumulate_max_2d_radii is True.

The values of accumulated_projected_mean_2d_gradients, accumulated_max_2d_radii, and accumulated_gradient_step_counts will be zeroed out after this call.

See also

accumulate_mean_2d_gradients() accumulate_max_2d_radii() which control if we accumulate these values during rendering and backward passes.

save_ply(filename: Path | str, metadata: Mapping[str, str | int | float | Tensor] | None = None) None[source]

Save this GaussianSplat3d to a PLY file. and include any metadata provided.

Parameters:
  • filename (pathlib.Path | str) – The path to the PLY file to save.

  • metadata (dict[str, str | int | float | torch.Tensor] | None) – An optional dictionary of metadata where the keys are strings and the values are either strings, ints, floats, or tensors. Defaults to None,

property scales: Tensor

Returns the scales of the Gaussians in the Gaussian splatting representation. The scales are the eigenvalues of the covariance matrix of each Gaussian. i.e each Gaussian \(g_i\) is defined as:

\[g_i(x) = \exp(-0.5 (x - \mu_i)^T \Sigma_i^{-1} (x - \mu_i))\]

where \(\mu_i\) is the mean and \(\Sigma_i = R(q_i)^T S_i R(q_i)\) is the covariance of the i-th Gaussian with \(R(q_i)\) being the rotation matrix defined by the unit quaternion \(q_i\) of the Gaussian, and \(S_i = diag(\exp(log\_scales_i))\).

Note

This property is read only. GaussianSplat3d stores the log of scales to ensure numerical stability, which you can modify. See log_scales.

Returns:

scales (torch.Tensor) – A tensor of shape (N, 3) where N is the number of Gaussians. Each row represents the scale of a Gaussian in 3D space.

set_state(means: Tensor, quats: Tensor, log_scales: Tensor, logit_opacities: Tensor, sh0: Tensor, shN: Tensor) None[source]

Set the underlying tensors managed by this GaussianSplat3d instance.

Note: If accumulate_mean_2d_gradients and/or accumulate_max_2d_radii are True, this method will reset the gradient state (see reset_accumulated_gradient_state()).

Parameters:
  • means (torch.Tensor) – Tensor of shape (N, 3) representing the means of the Gaussians. N is the number of Gaussians (see num_gaussians).

  • quats (torch.Tensor) – Tensor of shape (N, 4) representing the quaternions of the Gaussians. N is the number of Gaussians (see num_gaussians).

  • log_scales (torch.Tensor) – Tensor of shape (N, 3) representing the log scales of the Gaussians. N is the number of Gaussians (see num_gaussians).

  • logit_opacities (torch.Tensor) – Tensor of shape (N,) representing the logit opacities of the Gaussians. N is the number of Gaussians (see num_gaussians).

  • sh0 (torch.Tensor) – Tensor of shape (N, 1, D) representing the diffuse SH coefficients where N is the number of Gaussians (see num_gaussians), and D is the number of channels (see num_channels).

  • shN (torch.Tensor) – Tensor of shape (N, K-1, D) representing the directionally varying SH coefficients where N is the number of Gaussians (see num_gaussians), D is the number of channels (see num_channels), and K is the number of spherical harmonic bases (see num_sh_bases).

property sh0: Tensor

Returns the diffuse spherical harmonics coefficients of the Gaussians in this GaussianSplat3d. These coefficients are used to represent the diffuse color/feature of each Gaussian.

Returns:

sh0 (torch.Tensor) – A tensor of shape (N, 1, D) where N is the number of Gaussians (see num_gaussians), and D is the number of channels (see num_channels). Each row represents the diffuse SH coefficients for a Gaussian.

property shN: Tensor

Returns the directionally varying spherical harmonics coefficients of the Gaussians in the scene. These coefficients are used to represent a direction dependent color/feature of each Gaussian.

Returns:

torch.Tensor – A tensor of shape (N, K-1, D) where N is the number of Gaussians (see num_gaussians), D is the number of channels (see num_channels), and K is the number of spherical harmonic bases (see num_sh_bases). Each row represents the directionally varying SH coefficients for a Gaussian.

property sh_degree: int

Returns the degree of the spherical harmonics used in the Gaussian splatting representation. This value is 0 for diffuse SH coefficients and >= 1 for directionally varying SH coefficients.

Note

This is not the same as the number of spherical harmonics bases (see num_sh_bases). The relationship between the degree and the number of bases is given by \(K = (sh\_degree + 1)^2\), where \(K\) is the number of spherical harmonics bases.

Returns:

sh_degree (int) – The degree of the spherical harmonics.

sparse_render_num_contributing_gaussians(pixels_to_render: Tensor, world_to_camera_matrices: Tensor, projection_matrices: Tensor, image_width: int, image_height: int, near: float, far: float, projection_type=ProjectionType.PERSPECTIVE, tile_size: int = 16, min_radius_2d: float = 0.0, eps_2d: float = 0.3, antialias: bool = False) tuple[Tensor, Tensor][source]
sparse_render_num_contributing_gaussians(pixels_to_render: JaggedTensor, world_to_camera_matrices: Tensor, projection_matrices: Tensor, image_width: int, image_height: int, near: float, far: float, projection_type=ProjectionType.PERSPECTIVE, tile_size: int = 16, min_radius_2d: float = 0.0, eps_2d: float = 0.3, antialias: bool = False) tuple[JaggedTensor, JaggedTensor]

Renders the number of Gaussians which contribute to each pixel specified in the input.

See also

render_num_contributing_gaussians() for rendering dense images of contributing Gaussians.

Parameters:
  • pixels_to_render (torch.Tensor | JaggedTensor) – A fvdb.JaggedTensor of shape (C, R_c, 2) representing the pixels to render for each camera, where C is the number of camera views and R_c is the number of pixels to render per camera. Each value is an (x, y) pixel coordinate.

  • world_to_camera_matrices (torch.Tensor) – Tensor of shape (C, 4, 4) representing the world-to-camera transformation matrices for C cameras. Each matrix transforms points from world coordinates to camera coordinates.

  • projection_matrices (torch.Tensor) – Tensor of shape (C, 3, 3) representing the projection matrices for C cameras. Each matrix projects points in camera space into homogeneous pixel coordinates.

  • image_width (int) – The width of the images to be rendered. Note these are the same for all images being rendered.

  • image_height (int) – The height of the images to be rendered. Note these are the same for all images being rendered.

  • near (float) – The near clipping plane distance for the projection.

  • far (float) – The far clipping plane distance for the projection.

  • projection_type (ProjectionType) – The type of projection to use. Default is fvdb.ProjectionType.PERSPECTIVE.

  • tile_size (int) – The size of the tiles to use for rendering. Default is 16. You shouldn’t set this parameter unless you really know what you are doing.

  • min_radius_2d (float) – The minimum radius (in pixels) below which Gaussians are ignored during rendering.

  • eps_2d (float) – A value used to pad Gaussians when projecting them onto the image plane, to avoid very projected Gaussians which create artifacts and numerical issues.

  • antialias (bool) – If True, applies opacity correction to the projected Gaussians when using eps_2d > 0.0.

Returns:
  • num_contributing_gaussians (torch.Tensor | JaggedTensor) – A tensor of shape (C, R) (if this method was called with pixels_to_render as a torch.Tensor) or a fvdb.JaggedTensor of shape (C, R_c) (if this method was called with pixels_to_render as a fvdb.JaggedTensor) where C is the number of cameras, and R/R_c is the number of pixels to render per camera. Each element represents the number of contributing Gaussians at that pixel.

  • alphas (torch.Tensor | JaggedTensor) – A tensor of shape (C, R) (if this method was called with pixels_to_render as a torch.Tensor) or a fvdb.JaggedTensor of shape (C, R_c) (if this method was called with pixels_to_render as a fvdb.JaggedTensor) where C is the number of cameras, and R/R_c is the number of pixels to render per camera. Each element represents the alpha value (opacity) at that pixel such that 0 <= alpha < 1, and 0 means the pixel is fully transparent, and 1 means the pixel is fully opaque.

sparse_render_top_contributing_gaussian_ids(num_samples: int, pixels_to_render: Tensor, world_to_camera_matrices: Tensor, projection_matrices: Tensor, image_width: int, image_height: int, near: float, far: float, projection_type=ProjectionType.PERSPECTIVE, tile_size: int = 16, min_radius_2d: float = 0.0, eps_2d: float = 0.3, antialias: bool = False) tuple[Tensor, Tensor][source]
sparse_render_top_contributing_gaussian_ids(num_samples: int, pixels_to_render: JaggedTensor, world_to_camera_matrices: Tensor, projection_matrices: Tensor, image_width: int, image_height: int, near: float, far: float, projection_type=ProjectionType.PERSPECTIVE, tile_size: int = 16, min_radius_2d: float = 0.0, eps_2d: float = 0.3, antialias: bool = False) tuple[JaggedTensor, JaggedTensor]

Renders the ids of the top num_samples contributing Gaussians in the specified set of pixels across C camera views. i.e. the ids of the most opaque Gaussians contributing to each pixel in each image.

Note

If there are fewer than num_samples Gaussians contributing to a pixel, the remaining ids will be set to -1, and their corresponding weights will be set to 0.0.

Parameters:
  • pixels_to_render (torch.Tensor | JaggedTensor) – A :torch.Tensor: of shape (C, R, 2) or a fvdb.JaggedTensor of shape (C, R_c, 2) representing the pixels to render for each camera, where C is the number of camera views and R/R_c is the number of pixels to render per camera. Each value is an (x, y) pixel coordinate.

  • world_to_camera_matrices (torch.Tensor) – Tensor of shape (C, 4, 4) representing the world-to-camera transformation matrices for C cameras. Each matrix transforms points from world coordinates to camera coordinates.

  • projection_matrices (torch.Tensor) – Tensor of shape (C, 3, 3) representing the projection matrices for C cameras. Each matrix projects points in camera space into homogeneous pixel coordinates.

  • image_width (int) – The width of the images to be rendered. Note these are the same for all images being rendered.

  • image_height (int) – The height of the images to be rendered. Note these are the same for all images being rendered.

  • near (float) – The near clipping plane distance for the projection.

  • far (float) – The far clipping plane distance for the projection.

  • projection_type (ProjectionType) – The type of projection to use. Default is fvdb.ProjectionType.PERSPECTIVE.

  • tile_size (int) – The size of the tiles to use for rendering. Default is 16. You shouldn’t set this parameter unless you really know what you are doing.

  • min_radius_2d (float) – The minimum radius (in pixels) below which Gaussians are ignored during rendering.

  • eps_2d (float) – A value used to pad Gaussians when projecting them onto the image plane, to avoid very projected Gaussians which create artifacts and numerical issues.

  • antialias (bool) – If True, applies opacity correction to the projected Gaussians when using eps_2d > 0.0.

Returns:
  • top_contributing_gaussian_ids (torch.Tensor | JaggedTensor) – A long tensor of shape (C, R, num_samples) (if pixels_to_render was a torch.Tensor) or a fvdb.JaggedTensor of shape (C, R_c, num_samples) (if pixels_to_render was a fvdb.JaggedTensor), where C is the number of cameras, R/R_c is the number of pixels being rendered per image, and num_samples is the number of top contributing Gaussians to return for each pixel. Each element represents the id of a Gaussian that contributes to the pixel.

  • weights (torch.Tensor) – A tensor of shape (C, R, num_samples) (if pixels_to_render was a torch.Tensor) or a fvdb.JaggedTensor of shape (C, R_c, num_samples) (if pixels_to_render was a fvdb.JaggedTensor), where C is the number of cameras, R/R_c is the number of pixels being rendered per image, and num_samples is the number of top contributing Gaussians to return for each pixel. Each element represents the transmittance-weighted opacity of the Gaussian that contributes to the pixel (i.e. its proportion of the visible contribution to the pixel).

state_dict() dict[str, Tensor][source]

Return a dictionary containing the state of the GaussianSplat3d instance. This is useful for serializing the state of the object for saving or transferring.

A state dictionary always contains the following keys where N denotes the number of Gaussians (see num_gaussians):

  • 'means': Tensor of shape (N, 3) representing the means of the Gaussians.

  • 'quats': Tensor of shape (N, 4) representing the quaternions of the Gaussians.

  • 'log_scales': Tensor of shape (N, 3) representing the log scales of the Gaussians.

  • 'logit_opacities': Tensor of shape (N,) representing the logit opacities of the Gaussians.

  • 'sh0': Tensor of shape (N, 1, D) representing the diffuse SH coefficients where D is the number of channels (see num_channels).

  • 'shN': Tensor of shape (N, K-1, D) representing the directionally varying SH coefficients where D is the number of channels (see num_channels), and K is the number of spherical harmonic bases (see num_sh_bases).

  • 'accumulate_max_2d_radii': bool Tensor with a single element indicating whether to track the maximum 2D radii for gradients.

  • 'accumulate_mean_2d_gradients': bool Tensor with a single element indicating whether to track the average norm of the gradient of projected means for each Gaussian.

It can also optionally contain the following keys if accumulate_mean_2d_gradients and/or accumulate_max_2d_radii are set to True:

  • 'accumulated_gradient_step_counts': Tensor of shape (N,) representing the accumulated gradient step counts for each Gaussian.

  • 'accumulated_max_2d_radii': Tensor of shape (N,) representing the maximum 2D projected radius for each Gaussian across every iteration of optimization.

  • 'accumulated_mean_2d_gradient_norms': Tensor of shape (N,) representing the average norm of the gradient of projected means for each Gaussian across every iteration of optimization.

See also

from_state_dict() for constructing a GaussianSplat3d from a state dictionary.

Returns:

state_dict (dict[str, torch.Tensor]) – A dictionary containing the state of the GaussianSplat3d instance.

to(dtype: dtype | None = None) GaussianSplat3d[source]
to(device: str | device | None = None, dtype: dtype | None = None) GaussianSplat3d
to(other: Tensor) GaussianSplat3d
to(other: GaussianSplat3d) GaussianSplat3d
to(other: Grid) GaussianSplat3d
to(other: GridBatch) GaussianSplat3d
to(other: JaggedTensor) GaussianSplat3d

Move the GaussianSplat3d instance to a different device or change its data type or both.

Parameters:
Returns:

gaussian_splat_3d (GaussianSplat3d) – A new instance of GaussianSplat3d with the specified device and/or data type.