Gaussian Splatting
- class fvdb.ProjectedGaussianSplats(impl: ProjectedGaussianSplats, _private: Any = None)[source]
A class representing a set of Gaussian splats projected onto a batch of 2D image planes.
A
ProjectedGaussianSplatsinstance contains the 2D projections of 3D Gaussian splats, which can be used to render images onto the image planes. Instances of this class are created by calling theGaussianSplat3d.project_gaussians_for_images(),GaussianSplat3d.project_gaussians_for_images_and_depths(), etc. methods.Note
The reason to have a separate class for projected Gaussian splats is to be able to run projection once, and then render the splats multiple times (e.g. rendering crops) without re-projecting them each time. This can save significant computation time.
- property antialias: bool
Return whether antialiasing was enabled during the projection of the Gaussian splats.
- Returns:
antialias (bool) –
Trueif antialiasing was enabled during projection,Falseotherwise.
- property depths: Tensor
Return the depth of each projected Gaussian in each image plane. The depth is defined as the distance from the camera to the mean of the Gaussian along the camera’s viewing direction.
- Returns:
depths (torch.Tensor) – A tensor of shape
(C, N)representing the depth of each projected Gaussian, whereCis the number of image planes, andNis the number of projected Gaussians.
- property eps_2d: float
Return the epsilon value used during the projection of the Gaussian splats to avoid numerical issues. This value is used to clamp very small radii during projection.
- Returns:
eps_2d (float) – The epsilon value used during projection.
- property far_plane: float
Return the far plane distance used during the projection of the Gaussian splats.
- Returns:
far_plane (float) – The far plane distance.
- property image_height: int
Return the height of the image planes used during the projection of the Gaussian splats.
- Returns:
image_height (int) – The height of the image planes.
- property image_width: int
Return the width of the image planes used during the projection of the Gaussian splats.
- Returns:
image_width (int) – The width of the image planes.
- property inv_covar_2d: Tensor
The inverse of the 2D covariance matrices of the Gaussians projected into each image plane. These define the spatial extent of ellipses for each splatted Gaussian. Note that since covariance matrices are symmetric, we pack them into a tensor of shape
(num_projected_gaussians, 3)where each covariance matrix is represented as(Cxx, Cxy, Cyy).- Returns:
inv_covar_2d (torch.Tensor) – A tensor of shape
(C, N, D)representing the packed inverse 2D covariance matrices, whereCis the number of image planes,Nis the number of projected Gaussians, andDis number of feature channels for each Gaussian (seeGaussianSplat3d.num_channels).
- property means2d: Tensor
Return the 2D projected means (in pixel units) of the Gaussians in each image plane.
- Returns:
means2d (torch.Tensor) – A tensor of shape
(C, N, 2)representing the 2D projected means, whereCis the number of image planes,Nis the number of projected Gaussians, and the last dimension contains the (x, y) coordinates of the means in pixel space.
- property min_radius_2d: float
Return the minimum radius (in pixels) used to clip Gaussians during projection. Gaussians whose radius projected to less than this value are ignored to avoid numerical issues.
- Returns:
min_radius_2d (float) – The minimum radius used during projection.
- property near_plane: float
Return the near plane distance used during the projection of the Gaussian splats.
- Returns:
near_plane (float) – The near plane distance.
- property opacities: Tensor
Return the opacities of each projected Gaussian in each image plane.
- Returns:
opacities (torch.Tensor) – A tensor of shape
(C, N)representing the opacity of each projected Gaussian, whereCis the number of image planes, andNis the number of projected Gaussians.
- property projection_type: ProjectionType
Return the projection type used during the projection of the Gaussian splats.
- Returns:
projection_type (ProjectionType) – The projection type (e.g.
ProjectionType.PERSPECTIVEorProjectionType.ORTHOGRAPHIC).
- property radii: Tensor
Return the 2D radii (in pixels) of each projected Gaussian in each image plane. The radius of a Gaussian is the maximum extent of the Gaussian along any direction in the image plane.
- Returns:
radii (torch.Tensor) – A tensor of shape
(C, N)representing the 2D radius of each projected Gaussian, whereCis the number of image planes, andNis the number of projected Gaussians.
- property render_quantities: Tensor
Return the render quantities of each projected Gaussian in each image plane. The render quantities are used for shading and lighting calculations during rendering.
- Returns:
render_quantities (torch.Tensor) – A tensor of shape
(C, N, D)representing the render quantities of each projected Gaussian, whereCis the number of image planes,Nis the number of projected Gaussians, andDis the number of feature channels for each Gaussian (seeGaussianSplat3d.num_channels).
- property sh_degree_to_use: int
Return the spherical harmonic degree used during the projection of the Gaussian splats.
Note
This indicates up to which degree the spherical harmonics coefficients were projected for each Gaussian. For example, if this value is
0, only the diffuse (degree 0) coefficients were projected. If this value is2, coefficients up to degree 2 were projected.- Returns:
sh_degree_to_use (int) – The spherical harmonic degree used during projection.
- property tile_gaussian_ids: Tensor
Return a tensor containing the ID of each tile/gaussian intersection.
- Returns:
tile_gaussian_ids (torch.Tensor) – A tensor of shape
(M,)containing the IDs of the Gaussians.
- property tile_offsets: Tensor
Return the starting offset of the set of intersections for each tile into
tile_gaussian_ids.- Returns:
tile_offsets (torch.Tensor) – A tensor of shape
(C, TH, TW,)whereCis the number of image planes,THis the number of tiles in the height dimension, andTWis the number of tiles in the width dimension.
- class fvdb.GaussianSplat3d(impl: GaussianSplat3d, _private: Any = None)[source]
An efficient data structure representing a Gaussian splat radiance field in 3D space.
A
GaussianSplat3dinstance contains a set of 3D Gaussian splats, each defined by its mean position, orientation (quaternion), scale, opacity, and spherical harmonics coefficients for color representation.Together, these define a radiance field which can be volume rendered to produce images and depths from arbitrary viewpoints. This class provides a variety of methods for rendering and manipulating Gaussian splats radiance fields. These include:
Rendering images with arbitrary channels using spherical harmonics for view-dependent color representation (
render_images(),render_images_and_depths()).Rendering depth maps (
render_depths(),render_images_and_depths()).Rendering features at arbitrary sparse pixel locations (
sparse_render_features()).Rendering depths at arbitrary sparse pixel locations (
sparse_render_depths()).Computing which gaussians contribute to each pixel in an image plane (
render_num_contributing_gaussians(),render_top_contributing_gaussian_ids()).Computing the set of Gaussians which contribute to a set of sparse pixel locations (
sparse_render_num_contributing_gaussians(),sparse_render_top_contributing_gaussian_ids()).Saving and loading Gaussian splat data to/from PLY files (
save_to_ply(),from_ply()).Slicing, indexing, and masking Gaussians to create new
GaussianSplat3dinstances.Concatenating multiple
GaussianSplat3dinstances into a single instance (cat()).
Background
Mathematically, the radiance field represented by a
GaussianSplat3dis defined as a sum of anisotropic 3D Gaussians, with view-dependent features represented using spherical harmonics. The radiance field \(R(x, v)\) accepts as input a 3D position \(x \in \mathbb{R}^3\) and a viewing direction \(v \in \mathbb{S}^2\), and is defined as:\[ \begin{align}\begin{aligned}R(x, v) = \sum_{i=1}^{N} o_i \cdot \alpha_i(x) \cdot SH(v; C_i)\\\alpha_i(x) = \exp\left(-\frac{1}{2}(x - \mu_i)^T \Sigma_i^{-1} (x - \mu_i)\right)\\\Sigma_i = R(q_i)^T \cdot \text{diag}(S_i) \cdot R(q_i)\end{aligned}\end{align} \]where:
\(N\) is the number of Gaussians (see
num_gaussians).\(\mu_i \in \mathbb{R}^3\) is the mean of the i-th Gaussian (see
means).\(\Sigma_i \in \mathbb{R}^{3 \times 3}\) is the covariance matrix of the i-th Gaussian, defined by its scale diagonal scale \(S_i \in \mathbb{R}^3\) (see
scales) and orientation quaternion \(q_i \in \mathbb{R}^4\) (seequats).\(o_i \in [0, 1]\) is the opacity of the i-th Gaussian (see
opacities).\(SH(v; C_i)\) is the spherical harmonics function evaluated at direction \(v\) with coefficients \(C_i\).
\(R(q_i)\) is the rotation matrix corresponding to the quaternion \(q_i\).
To render images from a
GaussianSplat3d, you volume render the radiance field using\[I(u, v) = \int_{t \in r(u, v)} T(t) R(r(t), d) dt\]where \(r(u, v)\) is the camera ray through pixel \((u, v)\), \(d\) is the viewing direction of the ray, and \(T(t) = \exp\left(-\int_{0}^{t} R(r(s), s) ds\right)\) is the accumulated transmittance along the ray up to distance \(t\).
and to render depths you compute
\[D(u, v) = \int_{t \in r(u, v)} t \cdot T(t) \sum_{i=1}^{N} o_i \cdot \alpha_i(r(t), d) dt\]- PLY_VERSION_STRING = 'fvdb_ply 1.0.0'
Version string written to PLY files saved using the
save_to_ply()method. This string will be written in the comment section of the PLY file to identify the version of the fvdb library used to save the file. The comment will have the formcomment fvdb_gs_ply <PLY_VERSION_STRING>.
- __getitem__(index: slice) GaussianSplat3d[source]
- __getitem__(index: Tensor) GaussianSplat3d
Select Gaussians using either an integer index tensor, a boolean mask tensor, or a slice.
Note
If
accumulate_mean_2d_gradientsoraccumulate_max_2d_radiiis enabled on thisGaussianSplat3dinstance, the returnedGaussianSplat3dwill also contain the corresponding accumulated values.Example usage:
# Using a slice gs_subset = gsplat3d[10:20] # Selects Gaussians from index 10 to 19 # Using an integer index tensor indices = torch.tensor([0, 2, 4, 6]) gs_subset = gsplat3d[indices] # Selects Gaussians at indices 0, 2, 4, and 6 # Using a boolean mask tensor mask = torch.tensor([True, False, True, False, ...]) # Length must be num_gaussians gs_subset = gsplat3d[mask] # Selects Gaussians where mask is True
- Parameters:
index (slice | torch.Tensor) – A slice object or a 1D tensor containing either integer indices or a boolean mask.
- Returns:
gaussian_splat_3d (GaussianSplat3d) – A new instance of
GaussianSplat3dcontaining only the selected Gaussians.
- __setitem__(index: slice, value: GaussianSplat3d) None[source]
- __setitem__(index: Tensor, value: GaussianSplat3d) None
Set the values of Gaussians in this
GaussianSplat3dinstance using either an integer index tensor, a boolean mask tensor, or a slice.Note
If using integer indices with duplicate indices, the Gaussian set from
valueat the duplicate indices will overwrite in a random order.Note
If
accumulate_mean_2d_gradientsoraccumulate_max_2d_radiiis enabled on thisGaussianSplat3dinstance, the corresponding accumulated values will also be updated for the selected Gaussians, based on the values from thevalueinstance. Ifvaluedoes not have these accumulations enabled, the accumulated values for the selected Gaussians will be reset to zero.Example:
# Using a slice gs_subset: GaussianSplat3d = ... # Some GaussianSplat3d instance with 10 Gaussians gsplat3d[10:20] = gs_subset # Sets Gaussians from index 10 to 19 # Using an integer index tensor indices = torch.tensor([0, 2, 4, 6]) gs_subset: GaussianSplat3d = ... # Some GaussianSplat3d instance with 4 Gaussians gsplat3d[indices] = gs_subset # Sets Gaussians at indices 0, 2, 4, and 6 # Using a boolean mask tensor mask = torch.tensor([True, False, True, False, ...]) # Length must be num_gaussians gs_subset: GaussianSplat3d = ... # Some GaussianSplat3d instance with num unmasked Gaussians gsplat3d[mask] = gs_subset # Sets Gaussians where mask is True
- Parameters:
index (torch.Tensor | slice) – A slice object or a 1D tensor containing either integer indices or a boolean mask.
value (GaussianSplat3d) – The
GaussianSplat3dinstance containing the new values to set. Must have the same number of Gaussians as the selected indices or mask.
- property accumulate_max_2d_radii: bool
Returns whether to track the maximum 2D projected radius of each Gaussian across calls to render_* functions. This is used by certain optimization techniques to ensure that the Gaussians do not become too large or too small during the optimization process.
See also
See
accumulated_max_2d_radiifor the actual maximum radii values.- Returns:
accumulate_max_radii (bool) –
Trueif the maximum 2D radii are being tracked across rendering calls,Falseotherwise.
- property accumulate_mean_2d_gradients: bool
Returns whether to track the average norm of the gradient of projected means for each Gaussian during the backward pass of projection. This property is used by certain optimization techniques to split/prune/duplicate Gaussians. The accumulated 2d gradient norms are defined as follows:
\[\sum_{t=1}^{T} \| \partial_{L_t} \mu_i^{2D} \|_2\]where \(\mu_i^{2D}\) is the projection of the mean of Gaussian \(g_i\) onto the image plane, and \(L_t\) is the loss at iteration \(t\).
See also
See
accumulated_mean_2d_gradient_normsfor the actual average norms of the gradients.- Returns:
accumulate_mean_2d_grads (bool) –
Trueif the average norm of the gradient of projected means is being tracked,Falseotherwise.
- property accumulated_gradient_step_counts: Tensor
Returns the accumulated gradient step counts for each Gaussian.
If this
GaussianSplat3dinstance is set to track accumulated gradients (i.eaccumulate_mean_2d_gradientsisTrue), then this tensor contains the number of Gradient steps that have been applied to each Gaussian during optimization.If
accumulate_mean_2d_gradientsisFalse, this property will be an empty tensor.Note
To reset the counts, call call the
reset_accumulated_gradient_state()method.- Returns:
step_counts (torch.Tensor) – A tensor of shape
(N,)whereNis the number of Gaussians (seenum_gaussians). Each element represents the accumulated gradient step count for a Gaussian.
- property accumulated_max_2d_radii: Tensor
Returns the maximum 2D projected radius (in pixels) for each Gaussian across all calls to render_* functions. This is used by certain optimization techniques to ensure that the Gaussians do not become too large or too small during the optimization process.
If :this
GaussianSplat3dinstance is set to track maximum 2D radii (i.eaccumulate_max_2d_radiiisTrue), then this tensor contains the maximum 2D radius for each Gaussian.If
accumulate_max_2d_radiiisFalse, this property will be an empty tensor.Note
To reset the maximum radii to zero, you can call the
reset_accumulated_gradient_state()method.- Returns:
max_radii (torch.Tensor) – A tensor of shape
(N,)whereNis the number of Gaussians (seenum_gaussians). Each element represents the maximum 2D radius for a Gaussian across all optimization iterations.
- property accumulated_mean_2d_gradient_norms: Tensor
Returns the average norm of the gradient of projected (2D) means for each Gaussian across every backward pass. This is used by certain optimization techniques to split/prune/duplicate Gaussians. The accumulated 2d gradient norms are defined as follows:
\[\sum_{t=1}^{T} \| \partial_{L_t} \mu_i^{2D} \|_2\]where \(\mu_i^{2D}\) is the projection of the mean of Gaussian \(g_i\) onto the image plane, and \(L_t\) is the loss at iteration \(t\).
Note
To reset the accumulated norms, call the
reset_accumulated_gradient_state()method.- Returns:
accumulated_grad_2d_norms (torch.Tensor) – A tensor of shape
(N,)whereNis the number of Gaussians (seenum_gaussians). Each element represents the average norm of the gradient of projected means for a Gaussian across all optimization iterations. The norm is computed in 2D space, i.e., the projected means.
- static cat(splats: Sequence[GaussianSplat3d], accumulate_mean_2d_gradients: bool = False, accumulate_max_2d_radii: bool = False, detach: bool = False) GaussianSplat3d[source]
Concatenates a sequence of
GaussianSplat3dinstances into a singleGaussianSplat3dinstance.The returned
GaussianSplat3dwill contain all the Gaussians from the input instances, in the order they were provided.Note
All input
GaussianSplat3dinstances must have the same number of channels and spherical harmonic degree.Note
If
accumulate_mean_2d_gradientsisTrue, the concatenated instance will track the average norm of projected mean gradients for each Gaussian during the backward pass of projection. This value is copied over from each input instance if they were tracking it, and initialized to zero otherwise.Note
If
accumulate_max_2d_radiiisTrue, the concatenated instance will track the maximum 2D radii for each Gaussian during the backward pass of projection. This value is copied over from each input instance if they were tracking it, and initialized to zero otherwise.- Parameters:
splats (Sequence[GaussianSplat3d]) – A sequence of
GaussianSplat3dinstances to concatenate.accumulate_mean_2d_gradients (bool) – If True, copies over the accumulated mean 2D gradients for each
GaussianSplat3dinto the new one, or initializes it to zero if the input instance was not tracking it. Defaults toFalse.accumulate_max_2d_radii (bool) – If
True, copies the accumulated maximum 2D radii for eachGaussianSplat3dinto the concatenated one, or initializes it to zero if the input instance was not tracking it. Defaults toFalse.detach (bool) – If
True, detaches the concatenatedGaussianSplat3dfrom the computation graph. Defaults toFalse.
- Returns:
GaussianSplat3d – A new instance of GaussianSplat3d containing the concatenated Gaussians.
- detach() GaussianSplat3d[source]
Return a new
GaussianSplat3dinstance whose tensors are detached from the computation graph. This is useful when you want to stop tracking gradients for this instance.- Returns:
gaussian_splat (GaussianSplat3d) – A new
GaussianSplat3dinstance whose tensors are detached.
- detach_() None[source]
Detaches this
GaussianSplat3dinstance from the computation graph in place. This modifies the current instance to stop tracking gradients.Note
This method modifies the current instance and does not return a new instance.
- property device: device
Returns the device on which the Tensors managed by this
GaussianSplat3dinstance is stored.- Returns:
device (torch.device) – The device of this
GaussianSplat3dinstance.
- property dtype: dtype
Returns the data type of of the tensors managed by this
GaussianSplat3dinstance (e.g.,torch.float32,torch.float64).- Returns:
torch.dtype – The data type of the tensors managed by this
GaussianSplat3dinstance.
- classmethod from_ply(filename: Path | str, device: str | device = 'cuda') tuple[GaussianSplat3d, dict[str, str | int | float | Tensor]][source]
Create a GaussianSplat3d instance from a PLY file.
- Parameters:
filename (str) – The name of the file to load the PLY data from.
device (torch.device) – The device to load the data onto. Default is “cuda”.
- Returns:
splats (GaussianSplat3d) – An instance of GaussianSplat3d initialized with the data from the PLY file.
metadata (dict[str, str | int | float | torch.Tensor]) – A dictionary of metadata where the keys are strings and the values are either strings, ints, floats, or tensors. Can be empty if no metadata is saved in the PLY file.
- classmethod from_state_dict(state_dict: dict[str, Tensor]) GaussianSplat3d[source]
Creates a
GaussianSplat3dinstance from a state dictionary generated bystate_dict(). This method is typically used to load a saved state of theGaussianSplat3dinstance.A state dictionary must contains the following keys which are all the required parameters to initialize a
GaussianSplat3d. HereNdenotes the number of Gaussians (seenum_gaussians)'means': Tensor of shape(N, 3)representing the means of the Gaussians.'quats': Tensor of shape(N, 4)representing the quaternions of the Gaussians.'log_scales': Tensor of shape(N, 3)representing the log scales of the Gaussians.'logit_opacities': Tensor of shape(N,)representing the logit opacities of the Gaussians.'sh0': Tensor of shape(N, 1, D)representing the diffuse SH coefficients whereDis the number of channels (seenum_channels).'shN': Tensor of shape(N, K-1, D)representing the directionally varying SH coefficients whereDis the number of channels (seenum_channels), andKis the number of spherical harmonic bases (seenum_sh_bases).'accumulate_max_2d_radii': bool Tensor with a single element indicating whether to track the maximum 2D radii for gradients.'accumulate_mean_2d_gradients': bool Tensor with a single element indicating whether to track the average norm of the gradient of projected means for each Gaussian.
It can also optionally contain the following keys:
'accumulated_gradient_step_counts': Tensor of shape(N,)representing the accumulated gradient step counts for each Gaussian.'accumulated_max_2d_radii': Tensor of shape(N,)representing the maximum 2D projected radius for each Gaussian across every iteration of optimization.'accumulated_mean_2d_gradient_norms': Tensor of shape(N,)representing the average norm of the gradient of projected means for each Gaussian across every iteration of optimization.
- Parameters:
state_dict (dict[str, torch.Tensor]) – A dictionary containing the state of the
GaussianSplat3dinstance, usually generated via thestate_dict()method.- Returns:
gaussian_splat (GaussianSplat3d) – An instance of
GaussianSplat3dinitialized with the provided state dictionary.
- classmethod from_tensors(means: Tensor, quats: Tensor, log_scales: Tensor, logit_opacities: Tensor, sh0: Tensor, shN: Tensor, accumulate_mean_2d_gradients: bool = False, accumulate_max_2d_radii: bool = False, detach: bool = False) GaussianSplat3d[source]
Create a new
GaussianSplat3dfrom the provided tensors. This constructs a new Gaussian splat radiance field with the specified means, orientations, scales, opacities, and spherical harmonics coefficients.Note
The
GaussianSplat3dstores the log of scales scales (log_scales) rather than the scales directly. This ensures numerical stability, especially when optimizing the scales, since each gaussian is defined as \(\exp(R(q)^T S R(q))\) where \(R(q)\) is rotation matrix defined by the unit quaternion of the Gaussian, and \(S = diag(exp(log_scales))\).Note
The
GaussianSplat3dstores the logit of opacities (logit_opacities) rather than the opacities directly. The actual opacities are obtained by applying the sigmoid function to the logit opacities. This ensures opacities are always in the range[0, 1]and improves numerical stability during optimization.- Parameters:
means (torch.Tensor) – Tensor of shape
(N, 3)representing the means of the gaussians, whereNis the number of gaussians.quats (torch.Tensor) – Tensor of shape
(N, 4)representing the quaternions (orientations) of the gaussians, whereNis the number of gaussians.log_scales (torch.Tensor) – Tensor of shape
(N, 3)representing the log scales of the gaussians, whereNis the number of gaussians.logit_opacities (torch.Tensor) – Tensor of shape
(N,)representing the logit opacities of the gaussians, whereNis the number of gaussians.sh0 (torch.Tensor) – Tensor of shape
(N, 1, D)representing the diffuse SH coefficients whereDis the number of channels (seenum_channels).shN (torch.Tensor) – Tensor of shape
(N, K-1, D)representing the directionally varying SH coefficients whereDis the number of channels (seenum_channels), andKis the number of spherical harmonic bases (seenum_sh_bases).accumulate_mean_2d_gradients (bool, optional) – If
True, tracks the average norm of the gradient of projected means for each Gaussian during the backward pass of projection. This is useful for some optimization techniques, such as the one in the original paper. Defaults toFalse.accumulate_max_2d_radii (bool, optional) –
If
True, tracks the maximum 2D radii for each Gaussian during the backward pass of projection. This is useful for some optimization techniques, such as the one in the original paper. Defaults toFalse.detach (bool, optional) – If
True, creates copies of the input tensors and detaches them from the computation graph. Defaults toFalse.
- property log_scales: Tensor
Returns the log of the scales for each Gaussian. Gaussians are represented in 3D space, as ellipsoids defined by their means, orientations (quaternions), and scales. i.e.
\[g_i(x) = \exp(-0.5 (x - \mu_i)^T \Sigma_i^{-1} (x - \mu_i))\]where \(\mu_i\) is the mean and \(\Sigma_i = R(q_i)^T S_i R(q_i)\) is the covariance of the i-th Gaussian with \(R(q_i)\) being the rotation matrix defined by the unit quaternion \(q_i\) of the Gaussian, and \(S_i = diag(\exp(log\_scales_i))\).
Note
The
GaussianSplat3dstores the log of scales scales (log_scales) rather than the scales directly. This ensures numerical stability, especially when optimizing the scales. To read the scales directly, see thescalesproperty (which is read-only).- Returns:
log_scales (torch.Tensor) – A tensor of shape
(N, 3)whereNis the number of Gaussians (seenum_gaussians). Each row represents the log of the scale of a Gaussian in 3D space.
- property logit_opacities: Tensor
Return the logit (inverse of sigmoid) of the opacities of each Gaussian in the scene.
Note
The
GaussianSplat3dstores the logit of opacities (logit_opacities) rather than the opacities directly. The actual opacities are obtained by applying the sigmoid function to the logit opacities. To read the opacities directly, see theopacitiesproperty (which is read-only).- Returns:
logit_opacities (torch.Tensor) – A tensor of shape
(N,)whereNis the number of Gaussians (seenum_gaussians). Each row represents the logit of the opacity of a Gaussian in 3D space.
- property means: Tensor
Return the means (3d positions) of the Gaussians in this
GaussianSplat3d. The means represent the center of each Gaussian in 3D space. i.e each Gaussian \(g_i\) is defined as:\[g_i(x) = \exp(-0.5 (x - \mu_i)^T \Sigma_i^{-1} (x - \mu_i))\]where \(\mu_i\) is the mean and \(\Sigma_i = R(q_i)^T S_i R(q_i)\) is the covariance of the i-th Gaussian with \(R(q_i)\) being the rotation matrix defined by the unit quaternion \(q_i\) of the Gaussian, and \(S_i = diag(\exp(log\_scales_i))\).
- Returns:
torch.Tensor – A tensor of shape (N, 3) where N is the number of Gaussians (see num_gaussians). Each row represents the mean of a Gaussian in 3D space.
- property num_channels: int
Returns the number of channels in the Gaussian splatting representation. For example, if you are rendering RGB images, this method will return 3.
- Returns:
num_channels (int) – The number of channels.
- property num_gaussians: int
Returns the number of Gaussians in the Gaussian splatting representation. This is the total number of individual gaussian splats that are being used to represent the scene.
- Returns:
num_gaussians (int) – The number of Gaussians.
- property num_sh_bases: int
Returns the number of spherical harmonics (SH) bases used in the Gaussian splatting representation.
Note
The number of SH bases is related to the SH degree (see
sh_degree) by the formula \(K = (sh\_degree + 1)^2\), where \(K\) is the number of spherical harmonics bases.- Returns:
num_sh_bases (int) – The number of spherical harmonics bases.
- property opacities: Tensor
Returns the opacities of the Gaussians in the Gaussian splatting representation. The opacities encode the visibility of each Gaussian in the scene.
Note
This property is read only.
GaussianSplat3dstores the logit (inverse of sigmoid) of the opacities to ensure numerical stability, which you can modify. Seelogit_opacities.- Returns:
opacities (torch.Tensor) – A tensor of shape
(N,)whereNis the number of Gaussians (seenum_gaussians). Each element represents the opacity of a Gaussian.
- project_gaussians_for_depths(world_to_camera_matrices: Tensor, projection_matrices: Tensor, image_width: int, image_height: int, near: float, far: float, projection_type=ProjectionType.PERSPECTIVE, min_radius_2d: float = 0.0, eps_2d: float = 0.3, antialias: bool = False) ProjectedGaussianSplats[source]
Projects this
GaussianSplat3donto one or more image planes for rendering depth images in those planes. You can render depth images from the projected Gaussians by callingrender_projected_gaussians().Note
The reason to have a separate projection and rendering step is to enable rendering crops of an image without having to project the Gaussians again.
Note
All images being rendered must have the same width and height.
See also
fvdb.ProjectedGaussianSplatsfor the projected Gaussians representation.# Assume gaussian_splat_3d is an instance of GaussianSplat3d # Project the Gaussians for rendering depth images onto C image planes projected_gaussians = gaussian_splat_3d.project_gaussians_for_depths( world_to_camera_matrices, # tensor of shape [C, 4, 4] projection_matrices, # tensor of shape [C, 3, 3] image_width, # width of the C images image_height, # height of the C images near, # near clipping plane far) # far clipping plane # Now render a crop of size 100x100 starting at (10, 10) from the projected Gaussians # in each image plane. # Returns a tensor of shape [C, 100, 100, 1] containing the depth images, # and a tensor of shape [C, 100, 100, 1] containing the final alpha (opacity) values # of each pixel. cropped_depth_images_1, cropped_alphas = gaussian_splat_3d.render_from_projected_gaussians( projected_gaussians, crop_width=100, crop_height=100, crop_origin_w=10, crop_origin_h=10) # To get the depth images, divide the last channel by the alpha values true_depths_1 = cropped_images_1[..., -1:] / cropped_alphas
- Parameters:
world_to_camera_matrices (torch.Tensor) – Tensor of shape
(C, 4, 4)representing the world-to-camera transformation matrices forCcameras. Each matrix transforms points from world coordinates to camera coordinates.projection_matrices (torch.Tensor) – Tensor of shape
(C, 3, 3)representing the projection matrices forCcameras. Each matrix projects points in camera space into homogeneous pixel coordinates.image_width (int) – The width of the images to be rendered. Note that all images must have the same width.
image_height (int) – The height of the images to be rendered. Note that all images must have the same height.
near (float) – The near clipping plane distance for the projection.
far (float) – The far clipping plane distance for the projection.
projection_type (ProjectionType) – The type of projection to use. Default is
ProjectionType.PERSPECTIVE.min_radius_2d (float) – The minimum radius (in pixels) below which Gaussians are ignored during rendering.
eps_2d (float) – A value used to pad Gaussians when projecting them onto the image plane, to avoid very projected Gaussians which create artifacts and numerical issues.
antialias (bool) – If
True, applies opacity correction to the projected Gaussians when usingeps_2d > 0.0.
- Returns:
projected_gaussians (ProjectedGaussianSplats) – An instance of ProjectedGaussianSplats containing the projected Gaussians. This object contains the projected 2D representations of the Gaussians, which can be used for rendering depth images or further processing.
- project_gaussians_for_images(world_to_camera_matrices: Tensor, projection_matrices: Tensor, image_width: int, image_height: int, near: float, far: float, projection_type=ProjectionType.PERSPECTIVE, sh_degree_to_use: int = -1, min_radius_2d: float = 0.0, eps_2d: float = 0.3, antialias: bool = False) ProjectedGaussianSplats[source]
Projects this
GaussianSplat3donto one or more image planes for rendering multi-channel (seenum_channels) images in those planes. You can render images from the projected Gaussians by callingrender_projected_gaussians().Note
The reason to have a separate projection and rendering step is to enable rendering crops of an image without having to project the Gaussians again.
Note
All images being rendered must have the same width and height.
See also
fvdb.ProjectedGaussianSplatsfor the projected Gaussians representation.# Assume gaussian_splat_3d is an instance of GaussianSplat3d # Project the Gaussians for rendering images onto C image planes projected_gaussians = gaussian_splat_3d.project_gaussians_for_images( world_to_camera_matrices, # tensor of shape [C, 4, 4] projection_matrices, # tensor of shape [C, 3, 3] image_width, # width of the C images image_height, # height of the C images near, # near clipping plane far) # far clipping plane # Now render a crop of size 100x100 starting at (10, 10) from the projected Gaussians # in each image plane. # Returns a tensor of shape [C, 100, 100, D] containing the images (where D is num_channels), # and a tensor of shape [C, 100, 100, 1] containing the final alpha (opacity) values # of each pixel. cropped_images_1, cropped_alphas = gaussian_splat_3d.render_from_projected_gaussians( projected_gaussians, crop_width=100, crop_height=100, crop_origin_w=10, crop_origin_h=10)
- Parameters:
world_to_camera_matrices (torch.Tensor) – Tensor of shape
(C, 4, 4)representing the world-to-camera transformation matrices forCcameras. Each matrix transforms points from world coordinates to camera coordinates.projection_matrices (torch.Tensor) – Tensor of shape
(C, 3, 3)representing the projection matrices forCcameras. Each matrix projects points in camera space into homogeneous pixel coordinates.image_width (int) – The width of the images to be rendered. Note that all images must have the same width.
image_height (int) – The height of the images to be rendered. Note that all images must have the same height.
near (float) – The near clipping plane distance for the projection.
far (float) – The far clipping plane distance for the projection.
projection_type (ProjectionType) – The type of projection to use. Default is
ProjectionType.PERSPECTIVE.sh_degree_to_use (int) – The degree of spherical harmonics to use for rendering. -1 means use all available SH bases. 0 means use only the first SH base (constant color). Note that you can’t use more SH bases than available in the GaussianSplat3d instance. Default is -1.
min_radius_2d (float) – The minimum radius (in pixels) below which Gaussians are ignored during rendering.
eps_2d (float) – A value used to pad Gaussians when projecting them onto the image plane, to avoid very projected Gaussians which create artifacts and numerical issues.
antialias (bool) – If
True, applies opacity correction to the projected Gaussians when usingeps_2d > 0.0.
- Returns:
projected_gaussians (ProjectedGaussianSplats) – An instance of ProjectedGaussianSplats containing the projected Gaussians. This object contains the projected 2D representations of the Gaussians, which can be used for rendering images or further processing.
- project_gaussians_for_images_and_depths(world_to_camera_matrices: Tensor, projection_matrices: Tensor, image_width: int, image_height: int, near: float, far: float, projection_type=ProjectionType.PERSPECTIVE, sh_degree_to_use: int = -1, min_radius_2d: float = 0.0, eps_2d: float = 0.3, antialias: bool = False) ProjectedGaussianSplats[source]
Projects this
GaussianSplat3donto one or more image planes for rendering multi-channel (seenum_channels) images with depths in the last channel. You can render images+depths from the projected Gaussians by callingrender_projected_gaussians().Note
The reason to have a separate projection and rendering step is to enable rendering crops of an image without having to project the Gaussians again.
Note
All images being rendered must have the same width and height.
See also
fvdb.ProjectedGaussianSplatsfor the projected Gaussians representation.# Assume gaussian_splat_3d is an instance of GaussianSplat3d # Project the Gaussians for rendering images onto C image planes projected_gaussians = gaussian_splat_3d.project_gaussians_for_images_and_depths( world_to_camera_matrices, # tensor of shape [C, 4, 4] projection_matrices, # tensor of shape [C, 3, 3] image_width, # width of the C images image_height, # height of the C images near, # near clipping plane far) # far clipping plane # Now render a crop of size 100x100 starting at (10, 10) from the projected Gaussians # in each image plane. # Returns a tensor of shape [C, 100, 100, D] containing the images (where D is num_channels + 1 for depth), # and a tensor of shape [C, 100, 100, 1] containing the final alpha (opacity) values # of each pixel. cropped_images_1, cropped_alphas = gaussian_splat_3d.render_from_projected_gaussians( projected_gaussians, crop_width=100, crop_height=100, crop_origin_w=10, crop_origin_h=10) cropped_images = cropped_images_1[..., :-1] # Extract image channels # Divide by alpha to get the final true depth values cropped_depths = cropped_images_1[..., -1:] / cropped_alphas # Extract depth channel
- Parameters:
world_to_camera_matrices (torch.Tensor) – Tensor of shape
(C, 4, 4)representing the world-to-camera transformation matrices forCcameras. Each matrix transforms points from world coordinates to camera coordinates.projection_matrices (torch.Tensor) – Tensor of shape
(C, 3, 3)representing the projection matrices forCcameras. Each matrix projects points in camera space into homogeneous pixel coordinates.image_width (int) – The width of the images to be rendered. Note that all images must have the same width.
image_height (int) – The height of the images to be rendered. Note that all images must have the same height.
near (float) – The near clipping plane distance for the projection.
far (float) – The far clipping plane distance for the projection.
projection_type (ProjectionType) – The type of projection to use. Default is
fvdb.ProjectionType.PERSPECTIVE.sh_degree_to_use (int) – The degree of spherical harmonics to use for rendering. -1 means use all available SH bases. 0 means use only the first SH base (constant color). Note that you can’t use more SH bases than available in the GaussianSplat3d instance. Default is -1.
min_radius_2d (float) – The minimum radius (in pixels) below which Gaussians are ignored during rendering.
eps_2d (float) – A value used to pad Gaussians when projecting them onto the image plane, to avoid very projected Gaussians which create artifacts and numerical issues.
antialias (bool) – If
True, applies opacity correction to the projected Gaussians when usingeps_2d > 0.0.
- Returns:
projected_gaussians (ProjectedGaussianSplats) – An instance of ProjectedGaussianSplats containing the projected Gaussians. This object contains the projected 2D representations of the Gaussians, which can be used for rendering images or further processing.
- property quats: Tensor
Returns the unit quaternions representing the orientation of the covariance of the Gaussians in this
GaussianSplat3d. i.e each Gaussian \(g_i\) is defined as:\[g_i(x) = \exp(-0.5 (x - \mu_i)^T \Sigma_i^{-1} (x - \mu_i))\]where \(\mu_i\) is the mean and \(\Sigma_i = R(q_i)^T S_i R(q_i)\) is the covariance of the i-th Gaussian with \(R(q_i)\) being the rotation matrix defined by the unit quaternion \(q_i\) of the Gaussian, and \(S_i = diag(\exp(log\_scales_i))\).
- Returns:
quats (torch.Tensor) – A tensor of shape
(N, 4)whereNis the number of Gaussians (seenum_gaussians). Each row represents the unit quaternion of a Gaussian in 3D space.
- render_depths(world_to_camera_matrices: Tensor, projection_matrices: Tensor, image_width: int, image_height: int, near: float, far: float, projection_type=ProjectionType.PERSPECTIVE, tile_size: int = 16, min_radius_2d: float = 0.3, eps_2d: float = 0.3, antialias: bool = False) tuple[Tensor, Tensor][source]
Render
Cdepth maps from thisGaussianSplat3dfromCcamera views.Note
All depth maps being rendered must have the same width and height.
Example:
# Assume gaussian_splat_3d is an instance of GaussianSplat3d # Render depth maps from C camera views # depth_images is a tensor of shape [C, H, W, 1] # alpha_images is a tensor of shape [C, H, W, 1] depth_images, alpha_images = gaussian_splat_3d.render_depths( world_to_camera_matrices, # tensor of shape [C, 4, 4] projection_matrices, # tensor of shape [C, 3, 3] image_width, # width of the depth maps image_height, # height of the depth maps near, # near clipping plane far) # far clipping plane true_depths = depth_images / alpha_images # Get true depth values by dividing by alpha
- Parameters:
world_to_camera_matrices (torch.Tensor) – Tensor of shape
(C, 4, 4)representing the world-to-camera transformation matrices for C cameras. Each matrix transforms points from world coordinates to camera coordinates.projection_matrices (torch.Tensor) – Tensor of shape
(C, 3, 3)representing the projection matrices forCcameras. Each matrix projects points in camera space into homogeneous pixel coordinates.image_width (int) – The width of the depth maps to be rendered. Note these are the same for all depth maps being rendered.
image_height (int) – The height of the depth maps to be rendered. Note these are the same for all depth maps being rendered.
near (float) – The near clipping plane distance for the projection.
far (float) – The far clipping plane distance for the projection.
projection_type (ProjectionType) – The type of projection to use. Default is
fvdb.ProjectionType.PERSPECTIVE.tile_size (int) – The size of the tiles to use for rendering. Default is 16. You shouldn’t set this parameter unless you really know what you are doing.
min_radius_2d (float) – The minimum radius (in pixels) below which Gaussians are ignored during rendering.
eps_2d (float) – A value used to pad Gaussians when projecting them onto the image plane, to avoid very projected Gaussians which create artifacts and numerical issues.
antialias (bool) – If
True, applies opacity correction to the projected Gaussians when usingeps_2d > 0.0.
- Returns:
depth_images (torch.Tensor) – A tensor of shape
(C, H, W, 1)whereCis the number of camera views,His the height of the depth maps, andWis the width of the depth maps. Each element represents the depth value at that pixel in the depth map.alpha_images (torch.Tensor) – A tensor of shape
(C, H, W, 1)whereCis the number of camera views,His the height of the images, andWis the width of the images. Each element represents the alpha value (opacity) at a pixel such that0 <= alpha < 1, and 0 means the pixel is fully transparent, and 1 means the pixel is fully opaque.
- render_from_projected_gaussians(projected_gaussians: ProjectedGaussianSplats, crop_width: int = -1, crop_height: int = -1, crop_origin_w: int = -1, crop_origin_h: int = -1, tile_size: int = 16) tuple[Tensor, Tensor][source]
Render a set of images from Gaussian splats that have already been projected onto image planes (See for example
project_gaussians_for_images()). This method is useful when you want to render images from pre-computed projected Gaussians, for example, when rendering crops of images without having to re-project the Gaussians.Note
If you want to render the full image, pass negative values for
crop_width,crop_height,crop_origin_w, andcrop_origin_h(default behavior). To render full images, all these values must be negative or this method will raise an error.Note
If your crop goes beyond the image boundaries, the resulting image will be clipped to be within the image boundaries.
Example:
# Assume gaussian_splat_3d is an instance of GaussianSplat3d # Project the Gaussians for rendering images onto C image planes projected_gaussians = gaussian_splat_3d.project_gaussians_for_images_and_depths( world_to_camera_matrices, # tensor of shape [C, 4, 4] projection_matrices, # tensor of shape [C, 3, 3] image_width, # width of the C images image_height, # height of the C images near, # near clipping plane far) # far clipping plane # Now render a crop of size 100x100 starting at (10, 10) from the projected Gaussians # in each image plane. # Returns a tensor of shape [C, 100, 100, D] containing the images (where D is num_channels + 1 for depth), # and a tensor of shape [C, 100, 100, 1] containing the final alpha (opacity) values # of each pixel. cropped_images_1, cropped_alphas = gaussian_splat_3d.render_from_projected_gaussians( projected_gaussians, crop_width=100, crop_height=100, crop_origin_w=10, crop_origin_h=10) cropped_images = cropped_images_1[..., :-1] # Extract image channels # Divide by alpha to get the final true depth values cropped_depths = cropped_images_1[..., -1:] / cropped_alphas # Extract depth channel
- Parameters:
projected_gaussians (ProjectedGaussianSplats) – An instance of
fvdb.ProjectedGaussianSplatscontaining the projected Gaussians after spherical harmonic evaluation. This object should have been created by callingproject_gaussians_for_images(),project_gaussians_for_depths(),project_gaussians_for_images_and_depths(), etc.crop_width (int) – The width of the crop to render. If -1, the full image width is used. Default is -1.
crop_height (int) – The height of the crop to render. If -1, the full image height is used. Default is -1.
crop_origin_w (int) – The x-coordinate of the top-left corner of the crop. If -1, the crop starts at (0, 0). Default is -1.
crop_origin_h (int) – The y-coordinate of the top-left corner of the crop. If -1, the crop starts at (0, 0). Default is -1.
tile_size (int) – The size of the tiles to use for rendering. Default is 16. This parameter controls the size of the tiles used for rendering the images. You shouldn’t set this parameter unless you really know what you are doing.
- Returns:
rendered_images (torch.Tensor) – A tensor of shape
(C, H, W, D)whereCis the number of image planes,His the height of the rendered images,Wis the width of the rendered images, andDis the number of channels (e.g., RGB, RGBD, etc.).alpha_images (torch.Tensor) – A tensor of shape
(C, H, W, 1)whereCis the number of cameras,His the height of the images, andWis the width of the images. Each element represents the alpha value (opacity) at a pixel such that 0 <= alpha < 1, and 0 means the pixel is fully transparent, and 1 means the pixel is fully opaque.
- render_images(world_to_camera_matrices: Tensor, projection_matrices: Tensor, image_width: int, image_height: int, near: float, far: float, projection_type=ProjectionType.PERSPECTIVE, sh_degree_to_use: int = -1, tile_size: int = 16, min_radius_2d: float = 0.0, eps_2d: float = 0.3, antialias: bool = False) tuple[Tensor, Tensor][source]
Render
Cmulti-channel images (seenum_channels) from thisGaussianSplat3dfromCcamera views.Note
All images being rendered must have the same width and height.
Example:
# Assume gaussian_splat_3d is an instance of GaussianSplat3d # Render images from C camera views. # images is a tensor of shape [C, H, W, D] where D is the number of channels # alpha_images is a tensor of shape [C, H, W, 1] images, alpha_images = gaussian_splat_3d.render_images( world_to_camera_matrices, # tensor of shape [C, 4, 4] projection_matrices, # tensor of shape [C, 3, 3] image_width, # width of the images image_height, # height of the images near, # near clipping plane far) # far clipping plane
- Parameters:
world_to_camera_matrices (torch.Tensor) – Tensor of shape
(C, 4, 4)representing the world-to-camera transformation matrices for C cameras. Each matrix transforms points from world coordinates to camera coordinates.projection_matrices (torch.Tensor) – Tensor of shape
(C, 3, 3)representing the projection matrices forCcameras. Each matrix projects points in camera space into homogeneous pixel coordinates.image_width (int) – The width of the images to be rendered. Note these are the same for all images being rendered.
image_height (int) – The height of the images to be rendered. Note these are the same for all images being rendered.
near (float) – The near clipping plane distance for the projection.
far (float) – The far clipping plane distance for the projection.
projection_type (ProjectionType) – The type of projection to use. Default is
fvdb.ProjectionType.PERSPECTIVE.sh_degree_to_use (int) – The degree of spherical harmonics to use for rendering. -1 means use all available SH bases. 0 means use only the first SH base (constant color). Note that you can’t use more SH bases than available in the GaussianSplat3d instance. Default is -1.
tile_size (int) – The size of the tiles to use for rendering. Default is 16. You shouldn’t set this parameter unless you really know what you are doing.
min_radius_2d (float) – The minimum radius (in pixels) below which Gaussians are ignored during rendering.
eps_2d (float) – A value used to pad Gaussians when projecting them onto the image plane, to avoid very projected Gaussians which create artifacts and numerical issues.
antialias (bool) – If
True, applies opacity correction to the projected Gaussians when usingeps_2d > 0.0.
- Returns:
images (torch.Tensor) – A tensor of shape
(C, H, W, D)whereCis the number of camera views,His the height of the images,Wis the width of the images, andDis the number of channels.alpha_images (torch.Tensor) – A tensor of shape
(C, H, W, 1)whereCis the number of camera views,His the height of the images, andWis the width of the images. Each element represents the alpha value (opacity) at a pixel such that0 <= alpha < 1, and 0 means the pixel is fully transparent, and 1 means the pixel is fully opaque.
- render_images_and_depths(world_to_camera_matrices: Tensor, projection_matrices: Tensor, image_width: int, image_height: int, near: float, far: float, projection_type=ProjectionType.PERSPECTIVE, sh_degree_to_use: int = -1, tile_size: int = 16, min_radius_2d: float = 0.0, eps_2d: float = 0.3, antialias: bool = False) tuple[Tensor, Tensor][source]
Render
Cmulti-channel images (seenum_channels) with depth as the last channel from thisGaussianSplat3dfromCcamera views.Note
All images being rendered must have the same width and height.
Example:
# Assume gaussian_splat_3d is an instance of GaussianSplat3d # Render images with depth maps from C camera views. # images is a tensor of shape [C, H, W, D + 1] where D is the number of channels # alpha_images is a tensor of shape [C, H, W, 1] images, alpha_images = gaussian_splat_3d.render_images( world_to_camera_matrices, # tensor of shape [C, 4, 4] projection_matrices, # tensor of shape [C, 3, 3] image_width, # width of the images image_height, # height of the images near, # near clipping plane far) # far clipping plane images = images[..., :-1] # Extract image channels depths = images[..., -1:] / alpha_images # Extract depth channel by dividing by alpha
- Parameters:
world_to_camera_matrices (torch.Tensor) – Tensor of shape
(C, 4, 4)representing the world-to-camera transformation matrices for C cameras. Each matrix transforms points from world coordinates to camera coordinates.projection_matrices (torch.Tensor) – Tensor of shape
(C, 3, 3)representing the projection matrices forCcameras. Each matrix projects points in camera space into homogeneous pixel coordinates.image_width (int) – The width of the images to be rendered. Note these are the same for all images being rendered.
image_height (int) – The height of the images to be rendered. Note these are the same for all images being rendered.
near (float) – The near clipping plane distance for the projection.
far (float) – The far clipping plane distance for the projection.
projection_type (ProjectionType) – The type of projection to use. Default is
fvdb.ProjectionType.PERSPECTIVE.sh_degree_to_use (int) – The degree of spherical harmonics to use for rendering. -1 means use all available SH bases. 0 means use only the first SH base (constant color). Note that you can’t use more SH bases than available in the GaussianSplat3d instance. Default is -1.
tile_size (int) – The size of the tiles to use for rendering. Default is 16. You shouldn’t set this parameter unless you really know what you are doing.
min_radius_2d (float) – The minimum radius (in pixels) below which Gaussians are ignored during rendering.
eps_2d (float) – A value used to pad Gaussians when projecting them onto the image plane, to avoid very projected Gaussians which create artifacts and numerical issues.
antialias (bool) – If
True, applies opacity correction to the projected Gaussians when usingeps_2d > 0.0.
- Returns:
images (torch.Tensor) – A tensor of shape
(C, H, W, D + 1)whereCis the number of camera views,His the height of the images,Wis the width of the images, andDis the number of channels.alpha_images (torch.Tensor) – A tensor of shape
(C, H, W, 1)whereCis the number of camera views,His the height of the images, andWis the width of the images. Each element represents the alpha value (opacity) at a pixel such that0 <= alpha < 1, and 0 means the pixel is fully transparent, and 1 means the pixel is fully opaque.
- render_num_contributing_gaussians(world_to_camera_matrices: Tensor, projection_matrices: Tensor, image_width: int, image_height: int, near: float, far: float, projection_type=ProjectionType.PERSPECTIVE, tile_size: int = 16, min_radius_2d: float = 0.0, eps_2d: float = 0.3, antialias: bool = False) tuple[Tensor, Tensor][source]
Renders
Cimages where each pixel contains the number of contributing Gaussians for that pixel fromCcamera views.Note
All images being rendered must have the same width and height.
Example:
# Assume gaussian_splat_3d is an instance of GaussianSplat3d # Render images from C camera views. # images is a tensor of shape [C, H, W, D] where D is the number of channels # alpha_images is a tensor of shape [C, H, W, 1] num_gaussians, alpha_images = gaussian_splat_3d.render_images( world_to_camera_matrices, # tensor of shape [C, 4, 4] projection_matrices, # tensor of shape [C, 3, 3] image_width, # width of the images image_height, # height of the images near, # near clipping plane far) # far clipping plane num_gaussians_cij = num_gaussians[c, i, j, 0] # Number of contributing Gaussians at pixel (i, j) in camera c
- Parameters:
world_to_camera_matrices (torch.Tensor) – Tensor of shape
(C, 4, 4)representing the world-to-camera transformation matrices for C cameras. Each matrix transforms points from world coordinates to camera coordinates.projection_matrices (torch.Tensor) – Tensor of shape
(C, 3, 3)representing the projection matrices forCcameras. Each matrix projects points in camera space into homogeneous pixel coordinates.image_width (int) – The width of the images to be rendered. Note these are the same for all images being rendered.
image_height (int) – The height of the images to be rendered. Note these are the same for all images being rendered.
near (float) – The near clipping plane distance for the projection.
far (float) – The far clipping plane distance for the projection.
projection_type (ProjectionType) – The type of projection to use. Default is
fvdb.ProjectionType.PERSPECTIVE.tile_size (int) – The size of the tiles to use for rendering. Default is 16. You shouldn’t set this parameter unless you really know what you are doing.
min_radius_2d (float) – The minimum radius (in pixels) below which Gaussians are ignored during rendering.
eps_2d (float) – A value used to pad Gaussians when projecting them onto the image plane, to avoid very projected Gaussians which create artifacts and numerical issues.
antialias (bool) – If
True, applies opacity correction to the projected Gaussians when usingeps_2d > 0.0.
- Returns:
images (torch.Tensor) – A tensor of shape
(C, H, W, 1)whereCis the number of camera views,His the height of the images,Wis the width of the images. Each element represents the number of contributing Gaussians at that pixel.alpha_images (torch.Tensor) – A tensor of shape
(C, H, W, 1)whereCis the number of camera views,His the height of the images, andWis the width of the images. Each element represents the alpha value (opacity) at a pixel such that0 <= alpha < 1, and 0 means the pixel is fully transparent, and 1 means the pixel is fully opaque.
- render_top_contributing_gaussian_ids(num_samples: int, world_to_camera_matrices: Tensor, projection_matrices: Tensor, image_width: int, image_height: int, near: float, far: float, projection_type=ProjectionType.PERSPECTIVE, tile_size: int = 16, min_radius_2d: float = 0.0, eps_2d: float = 0.3, antialias: bool = False) tuple[Tensor, Tensor][source]
Renders the ids of the top
num_samplescontributing Gaussians inCcamera views. i.e. the ids of the most opaque Gaussians contributing to each pixel in each image.Note
If there are fewer than
num_samplesGaussians contributing to a pixel, the remaining ids will be set to -1, and their corresponding weights will be set to 0.0.- Parameters:
world_to_camera_matrices (torch.Tensor) – Tensor of shape
(C, 4, 4)representing the world-to-camera transformation matrices for C cameras. Each matrix transforms points from world coordinates to camera coordinates.projection_matrices (torch.Tensor) – Tensor of shape
(C, 3, 3)representing the projection matrices forCcameras. Each matrix projects points in camera space into homogeneous pixel coordinates.image_width (int) – The width of the images to be rendered. Note these are the same for all images being rendered.
image_height (int) – The height of the images to be rendered. Note these are the same for all images being rendered.
near (float) – The near clipping plane distance for the projection.
far (float) – The far clipping plane distance for the projection.
projection_type (ProjectionType) – The type of projection to use. Default is
fvdb.ProjectionType.PERSPECTIVE.tile_size (int) – The size of the tiles to use for rendering. Default is 16. You shouldn’t set this parameter unless you really know what you are doing.
min_radius_2d (float) – The minimum radius (in pixels) below which Gaussians are ignored during rendering.
eps_2d (float) – A value used to pad Gaussians when projecting them onto the image plane, to avoid very projected Gaussians which create artifacts and numerical issues.
antialias (bool) – If
True, applies opacity correction to the projected Gaussians when usingeps_2d > 0.0.
- Returns:
top_contributing_gaussian_ids (torch.Tensor) – An int64 tensor of shape
(C, H, W, num_samples)whereCis the number of cameras,His the height of the images,Wis the width of the images, andnum_samplesis the number of top contributing Gaussians to return for each pixel. Each element represents the id of a Gaussian that contributes to the pixel.weights (torch.Tensor) – A tensor of shape
(C, H, W, num_samples)whereCis the number of cameras,His the height of the images,Wis the width of the images, andnum_samplesis the number of top contributing Gaussians to return for each pixel. Each element represents the transmittance-weighted opacity of the Gaussian that contributes to the pixel (i.e. its proportion of the visible contribution to the pixel).
- property requires_grad: bool
Returns whether the tensors tracked by this
GaussianSplat3dinstance are set to require gradients. This is typically set to True if you want to optimize the parameters of the Gaussians.Example:
gsplat3d = GaussianSplat3d(...) # Some GaussianSplat3d instance gsplat3d.requires_grad = True # Enable gradient tracking for optimization assert gsplat3d.means.requires_grad # Now the means will require gradients assert gsplat3d.covariances.requires_grad # Now the covariances will require gradients assert gsplat3d.logit_opacities.requires_grad # Now the logit opacities will require gradients assert gsplat3d.log_scales.requires_grad # Now the log scales will require gradients assert gsplat3d.sh0.requires_grad # Now the SH coefficients will require gradients assert gsplat3d.shN.requires_grad # Now the SH coefficients will require gradients
- Returns:
requires_grad (bool) –
Trueif gradients are required,Falseotherwise.
- reset_accumulated_gradient_state() None[source]
Reset the accumulated projected gradients of the mans if
accumulate_mean_2d_gradientsisTrue, and the accumulated max 2D radii ifaccumulate_max_2d_radiiisTrue.The values of
accumulated_projected_mean_2d_gradients,accumulated_max_2d_radii, andaccumulated_gradient_step_countswill be zeroed out after this call.See also
accumulate_mean_2d_gradients()accumulate_max_2d_radii()which control if we accumulate these values during rendering and backward passes.See also
accumulated_mean_2d_gradient_normsaccumulated_max_2d_radiiaccumulated_gradient_step_countsfor the actual accumulated state being reset.
- save_ply(filename: Path | str, metadata: Mapping[str, str | int | float | Tensor] | None = None) None[source]
Save this
GaussianSplat3dto a PLY file. and include any metadata provided.- Parameters:
filename (pathlib.Path | str) – The path to the PLY file to save.
metadata (dict[str, str | int | float | torch.Tensor] | None) – An optional dictionary of metadata where the keys are strings and the values are either strings, ints, floats, or tensors. Defaults to
None,
- property scales: Tensor
Returns the scales of the Gaussians in the Gaussian splatting representation. The scales are the eigenvalues of the covariance matrix of each Gaussian. i.e each Gaussian \(g_i\) is defined as:
\[g_i(x) = \exp(-0.5 (x - \mu_i)^T \Sigma_i^{-1} (x - \mu_i))\]where \(\mu_i\) is the mean and \(\Sigma_i = R(q_i)^T S_i R(q_i)\) is the covariance of the i-th Gaussian with \(R(q_i)\) being the rotation matrix defined by the unit quaternion \(q_i\) of the Gaussian, and \(S_i = diag(\exp(log\_scales_i))\).
Note
This property is read only.
GaussianSplat3dstores the log of scales to ensure numerical stability, which you can modify. Seelog_scales.- Returns:
scales (torch.Tensor) – A tensor of shape
(N, 3)whereNis the number of Gaussians. Each row represents the scale of a Gaussian in 3D space.
- set_state(means: Tensor, quats: Tensor, log_scales: Tensor, logit_opacities: Tensor, sh0: Tensor, shN: Tensor) None[source]
Set the underlying tensors managed by this
GaussianSplat3dinstance.Note: If
accumulate_mean_2d_gradientsand/oraccumulate_max_2d_radiiareTrue, this method will reset the gradient state (seereset_accumulated_gradient_state()).- Parameters:
means (torch.Tensor) – Tensor of shape
(N, 3)representing the means of the Gaussians.Nis the number of Gaussians (seenum_gaussians).quats (torch.Tensor) – Tensor of shape
(N, 4)representing the quaternions of the Gaussians.Nis the number of Gaussians (seenum_gaussians).log_scales (torch.Tensor) – Tensor of shape
(N, 3)representing the log scales of the Gaussians.Nis the number of Gaussians (seenum_gaussians).logit_opacities (torch.Tensor) – Tensor of shape
(N,)representing the logit opacities of the Gaussians.Nis the number of Gaussians (seenum_gaussians).sh0 (torch.Tensor) – Tensor of shape
(N, 1, D)representing the diffuse SH coefficients whereNis the number of Gaussians (seenum_gaussians), andDis the number of channels (seenum_channels).shN (torch.Tensor) – Tensor of shape
(N, K-1, D)representing the directionally varying SH coefficients whereNis the number of Gaussians (seenum_gaussians),Dis the number of channels (seenum_channels), andKis the number of spherical harmonic bases (seenum_sh_bases).
- property sh0: Tensor
Returns the diffuse spherical harmonics coefficients of the Gaussians in this
GaussianSplat3d. These coefficients are used to represent the diffuse color/feature of each Gaussian.- Returns:
sh0 (torch.Tensor) – A tensor of shape
(N, 1, D)whereNis the number of Gaussians (seenum_gaussians), andDis the number of channels (seenum_channels). Each row represents the diffuse SH coefficients for a Gaussian.
- property shN: Tensor
Returns the directionally varying spherical harmonics coefficients of the Gaussians in the scene. These coefficients are used to represent a direction dependent color/feature of each Gaussian.
- Returns:
torch.Tensor – A tensor of shape (N, K-1, D) where N is the number of Gaussians (see num_gaussians), D is the number of channels (see num_channels), and K is the number of spherical harmonic bases (see num_sh_bases). Each row represents the directionally varying SH coefficients for a Gaussian.
- property sh_degree: int
Returns the degree of the spherical harmonics used in the Gaussian splatting representation. This value is 0 for diffuse SH coefficients and >= 1 for directionally varying SH coefficients.
Note
This is not the same as the number of spherical harmonics bases (see
num_sh_bases). The relationship between the degree and the number of bases is given by \(K = (sh\_degree + 1)^2\), where \(K\) is the number of spherical harmonics bases.- Returns:
sh_degree (int) – The degree of the spherical harmonics.
- sparse_render_num_contributing_gaussians(pixels_to_render: Tensor, world_to_camera_matrices: Tensor, projection_matrices: Tensor, image_width: int, image_height: int, near: float, far: float, projection_type=ProjectionType.PERSPECTIVE, tile_size: int = 16, min_radius_2d: float = 0.0, eps_2d: float = 0.3, antialias: bool = False) tuple[Tensor, Tensor][source]
- sparse_render_num_contributing_gaussians(pixels_to_render: JaggedTensor, world_to_camera_matrices: Tensor, projection_matrices: Tensor, image_width: int, image_height: int, near: float, far: float, projection_type=ProjectionType.PERSPECTIVE, tile_size: int = 16, min_radius_2d: float = 0.0, eps_2d: float = 0.3, antialias: bool = False) tuple[JaggedTensor, JaggedTensor]
Renders the number of Gaussians which contribute to each pixel specified in the input.
See also
render_num_contributing_gaussians()for rendering dense images of contributing Gaussians.- Parameters:
pixels_to_render (torch.Tensor | JaggedTensor) – A
fvdb.JaggedTensorof shape(C, R_c, 2)representing the pixels to render for each camera, whereCis the number of camera views andR_cis the number of pixels to render per camera. Each value is an (x, y) pixel coordinate.world_to_camera_matrices (torch.Tensor) – Tensor of shape
(C, 4, 4)representing the world-to-camera transformation matrices for C cameras. Each matrix transforms points from world coordinates to camera coordinates.projection_matrices (torch.Tensor) – Tensor of shape
(C, 3, 3)representing the projection matrices forCcameras. Each matrix projects points in camera space into homogeneous pixel coordinates.image_width (int) – The width of the images to be rendered. Note these are the same for all images being rendered.
image_height (int) – The height of the images to be rendered. Note these are the same for all images being rendered.
near (float) – The near clipping plane distance for the projection.
far (float) – The far clipping plane distance for the projection.
projection_type (ProjectionType) – The type of projection to use. Default is
fvdb.ProjectionType.PERSPECTIVE.tile_size (int) – The size of the tiles to use for rendering. Default is 16. You shouldn’t set this parameter unless you really know what you are doing.
min_radius_2d (float) – The minimum radius (in pixels) below which Gaussians are ignored during rendering.
eps_2d (float) – A value used to pad Gaussians when projecting them onto the image plane, to avoid very projected Gaussians which create artifacts and numerical issues.
antialias (bool) – If
True, applies opacity correction to the projected Gaussians when usingeps_2d > 0.0.
- Returns:
num_contributing_gaussians (torch.Tensor | JaggedTensor) – A tensor of shape
(C, R)(if this method was called withpixels_to_renderas atorch.Tensor) or afvdb.JaggedTensorof shape(C, R_c)(if this method was called withpixels_to_renderas afvdb.JaggedTensor) whereCis the number of cameras, andR/R_cis the number of pixels to render per camera. Each element represents the number of contributing Gaussians at that pixel.alphas (torch.Tensor | JaggedTensor) – A tensor of shape
(C, R)(if this method was called withpixels_to_renderas atorch.Tensor) or afvdb.JaggedTensorof shape(C, R_c)(if this method was called withpixels_to_renderas afvdb.JaggedTensor) whereCis the number of cameras, andR/R_cis the number of pixels to render per camera. Each element represents the alpha value (opacity) at that pixel such that0 <= alpha < 1, and 0 means the pixel is fully transparent, and 1 means the pixel is fully opaque.
- sparse_render_top_contributing_gaussian_ids(num_samples: int, pixels_to_render: Tensor, world_to_camera_matrices: Tensor, projection_matrices: Tensor, image_width: int, image_height: int, near: float, far: float, projection_type=ProjectionType.PERSPECTIVE, tile_size: int = 16, min_radius_2d: float = 0.0, eps_2d: float = 0.3, antialias: bool = False) tuple[Tensor, Tensor][source]
- sparse_render_top_contributing_gaussian_ids(num_samples: int, pixels_to_render: JaggedTensor, world_to_camera_matrices: Tensor, projection_matrices: Tensor, image_width: int, image_height: int, near: float, far: float, projection_type=ProjectionType.PERSPECTIVE, tile_size: int = 16, min_radius_2d: float = 0.0, eps_2d: float = 0.3, antialias: bool = False) tuple[JaggedTensor, JaggedTensor]
Renders the ids of the top
num_samplescontributing Gaussians in the specified set of pixels acrossCcamera views. i.e. the ids of the most opaque Gaussians contributing to each pixel in each image.Note
If there are fewer than
num_samplesGaussians contributing to a pixel, the remaining ids will be set to -1, and their corresponding weights will be set to 0.0.- Parameters:
pixels_to_render (torch.Tensor | JaggedTensor) – A :torch.Tensor: of shape
(C, R, 2)or afvdb.JaggedTensorof shape(C, R_c, 2)representing the pixels to render for each camera, whereCis the number of camera views andR/R_cis the number of pixels to render per camera. Each value is an (x, y) pixel coordinate.world_to_camera_matrices (torch.Tensor) – Tensor of shape
(C, 4, 4)representing the world-to-camera transformation matrices for C cameras. Each matrix transforms points from world coordinates to camera coordinates.projection_matrices (torch.Tensor) – Tensor of shape
(C, 3, 3)representing the projection matrices forCcameras. Each matrix projects points in camera space into homogeneous pixel coordinates.image_width (int) – The width of the images to be rendered. Note these are the same for all images being rendered.
image_height (int) – The height of the images to be rendered. Note these are the same for all images being rendered.
near (float) – The near clipping plane distance for the projection.
far (float) – The far clipping plane distance for the projection.
projection_type (ProjectionType) – The type of projection to use. Default is
fvdb.ProjectionType.PERSPECTIVE.tile_size (int) – The size of the tiles to use for rendering. Default is 16. You shouldn’t set this parameter unless you really know what you are doing.
min_radius_2d (float) – The minimum radius (in pixels) below which Gaussians are ignored during rendering.
eps_2d (float) – A value used to pad Gaussians when projecting them onto the image plane, to avoid very projected Gaussians which create artifacts and numerical issues.
antialias (bool) – If
True, applies opacity correction to the projected Gaussians when usingeps_2d > 0.0.
- Returns:
top_contributing_gaussian_ids (torch.Tensor | JaggedTensor) – A long tensor of shape
(C, R, num_samples)(ifpixels_to_renderwas atorch.Tensor) or afvdb.JaggedTensorof shape(C, R_c, num_samples)(ifpixels_to_renderwas afvdb.JaggedTensor), whereCis the number of cameras,R/R_cis the number of pixels being rendered per image, andnum_samplesis the number of top contributing Gaussians to return for each pixel. Each element represents the id of a Gaussian that contributes to the pixel.weights (torch.Tensor) – A tensor of shape
(C, R, num_samples)(ifpixels_to_renderwas atorch.Tensor) or afvdb.JaggedTensorof shape(C, R_c, num_samples)(ifpixels_to_renderwas afvdb.JaggedTensor), whereCis the number of cameras,R/R_cis the number of pixels being rendered per image, andnum_samplesis the number of top contributing Gaussians to return for each pixel. Each element represents the transmittance-weighted opacity of the Gaussian that contributes to the pixel (i.e. its proportion of the visible contribution to the pixel).
- state_dict() dict[str, Tensor][source]
Return a dictionary containing the state of the GaussianSplat3d instance. This is useful for serializing the state of the object for saving or transferring.
A state dictionary always contains the following keys where
Ndenotes the number of Gaussians (seenum_gaussians):'means': Tensor of shape(N, 3)representing the means of the Gaussians.'quats': Tensor of shape(N, 4)representing the quaternions of the Gaussians.'log_scales': Tensor of shape(N, 3)representing the log scales of the Gaussians.'logit_opacities': Tensor of shape(N,)representing the logit opacities of the Gaussians.'sh0': Tensor of shape(N, 1, D)representing the diffuse SH coefficients whereDis the number of channels (seenum_channels).'shN': Tensor of shape(N, K-1, D)representing the directionally varying SH coefficients whereDis the number of channels (seenum_channels), andKis the number of spherical harmonic bases (seenum_sh_bases).'accumulate_max_2d_radii': bool Tensor with a single element indicating whether to track the maximum 2D radii for gradients.'accumulate_mean_2d_gradients': bool Tensor with a single element indicating whether to track the average norm of the gradient of projected means for each Gaussian.
It can also optionally contain the following keys if
accumulate_mean_2d_gradientsand/oraccumulate_max_2d_radiiare set toTrue:'accumulated_gradient_step_counts': Tensor of shape(N,)representing the accumulated gradient step counts for each Gaussian.'accumulated_max_2d_radii': Tensor of shape(N,)representing the maximum 2D projected radius for each Gaussian across every iteration of optimization.'accumulated_mean_2d_gradient_norms': Tensor of shape(N,)representing the average norm of the gradient of projected means for each Gaussian across every iteration of optimization.
See also
from_state_dict()for constructing aGaussianSplat3dfrom a state dictionary.- Returns:
state_dict (dict[str, torch.Tensor]) – A dictionary containing the state of the
GaussianSplat3dinstance.
- to(dtype: dtype | None = None) GaussianSplat3d[source]
- to(device: str | device | None = None, dtype: dtype | None = None) GaussianSplat3d
- to(other: Tensor) GaussianSplat3d
- to(other: GaussianSplat3d) GaussianSplat3d
- to(other: Grid) GaussianSplat3d
- to(other: GridBatch) GaussianSplat3d
- to(other: JaggedTensor) GaussianSplat3d
Move the
GaussianSplat3dinstance to a different device or change its data type or both.- Parameters:
other (DeviceIdentifier | torch.Tensor | GaussianSplat3d | Grid | GridBatch | JaggedTensor) – The target
torch.Device,torch.Tensor,Grid,GridBatch,JaggedTensor, orGaussianSplat3dinstance to which theGaussianSplat3dinstance should be moved.device (DeviceIdentifier, optional) – The target
deviceto move theGaussianSplat3dinstance to.dtype (torch.dtype, optional) – The target data type for the
GaussianSplat3dinstance.
- Returns:
gaussian_splat_3d (GaussianSplat3d) – A new instance of
GaussianSplat3dwith the specified device and/or data type.