fvdb_reality_capture.transformsο
- class fvdb_reality_capture.transforms.base_transform.BaseTransform[source]ο
Base class for all transforms.
Transforms are used to modify an
SfmScenebefore it is used for reconstruction or other processing. They can be used to filter images, adjust camera parameters, or perform other modifications to the scene.Subclasses of
BaseTransformmust implement the following methods:- abstractmethod __call__(input_scene: SfmScene) SfmScene[source]ο
Abstract method to apply the transform to the input scene and return the transformed scene.
- Parameters:
input_scene (SfmScene) β The input scene to transform.
- Returns:
output_scene (SfmScene) β The transformed scene.
- abstractmethod static from_state_dict(state_dict: dict[str, Any]) BaseTransform[source]ο
Abstract method to create a transform from a state dictionary generated with
state_dict().- Parameters:
state_dict (dict[str, Any]) β A dictionary containing information to serialize/deserialize the transform.
- Returns:
transform (BaseTransform) β An instance of the transform.
- class fvdb_reality_capture.transforms.Compose(*transforms)[source]ο
A
BaseTransformthat composes multiple transforms together in sequence. This is useful for encoding a sequence of transforms into a single object.The transforms are applied in the order they are provided, allowing for complex data processing pipelines.
Example usage:
# Example usage: from fvdb_reality_capture import transforms from fvdb_reality_capture.sfm_scene import SfmScene scene_transform = transforms.Compose( transforms.NormalizeScene("pca"), transforms.DownsampleImages(4), ) input_scene: SfmScene = ... # Load or create an SfmScene transformed_scene: SfmScene = scene_transform(input_scene)
- __call__(input_scene: SfmScene) SfmScene[source]ο
Return a new
SfmScenewhich is the result of applying the composed transforms sequentially to the input scene.
- __init__(*transforms)[source]ο
Initialize the Compose transform with a sequence of transforms.
- Parameters:
*transforms (tuple[BaseTransform...]) β A tuple of
BaseTransforminstances to compose.
- static from_state_dict(state_dict: dict[str, Any]) Compose[source]ο
Create a
Composetransform from a state dictionary generated withstate_dict().
- static name() str[source]ο
Return the name of the
Composetransform. i.e."Compose".- Returns:
str β The name of the
Composetransform. i.e."Compose".
- state_dict() dict[str, Any][source]ο
Return the state of the
Composetransform for serialization.You can use this state dictionary to recreate the transform using
from_state_dict().- Returns:
state_dict (dict[str, Any]) β A dictionary containing information to serialize/deserialize the transform.
- version = '1.0.0'ο
- class fvdb_reality_capture.transforms.CropScene(bbox: Tensor | ndarray | int | float | integer | floating | Sequence[int | float | integer | floating] | Size, mask_format: Literal['png', 'jpg', 'npy'] = 'png', composite_with_existing_masks: bool = True)[source]ο
A
BaseTransformwhich crops the inputSfmScenepoints to lie within a specified bounding box. This transform additionally and updates the sceneβs masks to nullify pixels whose rays do not intersect the bounding box.Note
If the input scene already has masks, these new masks will be composited with the existing masks to ensure that pixels outside the cropped region are properly masked. This can be disabled by setting
composite_with_existing_maskstoFalse.Example usage:
# Example usage: from fvdb_reality_capture import transforms from fvdb_reality_capture.sfm_scene import SfmScene import numpy as np # Bounding box in the format (min_x, min_y, min_z, max_x, max_y, max_z) scene_transform = transforms.CropScene(bbox=np.array([-1.0, -1.0, -1.0, 1.0, 1.0, 1.0])) input_scene: SfmScene = ... # Load or create an SfmScene # The transformed scene will have points only within the bounding box, and posed images will have # masks updated to nullify pixels corresponding to regions outside the cropped scene. transformed_scene: SfmScene = scene_transform(input_scene)
- __call__(input_scene: SfmScene) SfmScene[source]ο
Return a new
SfmScenewith points cropped to lie within the bounding box specified at initialization, and with masks updated to nullify pixels whose rays do not intersect the bounding box.- Parameters:
input_scene (SfmScene) β The scene to be cropped.
- Returns:
output_scene (SfmScene) β The cropped scene.
- __init__(bbox: Tensor | ndarray | int | float | integer | floating | Sequence[int | float | integer | floating] | Size, mask_format: Literal['png', 'jpg', 'npy'] = 'png', composite_with_existing_masks: bool = True)[source]ο
Create a new
CropScenetransform with a bounding box.- Parameters:
bbox (NumericMaxRank1) β A bounding box in the format
(min_x, min_y, min_z, max_x, max_y, max_z).mask_format (Literal["png", "jpg", "npy"]) β The format to save the masks in. Defaults to βpngβ.
composite_with_existing_masks (bool) β Whether to composite the masks generated into existing masks for pixels corresponding to regions outside the cropped scene. If set to
True, existing masks will be loaded and composited with the new mask. Defaults toTrue. The resulting composited mask will allow a pixel to be valid if it is valid in both the existing and new mask.
- static from_state_dict(state_dict: dict) CropScene[source]ο
Create a
CropScenetransform from a state dictionary created withstate_dict().- Parameters:
state_dict (dict) β The state dictionary for the transform.
- Returns:
transform (CropScene) β An instance of the
CropScenetransform.
- static name() str[source]ο
Return the name of the
CropScenetransform. i.e."CropScene".- Returns:
str β The name of the
CropScenetransform. i.e."CropScene".
- state_dict() dict[source]ο
Return the state of the
CropScenetransform for serialization.You can use this state dictionary to recreate the transform using
from_state_dict().- Returns:
state_dict (dict[str, Any]) β A dictionary containing information to serialize/deserialize the transform.
- version = '1.0.0'ο
- class fvdb_reality_capture.transforms.CropSceneToPoints(margin: float = 0.0, mask_format: Literal['png', 'jpg', 'npy'] = 'png', composite_with_existing_masks: bool = True)[source]ο
A
BaseTransformwhich crops the inputSfmScenepoints to lie within the bounding box around its points plus or minus a padding margin. This transform additionally and updates the sceneβs masks to nullify pixels whose rays do not intersect the bounding box.Note
If the input scene already has masks, these new masks will be composited with the existing masks to ensure that pixels outside the cropped region are properly masked. This can be disabled by setting
composite_with_existing_maskstoFalse.Note
You may want to use this over
CropSceneif you want the bounding box to depend on the input scene points rather than being fixed (e.g. if you donβt know the bounding box ahead of time). This transform is also useful if you just want to apply conservative masking to the input scene based on its points.Note
The margin is specified as a fraction of the bounding box size. For example, a margin of 0.1 will expand the bounding box by 10% (5% in all directions). So if the sceneβs bounding box is
(0, 0, 0)to(1, 1, 1), a margin of0.1will result in a bounding box of(-0.05, -0.05, -0.05)to(1.05, 1.05, 1.05). The margin can also be negative to shrink the bounding box.Example usage:
# Example usage: from fvdb_reality_capture import transforms from fvdb_reality_capture.sfm_scene import SfmScene import numpy as np # Crop the scene to be 0.1 times smaller than the bounding box around its points # (i.e. a margin of -0.1) scene_transform = transforms.CropSceneToPoints(margin=-0.1) input_scene: SfmScene = ... # Load or create an SfmScene # The transformed scene will have points only within the bounding box of its points # minus a factor of 0.1 times the size. (i.e. a margin of -0.1). # Posed images will have masks updated to nullify pixels corresponding to regions outside the cropped scene. transformed_scene: SfmScene = scene_transform(input_scene)
- __call__(input_scene: SfmScene) SfmScene[source]ο
Return a new
SfmScenewith points cropped to lie within the bounding box of the input sceneβs points plus or minus the margin specified at initialization, and with masks updated to nullify pixels whose rays do not intersect the bounding box.- Parameters:
input_scene (SfmScene) β The scene to be cropped.
- Returns:
output_scene (SfmScene) β The cropped scene.
- __init__(margin: float = 0.0, mask_format: Literal['png', 'jpg', 'npy'] = 'png', composite_with_existing_masks: bool = True)[source]ο
Create a new
CropSceneToPointstransform with the given margin.- Parameters:
margin (float) β The margin factor to apply around the bounding box of the points. Can be negative to shrink the bounding box. This is a fraction of the bounding box size. For example, a margin of
0.1will expand the bounding box by 10% (5% in all directions), while a margin of-0.1will shrink the bounding box by 10% (-5% in all directions). Defaults to0.0.mask_format (Literal["png", "jpg", "npy"]) β The format to save the masks in. Defaults to βpngβ.
composite_with_existing_masks (bool) β Whether to composite the masks generated into existing masks for pixels corresponding to regions outside the cropped scene. If set to True, existing masks will be loaded and composited with the new mask. Defaults to True.
- static from_state_dict(state_dict: dict) CropSceneToPoints[source]ο
Create a
CropSceneToPointstransform from a state dictionary generated withstate_dict().- Parameters:
state_dict (dict[str, Any]) β A dictionary containing information to serialize/deserialize the transform.
- Returns:
transform (
CropSceneToPoints) β An instance of theCropSceneToPointstransform loaded from the state dictionary.
- static name() str[source]ο
Return the name of the
CropSceneToPointstransform. i.e."CropSceneToPoints".- Returns:
str β The name of the
CropSceneToPointstransform. i.e."CropSceneToPoints".
- state_dict() dict[source]ο
Return the state of the
CropSceneToPointstransform for serialization.You can use this state dictionary to recreate the transform using
from_state_dict().- Returns:
state_dict (dict[str, Any]) β A dictionary containing information to serialize/deserialize the transform.
- version = '1.0.0'ο
- class fvdb_reality_capture.transforms.DownsampleImages(image_downsample_factor: int, image_type: Literal['jpg', 'png'] = 'jpg', rescale_sampling_mode: int = 3, rescaled_jpeg_quality: int = 98)[source]ο
A
BaseTransformwhich downsamples all images in anSfmSceneby a specified factor and caches the downsampled images for future use.You can specify the cached downsampled image type (e.g.,
"jpg"or"png"), the mode for downsampling (e.g.,cv2.INTER_AREA), and the rescaled JPEG quality (if using JPEG).If the downsampled images already exist in the sceneβs cache with the correct parameters, they will be loaded from the cache instead of being regenerated.
Example usage:
# Example usage: from fvdb_reality_capture import transforms from fvdb_reality_capture.sfm_scene import SfmScene scene_transform = transforms.DownsampleImages(4) input_scene: SfmScene = ... # Load or create an SfmScene # The returned scene will have paths pointing to downsampled images by a factor of 4. transformed_scene: SfmScene = scene_transform(input_scene)
- __call__(input_scene: SfmScene) SfmScene[source]ο
Return a new
SfmScenewith images downsampled by the specified factor. i.e. images will be resized to(width / image_downsample_factor, height / image_downsample_factor).- Parameters:
input_scene (SfmScene) β The input scene with images to be downsampled.
- Returns:
output_scene (SfmScene) β The scene with downsampled images.
- __init__(image_downsample_factor: int, image_type: Literal['jpg', 'png'] = 'jpg', rescale_sampling_mode: int = 3, rescaled_jpeg_quality: int = 98)[source]ο
Create a new
DownsampleImagestransform with the specified downsampling factor and image caching parameters (image type, downsampling mode, and quality).Note
We use enums from OpenCV for the
rescale_sampling_modeparameter, e.g.,cv2.INTER_AREA,cv2.INTER_LINEAR,cv2.INTER_CUBIC, etc. This means if you want to change the resampling mode, you will need toimport cv2`and pass in the appropriate enum value. See the OpenCV documentation <https://docs.opencv.org/3.4/da/d54/group__imgproc__transform.html#ga5bb5a1fea74ea38e1a5445ca803ff121> for more details on valid enum values.- Parameters:
image_downsample_factor (int) β The factor by which to downsample the images.
image_type (str) β The type of the cached downsampled images, either βjpgβ or βpngβ.
rescale_sampling_mode (int) β
The interpolation method to use for rescaling images. Note that we use enums from OpenCV for this parameter, e.g.,
cv2.INTER_AREA,cv2.INTER_LINEAR,cv2.INTER_CUBIC, etc.rescaled_jpeg_quality (int) β The quality of the JPEG images when saving them to the cache (1-100).
- static from_state_dict(state_dict: dict[str, Any]) DownsampleImages[source]ο
Create a
DownsampleImagestransform from a state dictionary generated withstate_dict().- Parameters:
state_dict (dict) β The state dictionary for the transform.
- Returns:
transform (DownsampleImages) β An instance of the
DownsampleImagestransform.
- static name() str[source]ο
Return the name of the
DownsampleImagestransform. i.e."DownsampleImages".- Returns:
str β The name of the
DownsampleImagestransform. i.e."DownsampleImages".
- state_dict() dict[str, Any][source]ο
Return the state of the
DownsampleImagestransform for serialization.You can use this state dictionary to recreate the transform using
from_state_dict().- Returns:
state_dict (dict[str, Any]) β A dictionary containing information to serialize/deserialize the transform.
- version = '1.0.0'ο
- class fvdb_reality_capture.transforms.FilterImagesWithLowPoints(min_num_points: int = 0)[source]ο
A
BaseTransformwhich filters out posed images from anSfmScenethat have fewer than a specified minimum number of visible points.Any images that have a number of visible points less than or equal to
min_num_pointswill be removed from the scene.Note
If the input
SfmScenedoes not have point indices for its posed images (i.e. it hashas_visible_point_indicesset toFalse), then this transform is a no-op.Example usage:
# Example usage: from fvdb_reality_capture import transforms from fvdb_reality_capture.sfm_scene import SfmScene # Create a transform to filter out images with 50 or fewer visible points. scene_transform = transforms.FilterImagesWithLowPoints(min_num_points=50) input_scene: SfmScene = ... # Load or create an SfmScene # The transformed scene will only contain posed images with more than 50 visible points. transformed_scene: SfmScene = scene_transform(input_scene)
- __call__(input_scene: SfmScene) SfmScene[source]ο
Return a new
SfmScenecontaining only posed images which have more thanmin_num_pointsvisible points.Note
If the input
SfmScenedoes not have point indices for its posed images (i.e.fvdb_reality_capture.sfm_scene.SfmScene.has_visible_point_indicesisFalse), then this transform is a no-op.- Parameters:
input_scene (SfmScene) β The input scene.
- Returns:
output_scene (SfmScene) β A new SfmScene containing only posed images which have more than
min_num_pointsvisible points. If the input scene does not have point indices for its posed images, the input scene is returned unmodified.
- __init__(min_num_points: int = 0)[source]ο
Create a new
FilterImagesWithLowPointstransform which removes posed images from the scene which have fewer than or equal tomin_num_pointsvisible points.- Parameters:
min_num_points (int) β The minimum number of visible points required to keep a posed image in the scene. Posed images with fewer or equal visible points will be removed.
- static from_state_dict(state_dict: dict[str, Any]) FilterImagesWithLowPoints[source]ο
Create a
FilterImagesWithLowPointstransform from a state dictionary generated withstate_dict().- Parameters:
state_dict (dict) β The state dictionary for the transform.
- Returns:
transform (FilterImagesWithLowPoints) β An instance of the
FilterImagesWithLowPointstransform.
- property min_num_points: intο
Get the minimum number of points required to keep a posed image in the scene when applying this transform.
- Returns:
min_num_points (int) β The minimum number of points required to keep a posed image in the scene when applying this transform.
- static name() str[source]ο
Return the name of the
FilterImagesWithLowPointstransform. i.e."FilterImagesWithLowPoints".- Returns:
str β The name of the
FilterImagesWithLowPointstransform. i.e."FilterImagesWithLowPoints".
- state_dict() dict[str, Any][source]ο
Return the state of the
FilterImagesWithLowPointstransform for serialization.You can use this state dictionary to recreate the transform using
from_state_dict().- Returns:
state_dict (dict[str, Any]) β A dictionary containing information to serialize/deserialize the transform.
- version = '1.0.0'ο
- class fvdb_reality_capture.transforms.NormalizeScene(normalization_type: Literal['pca', 'none', 'ecef2enu', 'similarity'])[source]ο
A
BaseTransformwhich normalizes anSfmSceneusing a variety of approaches. This transform applies a rotation/translation/scaling to the entire scene, including both points and camera poses.The normalization types available are:
"pca": Normalizes by centering the scene about its median point, and rotating the point cloud to align with its principal axes."ecef2enu": Converts a scene whose points and camera poses are in Earth-Centered, Earth-Fixed (ECEF) coordinates to East-North-Up (ENU) coordinates, centering the scene around the median point."similarity": Rotate the scene so that +z aligns with the average up vector of the cameras, center the scene around the median camera position, and rescale the scene to fit within a unit cube."none": Do not apply any normalization to the scene. Effectively a no-op.
Example usage:
from fvdb_reality_capture import transforms from fvdb_reality_capture.sfm_scene import SfmScene # Create a NormalizeScene transform to normalize the scene using PCA transform = transforms.NormalizeScene(normalization_type="pca") # Apply the transform to an SfmScene input_scene: SfmScene = ... output_scene: SfmScene = transform(input_scene)
- __call__(input_scene: SfmScene) SfmScene[source]ο
Return a new
SfmScenewhich is the result of applying the normalization transform to the input scene.The normalization transform is computed based on the specified normalization type and the contents of the input scene. It is applied to both the points and camera poses in the scene.
- __init__(normalization_type: Literal['pca', 'none', 'ecef2enu', 'similarity'])[source]ο
Create a new
NormalizeScenetransform which normalizes anSfmSceneusing the specified normalization type.Normalization is applied to both the points and camera poses in the scene.
- Parameters:
normalization_type (str) β The type of normalization to apply. Options are
"pca","none","ecef2enu", or"similarity".
- static from_state_dict(state_dict: dict[str, Any]) NormalizeScene[source]ο
Create a
NormalizeScenetransform from a state dictionary generated withstate_dict().- Parameters:
state_dict (dict) β The state dictionary for the transform.
- Returns:
transform (NormalizeScene) β An instance of the
NormalizeScenetransform.
- static name() str[source]ο
Return the name of the
NormalizeScenetransform. i.e."NormalizeScene".- Returns:
str β The name of the
NormalizeScenetransform. i.e."NormalizeScene".
- state_dict() dict[str, Any][source]ο
Return the state of the
NormalizeScenetransform for serialization.You can use this state dictionary to recreate the transform using
from_state_dict().- Returns:
state_dict (dict[str, Any]) β A dictionary containing information to serialize/deserialize the transform.
- valid_normalization_types = ['pca', 'ecef2enu', 'similarity', 'none']ο
- version = '1.0.0'ο
- class fvdb_reality_capture.transforms.PercentileFilterPoints(percentile_min: Tensor | ndarray | int | float | integer | floating | Sequence[int | float | integer | floating] | Size, percentile_max: Tensor | ndarray | int | float | integer | floating | Sequence[int | float | integer | floating] | Size)[source]ο
A
BaseTransformthat filters points in anSfmScenebased on percentile bounds for x, y, and z coordinates.When applied to an input scene, this transform returns a new
SfmScenewith points that fall within the specified percentile bounds of the input sceneβs points along each axis.e.g. If percentile_min is
(0, 0, 0)and percentile_max is(100, 100, 100), all points will be included in the output scene.e.g. If percentile_min is
(10, 20, 30)and percentile_max is(90, 80, 70), only points with x-coordinates in the 10th to 90th percentile, y-coordinates in the 20th to 80th percentile, and z-coordinates in the 30th to 70th percentile will be included in the output scene.Example usage:
from fvdb_reality_capture.transforms import PercentileFilterPoints from fvdb_reality_capture.sfm_scene import SfmScene # Create a PercentileFilterPoints transform to filter points between the 10th and 90th percentiles transform = PercentileFilterPoints(percentile_min=(10, 10, 10), percentile_max=(90, 90, 90)) # Apply the transform to an SfmScene input_scene: SfmScene = ... output_scene: SfmScene = transform(input_scene)
- __call__(input_scene: SfmScene) SfmScene[source]ο
Return a new
SfmScenewith points filtered based on the specified percentile bounds.
- __init__(percentile_min: Tensor | ndarray | int | float | integer | floating | Sequence[int | float | integer | floating] | Size, percentile_max: Tensor | ndarray | int | float | integer | floating | Sequence[int | float | integer | floating] | Size)[source]ο
Create a new
PercentileFilterPointstransform which filters points in anSfmScenebased on percentile bounds for x, y, and z coordinates.- Parameters:
percentile_min (NumericMaxRank1) β Tuple of minimum percentiles (from 0 to 100) for x, y, z coordinates or None to use (0, 0, 0) (default: None)
percentile_max (NumericMaxRank1) β Tuple of maximum percentiles (from 0 to 100) for x, y, z coordinates or None to use (100, 100, 100) (default: None)
- static from_state_dict(state_dict: dict[str, Any]) PercentileFilterPoints[source]ο
Create a
PercentileFilterPointstransform from a state dictionary generated withstate_dict().- Parameters:
state_dict (dict) β The state dictionary for the transform.
- Returns:
transform (PercentileFilterPoints) β An instance of the
PercentileFilterPointstransform.
- static name() str[source]ο
Return the name of the
PercentileFilterPointstransform. i.e."PercentileFilterPoints".- Returns:
str β The name of the
PercentileFilterPointstransform. i.e."PercentileFilterPoints".
- state_dict() dict[str, Any][source]ο
Return the state of the
PercentileFilterPointstransform for serialization.You can use this state dictionary to recreate the transform using
from_state_dict().- Returns:
state_dict (dict[str, Any]) β A dictionary containing information to serialize/deserialize the transform.
- version = '1.0.0'ο
- class fvdb_reality_capture.transforms.Identity[source]ο
A
BaseTransformthat performs the identity transform on anSfmScene. This transform returns the input scene unchanged. It can be useful as a placeholder or default transform in a processing pipeline.Example usage:
# Example usage: from fvdb_reality_capture import transforms from fvdb_reality_capture.sfm_scene import SfmScene # Use Identity as a default parameter value def append_normalize(transform: transforms.BaseTransform = transforms.Identity()): return transforms.Compose( transform, transforms.NormalizeScene("pca"), ) # Use Identity to return a no-op for later use def get_transform(condition: bool) -> transforms.BaseTransform: if condition: return transforms.DownsampleImages(2) else: # Still return a valid transform that is a no-op return transforms.Identity() get_transform(False) # Returns an Identity transform get_transform(True) # Returns a DownsampleImages transform
- __call__(input_scene: SfmScene) SfmScene[source]ο
Return the input
SfmSceneunchanged.- Parameters:
input_scene (SfmScene) β The input scene.
- Returns:
output_scene (SfmScene) β The input scene, unchanged.
- static from_state_dict(state_dict: dict[str, Any]) Identity[source]ο
Create a
Identitytransform from a state dictionary created withstate_dict().- Parameters:
state_dict (dict) β The state dictionary for the transform.
- Returns:
transform (Identity) β An instance of the
Identitytransform.
- static name() str[source]ο
Return the name of the
Identitytransform. i.e."Identity".- Returns:
str β The name of the
Identitytransform. i.e."Identity".
- state_dict() dict[str, Any][source]ο
Return the state of the
Identitytransform for serialization.You can use this state dictionary to recreate the transform using
from_state_dict().- Returns:
state_dict (dict[str, Any]) β A dictionary containing information to serialize/deserialize the transform.
- version = '1.0.0'ο