Utilities API
General-purpose utility functions, configuration, and helper classes.
embodied_gen.utils.config
embodied_gen.utils.log
embodied_gen.utils.enum
AssetType
dataclass
AssetType()
Bases: str
Enumeration for asset types.
Supported types
MJCF: MuJoCo XML format. USD: Universal Scene Description format. URDF: Unified Robot Description Format. MESH: Mesh file format.
LayoutInfo
dataclass
LayoutInfo(tree: dict[str, list], relation: dict[str, str | list[str]], objs_desc: dict[str, str] = dict(), objs_mapping: dict[str, str] = dict(), assets: dict[str, str] = dict(), quality: dict[str, str] = dict(), position: dict[str, list[float]] = dict())
Bases: DataClassJsonMixin
Data structure for layout information in a 3D scene.
Attributes:
| Name | Type | Description |
|---|---|---|
tree |
dict[str, list]
|
Hierarchical structure of scene objects. |
relation |
dict[str, str | list[str]]
|
Spatial relations between objects. |
objs_desc |
dict[str, str]
|
Descriptions of objects. |
objs_mapping |
dict[str, str]
|
Mapping from object names to categories. |
assets |
dict[str, str]
|
Asset file paths for objects. |
quality |
dict[str, str]
|
Quality information for assets. |
position |
dict[str, list[float]]
|
Position coordinates for objects. |
RenderItems
dataclass
RenderItems()
Bases: str, Enum
Enumeration of render item types for 3D scenes.
Attributes:
| Name | Type | Description |
|---|---|---|
IMAGE |
Color image. |
|
ALPHA |
Mask image. |
|
VIEW_NORMAL |
View-space normal image. |
|
GLOBAL_NORMAL |
World-space normal image. |
|
POSITION_MAP |
Position map image. |
|
DEPTH |
Depth image. |
|
ALBEDO |
Albedo image. |
|
DIFFUSE |
Diffuse image. |
RobotItemEnum
dataclass
RobotItemEnum()
Bases: str, Enum
Enumeration of supported robot types.
Attributes:
| Name | Type | Description |
|---|---|---|
FRANKA |
Franka robot. |
|
UR5 |
UR5 robot. |
|
PIPER |
Piper robot. |
Scene3DItemEnum
dataclass
Scene3DItemEnum()
Bases: str, Enum
Enumeration of 3D scene item categories.
Attributes:
| Name | Type | Description |
|---|---|---|
BACKGROUND |
Background objects. |
|
CONTEXT |
Contextual objects. |
|
ROBOT |
Robot entity. |
|
MANIPULATED_OBJS |
Objects manipulated by the robot. |
|
DISTRACTOR_OBJS |
Distractor objects. |
|
OTHERS |
Other objects. |
Methods:
| Name | Description |
|---|---|
object_list |
Returns a list of objects in the scene. |
object_mapping |
Returns a mapping from object to category. |
object_list
classmethod
object_list(layout_relation: dict) -> list
Returns a list of objects in the scene.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
layout_relation
|
dict
|
Dictionary mapping categories to objects. |
required |
Returns:
| Type | Description |
|---|---|
list
|
List of objects in the scene. |
Source code in embodied_gen/utils/enum.py
82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 | |
object_mapping
classmethod
object_mapping(layout_relation)
Returns a mapping from object to category.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
layout_relation
|
Dictionary mapping categories to objects. |
required |
Returns:
| Type | Description |
|---|---|
|
Dictionary mapping object names to their category. |
Source code in embodied_gen/utils/enum.py
101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 | |
SimAssetMapper
Maps simulator names to asset types.
Provides a mapping from simulator names to their corresponding asset type.
Example
from embodied_gen.utils.enum import SimAssetMapper
asset_type = SimAssetMapper["isaacsim"]
print(asset_type) # Output: 'usd'
Methods:
| Name | Description |
|---|---|
__class_getitem__ |
Returns the asset type for a given simulator name. |
__class_getitem__
classmethod
__class_getitem__(key: str)
Returns the asset type for a given simulator name.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
key
|
str
|
Name of the simulator. |
required |
Returns:
| Type | Description |
|---|---|
|
AssetType corresponding to the simulator. |
Raises:
| Type | Description |
|---|---|
KeyError
|
If the simulator name is not recognized. |
Source code in embodied_gen/utils/enum.py
229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 | |
SpatialRelationEnum
dataclass
SpatialRelationEnum()
Bases: str, Enum
Enumeration of spatial relations for objects in a scene.
Attributes:
| Name | Type | Description |
|---|---|---|
ON |
Objects on a surface (e.g., table). |
|
IN |
Objects in a container or room. |
|
INSIDE |
Objects inside a shelf or rack. |
|
FLOOR |
Objects on the floor. |
embodied_gen.utils.geometry
all_corners_inside
all_corners_inside(hull: Path, box: list, threshold: int = 3) -> bool
Checks if at least threshold corners of a box are inside a hull.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
hull
|
Path
|
Convex hull path. |
required |
box
|
list
|
Box coordinates [x1, x2, y1, y2]. |
required |
threshold
|
int
|
Minimum corners inside. |
3
|
Returns:
| Name | Type | Description |
|---|---|---|
bool |
bool
|
True if enough corners are inside. |
Source code in embodied_gen/utils/geometry.py
661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 | |
bfs_placement
bfs_placement(layout_file: str, floor_margin: float = 0, beside_margin: float = 0.1, max_attempts: int = 3000, init_rpy: tuple = (1.5708, 0.0, 0.0), rotate_objs: bool = True, rotate_bg: bool = True, rotate_context: bool = True, limit_reach_range: tuple[float, float] | None = (0.2, 0.85), max_orient_diff: float | None = 60, robot_dim: float = 0.12, seed: int = None) -> LayoutInfo
Places objects in a scene layout using BFS traversal.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
layout_file
|
str
|
Path to layout JSON file generated from |
required |
floor_margin
|
float
|
Z-offset for objects placed on the floor. |
0
|
beside_margin
|
float
|
Minimum margin for objects placed 'beside' their parent, used when 'on' placement fails. |
0.1
|
max_attempts
|
int
|
Max attempts for a non-overlapping placement. |
3000
|
init_rpy
|
tuple
|
Initial rotation (rpy). |
(1.5708, 0.0, 0.0)
|
rotate_objs
|
bool
|
Whether to random rotate objects. |
True
|
rotate_bg
|
bool
|
Whether to random rotate background. |
True
|
rotate_context
|
bool
|
Whether to random rotate context asset. |
True
|
limit_reach_range
|
tuple[float, float] | None
|
If set, enforce a check that manipulated objects are within the robot's reach range, in meter. |
(0.2, 0.85)
|
max_orient_diff
|
float | None
|
If set, enforce a check that manipulated objects are within the robot's orientation range, in degree. |
60
|
robot_dim
|
float
|
The approximate robot size. |
0.12
|
seed
|
int
|
Random seed for reproducible placement. |
None
|
Returns:
| Name | Type | Description |
|---|---|---|
LayoutInfo |
LayoutInfo
|
Layout information with object poses. |
Example
from embodied_gen.utils.geometry import bfs_placement
layout = bfs_placement("scene_layout.json", seed=42)
print(layout.position)
Source code in embodied_gen/utils/geometry.py
746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 | |
check_reachable
check_reachable(base_xyz: ndarray, reach_xyz: ndarray, min_reach: float = 0.25, max_reach: float = 0.85) -> bool
Checks if the target point is within the reachable range.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
base_xyz
|
ndarray
|
Base position. |
required |
reach_xyz
|
ndarray
|
Target position. |
required |
min_reach
|
float
|
Minimum reach distance. |
0.25
|
max_reach
|
float
|
Maximum reach distance. |
0.85
|
Returns:
| Name | Type | Description |
|---|---|---|
bool |
bool
|
True if reachable, False otherwise. |
Source code in embodied_gen/utils/geometry.py
724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 | |
compose_mesh_scene
compose_mesh_scene(layout_info: LayoutInfo, out_scene_path: str, with_bg: bool = False) -> None
Composes a mesh scene from layout information and saves to file.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
layout_info
|
LayoutInfo
|
Layout information. |
required |
out_scene_path
|
str
|
Output scene file path. |
required |
with_bg
|
bool
|
Include background mesh. |
False
|
Source code in embodied_gen/utils/geometry.py
1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 | |
compute_axis_rotation_quat
compute_axis_rotation_quat(axis: Literal['x', 'y', 'z'], angle_rad: float) -> list[float]
Computes quaternion for rotation around a given axis.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
axis
|
Literal['x', 'y', 'z']
|
Axis of rotation. |
required |
angle_rad
|
float
|
Rotation angle in radians. |
required |
Returns:
| Type | Description |
|---|---|
list[float]
|
list[float]: Quaternion [x, y, z, w]. |
Source code in embodied_gen/utils/geometry.py
679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 | |
compute_convex_hull_path
compute_convex_hull_path(vertices: ndarray, z_threshold: float = 0.05, interp_per_edge: int = 10, margin: float = -0.02, x_axis: int = 0, y_axis: int = 1, z_axis: int = 2) -> Path
Computes a dense convex hull path for the top surface of a mesh.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
vertices
|
ndarray
|
Mesh vertices. |
required |
z_threshold
|
float
|
Z threshold for top surface. |
0.05
|
interp_per_edge
|
int
|
Interpolation points per edge. |
10
|
margin
|
float
|
Margin for polygon buffer. |
-0.02
|
x_axis
|
int
|
X axis index. |
0
|
y_axis
|
int
|
Y axis index. |
1
|
z_axis
|
int
|
Z axis index. |
2
|
Returns:
| Name | Type | Description |
|---|---|---|
Path |
Path
|
Matplotlib path object for the convex hull. |
Source code in embodied_gen/utils/geometry.py
596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 | |
compute_pinhole_intrinsics
compute_pinhole_intrinsics(image_w: int, image_h: int, fov_deg: float) -> np.ndarray
Computes pinhole camera intrinsic matrix from image size and FOV.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
image_w
|
int
|
Image width. |
required |
image_h
|
int
|
Image height. |
required |
fov_deg
|
float
|
Field of view in degrees. |
required |
Returns:
| Type | Description |
|---|---|
ndarray
|
np.ndarray: Intrinsic matrix K. |
Source code in embodied_gen/utils/geometry.py
1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 | |
compute_xy_bbox
compute_xy_bbox(vertices: ndarray, col_x: int = 0, col_y: int = 1) -> list[float]
Computes the bounding box in XY plane for given vertices.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
vertices
|
ndarray
|
Vertex coordinates. |
required |
col_x
|
int
|
Column index for X. |
0
|
col_y
|
int
|
Column index for Y. |
1
|
Returns:
| Type | Description |
|---|---|
list[float]
|
list[float]: [min_x, max_x, min_y, max_y] |
Source code in embodied_gen/utils/geometry.py
514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 | |
find_parent_node
find_parent_node(node: str, tree: dict) -> str | None
Finds the parent node of a given node in a tree.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
node
|
str
|
Node name. |
required |
tree
|
dict
|
Tree structure. |
required |
Returns:
| Type | Description |
|---|---|
str | None
|
str | None: Parent node name or None. |
Source code in embodied_gen/utils/geometry.py
645 646 647 648 649 650 651 652 653 654 655 656 657 658 | |
has_iou_conflict
has_iou_conflict(new_box: list[float], placed_boxes: list[list[float]], iou_threshold: float = 0.0) -> bool
Checks for intersection-over-union conflict between boxes.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
new_box
|
list[float]
|
New box coordinates. |
required |
placed_boxes
|
list[list[float]]
|
List of placed box coordinates. |
required |
iou_threshold
|
float
|
IOU threshold. |
0.0
|
Returns:
| Name | Type | Description |
|---|---|---|
bool |
bool
|
True if conflict exists, False otherwise. |
Source code in embodied_gen/utils/geometry.py
532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 | |
matrix_to_pose
matrix_to_pose(matrix: ndarray) -> list[float]
Converts a 4x4 transformation matrix to a pose (x, y, z, qx, qy, qz, qw).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
matrix
|
ndarray
|
4x4 transformation matrix. |
required |
Returns:
| Type | Description |
|---|---|
list[float]
|
list[float]: Pose as [x, y, z, qx, qy, qz, qw]. |
Source code in embodied_gen/utils/geometry.py
479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 | |
pose_to_matrix
pose_to_matrix(pose: list[float]) -> np.ndarray
Converts pose (x, y, z, qx, qy, qz, qw) to a 4x4 transformation matrix.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
pose
|
list[float]
|
Pose as [x, y, z, qx, qy, qz, qw]. |
required |
Returns:
| Type | Description |
|---|---|
ndarray
|
np.ndarray: 4x4 transformation matrix. |
Source code in embodied_gen/utils/geometry.py
496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 | |
quaternion_multiply
quaternion_multiply(init_quat: list[float], rotate_quat: list[float]) -> list[float]
Multiplies two quaternions.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
init_quat
|
list[float]
|
Initial quaternion [x, y, z, w]. |
required |
rotate_quat
|
list[float]
|
Rotation quaternion [x, y, z, w]. |
required |
Returns:
| Type | Description |
|---|---|
list[float]
|
list[float]: Resulting quaternion [x, y, z, w]. |
Source code in embodied_gen/utils/geometry.py
703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 | |
with_seed
with_seed(seed_attr_name: str = 'seed')
Decorator to temporarily set the random seed for reproducibility.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
seed_attr_name
|
str
|
Name of the seed argument. |
'seed'
|
Returns:
| Name | Type | Description |
|---|---|---|
function |
Decorator function. |
Source code in embodied_gen/utils/geometry.py
559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 | |
embodied_gen.utils.gaussian
export_splats
export_splats(means: Tensor, scales: Tensor, quats: Tensor, opacities: Tensor, sh0: Tensor, shN: Tensor, format: Literal['ply'] = 'ply', save_to: Optional[str] = None) -> bytes
Export a Gaussian Splats model to bytes in PLY file format.
Source code in embodied_gen/utils/gaussian.py
106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 | |
restore_scene_scale_and_position
restore_scene_scale_and_position(real_height: float, mesh_path: str, gs_path: str) -> None
Scales a mesh and corresponding GS model to match a given real-world height.
Uses the 1st and 99th percentile of mesh Z-axis to estimate height, applies scaling and vertical alignment, and updates both the mesh and GS model.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
real_height
|
float
|
Target real-world height among Z axis. |
required |
mesh_path
|
str
|
Path to the input mesh file. |
required |
gs_path
|
str
|
Path to the Gaussian Splatting model file. |
required |
Source code in embodied_gen/utils/gaussian.py
300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 | |
embodied_gen.utils.gpt_clients
GPTclient
GPTclient(endpoint: str, api_key: str, model_name: str = 'yfb-gpt-4o', api_version: str = None, check_connection: bool = True, verbose: bool = False, timeout: float = DEFAULT_GPT_TIMEOUT)
A client to interact with GPT models via OpenAI or Azure API.
Supports text and image prompts, connection checking, and configurable parameters.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
endpoint
|
str
|
API endpoint URL. |
required |
api_key
|
str
|
API key for authentication. |
required |
model_name
|
str
|
Model name to use. |
'yfb-gpt-4o'
|
api_version
|
str
|
API version (for Azure). |
None
|
check_connection
|
bool
|
Whether to check API connection. |
True
|
verbose
|
bool
|
Enable verbose logging. |
False
|
timeout
|
float
|
Max seconds for a single GPT request. |
DEFAULT_GPT_TIMEOUT
|
Example
export ENDPOINT="https://yfb-openai-sweden.openai.azure.com"
export API_KEY="xxxxxx"
export API_VERSION="2025-03-01-preview"
export MODEL_NAME="yfb-gpt-4o-sweden"
from embodied_gen.utils.gpt_clients import GPT_CLIENT
response = GPT_CLIENT.query("Describe the physics of a falling apple.")
response = GPT_CLIENT.query(
text_prompt="Describe the content in each image."
image_base64=["path/to/image1.png", "path/to/image2.jpg"],
)
Source code in embodied_gen/utils/gpt_clients.py
118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 | |
check_connection
check_connection() -> None
Checks whether the GPT API connection is working.
Raises:
| Type | Description |
|---|---|
ConnectionError
|
If connection fails. |
Source code in embodied_gen/utils/gpt_clients.py
288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 | |
completion_with_backoff
completion_with_backoff(**kwargs)
Performs a chat completion request with retry/backoff.
Source code in embodied_gen/utils/gpt_clients.py
159 160 161 162 163 164 165 166 | |
query
query(text_prompt: str, image_base64: Optional[list[str | Image]] = None, system_role: Optional[str] = None, params: Optional[dict] = None) -> Optional[str]
Queries the GPT model with text and optional image prompts.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text_prompt
|
str
|
Main text input. |
required |
image_base64
|
Optional[list[str | Image]]
|
List of image base64 strings, file paths, or PIL Images. |
None
|
system_role
|
Optional[str]
|
System-level instructions. |
None
|
params
|
Optional[dict]
|
Additional GPT parameters. |
None
|
Returns:
| Type | Description |
|---|---|
Optional[str]
|
Optional[str]: Model response content, or None if error. |
Source code in embodied_gen/utils/gpt_clients.py
168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 | |
embodied_gen.utils.process_media
SceneTreeVisualizer
SceneTreeVisualizer(layout_info: LayoutInfo)
Visualizes a scene tree layout using networkx and matplotlib.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
layout_info
|
LayoutInfo
|
Layout information for the scene. |
required |
Example
from embodied_gen.utils.process_media import SceneTreeVisualizer
visualizer = SceneTreeVisualizer(layout_info)
visualizer.render(save_path="tree.png")
Source code in embodied_gen/utils/process_media.py
323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 | |
render
render(save_path: str, figsize=(8, 6), dpi=300, title: str = 'Scene 3D Hierarchy Tree')
Renders the scene tree and saves to file.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
save_path
|
str
|
Path to save the rendered image. |
required |
figsize
|
tuple
|
Figure size. |
(8, 6)
|
dpi
|
int
|
Image DPI. |
300
|
title
|
str
|
Plot image title. |
'Scene 3D Hierarchy Tree'
|
Source code in embodied_gen/utils/process_media.py
394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 | |
alpha_blend_rgba
alpha_blend_rgba(fg_image: Union[str, Image, ndarray], bg_image: Union[str, Image, ndarray]) -> Image.Image
Alpha blends a foreground RGBA image over a background RGBA image.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
fg_image
|
Union[str, Image, ndarray]
|
Foreground image (str, PIL Image, or ndarray). |
required |
bg_image
|
Union[str, Image, ndarray]
|
Background image (str, PIL Image, or ndarray). |
required |
Returns:
| Type | Description |
|---|---|
Image
|
Image.Image: Alpha-blended RGBA image. |
Example
from embodied_gen.utils.process_media import alpha_blend_rgba
result = alpha_blend_rgba("fg.png", "bg.png")
result.save("blended.png")
Source code in embodied_gen/utils/process_media.py
538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 | |
check_object_edge_truncated
check_object_edge_truncated(mask: ndarray, edge_threshold: int = 5) -> bool
Checks if a binary object mask is truncated at the image edges.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
mask
|
ndarray
|
2D binary mask. |
required |
edge_threshold
|
int
|
Edge pixel threshold. |
5
|
Returns:
| Name | Type | Description |
|---|---|---|
bool |
bool
|
True if object is fully enclosed, False if truncated. |
Source code in embodied_gen/utils/process_media.py
579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 | |
combine_images_to_grid
combine_images_to_grid(images: list[str | Image], cat_row_col: tuple[int, int] = None, target_wh: tuple[int, int] = (512, 512), image_mode: str = 'RGB') -> list[Image.Image]
Combines multiple images into a grid.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
images
|
list[str | Image]
|
List of image paths or PIL Images. |
required |
cat_row_col
|
tuple[int, int]
|
Grid rows and columns. |
None
|
target_wh
|
tuple[int, int]
|
Target image size. |
(512, 512)
|
image_mode
|
str
|
Image mode. |
'RGB'
|
Returns:
| Type | Description |
|---|---|
list[Image]
|
list[Image.Image]: List containing the grid image. |
Example
from embodied_gen.utils.process_media import combine_images_to_grid
grid = combine_images_to_grid(["img1.png", "img2.png"])
grid[0].save("grid.png")
Source code in embodied_gen/utils/process_media.py
259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 | |
filter_image_small_connected_components
filter_image_small_connected_components(image: Union[Image, ndarray], area_ratio: float = 10, connectivity: int = 8) -> np.ndarray
Removes small connected components from the alpha channel of an image.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
image
|
Union[Image, ndarray]
|
Input image. |
required |
area_ratio
|
float
|
Minimum area ratio. |
10
|
connectivity
|
int
|
Connectivity for labeling. |
8
|
Returns:
| Type | Description |
|---|---|
ndarray
|
np.ndarray: Image with filtered alpha channel. |
Source code in embodied_gen/utils/process_media.py
210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 | |
filter_small_connected_components
filter_small_connected_components(mask: Union[Image, ndarray], area_ratio: float, connectivity: int = 8) -> np.ndarray
Removes small connected components from a binary mask.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
mask
|
Union[Image, ndarray]
|
Input mask. |
required |
area_ratio
|
float
|
Minimum area ratio for components. |
required |
connectivity
|
int
|
Connectivity for labeling. |
8
|
Returns:
| Type | Description |
|---|---|
ndarray
|
np.ndarray: Mask with small components removed. |
Source code in embodied_gen/utils/process_media.py
175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 | |
is_image_file
is_image_file(filename: str) -> bool
Checks if a filename is an image file.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
filename
|
str
|
Filename to check. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
bool |
bool
|
True if image file, False otherwise. |
Source code in embodied_gen/utils/process_media.py
505 506 507 508 509 510 511 512 513 514 515 516 | |
load_scene_dict
load_scene_dict(file_path: str) -> dict
Loads a scene description dictionary from a file.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
file_path
|
str
|
Path to the scene description file. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
dict |
dict
|
Mapping from scene ID to description. |
Source code in embodied_gen/utils/process_media.py
484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 | |
merge_images_video
merge_images_video(color_images, normal_images, output_path) -> None
Merges color and normal images into a video.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
color_images
|
list[ndarray]
|
List of color images. |
required |
normal_images
|
list[ndarray]
|
List of normal images. |
required |
output_path
|
str
|
Path to save the output video. |
required |
Source code in embodied_gen/utils/process_media.py
134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 | |
merge_video_video
merge_video_video(video_path1: str, video_path2: str, output_path: str) -> None
Merges two videos by combining their left and right halves.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
video_path1
|
str
|
Path to first video. |
required |
video_path2
|
str
|
Path to second video. |
required |
output_path
|
str
|
Path to save the merged video. |
required |
Source code in embodied_gen/utils/process_media.py
152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 | |
parse_text_prompts
parse_text_prompts(prompts: list[str]) -> list[str]
Parses text prompts from a list or file.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
prompts
|
list[str]
|
List of prompts or a file path. |
required |
Returns:
| Type | Description |
|---|---|
list[str]
|
list[str]: List of parsed prompts. |
Source code in embodied_gen/utils/process_media.py
519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 | |
render_asset3d
render_asset3d(mesh_path: str, output_root: str, distance: float = 5.0, num_images: int = 1, elevation: list[float] = (0.0,), pbr_light_factor: float = 1.2, pbr_metallic: bool = False, return_key: str = 'image_color/*', output_subdir: str = 'renders', gen_color_mp4: bool = False, gen_viewnormal_mp4: bool = False, gen_glonormal_mp4: bool = False, no_index_file: bool = False, with_mtl: bool = True) -> list[str]
Renders a 3D mesh asset and returns output image paths.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
mesh_path
|
str
|
Path to the mesh file. |
required |
output_root
|
str
|
Directory to save outputs. |
required |
distance
|
float
|
Camera distance. |
5.0
|
num_images
|
int
|
Number of views to render. |
1
|
elevation
|
list[float]
|
Camera elevation angles. |
(0.0,)
|
pbr_light_factor
|
float
|
PBR lighting factor. |
1.2
|
pbr_metallic
|
bool
|
Keep metallic material properties for PBR rendering. |
False
|
return_key
|
str
|
Glob pattern for output images. |
'image_color/*'
|
output_subdir
|
str
|
Subdirectory for outputs. |
'renders'
|
gen_color_mp4
|
bool
|
Generate color MP4 video. |
False
|
gen_viewnormal_mp4
|
bool
|
Generate view normal MP4. |
False
|
gen_glonormal_mp4
|
bool
|
Generate global normal MP4. |
False
|
no_index_file
|
bool
|
Skip index file saving. |
False
|
with_mtl
|
bool
|
Use mesh material. |
True
|
Returns:
| Type | Description |
|---|---|
list[str]
|
list[str]: List of output image file paths. |
Example
from embodied_gen.utils.process_media import render_asset3d
image_paths = render_asset3d(
mesh_path="path_to_mesh.obj",
output_root="path_to_save_dir",
num_images=4,
elevation=(30, -30),
output_subdir="renders",
no_index_file=True,
)
Source code in embodied_gen/utils/process_media.py
56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 | |
vcat_pil_images
vcat_pil_images(images: list[Image], image_mode: str = 'RGB') -> Image.Image
Vertically concatenates a list of PIL images.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
images
|
list[Image]
|
List of images. |
required |
image_mode
|
str
|
Image mode. |
'RGB'
|
Returns:
| Type | Description |
|---|---|
Image
|
Image.Image: Vertically concatenated image. |
Example
from embodied_gen.utils.process_media import vcat_pil_images
img = vcat_pil_images([Image.open("a.png"), Image.open("b.png")])
img.save("vcat.png")
Source code in embodied_gen/utils/process_media.py
599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 | |
embodied_gen.utils.simulation
FrankaPandaGrasper
FrankaPandaGrasper(agent: BaseAgent, control_freq: float, joint_vel_limits: float = 2.0, joint_acc_limits: float = 1.0, finger_length: float = 0.025)
Bases: object
Provides grasp planning and control for Franka Panda robot.
Attributes:
| Name | Type | Description |
|---|---|---|
agent |
BaseAgent
|
The robot agent. |
robot |
The robot instance. |
|
control_freq |
float
|
Control frequency. |
control_timestep |
float
|
Control timestep. |
joint_vel_limits |
float
|
Joint velocity limits. |
joint_acc_limits |
float
|
Joint acceleration limits. |
finger_length |
float
|
Length of gripper fingers. |
planners |
Motion planners for each environment. |
Initialize the grasper.
Source code in embodied_gen/utils/simulation.py
627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 | |
compute_grasp_action
compute_grasp_action(actor: Entity, reach_target_only: bool = True, offset: tuple[float, float, float] = [0, 0, -0.05], env_idx: int = 0) -> np.ndarray
Compute grasp actions for a target actor.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
actor
|
Entity
|
Target actor to grasp. |
required |
reach_target_only
|
bool
|
Only reach the target pose if True. |
True
|
offset
|
tuple[float, float, float]
|
Offset for reach pose. |
[0, 0, -0.05]
|
env_idx
|
int
|
Environment index. |
0
|
Returns:
| Type | Description |
|---|---|
ndarray
|
np.ndarray: Array of grasp actions. |
Source code in embodied_gen/utils/simulation.py
744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 | |
control_gripper
control_gripper(gripper_state: Literal[-1, 1], n_step: int = 10) -> np.ndarray
Generate gripper control actions.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
gripper_state
|
Literal[-1, 1]
|
Desired gripper state. |
required |
n_step
|
int
|
Number of steps. |
10
|
Returns:
| Type | Description |
|---|---|
ndarray
|
np.ndarray: Array of gripper actions. |
Source code in embodied_gen/utils/simulation.py
666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 | |
move_to_pose
move_to_pose(pose: Pose, control_timestep: float, gripper_state: Literal[-1, 1], use_point_cloud: bool = False, n_max_step: int = 100, action_key: str = 'position', env_idx: int = 0) -> np.ndarray
Plan and execute motion to a target pose.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
pose
|
Pose
|
Target pose. |
required |
control_timestep
|
float
|
Control timestep. |
required |
gripper_state
|
Literal[-1, 1]
|
Desired gripper state. |
required |
use_point_cloud
|
bool
|
Use point cloud for planning. |
False
|
n_max_step
|
int
|
Max number of steps. |
100
|
action_key
|
str
|
Key for action in result. |
'position'
|
env_idx
|
int
|
Environment index. |
0
|
Returns:
| Type | Description |
|---|---|
ndarray
|
np.ndarray: Array of actions to reach the pose. |
Source code in embodied_gen/utils/simulation.py
688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 | |
SapienSceneManager
SapienSceneManager(sim_freq: int, ray_tracing: bool, device: str = 'cuda')
Manages SAPIEN simulation scenes, cameras, and rendering.
This class provides utilities for setting up scenes, adding cameras, stepping simulation, and rendering images.
Attributes:
| Name | Type | Description |
|---|---|---|
sim_freq |
int
|
Simulation frequency. |
ray_tracing |
bool
|
Whether to use ray tracing. |
device |
str
|
Device for simulation. |
renderer |
SapienRenderer
|
SAPIEN renderer. |
scene |
Scene
|
Simulation scene. |
cameras |
list
|
List of camera components. |
actors |
dict
|
Mapping of actor names to entities. |
Example see embodied_gen/scripts/simulate_sapien.py.
Initialize the scene manager.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
sim_freq
|
int
|
Simulation frequency. |
required |
ray_tracing
|
bool
|
Enable ray tracing. |
required |
device
|
str
|
Device for simulation. |
'cuda'
|
Source code in embodied_gen/utils/simulation.py
424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 | |
create_camera
create_camera(cam_name: str, pose: Pose, image_hw: tuple[int, int], fovy_deg: float) -> sapien.render.RenderCameraComponent
Create a camera in the scene.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
cam_name
|
str
|
Camera name. |
required |
pose
|
Pose
|
Camera pose. |
required |
image_hw
|
tuple[int, int]
|
Image resolution (height, width). |
required |
fovy_deg
|
float
|
Field of view in degrees. |
required |
Returns:
| Type | Description |
|---|---|
RenderCameraComponent
|
sapien.render.RenderCameraComponent: The created camera. |
Source code in embodied_gen/utils/simulation.py
514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 | |
initialize_circular_cameras
initialize_circular_cameras(num_cameras: int, radius: float, height: float, target_pt: list[float], image_hw: tuple[int, int], fovy_deg: float) -> list[sapien.render.RenderCameraComponent]
Initialize multiple cameras arranged in a circle.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
num_cameras
|
int
|
Number of cameras. |
required |
radius
|
float
|
Circle radius. |
required |
height
|
float
|
Camera height. |
required |
target_pt
|
list[float]
|
Target point to look at. |
required |
image_hw
|
tuple[int, int]
|
Image resolution. |
required |
fovy_deg
|
float
|
Field of view in degrees. |
required |
Returns:
| Type | Description |
|---|---|
list[RenderCameraComponent]
|
list[sapien.render.RenderCameraComponent]: List of cameras. |
Source code in embodied_gen/utils/simulation.py
548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 | |
step_action
step_action(agent: BaseAgent, action: Tensor, cameras: list[RenderCameraComponent], render_keys: list[str], sim_steps_per_control: int = 1) -> dict
Step the simulation and render images from cameras.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
agent
|
BaseAgent
|
The robot agent. |
required |
action
|
Tensor
|
Action to apply. |
required |
cameras
|
list
|
List of camera components. |
required |
render_keys
|
list[str]
|
Types of images to render. |
required |
sim_steps_per_control
|
int
|
Simulation steps per control. |
1
|
Returns:
| Name | Type | Description |
|---|---|---|
dict |
dict
|
Dictionary of rendered frames per camera. |
Source code in embodied_gen/utils/simulation.py
481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 | |
capture_frame
capture_frame(scene: Scene, camera: RenderCameraComponent) -> np.ndarray
Capture a single RGB frame from the camera (updates render first).
Source code in embodied_gen/utils/simulation.py
924 925 926 927 928 929 930 931 | |
create_panda_agent
create_panda_agent(scene: Scene, control_freq: int, sim_backend: str, render_backend: str, initial_qpos: ndarray | None = None, control_mode: str = 'pd_joint_pos') -> BaseAgent
Create a ManiSkill Panda agent attached to a SAPIEN scene.
Source code in embodied_gen/utils/simulation.py
970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 | |
create_recording_camera
create_recording_camera(scene_manager: SapienSceneManager, eye_pos: list[float], target_pt: list[float], image_hw: tuple[int, int], fovy_deg: float = 45.0, cam_name: str = 'recording_camera') -> sapien.render.RenderCameraComponent
Create a camera looking from eye_pos at target_pt for video capture.
Source code in embodied_gen/utils/simulation.py
934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 | |
estimate_grasp_width
estimate_grasp_width(mesh: Trimesh) -> float
Estimate a conservative top-down grasp width from OBB extents.
Source code in embodied_gen/utils/simulation.py
881 882 883 884 | |
get_actor_bottom_z
get_actor_bottom_z(actor: Entity) -> float
Get the actor world-space bottom z from its collision mesh.
Source code in embodied_gen/utils/simulation.py
897 898 899 | |
get_actor_mesh
get_actor_mesh(actor: Entity) -> trimesh.Trimesh
Get the actor collision mesh in world coordinates.
Source code in embodied_gen/utils/simulation.py
887 888 889 890 891 892 893 894 | |
load_actor_from_urdf
load_actor_from_urdf(scene: Scene | ManiSkillScene, file_path: str, pose: Pose | None = None, env_idx: int = None, use_static: bool = False, update_mass: bool = False, scale: float | ndarray = 1.0) -> sapien.pysapien.Entity
Load an sapien actor from a URDF file and add it to the scene.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
scene
|
Scene | ManiSkillScene
|
The simulation scene. |
required |
file_path
|
str
|
Path to the URDF file. |
required |
pose
|
Pose | None
|
Initial pose of the actor. |
None
|
env_idx
|
int
|
Environment index for multi-env setup. |
None
|
use_static
|
bool
|
Whether the actor is static. |
False
|
update_mass
|
bool
|
Whether to update the actor's mass from URDF. |
False
|
scale
|
float | ndarray
|
Scale factor for the actor. |
1.0
|
Returns:
| Type | Description |
|---|---|
Entity
|
sapien.pysapien.Entity: The created actor entity. |
Source code in embodied_gen/utils/simulation.py
88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 | |
load_assets_from_layout_file
load_assets_from_layout_file(scene: ManiSkillScene | Scene, layout: str, z_offset: float = 0.0, init_quat: list[float] = [0, 0, 0, 1], env_idx: int = None) -> dict[str, sapien.pysapien.Entity]
Load assets from an EmbodiedGen layout file and create sapien actors in the scene.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
scene
|
ManiSkillScene | Scene
|
The sapien simulation scene. |
required |
layout
|
str
|
Path to the embodiedgen layout file. |
required |
z_offset
|
float
|
Z offset for non-context objects. |
0.0
|
init_quat
|
list[float]
|
Initial quaternion for orientation. |
[0, 0, 0, 1]
|
env_idx
|
int
|
Environment index. |
None
|
Returns:
| Type | Description |
|---|---|
dict[str, Entity]
|
dict[str, sapien.pysapien.Entity]: Mapping from object names to actor entities. |
Source code in embodied_gen/utils/simulation.py
190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 | |
load_collision_mesh_from_urdf
load_collision_mesh_from_urdf(urdf_path: str) -> trimesh.Trimesh
Load the collision mesh referenced by a URDF in its link frame.
Applies the optional collision/origin transform so the returned mesh sits in the same frame the simulator will use; required for correct spawn-z estimation downstream.
Source code in embodied_gen/utils/simulation.py
845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 | |
load_mani_skill_robot
load_mani_skill_robot(scene: Scene | ManiSkillScene, layout: LayoutInfo | str, control_freq: int = 20, robot_init_qpos_noise: float = 0.0, control_mode: str = 'pd_joint_pos', backend_str: tuple[str, str] = ('cpu', 'gpu')) -> BaseAgent
Load a ManiSkill robot agent into the scene.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
scene
|
Scene | ManiSkillScene
|
The simulation scene. |
required |
layout
|
LayoutInfo | str
|
Layout info or path to layout file. |
required |
control_freq
|
int
|
Control frequency. |
20
|
robot_init_qpos_noise
|
float
|
Noise for initial joint positions. |
0.0
|
control_mode
|
str
|
Robot control mode. |
'pd_joint_pos'
|
backend_str
|
tuple[str, str]
|
Simulation/render backend. |
('cpu', 'gpu')
|
Returns:
| Name | Type | Description |
|---|---|---|
BaseAgent |
BaseAgent
|
The loaded robot agent. |
Source code in embodied_gen/utils/simulation.py
251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 | |
quat_from_yaw
quat_from_yaw(yaw_deg: float) -> list[float]
Convert z-axis yaw angle (degrees) to a SAPIEN quaternion (w,x,y,z).
Source code in embodied_gen/utils/simulation.py
902 903 904 905 | |
render_images
render_images(camera: RenderCameraComponent, render_keys: list[Literal['Color', 'Segmentation', 'Normal', 'Mask', 'Depth', 'Foreground']] = None) -> dict[str, Image.Image]
Render images from a given SAPIEN camera.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
camera
|
RenderCameraComponent
|
Camera to render from. |
required |
render_keys
|
list[str]
|
Types of images to render. |
None
|
Returns:
| Type | Description |
|---|---|
dict[str, Image]
|
dict[str, Image.Image]: Dictionary of rendered images. |
Source code in embodied_gen/utils/simulation.py
331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 | |
set_ground_base_color
set_ground_base_color(scene: Scene, rgba: list[float]) -> None
Update the default ground plane material color for this scene.
Source code in embodied_gen/utils/simulation.py
908 909 910 911 912 913 914 915 916 917 918 919 920 921 | |
embodied_gen.utils.tags
embodied_gen.utils.trender
embodied_gen.utils.inference
image3d_model_infer
image3d_model_infer(pipe: TrellisImageTo3DPipeline | Sam3dInference, seg_image: Image, seed: int = None, **kwargs: dict) -> dict[str, any]
Execute 3D generation using Trellis or SAM3D pipeline on input image.
Source code in embodied_gen/utils/inference.py
35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 | |
embodied_gen.utils.llm_resolve
resolve_instance_with_llm
resolve_instance_with_llm(gpt_client: GPTclient, instance_names: list[str], user_spec: str, prompt_template: str | None = None) -> str | None
Map a user description to a single scene instance name via LLM semantic matching.
E.g. user says "yellow fruit" and the scene has "banana_001" -> returns "banana_001". Returns None when there is no match or the LLM replies NONE; the caller should prompt the user that the object does not exist and ask for re-entry.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
gpt_client
|
GPTclient
|
GPT client instance, e.g. embodied_gen.utils.gpt_clients.GPT_CLIENT. |
required |
instance_names
|
list[str]
|
List of scene instance names from FloorplanManager.get_instance_names(). |
required |
user_spec
|
str
|
User input, e.g. "yellow fruit", "柜子", "the table". |
required |
prompt_template
|
str | None
|
Optional custom prompt; placeholders {instance_list} and {user_spec}. |
None
|
Returns:
| Type | Description |
|---|---|
str | None
|
The matched instance name (exactly one of instance_names), or None if no match. |
Source code in embodied_gen/utils/llm_resolve.py
42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 | |
embodied_gen.utils.general
embodied_gen.utils.io_utils
URDFFile
URDFFile(urdf_path: str | PathLike)
Small XML helper for reading and writing fields inside one URDF file.
Source code in embodied_gen/utils/io_utils.py
124 125 126 127 128 | |
load_mesh
load_mesh(mesh_path: str | PathLike, *, origin_xyz: Iterable[float] | str | None = None, origin_rpy: Iterable[float] | str | None = None, scale: tuple[float, float, float] | str | Iterable[float] | float | None = None, apply_origin: bool = True, apply_scale: bool = True) -> trimesh.Trimesh
Load a mesh and optionally apply URDF mesh scale and origin transform.
Source code in embodied_gen/utils/io_utils.py
666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 | |
save_mesh
save_mesh(mesh: Trimesh, output_path: str | PathLike, *, origin_xyz: Iterable[float] | str | None = DEFAULT_URDF_ORIGIN_XYZ, origin_rpy: Iterable[float] | str | None = DEFAULT_URDF_ORIGIN_RPY, scale: tuple[float, float, float] | str | Iterable[float] | float | None = DEFAULT_URDF_SCALE, apply_origin: bool = True, apply_scale: bool = True, copy: bool = True) -> str
Save a mesh by optionally undoing the default URDF scale and origin.
Source code in embodied_gen/utils/io_utils.py
697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 | |