first commit

This commit is contained in:
hofee 2025-05-13 09:03:38 +08:00
commit b98753bfbb
121 changed files with 8665 additions and 0 deletions

2
.gitignore vendored Normal file
View File

@ -0,0 +1,2 @@
/experiments
/__pycache__

192
Readme.md Normal file
View File

@ -0,0 +1,192 @@
# Next Best View for Reconstruction
## 1. Setup Environment
### 1.1 Install Main Project
```bash
mkdir nbv_rec
cd nbv_rec
git clone https://git.hofee.top/hofee/nbv_reconstruction.git
```
### 1.2 Install PytorchBoot
the environment is based on PytorchBoot, clone and install it from [PytorchBoot](https://git.hofee.top/hofee/PyTorchBoot.git)
```bash
git clone https://git.hofee.top/hofee/PyTorchBoot.git
cd PyTorchBoot
pip install .
cd ..
```
### 1.3 Install Blender (Optional)
If you want to render your own dataset as described in [section 2. Render Datasets](#2-render-datasets), you'll need to install Blender version 4.0 from [Blender Release](https://download.blender.org/release/Blender4.0/). Here is an example of installing Blender on Ubuntu:
```bash
wget https://download.blender.org/release/Blender4.0/blender-4.0.2-linux-x64.tar.xz
tar -xvf blender-4.0.2-linux-x64.tar.xz
```
If blender is not in your PATH, you can add it by:
```bash
export PATH=$PATH:/path/to/blender/blender-4.0.2-linux-x64
```
To run the blender script, you need to install the `pyyaml` and `scipy` package into your blender python environment. Run the following command to print the python path of your blender:
```bash
./blender -b --python-expr "import sys; print(sys.executable)"
```
Then copy the python path `/path/to/blender_python` shown in the output and run the following command to install the packages:
```bash
/path/to/blender_python -m pip install pyyaml scipy
```
### 1.4 Install Blender Render Script (Optional)
Clone the script from [nbv_rec_blender_render](https://git.hofee.top/hofee/nbv_rec_blender_render.git) and rename it to `blender`:
```bash
git clone https://git.hofee.top/hofee/nbv_rec_blender_render.git
mv nbv_rec_blender_render blender
```
### 1.5 Check Dependencies
Switch to the project root directory and run `pytorch-boot scan` or `ptb scan` to check if all dependencies are installed:
```bash
cd nbv_reconstruction
pytorch-boot scan
# or
ptb scan
```
If you see project structure information in the output, it means all dependencies are correctly installed. Otherwise, you may need to run `pip install xxx` to install the missing packages.
## 2. Render Datasets (Optional)
### 2.1 Download Object Mesh Models
Download the mesh models divided into three parts from:
- [object_meshes_part1.zip](None)
- [object_meshes_part2.zip](https://pan.baidu.com/s/1pBPhrFtBwEGp1g4vwsLIxA?pwd=1234)
- [object_meshes_part3.zip](https://pan.baidu.com/s/1peE8HqFFL0qNFhM5OC69gA?pwd=1234)
or download the whole dataset from [object_meshes.zip](https://pan.baidu.com/s/1ilWWgzg_l7_pPBv64eSgzA?pwd=1234)
Download the table model from [table.obj](https://pan.baidu.com/s/1sjjiID25Es_kmcdUIjU_Dw?pwd=1234)
### 2.2 Set Render Configurations
Open file `configs/local/view_generate_config.yaml` and modify the parameters to fit your needs. You are required to at least set the following parameters in `runner-generate`:
- `object_dir`: the directory of the downloaded object mesh models
- `output_dir`: the directory to save the rendered dataset
- `table_model_path`: the path of the downloaded table model
### 2.3 Render Dataset
There are two ways to render the dataset:
#### 2.3.1 Render with Visual Monitoring
If you want to visually monitor the rendering progress and machine resource usage:
1. In the terminal, run:
```
ptb ui
```
2. Open your browser and visit http://localhost:5000
3. Navigate to `Project Dashboard - Project Structure - Applications - generate_view`
4. Click the `Run` button to execute the rendering script
#### 2.3.2 Render in Terminal
If you don't need visual monitoring and prefer to run the rendering process directly in the terminal, simply run:
```
ptb run generate_view
```
This command will start the rendering process without launching the UI.
## 3. Preprocess
⚠️ The preprocessing code is currently not managed by `PytorchBoot`. To run the preprocessing:
1. Open the `./preprocess/preprocessor.py` file.
2. Locate the `if __name__ == "__main__":` block at the bottom of the file.
3. Specify the dataset folder by setting `root = "path/to/your/dataset"`.
4. Run the preprocessing script directly:
```
python ./preprocess/preprocessor.py
```
This will preprocess the data in the specified dataset folder.
## 4. Generate Strategy Label
### 4.1 Set Configuration
Open the file `configs/local/strategy_generate_config.yaml` and modify the parameters to fit your needs. You are required to at least set the following parameter:
- `datasets.OmniObject3d.root_dir`: the directory of your dataset
### 4.2 Generate Strategy Label
There are two ways to generate the strategy label:
#### 4.2.1 Generate with Visual Monitoring
If you want to visually monitor the generation progress and machine resource usage:
1. In the terminal, run:
```
ptb ui
```
2. Open your browser and visit http://localhost:5000
3. Navigate to Project Dashboard - Project Structure - Applications - generate_strategy
4. Click the `Run` button to execute the generation script
#### 4.2.2 Generate in Terminal
If you don't need visual monitoring and prefer to run the generation process directly in the terminal, simply run:
```
ptb run generate_strategy
```
This command will start the strategy label generation process without launching the UI.
## 5. Train
### 5.1 Set Configuration
Open the file `configs/local/train_config.yaml` and modify the parameters to fit your needs. You are required to at least set the following parameters in the `experiment` section:
```yaml
experiment:
name: your_experiment_name
root_dir: path/to/your/experiment_dir
use_checkpoint: False # if True, the checkpoint will be loaded
epoch: 600 # specific epoch to load, -1 stands for last epoch
max_epochs: 5000 # maximum epochs to train
save_checkpoint_interval: 1 # save checkpoint interval
test_first: True # if True, test process will be performed before training at each epoch
```
Adjust these parameters according to your training requirements.
### 5.2 Start Training
There are two ways to start the training process:
#### 5.2.1 Train with Visual Monitoring
If you want to visually monitor the training progress and machine resource usage:
1. In the terminal, run:
```
ptb ui
```
2. Open your browser and visit http://localhost:5000
3. Navigate to Project Dashboard - Project Structure - Applications - train
4. Click the `Run` button to start the training process
#### 5.2.2 Train in Terminal
If you don't need visual monitoring and prefer to run the training process directly in the terminal, simply run:
```
ptb run train
```
This command will start the training process without launching the UI.
## 6. Evaluation
...

9
app_generate_strategy.py Normal file
View File

@ -0,0 +1,9 @@
from PytorchBoot.application import PytorchBootApplication
from runners.strategy_generator import StrategyGenerator
@PytorchBootApplication("generate_strategy")
class DataGenerateApp:
@staticmethod
def start():
StrategyGenerator("configs/local/strategy_generate_config.yaml").run()

16
app_generate_view.py Normal file
View File

@ -0,0 +1,16 @@
from PytorchBoot.application import PytorchBootApplication
from runners.view_generator import ViewGenerator
@PytorchBootApplication("generate_view")
class ViewGenerateApp:
@staticmethod
def start():
'''
call default or your custom runners here, code will be executed
automatically when type "pytorch-boot run" or "ptb run" in terminal
example:
Trainer("path_to_your_train_config").run()
Evaluator("path_to_your_eval_config").run()
'''
ViewGenerator("configs/local/view_generate_config.yaml").run()

75
app_inference.py Normal file
View File

@ -0,0 +1,75 @@
from PytorchBoot.application import PytorchBootApplication
from runners.global_points_inferencer import GlobalPointsInferencer
from runners.global_and_local_points_inferencer import GlobalAndLocalPointsInferencer
from runners.local_points_inferencer import LocalPointsInferencer
from runners.inference_server import InferencerServer
from runners.evaluate_uncertainty_guide import EvaluateUncertaintyGuide
@PytorchBootApplication("global_points_inference")
class GlobalPointsInferenceApp:
@staticmethod
def start():
'''
call default or your custom runners here, code will be executed
automatically when type "pytorch-boot run" or "ptb run" in terminal
example:
Trainer("path_to_your_train_config").run()
Evaluator("path_to_your_eval_config").run()
'''
GlobalPointsInferencer("./configs/local/global_only_inference_config.yaml").run()
@PytorchBootApplication("global_and_local_points_inference")
class GlobalAndLocalPointsInferenceApp:
@staticmethod
def start():
'''
call default or your custom runners here, code will be executed
automatically when type "pytorch-boot run" or "ptb run" in terminal
example:
Trainer("path_to_your_train_config").run()
Evaluator("path_to_your_eval_config").run()
'''
GlobalAndLocalPointsInferencer("./configs/local/global_pts_and_local_pts_pose.yaml").run()
@PytorchBootApplication("local_points_inference")
class LocalPointsInferenceApp:
@staticmethod
def start():
'''
call default or your custom runners here, code will be executed
automatically when type "pytorch-boot run" or "ptb run" in terminal
example:
Trainer("path_to_your_train_config").run()
Evaluator("path_to_your_eval_config").run()
'''
LocalPointsInferencer("./configs/local/local_only_inference_config.yaml").run()
@PytorchBootApplication("server")
class InferenceServerApp:
@staticmethod
def start():
'''
call default or your custom runners here, code will be executed
automatically when type "pytorch-boot run" or "ptb run" in terminal
example:
Trainer("path_to_your_train_config").run()
Evaluator("path_to_your_eval_config").run()
'''
InferencerServer("./configs/server/server_inference_server_config.yaml").run()
@PytorchBootApplication("evaluate_uncertainty_guide")
class EvaluateUncertaintyGuideApp:
@staticmethod
def start():
'''
call default or your custom runners here, code will be executed
automatically when type "pytorch-boot run" or "ptb run" in terminal
example:
Trainer("path_to_your_train_config").run()
Evaluator("path_to_your_eval_config").run()
'''
EvaluateUncertaintyGuide("./configs/local/uncertainty_guide_evaluation_config.yaml").run()

11
app_sim.py Normal file
View File

@ -0,0 +1,11 @@
from PytorchBoot.application import PytorchBootApplication
from runners.simulator import Simulator
@PytorchBootApplication("sim")
class SimulateApp:
@staticmethod
def start():
simulator = Simulator("configs/local/simulation_config.yaml")
simulator.run("create")
simulator.run("simulate")

9
app_split.py Normal file
View File

@ -0,0 +1,9 @@
from PytorchBoot.application import PytorchBootApplication
from runners.data_spliter import DataSpliter
@PytorchBootApplication("split_data")
class DataSplitApp:
@staticmethod
def start():
DataSpliter("configs/server/server_split_dataset_config.yaml").run()

8
app_train.py Normal file
View File

@ -0,0 +1,8 @@
from PytorchBoot.application import PytorchBootApplication
from PytorchBoot.runners.trainer import DefaultTrainer
@PytorchBootApplication("train")
class TrainApp:
@staticmethod
def start():
DefaultTrainer("configs/server/server_train_config.yaml").run()

Binary file not shown.

162
beans/predict_result.py Normal file
View File

@ -0,0 +1,162 @@
import numpy as np
from sklearn.cluster import DBSCAN
class PredictResult:
def __init__(self, raw_predict_result, input_pts=None, cluster_params=dict(eps=0.5, min_samples=2)):
self.input_pts = input_pts
self.cluster_params = cluster_params
self.sampled_9d_pose = raw_predict_result
self.sampled_matrix_pose = self.get_sampled_matrix_pose()
self.distance_matrix = self.calculate_distance_matrix()
self.clusters = self.get_cluster_result()
self.candidate_matrix_poses = self.get_candidate_poses()
self.candidate_9d_poses = [np.concatenate((self.matrix_to_rotation_6d_numpy(matrix[:3,:3]), matrix[:3,3].reshape(-1,)), axis=-1) for matrix in self.candidate_matrix_poses]
self.cluster_num = len(self.clusters)
@staticmethod
def rotation_6d_to_matrix_numpy(d6):
a1, a2 = d6[:3], d6[3:]
b1 = a1 / np.linalg.norm(a1)
b2 = a2 - np.dot(b1, a2) * b1
b2 = b2 / np.linalg.norm(b2)
b3 = np.cross(b1, b2)
return np.stack((b1, b2, b3), axis=-2)
@staticmethod
def matrix_to_rotation_6d_numpy(matrix):
return np.copy(matrix[:2, :]).reshape((6,))
def __str__(self):
info = "Predict Result:\n"
info += f" Predicted pose number: {len(self.sampled_9d_pose)}\n"
info += f" Cluster number: {self.cluster_num}\n"
for i, cluster in enumerate(self.clusters):
info += f" - Cluster {i} size: {len(cluster)}\n"
max_distance = np.max(self.distance_matrix[self.distance_matrix != 0])
min_distance = np.min(self.distance_matrix[self.distance_matrix != 0])
info += f" Max distance: {max_distance}\n"
info += f" Min distance: {min_distance}\n"
return info
def get_sampled_matrix_pose(self):
sampled_matrix_pose = []
for pose in self.sampled_9d_pose:
rotation = pose[:6]
translation = pose[6:]
pose = self.rotation_6d_to_matrix_numpy(rotation)
pose = np.concatenate((pose, translation.reshape(-1, 1)), axis=-1)
pose = np.concatenate((pose, np.array([[0, 0, 0, 1]])), axis=-2)
sampled_matrix_pose.append(pose)
return np.array(sampled_matrix_pose)
def rotation_distance(self, R1, R2):
R = np.dot(R1.T, R2)
trace = np.trace(R)
angle = np.arccos(np.clip((trace - 1) / 2, -1, 1))
return angle
def calculate_distance_matrix(self):
n = len(self.sampled_matrix_pose)
dist_matrix = np.zeros((n, n))
for i in range(n):
for j in range(n):
dist_matrix[i, j] = self.rotation_distance(self.sampled_matrix_pose[i][:3, :3], self.sampled_matrix_pose[j][:3, :3])
return dist_matrix
def cluster_rotations(self):
clustering = DBSCAN(eps=self.cluster_params['eps'], min_samples=self.cluster_params['min_samples'], metric='precomputed')
labels = clustering.fit_predict(self.distance_matrix)
return labels
def get_cluster_result(self):
labels = self.cluster_rotations()
cluster_num = len(set(labels)) - (1 if -1 in labels else 0)
clusters = []
for _ in range(cluster_num):
clusters.append([])
for matrix_pose, label in zip(self.sampled_matrix_pose, labels):
if label != -1:
clusters[label].append(matrix_pose)
clusters.sort(key=len, reverse=True)
return clusters
def get_center_matrix_pose_from_cluster(self, cluster):
min_total_distance = float('inf')
center_matrix_pose = None
for matrix_pose in cluster:
total_distance = 0
for other_matrix_pose in cluster:
rot_distance = self.rotation_distance(matrix_pose[:3, :3], other_matrix_pose[:3, :3])
total_distance += rot_distance
if total_distance < min_total_distance:
min_total_distance = total_distance
center_matrix_pose = matrix_pose
return center_matrix_pose
def get_candidate_poses(self):
candidate_poses = []
for cluster in self.clusters:
candidate_poses.append(self.get_center_matrix_pose_from_cluster(cluster))
return candidate_poses
def visualize(self):
import plotly.graph_objects as go
fig = go.Figure()
if self.input_pts is not None:
fig.add_trace(go.Scatter3d(
x=self.input_pts[:, 0], y=self.input_pts[:, 1], z=self.input_pts[:, 2],
mode='markers', marker=dict(size=1, color='gray', opacity=0.5), name='Input Points'
))
colors = ['aggrnyl', 'agsunset', 'algae', 'amp', 'armyrose', 'balance',
'blackbody', 'bluered', 'blues', 'blugrn', 'bluyl', 'brbg']
for i, cluster in enumerate(self.clusters):
color = colors[i]
candidate_pose = self.candidate_matrix_poses[i]
origin_candidate = candidate_pose[:3, 3]
z_axis_candidate = candidate_pose[:3, 2]
for pose in cluster:
origin = pose[:3, 3]
z_axis = pose[:3, 2]
fig.add_trace(go.Cone(
x=[origin[0]], y=[origin[1]], z=[origin[2]],
u=[z_axis[0]], v=[z_axis[1]], w=[z_axis[2]],
colorscale=color,
sizemode="absolute", sizeref=0.05, anchor="tail", showscale=False
))
fig.add_trace(go.Cone(
x=[origin_candidate[0]], y=[origin_candidate[1]], z=[origin_candidate[2]],
u=[z_axis_candidate[0]], v=[z_axis_candidate[1]], w=[z_axis_candidate[2]],
colorscale=color,
sizemode="absolute", sizeref=0.1, anchor="tail", showscale=False
))
fig.update_layout(
title="Clustered Poses and Input Points",
scene=dict(
xaxis_title='X',
yaxis_title='Y',
zaxis_title='Z'
),
margin=dict(l=0, r=0, b=0, t=40),
scene_camera=dict(eye=dict(x=1.25, y=1.25, z=1.25))
)
fig.show()
if __name__ == "__main__":
step = 0
raw_predict_result = np.load(f"inference_result_pack/inference_result_pack/{step}/all_pred_pose_9d.npy")
input_pts = np.loadtxt(f"inference_result_pack/inference_result_pack/{step}/input_pts.txt")
print(raw_predict_result.shape)
predict_result = PredictResult(raw_predict_result, input_pts, cluster_params=dict(eps=0.25, min_samples=3))
print(predict_result)
print(len(predict_result.candidate_matrix_poses))
print(predict_result.distance_matrix)
#import ipdb; ipdb.set_trace()
predict_result.visualize()

View File

@ -0,0 +1,115 @@
runner:
general:
seed: 0
device: cuda
cuda_visible_devices: "0,1,2,3,4,5,6,7"
experiment:
name: ab_global_pts_and_local_pose
root_dir: "experiments"
epoch: 200 # -1 stands for last epoch
test:
dataset_list:
- OmniObject3d_test
blender_script_path: "/media/hofee/data/project/python/nbv_reconstruction/blender/data_renderer.py"
output_dir: "/media/hofee/data/project/exp/new_ab_global_pts_and_local_pose"
pipeline: nbv_reconstruction_pipeline_global
voxel_size: 0.003
min_new_area: 1.0
dataset:
OmniObject3d_test:
root_dir: "/media/hofee/repository/final_test_set/preprocessed_dataset"
model_dir: "/media/hofee/data/data/target/target_formulated_view"
source: seq_reconstruction_dataset_preprocessed
type: test
filter_degree: 75
eval_list:
- pose_diff
- coverage_rate_increase
ratio: 0.1
batch_size: 1
num_workers: 12
pts_num: 8192
load_from_preprocess: True
pipeline:
nbv_reconstruction_pipeline_local:
modules:
pts_encoder: pointnet++_encoder
seq_encoder: transformer_seq_encoder
pose_encoder: pose_encoder
view_finder: gf_view_finder
eps: 1e-5
global_scanned_feat: True
nbv_reconstruction_pipeline_global:
modules:
pts_encoder: pointnet++_encoder
seq_encoder: transformer_seq_encoder
pose_encoder: pose_encoder
view_finder: gf_view_finder
eps: 1e-5
global_scanned_feat: True
nbv_reconstruction_pipeline_local_only:
modules:
pts_encoder: pointnet++_encoder
seq_encoder: transformer_seq_encoder
pose_encoder: pose_encoder
view_finder: gf_view_finder
eps: 1e-5
global_scanned_feat: True
nbv_reconstruction_pipeline_global_only:
modules:
pts_encoder: pointnet++_encoder
seq_encoder: transformer_seq_encoder
pose_encoder: pose_encoder
view_finder: gf_view_finder
eps: 1e-5
global_scanned_feat: True
nbv_reconstruction_pipeline_mlp:
modules:
pts_encoder: pointnet++_encoder
seq_encoder: transformer_seq_encoder
pose_encoder: pose_encoder
view_finder: mlp_view_finder
eps: 1e-5
global_scanned_feat: True
module:
pointnet_encoder:
in_dim: 3
out_dim: 1024
global_feat: True
feature_transform: False
pointnet++_encoder:
in_dim: 3
transformer_seq_encoder:
embed_dim: 256
num_heads: 4
ffn_dim: 256
num_layers: 3
output_dim: 1024
gf_view_finder:
t_feat_dim: 128
pose_feat_dim: 256
main_feat_dim: 2048
regression_head: Rx_Ry_and_T
pose_mode: rot_matrix
per_point_feature: False
sample_mode: ode
sampling_steps: 500
sde_mode: ve
pose_encoder:
pose_dim: 9
out_dim: 256
pts_num_encoder:
out_dim: 64

View File

@ -0,0 +1,115 @@
runner:
general:
seed: 0
device: cuda
cuda_visible_devices: "0,1,2,3,4,5,6,7"
experiment:
name: ab_global_pts_and_local_pose
root_dir: "experiments"
epoch: 200 # -1 stands for last epoch
test:
dataset_list:
- OmniObject3d_test
blender_script_path: "/media/hofee/data/project/python/nbv_reconstruction/blender/data_renderer.py"
output_dir: "/media/hofee/data/project/exp/new_ab_global_only"
pipeline: nbv_reconstruction_pipeline_global_only
voxel_size: 0.003
min_new_area: 1.0
dataset:
OmniObject3d_test:
source: seq_reconstruction_dataset_preprocessed
root_dir: /media/hofee/repository/final_test_set/preprocessed_dataset
model_dir: /media/hofee/data/data/target/target_formulated_view
type: test
filter_degree: 75
eval_list:
- pose_diff
- coverage_rate_increase
ratio: 0.1
batch_size: 1
num_workers: 12
pts_num: 8192
load_from_preprocess: True
pipeline:
nbv_reconstruction_pipeline_local:
modules:
pts_encoder: pointnet++_encoder
seq_encoder: transformer_seq_encoder
pose_encoder: pose_encoder
view_finder: gf_view_finder
eps: 1e-5
global_scanned_feat: True
nbv_reconstruction_pipeline_global:
modules:
pts_encoder: pointnet++_encoder
seq_encoder: transformer_seq_encoder
pose_encoder: pose_encoder
view_finder: gf_view_finder
eps: 1e-5
global_scanned_feat: True
nbv_reconstruction_pipeline_local_only:
modules:
pts_encoder: pointnet++_encoder
seq_encoder: transformer_seq_encoder
pose_encoder: pose_encoder
view_finder: gf_view_finder
eps: 1e-5
global_scanned_feat: True
nbv_reconstruction_pipeline_global_only:
modules:
pts_encoder: pointnet++_encoder
seq_encoder: transformer_seq_encoder
pose_encoder: pose_encoder
view_finder: gf_view_finder
eps: 1e-5
global_scanned_feat: True
nbv_reconstruction_pipeline_mlp:
modules:
pts_encoder: pointnet++_encoder
seq_encoder: transformer_seq_encoder
pose_encoder: pose_encoder
view_finder: mlp_view_finder
eps: 1e-5
global_scanned_feat: True
module:
pointnet_encoder:
in_dim: 3
out_dim: 1024
global_feat: True
feature_transform: False
pointnet++_encoder:
in_dim: 3
transformer_seq_encoder:
embed_dim: 256
num_heads: 4
ffn_dim: 256
num_layers: 3
output_dim: 1024
gf_view_finder:
t_feat_dim: 128
pose_feat_dim: 256
main_feat_dim: 2048
regression_head: Rx_Ry_and_T
pose_mode: rot_matrix
per_point_feature: False
sample_mode: ode
sampling_steps: 500
sde_mode: ve
pose_encoder:
pose_dim: 9
out_dim: 256
pts_num_encoder:
out_dim: 64

View File

@ -0,0 +1,117 @@
runner:
general:
seed: 0
device: cuda
cuda_visible_devices: "0,1,2,3,4,5,6,7"
experiment:
name: ab_global_pts_and_local_pts_pose
root_dir: "experiments"
epoch: 200 # -1 stands for last epoch
test:
dataset_list:
- OmniObject3d_test
blender_script_path: "/media/hofee/data/project/python/nbv_reconstruction/blender/data_renderer.py"
output_dir: "/media/hofee/data/project/exp/new_ab_global_pts_and_local_pts_pose"
pipeline: nbv_reconstruction_pipeline_local
voxel_size: 0.003
min_new_area: 1.0
dataset:
OmniObject3d_test:
root_dir: "/media/hofee/repository/final_test_set/preprocessed_dataset"
model_dir: "/media/hofee/data/data/target/target_formulated_view"
source: seq_reconstruction_dataset_preprocessed
# split_file: "C:\\Document\\Datasets\\data_list\\OmniObject3d_test.txt"
type: test
filter_degree: 75
eval_list:
- pose_diff
- coverage_rate_increase
ratio: 0.1
batch_size: 1
num_workers: 12
pts_num: 8192
load_from_preprocess: True
pipeline:
nbv_reconstruction_pipeline_local:
modules:
pts_encoder: pointnet++_encoder
seq_encoder: transformer_seq_encoder
pose_encoder: pose_encoder
view_finder: gf_view_finder
eps: 1e-5
global_scanned_feat: True
nbv_reconstruction_pipeline_global:
modules:
pts_encoder: pointnet++_encoder
seq_encoder: transformer_seq_encoder
pose_encoder: pose_encoder
view_finder: gf_view_finder
eps: 1e-5
global_scanned_feat: True
nbv_reconstruction_pipeline_local_only:
modules:
pts_encoder: pointnet++_encoder
seq_encoder: transformer_seq_encoder
pose_encoder: pose_encoder
view_finder: gf_view_finder
eps: 1e-5
global_scanned_feat: True
nbv_reconstruction_pipeline_global_only:
modules:
pts_encoder: pointnet++_encoder
seq_encoder: transformer_seq_encoder
pose_encoder: pose_encoder
view_finder: gf_view_finder
eps: 1e-5
global_scanned_feat: True
nbv_reconstruction_pipeline_mlp:
modules:
pts_encoder: pointnet++_encoder
seq_encoder: transformer_seq_encoder
pose_encoder: pose_encoder
view_finder: mlp_view_finder
eps: 1e-5
global_scanned_feat: True
module:
pointnet_encoder:
in_dim: 3
out_dim: 1024
global_feat: True
feature_transform: False
pointnet++_encoder:
in_dim: 3
transformer_seq_encoder:
embed_dim: 1280
num_heads: 4
ffn_dim: 256
num_layers: 3
output_dim: 1024
gf_view_finder:
t_feat_dim: 128
pose_feat_dim: 256
main_feat_dim: 2048
regression_head: Rx_Ry_and_T
pose_mode: rot_matrix
per_point_feature: False
sample_mode: ode
sampling_steps: 500
sde_mode: ve
pose_encoder:
pose_dim: 9
out_dim: 256
pts_num_encoder:
out_dim: 64

View File

@ -0,0 +1,129 @@
runner:
general:
seed: 0
device: cuda
cuda_visible_devices: "0,1,2,3,4,5,6,7"
experiment:
name: ab_local_only
root_dir: "experiments"
epoch: 200 # -1 stands for last epoch
test:
dataset_list:
- OmniObject3d_test
blender_script_path: "/media/hofee/data/project/python/nbv_reconstruction/blender/data_renderer.py"
output_dir: "/media/hofee/data/project/exp/new_ab_local_only"
pipeline: nbv_reconstruction_pipeline_local_only
voxel_size: 0.003
min_new_area: 1.0
dataset:
# OmniObject3d_train:
# root_dir: "C:\\Document\\Datasets\\inference_test1"
# model_dir: "C:\\Document\\Datasets\\scaled_object_meshes"
# source: seq_reconstruction_dataset_preprocessed
# split_file: "C:\\Document\\Datasets\\data_list\\sample.txt"
# type: test
# filter_degree: 75
# ratio: 1
# batch_size: 1
# num_workers: 12
# pts_num: 8192
# load_from_preprocess: True
OmniObject3d_test:
root_dir: "/media/hofee/repository/final_test_set/preprocessed_dataset"
model_dir: "/media/hofee/data/data/target/target_formulated_view"
source: seq_reconstruction_dataset_preprocessed
# split_file: "C:\\Document\\Datasets\\data_list\\OmniObject3d_test.txt"
type: test
filter_degree: 75
eval_list:
- pose_diff
- coverage_rate_increase
ratio: 0.1
batch_size: 1
num_workers: 12
pts_num: 8192
load_from_preprocess: True
pipeline:
nbv_reconstruction_pipeline_local:
modules:
pts_encoder: pointnet++_encoder
seq_encoder: transformer_seq_encoder
pose_encoder: pose_encoder
view_finder: gf_view_finder
eps: 1e-5
global_scanned_feat: True
nbv_reconstruction_pipeline_global:
modules:
pts_encoder: pointnet++_encoder
seq_encoder: transformer_seq_encoder
pose_encoder: pose_encoder
view_finder: gf_view_finder
eps: 1e-5
global_scanned_feat: True
nbv_reconstruction_pipeline_local_only:
modules:
pts_encoder: pointnet++_encoder
seq_encoder: transformer_seq_encoder
pose_encoder: pose_encoder
view_finder: gf_view_finder
eps: 1e-5
global_scanned_feat: True
nbv_reconstruction_pipeline_global_only:
modules:
pts_encoder: pointnet++_encoder
seq_encoder: transformer_seq_encoder
pose_encoder: pose_encoder
view_finder: gf_view_finder
eps: 1e-5
global_scanned_feat: True
nbv_reconstruction_pipeline_mlp:
modules:
pts_encoder: pointnet++_encoder
seq_encoder: transformer_seq_encoder
pose_encoder: pose_encoder
view_finder: mlp_view_finder
eps: 1e-5
global_scanned_feat: True
module:
pointnet_encoder:
in_dim: 3
out_dim: 1024
global_feat: True
feature_transform: False
pointnet++_encoder:
in_dim: 3
transformer_seq_encoder:
embed_dim: 1280
num_heads: 4
ffn_dim: 256
num_layers: 3
output_dim: 1024
gf_view_finder:
t_feat_dim: 128
pose_feat_dim: 256
main_feat_dim: 1024
regression_head: Rx_Ry_and_T
pose_mode: rot_matrix
per_point_feature: False
sample_mode: ode
sampling_steps: 500
sde_mode: ve
pose_encoder:
pose_dim: 9
out_dim: 256
pts_num_encoder:
out_dim: 64

View File

@ -0,0 +1,36 @@
runner:
general:
seed: 0
device: cuda
cuda_visible_devices: "0,1,2,3,4,5,6,7"
experiment:
name: simulation_debug
root_dir: "experiments"
simulation:
robot:
urdf_path: "assets/franka_panda/panda.urdf"
initial_position: [0, 0, 0] # 机械臂基座位置
initial_orientation: [0, 0, 0] # 机械臂基座朝向(欧拉角)
turntable:
radius: 0.3 # 转盘半径(米)
height: 0.1 # 转盘高度
center_position: [0.8, 0, 0.4]
target:
obj_dir: /media/hofee/data/project/python/nbv_reconstruction/nbv_reconstruction/assets/object_meshes
obj_name: "google_scan-box_0185"
scale: 1.0 # 缩放系数
mass: 0.1 # 质量(kg)
rgba_color: [0.8, 0.8, 0.8, 1.0] # 目标物体颜色
camera:
width: 640
height: 480
fov: 40
near: 0.01
far: 5.0
displaytable:

View File

@ -0,0 +1,22 @@
runner:
general:
seed: 0
device: cpu
cuda_visible_devices: "0,1,2,3,4,5,6,7"
experiment:
name: debug
root_dir: "experiments"
split:
root_dir: "/media/hofee/data/project/python/nbv_reconstruction/sample_for_training/scenes"
type: "unseen_instance" # "unseen_category"
datasets:
OmniObject3d_train:
path: "/media/hofee/data/project/python/nbv_reconstruction/sample_for_training/OmniObject3d_train.txt"
ratio: 0.9
OmniObject3d_test:
path: "/media/hofee/data/project/python/nbv_reconstruction/sample_for_training/OmniObject3d_test.txt"
ratio: 0.1

View File

@ -0,0 +1,27 @@
runner:
general:
seed: 0
device: cpu
cuda_visible_devices: "0,1,2,3,4,5,6,7"
experiment:
name: debug
root_dir: "experiments"
generate:
voxel_threshold: 0.003
overlap_area_threshold: 30
compute_with_normal: False
scan_points_threshold: 10
overwrite: False
seq_num: 10
dataset_list:
- OmniObject3d
datasets:
OmniObject3d:
root_dir: /media/hofee/data/data/test_bottle/view
from: 0
to: -1 # ..-1 means end

View File

@ -0,0 +1,105 @@
runner:
general:
seed: 1
device: cuda
cuda_visible_devices: "0,1,2,3,4,5,6,7"
parallel: False
experiment:
name: debug
root_dir: "experiments"
use_checkpoint: False
epoch: 600 # -1 stands for last epoch
max_epochs: 5000
save_checkpoint_interval: 1
test_first: True
train:
optimizer:
type: Adam
lr: 0.0001
losses:
- gf_loss
dataset: OmniObject3d_train
test:
frequency: 3 # test frequency
dataset_list:
- OmniObject3d_test
pipeline: nbv_reconstruction_pipeline
dataset:
OmniObject3d_train:
root_dir: "/media/hofee/data/project/python/nbv_reconstruction/sample_for_training/scenes"
model_dir: "/media/hofee/data/data/scaled_object_meshes"
source: nbv_reconstruction_dataset
split_file: "/media/hofee/data/project/python/nbv_reconstruction/sample_for_training/OmniObject3d_train.txt"
type: train
ratio: 1
batch_size: 1
num_workers: 12
pts_num: 4096
load_from_preprocess: True
OmniObject3d_test:
root_dir: "/media/hofee/data/project/python/nbv_reconstruction/sample_for_training/scenes"
model_dir: "/media/hofee/data/data/scaled_object_meshes"
source: nbv_reconstruction_dataset
split_file: "/media/hofee/data/project/python/nbv_reconstruction/sample_for_training/OmniObject3d_train.txt"
type: test
filter_degree: 75
eval_list:
- pose_diff
- coverage_rate_increase
ratio: 0.1
batch_size: 1
num_workers: 12
pts_num: 4096
load_from_preprocess: True
pipeline:
nbv_reconstruction_pipeline:
pts_encoder: pointnet_encoder
seq_encoder: transformer_seq_encoder
pose_encoder: pose_encoder
view_finder: gf_view_finder
module:
pointnet_encoder:
in_dim: 3
out_dim: 1024
global_feat: True
feature_transform: False
transformer_seq_encoder:
pts_embed_dim: 1024
pose_embed_dim: 256
num_heads: 4
ffn_dim: 256
num_layers: 3
output_dim: 2048
gf_view_finder:
t_feat_dim: 128
pose_feat_dim: 256
main_feat_dim: 3072
regression_head: Rx_Ry_and_T
pose_mode: rot_matrix
per_point_feature: False
sample_mode: ode
sampling_steps: 500
sde_mode: ve
pose_encoder:
pose_dim: 9
out_dim: 256
loss_function:
gf_loss:
evaluation_method:
pose_diff:
coverage_rate_increase:
renderer_path: "/media/hofee/data/project/python/nbv_reconstruction/blender/data_renderer.py"

View File

@ -0,0 +1,130 @@
runner:
general:
seed: 0
device: cuda
cuda_visible_devices: "0,1,2,3,4,5,6,7"
experiment:
name: uncertainty_guide_evaluation
root_dir: "experiments"
epoch: 200 # -1 stands for last epoch
test:
dataset_list:
- OmniObject3d_test
blender_script_path: "/media/hofee/data/project/python/nbv_reconstruction/blender/data_renderer.py"
output_dir: "/media/hofee/data/project/exp/old_uncertainty_guide_evaluation"
output_data_root: "/media/hofee/repository/code/nbv_rec_uncertainty_guide/output/reconstruction"
pipeline: nbv_reconstruction_pipeline_global_only
voxel_size: 0.003
min_new_area: 1.0
dataset:
# OmniObject3d_train:
# root_dir: "C:\\Document\\Datasets\\inference_test1"
# model_dir: "C:\\Document\\Datasets\\scaled_object_meshes"
# source: seq_reconstruction_dataset_preprocessed
# split_file: "C:\\Document\\Datasets\\data_list\\sample.txt"
# type: test
# filter_degree: 75
# ratio: 1
# batch_size: 1
# num_workers: 12
# pts_num: 8192
# load_from_preprocess: True
OmniObject3d_test:
root_dir: "/media/hofee/data/data/new_testset_output"
model_dir: "/media/hofee/data/data/scaled_object_meshes"
source: seq_reconstruction_dataset_preprocessed
# split_file: "C:\\Document\\Datasets\\data_list\\OmniObject3d_test.txt"
type: test
filter_degree: 75
eval_list:
- pose_diff
- coverage_rate_increase
ratio: 0.1
batch_size: 1
num_workers: 12
pts_num: 8192
load_from_preprocess: True
pipeline:
nbv_reconstruction_pipeline_local:
modules:
pts_encoder: pointnet++_encoder
seq_encoder: transformer_seq_encoder
pose_encoder: pose_encoder
view_finder: gf_view_finder
eps: 1e-5
global_scanned_feat: True
nbv_reconstruction_pipeline_global:
modules:
pts_encoder: pointnet++_encoder
seq_encoder: transformer_seq_encoder
pose_encoder: pose_encoder
view_finder: gf_view_finder
eps: 1e-5
global_scanned_feat: True
nbv_reconstruction_pipeline_local_only:
modules:
pts_encoder: pointnet++_encoder
seq_encoder: transformer_seq_encoder
pose_encoder: pose_encoder
view_finder: gf_view_finder
eps: 1e-5
global_scanned_feat: True
nbv_reconstruction_pipeline_global_only:
modules:
pts_encoder: pointnet++_encoder
seq_encoder: transformer_seq_encoder
pose_encoder: pose_encoder
view_finder: gf_view_finder
eps: 1e-5
global_scanned_feat: True
nbv_reconstruction_pipeline_mlp:
modules:
pts_encoder: pointnet++_encoder
seq_encoder: transformer_seq_encoder
pose_encoder: pose_encoder
view_finder: mlp_view_finder
eps: 1e-5
global_scanned_feat: True
module:
pointnet_encoder:
in_dim: 3
out_dim: 1024
global_feat: True
feature_transform: False
pointnet++_encoder:
in_dim: 3
transformer_seq_encoder:
embed_dim: 256
num_heads: 4
ffn_dim: 256
num_layers: 3
output_dim: 1024
gf_view_finder:
t_feat_dim: 128
pose_feat_dim: 256
main_feat_dim: 2048
regression_head: Rx_Ry_and_T
pose_mode: rot_matrix
per_point_feature: False
sample_mode: ode
sampling_steps: 500
sde_mode: ve
pose_encoder:
pose_dim: 9
out_dim: 256
pts_num_encoder:
out_dim: 64

View File

@ -0,0 +1,52 @@
runner:
general:
seed: 0
device: cpu
cuda_visible_devices: 0,1,2,3,4,5,6,7
experiment:
name: debug
root_dir: experiments
generate:
port: 5002
from: 0
to: 50 # -1 means all
object_dir: /media/hofee/data/data/test_bottle/bottle_mesh
table_model_path: /media/hofee/data/data/others/table.obj
output_dir: /media/hofee/data/data/test_bottle/view
binocular_vision: true
plane_size: 10
max_views: 512
min_views: 128
random_view_ratio: 0.002
min_cam_table_included_degree: 20
max_diag: 0.7
min_diag: 0.01
random_config:
display_table:
min_height: 0.05
max_height: 0.15
min_radius: 0.2
max_radius: 0.3
display_object:
min_x: 0
max_x: 0.05
min_y: 0
max_y: 0.05
min_z: 0.01
max_z: 0.01
random_rotation_ratio: 0.0
random_objects:
num: 4
cluster: 0.9
light_and_camera_config:
Camera:
near_plane: 0.01
far_plane: 5
fov_vertical: 25
resolution: [640,400]
eye_distance: 0.10
eye_angle: 25
Light:
location: [0,0,3.5]
orientation: [0,0,0]
power: 150

View File

@ -0,0 +1,92 @@
runner:
general:
seed: 1
device: cuda
cuda_visible_devices: "0,1,2,3,4,5,6,7"
experiment:
name: w_gf_wo_lf_full_debug
root_dir: "experiments"
epoch: 1 # -1 stands for last epoch
test:
dataset_list:
- OmniObject3d_train
blender_script_path: ""
output_dir: ""
pipeline: nbv_reconstruction_global_pts_pipeline
dataset:
OmniObject3d_train:
root_dir: "/home/data/hofee/project/nbv_rec/data/nbv_rec_data_512_preproc_npy"
model_dir: "/home/data/hofee/project/nbv_rec/data/scaled_object_meshes"
source: seq_nbv_reconstruction_dataset
split_file: "/home/data/hofee/project/nbv_rec/data/OmniObject3d_train.txt"
type: test
filter_degree: 75
ratio: 1
batch_size: 1
num_workers: 12
pts_num: 4096
load_from_preprocess: True
pipeline:
nbv_reconstruction_local_pts_pipeline:
modules:
pts_encoder: pointnet_encoder
seq_encoder: transformer_seq_encoder
pose_encoder: pose_encoder
view_finder: gf_view_finder
eps: 1e-5
global_scanned_feat: False
nbv_reconstruction_global_pts_pipeline:
modules:
pts_encoder: pointnet_encoder
pose_seq_encoder: transformer_pose_seq_encoder
pose_encoder: pose_encoder
view_finder: gf_view_finder
eps: 1e-5
global_scanned_feat: True
module:
pointnet_encoder:
in_dim: 3
out_dim: 1024
global_feat: True
feature_transform: False
transformer_seq_encoder:
pts_embed_dim: 1024
pose_embed_dim: 256
num_heads: 4
ffn_dim: 256
num_layers: 3
output_dim: 2048
transformer_pose_seq_encoder:
pose_embed_dim: 256
num_heads: 4
ffn_dim: 256
num_layers: 3
output_dim: 1024
gf_view_finder:
t_feat_dim: 128
pose_feat_dim: 256
main_feat_dim: 2048
regression_head: Rx_Ry_and_T
pose_mode: rot_matrix
per_point_feature: False
sample_mode: ode
sampling_steps: 500
sde_mode: ve
pose_encoder:
pose_dim: 9
out_dim: 256

View File

@ -0,0 +1,53 @@
runner:
general:
seed: 0
device: cuda
cuda_visible_devices: "0,1,2,3,4,5,6,7"
experiment:
name: train_ab_global_only
root_dir: "experiments"
epoch: -1 # -1 stands for last epoch
pipeline: nbv_reconstruction_pipeline
voxel_size: 0.003
pipeline:
nbv_reconstruction_pipeline:
modules:
pts_encoder: pointnet_encoder
seq_encoder: transformer_seq_encoder
pose_encoder: pose_encoder
view_finder: gf_view_finder
eps: 1e-5
global_scanned_feat: True
module:
pointnet_encoder:
in_dim: 3
out_dim: 1024
global_feat: True
feature_transform: False
transformer_seq_encoder:
embed_dim: 256
num_heads: 4
ffn_dim: 256
num_layers: 3
output_dim: 1024
gf_view_finder:
t_feat_dim: 128
pose_feat_dim: 256
main_feat_dim: 2048
regression_head: Rx_Ry_and_T
pose_mode: rot_matrix
per_point_feature: False
sample_mode: ode
sampling_steps: 500
sde_mode: ve
pose_encoder:
pose_dim: 9
out_dim: 256
pts_num_encoder:
out_dim: 64

View File

@ -0,0 +1,22 @@
runner:
general:
seed: 0
device: cpu
cuda_visible_devices: "0,1,2,3,4,5,6,7"
experiment:
name: debug
root_dir: "experiments"
split: #
root_dir: "/data/hofee/data/packed_preprocessed_data"
type: "unseen_instance" # "unseen_category"
datasets:
OmniObject3d_train:
path: "/data/hofee/data/OmniObject3d_train.txt"
ratio: 0.9
OmniObject3d_test:
path: "/data/hofee/data/OmniObject3d_test.txt"
ratio: 0.1

View File

@ -0,0 +1,167 @@
runner:
general:
seed: 0
device: cuda
cuda_visible_devices: "2"
parallel: False
experiment:
name: newtrain_mlp
root_dir: "experiments"
use_checkpoint: False
epoch: -1 # -1 stands for last epoch
max_epochs: 5000
save_checkpoint_interval: 1
test_first: False
train:
optimizer:
type: Adam
lr: 0.0001
losses:
- mse_loss
dataset: OmniObject3d_train
test:
frequency: 3 # test frequency
dataset_list:
- OmniObject3d_test
- OmniObject3d_val
pipeline: nbv_reconstruction_pipeline_mlp
dataset:
OmniObject3d_train:
root_dir: "/data/hofee/data/new_full_data"
model_dir: "../data/scaled_object_meshes"
source: nbv_reconstruction_dataset
split_file: "/data/hofee/data/new_full_data_list/new_OmniObject3d_train.txt"
type: train
cache: True
ratio: 1
batch_size: 24
num_workers: 128
pts_num: 8192
load_from_preprocess: True
OmniObject3d_test:
root_dir: "/data/hofee/data/new_full_data"
model_dir: "../data/scaled_object_meshes"
source: nbv_reconstruction_dataset
split_file: "/data/hofee/data/new_full_data_list/new_OmniObject3d_test.txt"
type: test
cache: True
filter_degree: 75
eval_list:
- pose_diff
ratio: 1
batch_size: 32
num_workers: 12
pts_num: 8192
load_from_preprocess: True
OmniObject3d_val:
root_dir: "/data/hofee/data/new_full_data"
model_dir: "../data/scaled_object_meshes"
source: nbv_reconstruction_dataset
split_file: "/data/hofee/data/new_full_data_list/new_OmniObject3d_train.txt"
type: test
cache: True
filter_degree: 75
eval_list:
- pose_diff
ratio: 0.1
batch_size: 32
num_workers: 12
pts_num: 8192
load_from_preprocess: True
pipeline:
nbv_reconstruction_pipeline_local:
modules:
pts_encoder: pointnet++_encoder
seq_encoder: transformer_seq_encoder
pose_encoder: pose_encoder
view_finder: gf_view_finder
eps: 1e-5
global_scanned_feat: True
nbv_reconstruction_pipeline_global:
modules:
pts_encoder: pointnet++_encoder
seq_encoder: transformer_seq_encoder
pose_encoder: pose_encoder
view_finder: gf_view_finder
eps: 1e-5
global_scanned_feat: True
nbv_reconstruction_pipeline_local_only:
modules:
pts_encoder: pointnet++_encoder
seq_encoder: transformer_seq_encoder
pose_encoder: pose_encoder
view_finder: gf_view_finder
eps: 1e-5
global_scanned_feat: True
nbv_reconstruction_pipeline_global_only:
modules:
pts_encoder: pointnet++_encoder
seq_encoder: transformer_seq_encoder
pose_encoder: pose_encoder
view_finder: gf_view_finder
eps: 1e-5
global_scanned_feat: True
nbv_reconstruction_pipeline_mlp:
modules:
pts_encoder: pointnet++_encoder
seq_encoder: transformer_seq_encoder
pose_encoder: pose_encoder
view_finder: mlp_view_finder
eps: 1e-5
global_scanned_feat: True
module:
pointnet_encoder:
in_dim: 3
out_dim: 1024
global_feat: True
feature_transform: False
pointnet++_encoder:
in_dim: 3
transformer_seq_encoder:
embed_dim: 1280
num_heads: 4
ffn_dim: 256
num_layers: 3
output_dim: 1024
gf_view_finder:
t_feat_dim: 128
pose_feat_dim: 256
main_feat_dim: 2048
regression_head: Rx_Ry_and_T
pose_mode: rot_matrix
per_point_feature: False
sample_mode: ode
sampling_steps: 500
sde_mode: ve
pose_encoder:
pose_dim: 9
out_dim: 256
pts_num_encoder:
out_dim: 64
loss_function:
gf_loss:
mse_loss:
evaluation_method:
pose_diff:
coverage_rate_increase:
renderer_path: "../blender/data_renderer.py"

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

View File

@ -0,0 +1,85 @@
import torch
from torch import nn
import PytorchBoot.namespace as namespace
import PytorchBoot.stereotype as stereotype
from PytorchBoot.factory.component_factory import ComponentFactory
from PytorchBoot.utils import Log
@stereotype.pipeline("nbv_reconstruction_pipeline_global_only")
class NBVReconstructionGlobalPointsOnlyPipeline(nn.Module):
def __init__(self, config):
super(NBVReconstructionGlobalPointsOnlyPipeline, self).__init__()
self.config = config
self.module_config = config["modules"]
self.pts_encoder = ComponentFactory.create(namespace.Stereotype.MODULE, self.module_config["pts_encoder"])
self.pose_encoder = ComponentFactory.create(namespace.Stereotype.MODULE, self.module_config["pose_encoder"])
self.view_finder = ComponentFactory.create(namespace.Stereotype.MODULE, self.module_config["view_finder"])
self.eps = float(self.config["eps"])
self.enable_global_scanned_feat = self.config["global_scanned_feat"]
def forward(self, data):
mode = data["mode"]
if mode == namespace.Mode.TRAIN:
return self.forward_train(data)
elif mode == namespace.Mode.TEST:
return self.forward_test(data)
else:
Log.error("Unknown mode: {}".format(mode), True)
def pertube_data(self, gt_delta_9d):
bs = gt_delta_9d.shape[0]
random_t = torch.rand(bs, device=gt_delta_9d.device) * (1. - self.eps) + self.eps
random_t = random_t.unsqueeze(-1)
mu, std = self.view_finder.marginal_prob(gt_delta_9d, random_t)
std = std.view(-1, 1)
z = torch.randn_like(gt_delta_9d)
perturbed_x = mu + z * std
target_score = - z * std / (std ** 2)
return perturbed_x, random_t, target_score, std
def forward_train(self, data):
main_feat = self.get_main_feat(data)
''' get std '''
best_to_world_pose_9d_batch = data["best_to_world_pose_9d"]
perturbed_x, random_t, target_score, std = self.pertube_data(best_to_world_pose_9d_batch)
input_data = {
"sampled_pose": perturbed_x,
"t": random_t,
"main_feat": main_feat,
}
estimated_score = self.view_finder(input_data)
output = {
"estimated_score": estimated_score,
"target_score": target_score,
"std": std
}
return output
def forward_test(self,data):
main_feat = self.get_main_feat(data)
repeat_num = data.get("repeat_num", 50)
main_feat = main_feat.repeat(repeat_num, 1)
estimated_delta_rot_9d, in_process_sample = self.view_finder.next_best_view(
main_feat
)
result = {
"pred_pose_9d": estimated_delta_rot_9d,
"in_process_sample": in_process_sample,
}
return result
def get_main_feat(self, data):
combined_scanned_pts_batch = data['combined_scanned_pts']
global_scanned_feat = self.pts_encoder.encode_points(combined_scanned_pts_batch)
main_feat = global_scanned_feat
if torch.isnan(main_feat).any():
Log.error("nan in main_feat", True)
return main_feat

View File

@ -0,0 +1,95 @@
import torch
from torch import nn
import PytorchBoot.namespace as namespace
import PytorchBoot.stereotype as stereotype
from PytorchBoot.factory.component_factory import ComponentFactory
from PytorchBoot.utils import Log
@stereotype.pipeline("nbv_reconstruction_pipeline_local_only")
class NBVReconstructionLocalPointsOnlyPipeline(nn.Module):
def __init__(self, config):
super(NBVReconstructionLocalPointsOnlyPipeline, self).__init__()
self.config = config
self.module_config = config["modules"]
self.pts_encoder = ComponentFactory.create(namespace.Stereotype.MODULE, self.module_config["pts_encoder"])
self.pose_encoder = ComponentFactory.create(namespace.Stereotype.MODULE, self.module_config["pose_encoder"])
self.seq_encoder = ComponentFactory.create(namespace.Stereotype.MODULE, self.module_config["seq_encoder"])
self.view_finder = ComponentFactory.create(namespace.Stereotype.MODULE, self.module_config["view_finder"])
self.eps = float(self.config["eps"])
self.enable_global_scanned_feat = self.config["global_scanned_feat"]
def forward(self, data):
mode = data["mode"]
if mode == namespace.Mode.TRAIN:
return self.forward_train(data)
elif mode == namespace.Mode.TEST:
return self.forward_test(data)
else:
Log.error("Unknown mode: {}".format(mode), True)
def pertube_data(self, gt_delta_9d):
bs = gt_delta_9d.shape[0]
random_t = torch.rand(bs, device=gt_delta_9d.device) * (1. - self.eps) + self.eps
random_t = random_t.unsqueeze(-1)
mu, std = self.view_finder.marginal_prob(gt_delta_9d, random_t)
std = std.view(-1, 1)
z = torch.randn_like(gt_delta_9d)
perturbed_x = mu + z * std
target_score = - z * std / (std ** 2)
return perturbed_x, random_t, target_score, std
def forward_train(self, data):
main_feat = self.get_main_feat(data)
''' get std '''
best_to_world_pose_9d_batch = data["best_to_world_pose_9d"]
perturbed_x, random_t, target_score, std = self.pertube_data(best_to_world_pose_9d_batch)
input_data = {
"sampled_pose": perturbed_x,
"t": random_t,
"main_feat": main_feat,
}
estimated_score = self.view_finder(input_data)
output = {
"estimated_score": estimated_score,
"target_score": target_score,
"std": std
}
return output
def forward_test(self,data):
main_feat = self.get_main_feat(data)
repeat_num = data.get("repeat_num", 50)
main_feat = main_feat.repeat(repeat_num, 1)
estimated_delta_rot_9d, in_process_sample = self.view_finder.next_best_view(
main_feat
)
result = {
"pred_pose_9d": estimated_delta_rot_9d,
"in_process_sample": in_process_sample,
}
return result
def get_main_feat(self, data):
scanned_pts_batch = data['scanned_pts']
scanned_n_to_world_pose_9d_batch = data['scanned_n_to_world_pose_9d']
device = next(self.parameters()).device
feat_seq_list = []
for scanned_pts,scanned_n_to_world_pose_9d in zip(scanned_pts_batch,scanned_n_to_world_pose_9d_batch):
scanned_pts = scanned_pts.to(device)
scanned_n_to_world_pose_9d = scanned_n_to_world_pose_9d.to(device)
pts_feat = self.pts_encoder.encode_points(scanned_pts)
pose_feat = self.pose_encoder.encode_pose(scanned_n_to_world_pose_9d)
seq_feat = torch.cat([pts_feat, pose_feat], dim=-1)
feat_seq_list.append(seq_feat)
main_feat = self.seq_encoder.encode_sequence(feat_seq_list)
if torch.isnan(main_feat).any():
Log.error("nan in main_feat", True)
return main_feat

81
core/ab_mlp_pipeline.py Normal file
View File

@ -0,0 +1,81 @@
import torch
from torch import nn
import PytorchBoot.namespace as namespace
import PytorchBoot.stereotype as stereotype
from PytorchBoot.factory.component_factory import ComponentFactory
from PytorchBoot.utils import Log
@stereotype.pipeline("nbv_reconstruction_pipeline_mlp")
class NBVReconstructionMLPPipeline(nn.Module):
def __init__(self, config):
super(NBVReconstructionMLPPipeline, self).__init__()
self.config = config
self.module_config = config["modules"]
self.pts_encoder = ComponentFactory.create(namespace.Stereotype.MODULE, self.module_config["pts_encoder"])
self.pose_encoder = ComponentFactory.create(namespace.Stereotype.MODULE, self.module_config["pose_encoder"])
self.seq_encoder = ComponentFactory.create(namespace.Stereotype.MODULE, self.module_config["seq_encoder"])
self.view_finder = ComponentFactory.create(namespace.Stereotype.MODULE, self.module_config["view_finder"])
self.eps = float(self.config["eps"])
self.enable_global_scanned_feat = self.config["global_scanned_feat"]
def forward(self, data):
mode = data["mode"]
if mode == namespace.Mode.TRAIN:
return self.forward_train(data)
elif mode == namespace.Mode.TEST:
return self.forward_test(data)
else:
Log.error("Unknown mode: {}".format(mode), True)
def forward_train(self, data):
main_feat = self.get_main_feat(data)
''' get std '''
best_to_world_pose_9d_batch = data["best_to_world_pose_9d"]
input_data = {
"main_feat": main_feat,
}
pred = self.view_finder(input_data)
output = {
"pred": pred,
"gt": best_to_world_pose_9d_batch,
}
return output
def forward_test(self,data):
main_feat = self.get_main_feat(data)
estimated_delta_rot_9d, _ = self.view_finder.next_best_view(main_feat)
result = {
"pred_pose_9d": estimated_delta_rot_9d,
}
return result
def get_main_feat(self, data):
scanned_pts_batch = data['scanned_pts']
scanned_n_to_world_pose_9d_batch = data['scanned_n_to_world_pose_9d']
device = next(self.parameters()).device
feat_seq_list = []
for scanned_pts,scanned_n_to_world_pose_9d in zip(scanned_pts_batch,scanned_n_to_world_pose_9d_batch):
scanned_pts = scanned_pts.to(device)
scanned_n_to_world_pose_9d = scanned_n_to_world_pose_9d.to(device)
pts_feat = self.pts_encoder.encode_points(scanned_pts)
pose_feat = self.pose_encoder.encode_pose(scanned_n_to_world_pose_9d)
seq_feat = torch.cat([pts_feat, pose_feat], dim=-1)
feat_seq_list.append(seq_feat)
main_feat = self.seq_encoder.encode_sequence(feat_seq_list)
if self.enable_global_scanned_feat:
combined_scanned_pts_batch = data['combined_scanned_pts']
global_scanned_feat = self.pts_encoder.encode_points(combined_scanned_pts_batch)
main_feat = torch.cat([main_feat, global_scanned_feat], dim=-1)
if torch.isnan(main_feat).any():
Log.error("nan in main_feat", True)
return main_feat

109
core/evaluation.py Normal file
View File

@ -0,0 +1,109 @@
import torch
import numpy as np
from utils.reconstruction import ReconstructionUtil
from utils.pose import PoseUtil
from utils.pts import PtsUtil
from utils.render import RenderUtil
import PytorchBoot.stereotype as stereotype
import PytorchBoot.namespace as namespace
from PytorchBoot.utils.log_util import Log
@stereotype.evaluation_method("pose_diff")
class PoseDiff:
def __init__(self, _):
pass
def evaluate(self, output_list, data_list):
results = {namespace.TensorBoard.SCALAR: {}}
rot_angle_list = []
trans_dist_list = []
for output, data in zip(output_list, data_list):
gt_pose_9d = data['best_to_world_pose_9d']
pred_pose_9d = output['pred_pose_9d']
gt_rot_6d = gt_pose_9d[:, :6]
gt_trans = gt_pose_9d[:, 6:]
pred_rot_6d = pred_pose_9d[:, :6]
pred_trans = pred_pose_9d[:, 6:]
gt_rot_mat = PoseUtil.rotation_6d_to_matrix_tensor_batch(gt_rot_6d)
pred_rot_mat = PoseUtil.rotation_6d_to_matrix_tensor_batch(pred_rot_6d)
rotation_angles = PoseUtil.rotation_angle_distance(gt_rot_mat, pred_rot_mat)
rot_angle_list.extend(list(rotation_angles))
trans_dist = torch.norm(gt_trans-pred_trans, dim=1).mean().item()
trans_dist_list.append(trans_dist)
results[namespace.TensorBoard.SCALAR]["rot_diff"] = float(sum(rot_angle_list) / len(rot_angle_list))
results[namespace.TensorBoard.SCALAR]["trans_diff"] = float(sum(trans_dist_list) / len(trans_dist_list))
return results
@stereotype.evaluation_method("coverage_rate_increase")
class ConverageRateIncrease:
def __init__(self, config):
self.config = config
self.renderer_path = config["renderer_path"]
def evaluate(self, output_list, data_list):
results = {namespace.TensorBoard.SCALAR: {}}
gt_coverate_increase_list = []
pred_coverate_increase_list = []
cr_diff_list = []
for output, data in zip(output_list, data_list):
scanned_cr = data['scanned_coverage_rate']
gt_cr = data["best_coverage_rate"]
scene_path_list = data['scene_path']
model_points_normals_list = data['model_points_normals']
scanned_view_pts_list = data['scanned_target_pts_list']
pred_pose_9ds = output['pred_pose_9d']
nO_to_nL_pose_batch = data["nO_to_nL_pose"]
voxel_threshold_list = data["voxel_threshold"]
filter_degree_list = data["filter_degree"]
first_frame_to_world = data["first_frame_to_world"]
pred_n_to_world_pose_mats = torch.eye(4, device=pred_pose_9ds.device).unsqueeze(0).repeat(pred_pose_9ds.shape[0], 1, 1)
pred_n_to_world_pose_mats[:,:3,:3] = PoseUtil.rotation_6d_to_matrix_tensor_batch(pred_pose_9ds[:, :6])
pred_n_to_world_pose_mats[:,:3,3] = pred_pose_9ds[:, 6:]
pred_n_to_world_pose_mats = torch.matmul(first_frame_to_world, pred_n_to_world_pose_mats)
for idx in range(len(scanned_cr)):
model_points_normals = model_points_normals_list[idx]
scanned_view_pts = scanned_view_pts_list[idx]
voxel_threshold = voxel_threshold_list[idx]
model_pts = model_points_normals[:,:3]
down_sampled_model_pts = PtsUtil.voxel_downsample_point_cloud(model_pts, voxel_threshold)
old_scanned_cr = self.compute_coverage_rate(scanned_view_pts, None, down_sampled_model_pts, threshold=voxel_threshold)
gt_coverate_increase_list.append(gt_cr[idx]-old_scanned_cr)
scene_path = scene_path_list[idx]
pred_pose = pred_n_to_world_pose_mats[idx]
filter_degree = filter_degree_list[idx]
nO_to_nL_pose = nO_to_nL_pose_batch[idx]
try:
new_pts, _ = RenderUtil.render_pts(pred_pose, scene_path, self.renderer_path, model_points_normals, voxel_threshold=voxel_threshold, filter_degree=filter_degree, nO_to_nL_pose=nO_to_nL_pose)
pred_cr = self.compute_coverage_rate(scanned_view_pts, new_pts, down_sampled_model_pts, threshold=voxel_threshold)
except Exception as e:
Log.warning(f"Error in scene {scene_path}, {e}")
pred_cr = old_scanned_cr
pred_coverate_increase_list.append(pred_cr-old_scanned_cr)
cr_diff_list.append(gt_cr[idx]-pred_cr)
results[namespace.TensorBoard.SCALAR]["gt_cr_increase"] = float(sum(gt_coverate_increase_list) / len(gt_coverate_increase_list))
results[namespace.TensorBoard.SCALAR]["pred_cr_increase"] = float(sum(pred_coverate_increase_list) / len(pred_coverate_increase_list))
results[namespace.TensorBoard.SCALAR]["cr_diff"] = float(sum(cr_diff_list) / len(cr_diff_list))
return results
def compute_coverage_rate(self, scanned_view_pts, new_pts, model_pts, threshold=0.005):
if new_pts is not None:
new_scanned_view_pts = scanned_view_pts + [new_pts]
else:
new_scanned_view_pts = scanned_view_pts
combined_point_cloud = np.vstack(new_scanned_view_pts)
down_sampled_combined_point_cloud = PtsUtil.voxel_downsample_point_cloud(combined_point_cloud,threshold)
return ReconstructionUtil.compute_coverage_rate(model_pts, down_sampled_combined_point_cloud, threshold)

View File

@ -0,0 +1,98 @@
import torch
from torch import nn
import PytorchBoot.namespace as namespace
import PytorchBoot.stereotype as stereotype
from PytorchBoot.factory.component_factory import ComponentFactory
from PytorchBoot.utils import Log
@stereotype.pipeline("nbv_reconstruction_pipeline_global")
class NBVReconstructionGlobalPointsPipeline(nn.Module):
def __init__(self, config):
super(NBVReconstructionGlobalPointsPipeline, self).__init__()
self.config = config
self.module_config = config["modules"]
self.pts_encoder = ComponentFactory.create(namespace.Stereotype.MODULE, self.module_config["pts_encoder"])
self.pose_encoder = ComponentFactory.create(namespace.Stereotype.MODULE, self.module_config["pose_encoder"])
self.seq_encoder = ComponentFactory.create(namespace.Stereotype.MODULE, self.module_config["seq_encoder"])
self.view_finder = ComponentFactory.create(namespace.Stereotype.MODULE, self.module_config["view_finder"])
self.eps = float(self.config["eps"])
self.enable_global_scanned_feat = self.config["global_scanned_feat"]
def forward(self, data):
mode = data["mode"]
if mode == namespace.Mode.TRAIN:
return self.forward_train(data)
elif mode == namespace.Mode.TEST:
return self.forward_test(data)
else:
Log.error("Unknown mode: {}".format(mode), True)
def pertube_data(self, gt_delta_9d):
bs = gt_delta_9d.shape[0]
random_t = torch.rand(bs, device=gt_delta_9d.device) * (1. - self.eps) + self.eps
random_t = random_t.unsqueeze(-1)
mu, std = self.view_finder.marginal_prob(gt_delta_9d, random_t)
std = std.view(-1, 1)
z = torch.randn_like(gt_delta_9d)
perturbed_x = mu + z * std
target_score = - z * std / (std ** 2)
return perturbed_x, random_t, target_score, std
def forward_train(self, data):
main_feat = self.get_main_feat(data)
''' get std '''
best_to_world_pose_9d_batch = data["best_to_world_pose_9d"]
perturbed_x, random_t, target_score, std = self.pertube_data(best_to_world_pose_9d_batch)
input_data = {
"sampled_pose": perturbed_x,
"t": random_t,
"main_feat": main_feat,
}
estimated_score = self.view_finder(input_data)
output = {
"estimated_score": estimated_score,
"target_score": target_score,
"std": std
}
return output
def forward_test(self,data):
main_feat = self.get_main_feat(data)
repeat_num = data.get("repeat_num", 50)
main_feat = main_feat.repeat(repeat_num, 1)
estimated_delta_rot_9d, in_process_sample = self.view_finder.next_best_view(
main_feat
)
result = {
"pred_pose_9d": estimated_delta_rot_9d,
"in_process_sample": in_process_sample,
}
return result
def get_main_feat(self, data):
scanned_n_to_world_pose_9d_batch = data['scanned_n_to_world_pose_9d']
device = next(self.parameters()).device
feat_seq_list = []
for scanned_n_to_world_pose_9d in scanned_n_to_world_pose_9d_batch:
scanned_n_to_world_pose_9d = scanned_n_to_world_pose_9d.to(device)
feat_seq_list.append(self.pose_encoder.encode_pose(scanned_n_to_world_pose_9d))
main_feat = self.seq_encoder.encode_sequence(feat_seq_list)
combined_scanned_pts_batch = data['combined_scanned_pts']
global_scanned_feat = self.pts_encoder.encode_points(combined_scanned_pts_batch)
main_feat = torch.cat([main_feat, global_scanned_feat], dim=-1)
if torch.isnan(main_feat).any():
Log.error("nan in main_feat", True)
return main_feat

View File

@ -0,0 +1,99 @@
import torch
from torch import nn
import PytorchBoot.namespace as namespace
import PytorchBoot.stereotype as stereotype
from PytorchBoot.factory.component_factory import ComponentFactory
from PytorchBoot.utils import Log
@stereotype.pipeline("nbv_reconstruction_pipeline_local")
class NBVReconstructionLocalPointsPipeline(nn.Module):
def __init__(self, config):
super(NBVReconstructionLocalPointsPipeline, self).__init__()
self.config = config
self.module_config = config["modules"]
self.pts_encoder = ComponentFactory.create(namespace.Stereotype.MODULE, self.module_config["pts_encoder"])
self.pose_encoder = ComponentFactory.create(namespace.Stereotype.MODULE, self.module_config["pose_encoder"])
self.seq_encoder = ComponentFactory.create(namespace.Stereotype.MODULE, self.module_config["seq_encoder"])
self.view_finder = ComponentFactory.create(namespace.Stereotype.MODULE, self.module_config["view_finder"])
self.eps = float(self.config["eps"])
self.enable_global_scanned_feat = self.config["global_scanned_feat"]
def forward(self, data):
mode = data["mode"]
if mode == namespace.Mode.TRAIN:
return self.forward_train(data)
elif mode == namespace.Mode.TEST:
return self.forward_test(data)
else:
Log.error("Unknown mode: {}".format(mode), True)
def pertube_data(self, gt_delta_9d):
bs = gt_delta_9d.shape[0]
random_t = torch.rand(bs, device=gt_delta_9d.device) * (1. - self.eps) + self.eps
random_t = random_t.unsqueeze(-1)
mu, std = self.view_finder.marginal_prob(gt_delta_9d, random_t)
std = std.view(-1, 1)
z = torch.randn_like(gt_delta_9d)
perturbed_x = mu + z * std
target_score = - z * std / (std ** 2)
return perturbed_x, random_t, target_score, std
def forward_train(self, data):
main_feat = self.get_main_feat(data)
''' get std '''
best_to_world_pose_9d_batch = data["best_to_world_pose_9d"]
perturbed_x, random_t, target_score, std = self.pertube_data(best_to_world_pose_9d_batch)
input_data = {
"sampled_pose": perturbed_x,
"t": random_t,
"main_feat": main_feat,
}
estimated_score = self.view_finder(input_data)
output = {
"estimated_score": estimated_score,
"target_score": target_score,
"std": std
}
return output
def forward_test(self,data):
main_feat = self.get_main_feat(data)
repeat_num = data.get("repeat_num", 50)
main_feat = main_feat.repeat(repeat_num, 1)
estimated_delta_rot_9d, in_process_sample = self.view_finder.next_best_view(
main_feat
)
result = {
"pred_pose_9d": estimated_delta_rot_9d,
"in_process_sample": in_process_sample,
}
return result
def get_main_feat(self, data):
scanned_pts_batch = data['scanned_pts']
scanned_n_to_world_pose_9d_batch = data['scanned_n_to_world_pose_9d']
device = next(self.parameters()).device
feat_seq_list = []
for scanned_pts,scanned_n_to_world_pose_9d in zip(scanned_pts_batch,scanned_n_to_world_pose_9d_batch):
scanned_pts = scanned_pts.to(device)
scanned_n_to_world_pose_9d = scanned_n_to_world_pose_9d.to(device)
pts_feat = self.pts_encoder.encode_points(scanned_pts)
pose_feat = self.pose_encoder.encode_pose(scanned_n_to_world_pose_9d)
seq_feat = torch.cat([pts_feat, pose_feat], dim=-1)
feat_seq_list.append(seq_feat)
main_feat = self.seq_encoder.encode_sequence(feat_seq_list)
if self.enable_global_scanned_feat:
combined_scanned_pts_batch = data['combined_scanned_pts']
global_scanned_feat = self.pts_encoder.encode_points(combined_scanned_pts_batch)
main_feat = torch.cat([main_feat, global_scanned_feat], dim=-1)
if torch.isnan(main_feat).any():
Log.error("nan in main_feat", True)
return main_feat

27
core/loss.py Normal file
View File

@ -0,0 +1,27 @@
import torch
import PytorchBoot.stereotype as stereotype
@stereotype.loss_function("gf_loss")
class GFLoss:
def __init__(self, _):
pass
def compute(self, output, _):
estimated_score = output['estimated_score']
target_score = output['target_score']
std = output['std']
bs = estimated_score.shape[0]
loss_weighting = std ** 2
loss = torch.mean(torch.sum((loss_weighting * (estimated_score - target_score) ** 2).view(bs, -1), dim=-1))
return loss
@stereotype.loss_function("mse_loss")
class MSELoss:
def __init__(self,_):
pass
def compute(self, output, _):
pred_pose = output["pred"]
gt_pose = output["gt"]
loss = torch.mean(torch.sum((pred_pose - gt_pose) ** 2, dim=-1))
return loss

282
core/nbv_dataset.py Normal file
View File

@ -0,0 +1,282 @@
import numpy as np
from PytorchBoot.dataset import BaseDataset
import PytorchBoot.namespace as namespace
import PytorchBoot.stereotype as stereotype
from PytorchBoot.config import ConfigManager
from PytorchBoot.utils.log_util import Log
import torch
import os
import sys
import time
sys.path.append(r"/data/hofee/project/nbv_rec/nbv_reconstruction")
from utils.data_load import DataLoadUtil
from utils.pose import PoseUtil
from utils.pts import PtsUtil
@stereotype.dataset("nbv_reconstruction_dataset")
class NBVReconstructionDataset(BaseDataset):
def __init__(self, config):
super(NBVReconstructionDataset, self).__init__(config)
self.config = config
self.root_dir = config["root_dir"]
self.split_file_path = config["split_file"]
self.scene_name_list = self.load_scene_name_list()
self.datalist = self.get_datalist()
self.pts_num = config["pts_num"]
self.type = config["type"]
self.cache = config.get("cache")
self.load_from_preprocess = config.get("load_from_preprocess", False)
if self.type == namespace.Mode.TEST:
#self.model_dir = config["model_dir"]
self.filter_degree = config["filter_degree"]
if self.type == namespace.Mode.TRAIN:
scale_ratio = 1
self.datalist = self.datalist*scale_ratio
if self.cache:
expr_root = ConfigManager.get("runner", "experiment", "root_dir")
expr_name = ConfigManager.get("runner", "experiment", "name")
self.cache_dir = os.path.join(expr_root, expr_name, "cache")
# self.preprocess_cache()
def load_scene_name_list(self):
scene_name_list = []
with open(self.split_file_path, "r") as f:
for line in f:
scene_name = line.strip()
scene_name_list.append(scene_name)
return scene_name_list
def get_datalist(self):
datalist = []
for scene_name in self.scene_name_list:
seq_num = DataLoadUtil.get_label_num(self.root_dir, scene_name)
scene_max_coverage_rate = 0
max_coverage_rate_list = []
for seq_idx in range(seq_num):
label_path = DataLoadUtil.get_label_path(
self.root_dir, scene_name, seq_idx
)
label_data = DataLoadUtil.load_label(label_path)
max_coverage_rate = label_data["max_coverage_rate"]
if max_coverage_rate > scene_max_coverage_rate:
scene_max_coverage_rate = max_coverage_rate
max_coverage_rate_list.append(max_coverage_rate)
if max_coverage_rate_list:
mean_coverage_rate = np.mean(max_coverage_rate_list)
for seq_idx in range(seq_num):
label_path = DataLoadUtil.get_label_path(
self.root_dir, scene_name, seq_idx
)
label_data = DataLoadUtil.load_label(label_path)
if max_coverage_rate_list[seq_idx] > mean_coverage_rate - 0.1:
for data_pair in label_data["data_pairs"]:
scanned_views = data_pair[0]
next_best_view = data_pair[1]
datalist.append(
{
"scanned_views": scanned_views,
"next_best_view": next_best_view,
"seq_max_coverage_rate": max_coverage_rate,
"scene_name": scene_name,
"label_idx": seq_idx,
"scene_max_coverage_rate": scene_max_coverage_rate,
}
)
return datalist
def preprocess_cache(self):
Log.info("preprocessing cache...")
for item_idx in range(len(self.datalist)):
self.__getitem__(item_idx)
Log.success("finish preprocessing cache.")
def load_from_cache(self, scene_name, curr_frame_idx):
cache_name = f"{scene_name}_{curr_frame_idx}.txt"
cache_path = os.path.join(self.cache_dir, cache_name)
if os.path.exists(cache_path):
data = np.loadtxt(cache_path)
return data
else:
return None
def save_to_cache(self, scene_name, curr_frame_idx, data):
cache_name = f"{scene_name}_{curr_frame_idx}.txt"
cache_path = os.path.join(self.cache_dir, cache_name)
try:
np.savetxt(cache_path, data)
except Exception as e:
Log.error(f"Save cache failed: {e}")
def voxel_downsample_with_mapping(self, point_cloud, voxel_size=0.003):
voxel_indices = np.floor(point_cloud / voxel_size).astype(np.int32)
unique_voxels, inverse, counts = np.unique(voxel_indices, axis=0, return_inverse=True, return_counts=True)
idx_sort = np.argsort(inverse)
idx_unique = idx_sort[np.cumsum(counts)-counts]
downsampled_points = point_cloud[idx_unique]
return downsampled_points, inverse
def __getitem__(self, index):
data_item_info = self.datalist[index]
scanned_views = data_item_info["scanned_views"]
nbv = data_item_info["next_best_view"]
max_coverage_rate = data_item_info["seq_max_coverage_rate"]
scene_name = data_item_info["scene_name"]
(
scanned_views_pts,
scanned_coverages_rate,
scanned_n_to_world_pose,
) = ([], [], [])
#start_time = time.time()
start_indices = [0]
total_points = 0
for view in scanned_views:
frame_idx = view[0]
coverage_rate = view[1]
view_path = DataLoadUtil.get_path(self.root_dir, scene_name, frame_idx)
cam_info = DataLoadUtil.load_cam_info(view_path, binocular=True)
n_to_world_pose = cam_info["cam_to_world"]
target_point_cloud = (
DataLoadUtil.load_from_preprocessed_pts(view_path)
)
downsampled_target_point_cloud = PtsUtil.random_downsample_point_cloud(
target_point_cloud, self.pts_num
)
scanned_views_pts.append(downsampled_target_point_cloud)
scanned_coverages_rate.append(coverage_rate)
n_to_world_6d = PoseUtil.matrix_to_rotation_6d_numpy(
np.asarray(n_to_world_pose[:3, :3])
)
n_to_world_trans = n_to_world_pose[:3, 3]
n_to_world_9d = np.concatenate([n_to_world_6d, n_to_world_trans], axis=0)
scanned_n_to_world_pose.append(n_to_world_9d)
total_points += len(downsampled_target_point_cloud)
start_indices.append(total_points)
#end_time = time.time()
#Log.info(f"load data time: {end_time - start_time}")
nbv_idx, nbv_coverage_rate = nbv[0], nbv[1]
nbv_path = DataLoadUtil.get_path(self.root_dir, scene_name, nbv_idx)
cam_info = DataLoadUtil.load_cam_info(nbv_path)
best_frame_to_world = cam_info["cam_to_world"]
best_to_world_6d = PoseUtil.matrix_to_rotation_6d_numpy(
np.asarray(best_frame_to_world[:3, :3])
)
best_to_world_trans = best_frame_to_world[:3, 3]
best_to_world_9d = np.concatenate(
[best_to_world_6d, best_to_world_trans], axis=0
)
combined_scanned_views_pts = np.concatenate(scanned_views_pts, axis=0)
voxel_downsampled_combined_scanned_pts_np, inverse = self.voxel_downsample_with_mapping(combined_scanned_views_pts, 0.003)
random_downsampled_combined_scanned_pts_np, random_downsample_idx = PtsUtil.random_downsample_point_cloud(voxel_downsampled_combined_scanned_pts_np, self.pts_num, require_idx=True)
# all_idx_unique = np.arange(len(voxel_downsampled_combined_scanned_pts_np))
# all_random_downsample_idx = all_idx_unique[random_downsample_idx]
# scanned_pts_mask = []
# for idx, start_idx in enumerate(start_indices):
# if idx == len(start_indices) - 1:
# break
# end_idx = start_indices[idx+1]
# view_inverse = inverse[start_idx:end_idx]
# view_unique_downsampled_idx = np.unique(view_inverse)
# view_unique_downsampled_idx_set = set(view_unique_downsampled_idx)
# mask = np.array([idx in view_unique_downsampled_idx_set for idx in all_random_downsample_idx])
# #scanned_pts_mask.append(mask)
data_item = {
"scanned_pts": np.asarray(scanned_views_pts, dtype=np.float32), # Ndarray(S x Nv x 3)
"combined_scanned_pts": np.asarray(random_downsampled_combined_scanned_pts_np, dtype=np.float32), # Ndarray(N x 3)
#"scanned_pts_mask": np.asarray(scanned_pts_mask, dtype=np.bool), # Ndarray(N)
"scanned_coverage_rate": scanned_coverages_rate, # List(S): Float, range(0, 1)
"scanned_n_to_world_pose_9d": np.asarray(scanned_n_to_world_pose, dtype=np.float32), # Ndarray(S x 9)
"best_coverage_rate": nbv_coverage_rate, # Float, range(0, 1)
"best_to_world_pose_9d": np.asarray(best_to_world_9d, dtype=np.float32), # Ndarray(9)
"seq_max_coverage_rate": max_coverage_rate, # Float, range(0, 1)
"scene_name": scene_name, # String
}
return data_item
def __len__(self):
return len(self.datalist)
def get_collate_fn(self):
def collate_fn(batch):
collate_data = {}
''' ------ Varialbe Length ------ '''
collate_data["scanned_pts"] = [
torch.tensor(item["scanned_pts"]) for item in batch
]
collate_data["scanned_n_to_world_pose_9d"] = [
torch.tensor(item["scanned_n_to_world_pose_9d"]) for item in batch
]
# collate_data["scanned_pts_mask"] = [
# torch.tensor(item["scanned_pts_mask"]) for item in batch
# ]
''' ------ Fixed Length ------ '''
collate_data["best_to_world_pose_9d"] = torch.stack(
[torch.tensor(item["best_to_world_pose_9d"]) for item in batch]
)
collate_data["combined_scanned_pts"] = torch.stack(
[torch.tensor(item["combined_scanned_pts"]) for item in batch]
)
for key in batch[0].keys():
if key not in [
"scanned_pts",
"scanned_n_to_world_pose_9d",
"best_to_world_pose_9d",
"combined_scanned_pts",
"scanned_pts_mask",
]:
collate_data[key] = [item[key] for item in batch]
return collate_data
return collate_fn
# -------------- Debug ---------------- #
if __name__ == "__main__":
import torch
seed = 0
torch.manual_seed(seed)
np.random.seed(seed)
config = {
"root_dir": "/data/hofee/nbv_rec_part2_preprocessed",
"source": "nbv_reconstruction_dataset",
"split_file": "/data/hofee/data/sample.txt",
"load_from_preprocess": True,
"ratio": 0.5,
"batch_size": 2,
"filter_degree": 75,
"num_workers": 0,
"pts_num": 4096,
"type": namespace.Mode.TRAIN,
}
ds = NBVReconstructionDataset(config)
print(len(ds))
# ds.__getitem__(10)
dl = ds.get_loader(shuffle=True)
for idx, data in enumerate(dl):
data = ds.process_batch(data, "cuda:0")
print(data)
# ------ Debug Start ------
import ipdb
ipdb.set_trace()
# ------ Debug End ------

154
core/old_seq_dataset.py Normal file
View File

@ -0,0 +1,154 @@
import numpy as np
from PytorchBoot.dataset import BaseDataset
import PytorchBoot.namespace as namespace
import PytorchBoot.stereotype as stereotype
from PytorchBoot.utils.log_util import Log
import torch
import os
import sys
sys.path.append(r"/home/data/hofee/project/nbv_rec/nbv_reconstruction")
from utils.data_load import DataLoadUtil
from utils.pose import PoseUtil
from utils.pts import PtsUtil
@stereotype.dataset("old_seq_nbv_reconstruction_dataset")
class SeqNBVReconstructionDataset(BaseDataset):
def __init__(self, config):
super(SeqNBVReconstructionDataset, self).__init__(config)
self.type = config["type"]
if self.type != namespace.Mode.TEST:
Log.error("Dataset <seq_nbv_reconstruction_dataset> Only support test mode", terminate=True)
self.config = config
self.root_dir = config["root_dir"]
self.split_file_path = config["split_file"]
self.scene_name_list = self.load_scene_name_list()
self.datalist = self.get_datalist()
self.pts_num = config["pts_num"]
self.model_dir = config["model_dir"]
self.filter_degree = config["filter_degree"]
self.load_from_preprocess = config.get("load_from_preprocess", False)
def load_scene_name_list(self):
scene_name_list = []
with open(self.split_file_path, "r") as f:
for line in f:
scene_name = line.strip()
scene_name_list.append(scene_name)
return scene_name_list
def get_datalist(self):
datalist = []
for scene_name in self.scene_name_list:
seq_num = DataLoadUtil.get_label_num(self.root_dir, scene_name)
scene_max_coverage_rate = 0
scene_max_cr_idx = 0
for seq_idx in range(seq_num):
label_path = DataLoadUtil.get_label_path(self.root_dir, scene_name, seq_idx)
label_data = DataLoadUtil.load_label(label_path)
max_coverage_rate = label_data["max_coverage_rate"]
if max_coverage_rate > scene_max_coverage_rate:
scene_max_coverage_rate = max_coverage_rate
scene_max_cr_idx = seq_idx
label_path = DataLoadUtil.get_label_path(self.root_dir, scene_name, scene_max_cr_idx)
label_data = DataLoadUtil.load_label(label_path)
first_frame = label_data["best_sequence"][0]
best_seq_len = len(label_data["best_sequence"])
datalist.append({
"scene_name": scene_name,
"first_frame": first_frame,
"max_coverage_rate": scene_max_coverage_rate,
"best_seq_len": best_seq_len,
"label_idx": scene_max_cr_idx,
})
return datalist
def __getitem__(self, index):
data_item_info = self.datalist[index]
first_frame_idx = data_item_info["first_frame"][0]
first_frame_coverage = data_item_info["first_frame"][1]
max_coverage_rate = data_item_info["max_coverage_rate"]
scene_name = data_item_info["scene_name"]
first_cam_info = DataLoadUtil.load_cam_info(DataLoadUtil.get_path(self.root_dir, scene_name, first_frame_idx), binocular=True)
first_view_path = DataLoadUtil.get_path(self.root_dir, scene_name, first_frame_idx)
first_left_cam_pose = first_cam_info["cam_to_world"]
first_center_cam_pose = first_cam_info["cam_to_world_O"]
first_target_point_cloud = DataLoadUtil.load_from_preprocessed_pts(first_view_path)
first_pts_num = first_target_point_cloud.shape[0]
first_downsampled_target_point_cloud = PtsUtil.random_downsample_point_cloud(first_target_point_cloud, self.pts_num)
first_to_world_rot_6d = PoseUtil.matrix_to_rotation_6d_numpy(np.asarray(first_left_cam_pose[:3,:3]))
first_to_world_trans = first_left_cam_pose[:3,3]
first_to_world_9d = np.concatenate([first_to_world_rot_6d, first_to_world_trans], axis=0)
diag = DataLoadUtil.get_bbox_diag(self.model_dir, scene_name)
voxel_threshold = diag*0.02
first_O_to_first_L_pose = np.dot(np.linalg.inv(first_left_cam_pose), first_center_cam_pose)
scene_path = os.path.join(self.root_dir, scene_name)
model_points_normals = DataLoadUtil.load_points_normals(self.root_dir, scene_name)
data_item = {
"first_pts_num": np.asarray(
first_pts_num, dtype=np.int32
),
"first_pts": np.asarray([first_downsampled_target_point_cloud],dtype=np.float32),
"combined_scanned_pts": np.asarray(first_downsampled_target_point_cloud,dtype=np.float32),
"first_to_world_9d": np.asarray([first_to_world_9d],dtype=np.float32),
"scene_name": scene_name,
"max_coverage_rate": max_coverage_rate,
"voxel_threshold": voxel_threshold,
"filter_degree": self.filter_degree,
"O_to_L_pose": first_O_to_first_L_pose,
"first_frame_coverage": first_frame_coverage,
"scene_path": scene_path,
"model_points_normals": model_points_normals,
"best_seq_len": data_item_info["best_seq_len"],
"first_frame_id": first_frame_idx,
}
return data_item
def __len__(self):
return len(self.datalist)
def get_collate_fn(self):
def collate_fn(batch):
collate_data = {}
collate_data["first_pts"] = [torch.tensor(item['first_pts']) for item in batch]
collate_data["first_to_world_9d"] = [torch.tensor(item['first_to_world_9d']) for item in batch]
collate_data["combined_scanned_pts"] = torch.stack([torch.tensor(item['combined_scanned_pts']) for item in batch])
for key in batch[0].keys():
if key not in ["first_pts", "first_to_world_9d", "combined_scanned_pts"]:
collate_data[key] = [item[key] for item in batch]
return collate_data
return collate_fn
# -------------- Debug ---------------- #
if __name__ == "__main__":
import torch
seed = 0
torch.manual_seed(seed)
np.random.seed(seed)
config = {
"root_dir": "/home/data/hofee/project/nbv_rec/data/nbv_rec_data_512_preproc_npy",
"split_file": "/home/data/hofee/project/nbv_rec/data/OmniObject3d_train.txt",
"model_dir": "/home/data/hofee/project/nbv_rec/data/scaled_object_meshes",
"ratio": 0.005,
"batch_size": 2,
"filter_degree": 75,
"num_workers": 0,
"pts_num": 32684,
"type": namespace.Mode.TEST,
"load_from_preprocess": True
}
ds = SeqNBVReconstructionDataset(config)
print(len(ds))
#ds.__getitem__(10)
dl = ds.get_loader(shuffle=True)
for idx, data in enumerate(dl):
data = ds.process_batch(data, "cuda:0")
print(data)
# ------ Debug Start ------
import ipdb;ipdb.set_trace()
# ------ Debug End ------+

140
core/pipeline.py Normal file
View File

@ -0,0 +1,140 @@
import torch
import time
from torch import nn
import PytorchBoot.namespace as namespace
import PytorchBoot.stereotype as stereotype
from PytorchBoot.factory.component_factory import ComponentFactory
from PytorchBoot.utils import Log
@stereotype.pipeline("nbv_reconstruction_pipeline")
class NBVReconstructionPipeline(nn.Module):
def __init__(self, config):
super(NBVReconstructionPipeline, self).__init__()
self.config = config
self.module_config = config["modules"]
self.pts_encoder = ComponentFactory.create(
namespace.Stereotype.MODULE, self.module_config["pts_encoder"]
)
self.pose_encoder = ComponentFactory.create(
namespace.Stereotype.MODULE, self.module_config["pose_encoder"]
)
self.seq_encoder = ComponentFactory.create(
namespace.Stereotype.MODULE, self.module_config["seq_encoder"]
)
self.view_finder = ComponentFactory.create(
namespace.Stereotype.MODULE, self.module_config["view_finder"]
)
self.eps = float(self.config["eps"])
def forward(self, data):
mode = data["mode"]
if mode == namespace.Mode.TRAIN:
return self.forward_train(data)
elif mode == namespace.Mode.TEST:
return self.forward_test(data)
else:
Log.error("Unknown mode: {}".format(mode), True)
def pertube_data(self, gt_delta_9d):
bs = gt_delta_9d.shape[0]
random_t = (
torch.rand(bs, device=gt_delta_9d.device) * (1.0 - self.eps) + self.eps
)
random_t = random_t.unsqueeze(-1)
mu, std = self.view_finder.marginal_prob(gt_delta_9d, random_t)
std = std.view(-1, 1)
z = torch.randn_like(gt_delta_9d)
perturbed_x = mu + z * std
target_score = -z * std / (std**2)
return perturbed_x, random_t, target_score, std
def forward_train(self, data):
main_feat = self.get_main_feat(data)
""" get std """
best_to_world_pose_9d_batch = data["best_to_world_pose_9d"]
perturbed_x, random_t, target_score, std = self.pertube_data(
best_to_world_pose_9d_batch
)
input_data = {
"sampled_pose": perturbed_x,
"t": random_t,
"main_feat": main_feat,
}
estimated_score = self.view_finder(input_data)
output = {
"estimated_score": estimated_score,
"target_score": target_score,
"std": std,
}
return output
def forward_test(self, data):
main_feat = self.get_main_feat(data)
repeat_num = data.get("repeat_num", 1)
main_feat = main_feat.repeat(repeat_num, 1)
estimated_delta_rot_9d, in_process_sample = self.view_finder.next_best_view(
main_feat
)
result = {
"pred_pose_9d": estimated_delta_rot_9d,
"in_process_sample": in_process_sample,
}
return result
def get_main_feat(self, data):
scanned_n_to_world_pose_9d_batch = data[
"scanned_n_to_world_pose_9d"
] # List(B): Tensor(S x 9)
scanned_pts_mask_batch = data["scanned_pts_mask"] # List(B): Tensor(S x N)
scanned_pts_mask_batch = data["scanned_pts_mask"] # List(B): Tensor(N)
device = next(self.parameters()).device
embedding_list_batch = []
combined_scanned_pts_batch = data["combined_scanned_pts"] # Tensor(B x N x 3)
global_scanned_feat, per_point_feat_batch = self.pts_encoder.encode_points(
combined_scanned_pts_batch, require_per_point_feat=True
) # global_scanned_feat: Tensor(B x Dg)
batch_size = len(scanned_n_to_world_pose_9d_batch)
for i in range(batch_size):
seq_len = len(scanned_n_to_world_pose_9d_batch[i])
scanned_n_to_world_pose_9d = scanned_n_to_world_pose_9d_batch[i].to(device) # Tensor(S x 9)
scanned_pts_mask = scanned_pts_mask_batch[i] # Tensor(S x N)
per_point_feat = per_point_feat_batch[i] # Tensor(N x Dp)
partial_point_feat_seq = []
for j in range(seq_len):
partial_per_point_feat = per_point_feat[scanned_pts_mask[j]]
if partial_per_point_feat.shape[0] == 0:
partial_point_feat = torch.zeros(per_point_feat.shape[1], device=device)
else:
partial_point_feat = torch.mean(partial_per_point_feat, dim=0) # Tensor(Dp)
partial_point_feat_seq.append(partial_point_feat)
partial_point_feat_seq = torch.stack(partial_point_feat_seq, dim=0) # Tensor(S x Dp)
pose_feat_seq = self.pose_encoder.encode_pose(scanned_n_to_world_pose_9d) # Tensor(S x Dp)
seq_embedding = torch.cat([partial_point_feat_seq, pose_feat_seq], dim=-1)
embedding_list_batch.append(seq_embedding) # List(B): Tensor(S x (Dp))
seq_feat = self.seq_encoder.encode_sequence(embedding_list_batch) # Tensor(B x Ds)
main_feat = torch.cat([seq_feat, global_scanned_feat], dim=-1) # Tensor(B x (Ds+Dg))
if torch.isnan(main_feat).any():
for i in range(len(main_feat)):
if torch.isnan(main_feat[i]).any():
scanned_pts_mask = scanned_pts_mask_batch[i]
Log.info(f"scanned_pts_mask shape: {scanned_pts_mask.shape}")
Log.info(f"scanned_pts_mask sum: {scanned_pts_mask.sum()}")
import ipdb
ipdb.set_trace()
Log.error("nan in main_feat", True)
return main_feat

209
core/seq_dataset.py Normal file
View File

@ -0,0 +1,209 @@
import numpy as np
from PytorchBoot.dataset import BaseDataset
import PytorchBoot.namespace as namespace
import PytorchBoot.stereotype as stereotype
from PytorchBoot.config import ConfigManager
from PytorchBoot.utils.log_util import Log
import torch
import os
import sys
sys.path.append(r"/media/hofee/data/project/python/nbv_reconstruction/nbv_reconstruction")
from utils.data_load import DataLoadUtil
from utils.pose import PoseUtil
from utils.pts import PtsUtil
@stereotype.dataset("seq_reconstruction_dataset")
class SeqReconstructionDataset(BaseDataset):
def __init__(self, config):
super(SeqReconstructionDataset, self).__init__(config)
self.config = config
self.root_dir = config["root_dir"]
self.split_file_path = config["split_file"]
self.scene_name_list = self.load_scene_name_list()
self.datalist = self.get_datalist()
self.pts_num = config["pts_num"]
self.type = config["type"]
self.cache = config.get("cache")
self.load_from_preprocess = config.get("load_from_preprocess", False)
if self.type == namespace.Mode.TEST:
#self.model_dir = config["model_dir"]
self.filter_degree = config["filter_degree"]
if self.type == namespace.Mode.TRAIN:
scale_ratio = 1
self.datalist = self.datalist*scale_ratio
if self.cache:
expr_root = ConfigManager.get("runner", "experiment", "root_dir")
expr_name = ConfigManager.get("runner", "experiment", "name")
self.cache_dir = os.path.join(expr_root, expr_name, "cache")
# self.preprocess_cache()
def load_scene_name_list(self):
scene_name_list = []
with open(self.split_file_path, "r") as f:
for line in f:
scene_name = line.strip()
if not os.path.exists(os.path.join(self.root_dir, scene_name)):
continue
scene_name_list.append(scene_name)
return scene_name_list
def get_scene_name_list(self):
return self.scene_name_list
def get_datalist(self):
datalist = []
total = len(self.scene_name_list)
for idx, scene_name in enumerate(self.scene_name_list):
print(f"processing {scene_name} ({idx}/{total})")
scene_max_cr_idx = 0
frame_len = DataLoadUtil.get_scene_seq_length(self.root_dir, scene_name)
for i in range(10,frame_len):
path = DataLoadUtil.get_path(self.root_dir, scene_name, i)
pts = DataLoadUtil.load_from_preprocessed_pts(path, "npy")
print(pts.shape)
if pts.shape[0] == 0:
continue
else:
break
print(i)
datalist.append({
"scene_name": scene_name,
"first_frame": i,
"best_seq_len": -1,
"max_coverage_rate": 1.0,
"label_idx": scene_max_cr_idx,
})
return datalist
def preprocess_cache(self):
Log.info("preprocessing cache...")
for item_idx in range(len(self.datalist)):
self.__getitem__(item_idx)
Log.success("finish preprocessing cache.")
def load_from_cache(self, scene_name, curr_frame_idx):
cache_name = f"{scene_name}_{curr_frame_idx}.txt"
cache_path = os.path.join(self.cache_dir, cache_name)
if os.path.exists(cache_path):
data = np.loadtxt(cache_path)
return data
else:
return None
def save_to_cache(self, scene_name, curr_frame_idx, data):
cache_name = f"{scene_name}_{curr_frame_idx}.txt"
cache_path = os.path.join(self.cache_dir, cache_name)
try:
np.savetxt(cache_path, data)
except Exception as e:
Log.error(f"Save cache failed: {e}")
def seq_combined_pts(self, scene, frame_idx_list):
all_combined_pts = []
for i in frame_idx_list:
path = DataLoadUtil.get_path(self.root_dir, scene, i)
pts = DataLoadUtil.load_from_preprocessed_pts(path,"npy")
if pts.shape[0] == 0:
continue
all_combined_pts.append(pts)
all_combined_pts = np.vstack(all_combined_pts)
downsampled_all_pts = PtsUtil.voxel_downsample_point_cloud(all_combined_pts, 0.003)
return downsampled_all_pts
def __getitem__(self, index):
data_item_info = self.datalist[index]
max_coverage_rate = data_item_info["max_coverage_rate"]
best_seq_len = data_item_info["best_seq_len"]
scene_name = data_item_info["scene_name"]
(
scanned_views_pts,
scanned_coverages_rate,
scanned_n_to_world_pose,
) = ([], [], [])
view = data_item_info["first_frame"]
frame_idx = view
view_path = DataLoadUtil.get_path(self.root_dir, scene_name, frame_idx)
cam_info = DataLoadUtil.load_cam_info(view_path, binocular=True)
n_to_world_pose = cam_info["cam_to_world"]
target_point_cloud = (
DataLoadUtil.load_from_preprocessed_pts(view_path)
)
downsampled_target_point_cloud = PtsUtil.random_downsample_point_cloud(
target_point_cloud, self.pts_num
)
scanned_views_pts.append(downsampled_target_point_cloud)
n_to_world_6d = PoseUtil.matrix_to_rotation_6d_numpy(
np.asarray(n_to_world_pose[:3, :3])
)
first_left_cam_pose = cam_info["cam_to_world"]
first_center_cam_pose = cam_info["cam_to_world_O"]
first_O_to_first_L_pose = np.dot(np.linalg.inv(first_left_cam_pose), first_center_cam_pose)
n_to_world_trans = n_to_world_pose[:3, 3]
n_to_world_9d = np.concatenate([n_to_world_6d, n_to_world_trans], axis=0)
scanned_n_to_world_pose.append(n_to_world_9d)
frame_list = []
for i in range(DataLoadUtil.get_scene_seq_length(self.root_dir, scene_name)):
frame_list.append(i)
gt_pts = self.seq_combined_pts(scene_name, frame_list)
data_item = {
"first_scanned_pts": np.asarray(scanned_views_pts, dtype=np.float32), # Ndarray(S x Nv x 3)
"first_scanned_n_to_world_pose_9d": np.asarray(scanned_n_to_world_pose, dtype=np.float32), # Ndarray(S x 9)
"seq_max_coverage_rate": max_coverage_rate, # Float, range(0, 1)
"best_seq_len": best_seq_len, # Int
"scene_name": scene_name, # String
"gt_pts": gt_pts, # Ndarray(N x 3)
"scene_path": os.path.join(self.root_dir, scene_name), # String
"O_to_L_pose": first_O_to_first_L_pose,
}
return data_item
def __len__(self):
return len(self.datalist)
# -------------- Debug ---------------- #
if __name__ == "__main__":
import torch
from tqdm import tqdm
import pickle
import os
seed = 0
torch.manual_seed(seed)
np.random.seed(seed)
config = {
"root_dir": "/media/hofee/data/data/test_bottle/view",
"source": "seq_reconstruction_dataset",
"split_file": "/media/hofee/data/data/test_bottle/test_bottle.txt",
"load_from_preprocess": True,
"filter_degree": 75,
"num_workers": 0,
"pts_num": 8192,
"type": namespace.Mode.TEST,
}
output_dir = "/media/hofee/data/data/test_bottle/preprocessed_dataset"
os.makedirs(output_dir, exist_ok=True)
ds = SeqReconstructionDataset(config)
for i in tqdm(range(len(ds)), desc="processing dataset"):
output_path = os.path.join(output_dir, f"item_{i}.pkl")
item = ds.__getitem__(i)
for key, value in item.items():
if isinstance(value, np.ndarray):
item[key] = value.tolist()
#import ipdb; ipdb.set_trace()
with open(output_path, "wb") as f:
pickle.dump(item, f)

View File

@ -0,0 +1,82 @@
import numpy as np
from PytorchBoot.dataset import BaseDataset
import PytorchBoot.namespace as namespace
import PytorchBoot.stereotype as stereotype
from PytorchBoot.config import ConfigManager
from PytorchBoot.utils.log_util import Log
import pickle
import torch
import os
import sys
sys.path.append(r"C:\Document\Local Project\nbv_rec\nbv_reconstruction")
from utils.data_load import DataLoadUtil
from utils.pose import PoseUtil
from utils.pts import PtsUtil
@stereotype.dataset("seq_reconstruction_dataset_preprocessed")
class SeqReconstructionDatasetPreprocessed(BaseDataset):
def __init__(self, config):
super(SeqReconstructionDatasetPreprocessed, self).__init__(config)
self.config = config
self.root_dir = config["root_dir"]
self.real_root_dir = r"/media/hofee/repository/final_test_set/view"
self.item_list = os.listdir(self.root_dir)
def __getitem__(self, index):
data = pickle.load(open(os.path.join(self.root_dir, self.item_list[index]), "rb"))
data_item = {
"first_scanned_pts": np.asarray(data["first_scanned_pts"], dtype=np.float32), # Ndarray(S x Nv x 3)
"first_scanned_n_to_world_pose_9d": np.asarray(data["first_scanned_n_to_world_pose_9d"], dtype=np.float32), # Ndarray(S x 9)
"seq_max_coverage_rate": data["seq_max_coverage_rate"], # Float, range(0, 1)
"best_seq_len": data["best_seq_len"], # Int
"scene_name": data["scene_name"], # String
"gt_pts": np.asarray(data["gt_pts"], dtype=np.float32), # Ndarray(N x 3)
"scene_path": os.path.join(self.real_root_dir, data["scene_name"]), # String
"O_to_L_pose": np.asarray(data["O_to_L_pose"], dtype=np.float32),
}
return data_item
def __len__(self):
return len(self.item_list)
# -------------- Debug ---------------- #
if __name__ == "__main__":
import torch
seed = 0
torch.manual_seed(seed)
np.random.seed(seed)
'''
OmniObject3d_test:
root_dir: "H:\\AI\\Datasets\\packed_test_data"
model_dir: "H:\\AI\\Datasets\\scaled_object_meshes"
source: seq_reconstruction_dataset
split_file: "H:\\AI\\Datasets\\data_list\\OmniObject3d_test.txt"
type: test
filter_degree: 75
eval_list:
- pose_diff
- coverage_rate_increase
ratio: 0.1
batch_size: 1
num_workers: 12
pts_num: 8192
load_from_preprocess: True
'''
config = {
"root_dir": "/media/hofee/data/data/test_bottle/preprocessed_dataset",
"source": "seq_reconstruction_dataset",
"split_file": "H:\\AI\\Datasets\\data_list\\OmniObject3d_test.txt",
"load_from_preprocess": True,
"ratio": 1,
"filter_degree": 75,
"num_workers": 0,
"pts_num": 8192,
"type": "test",
}
ds = SeqReconstructionDataset(config)
print(len(ds))
print(ds.__getitem__(10))

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

View File

@ -0,0 +1,6 @@
from modules.func_lib.samplers import (
cond_ode_sampler
)
from modules.func_lib.sde import (
init_sde
)

Binary file not shown.

Binary file not shown.

Binary file not shown.

View File

@ -0,0 +1,95 @@
import torch
import numpy as np
from scipy import integrate
from utils.pose import PoseUtil
def global_prior_likelihood(z, sigma_max):
"""The likelihood of a Gaussian distribution with mean zero and
standard deviation sigma."""
# z: [bs, pose_dim]
shape = z.shape
N = np.prod(shape[1:]) # pose_dim
return -N / 2.0 * torch.log(2 * np.pi * sigma_max**2) - torch.sum(
z**2, dim=-1
) / (2 * sigma_max**2)
def cond_ode_sampler(
score_model,
data,
prior,
sde_coeff,
atol=1e-5,
rtol=1e-5,
device="cuda",
eps=1e-5,
T=1.0,
num_steps=None,
pose_mode="quat_wxyz",
denoise=True,
init_x=None,
):
pose_dim = PoseUtil.get_pose_dim(pose_mode)
batch_size = data["main_feat"].shape[0]
init_x = (
prior((batch_size, pose_dim), T=T).to(device)
if init_x is None
else init_x + prior((batch_size, pose_dim), T=T).to(device)
)
shape = init_x.shape
def score_eval_wrapper(data):
"""A wrapper of the score-based model for use by the ODE solver."""
with torch.no_grad():
score = score_model(data)
return score.cpu().numpy().reshape((-1,))
def ode_func(t, x):
"""The ODE function for use by the ODE solver."""
x = torch.tensor(x.reshape(-1, pose_dim), dtype=torch.float32, device=device)
time_steps = torch.ones(batch_size, device=device).unsqueeze(-1) * t
drift, diffusion = sde_coeff(torch.tensor(t))
drift = drift.cpu().numpy()
diffusion = diffusion.cpu().numpy()
data["sampled_pose"] = x
data["t"] = time_steps
return drift - 0.5 * (diffusion**2) * score_eval_wrapper(data)
# Run the black-box ODE solver, note the
t_eval = None
if num_steps is not None:
# num_steps, from T -> eps
t_eval = np.linspace(T, eps, num_steps)
res = integrate.solve_ivp(
ode_func,
(T, eps),
init_x.reshape(-1).cpu().numpy(),
rtol=rtol,
atol=atol,
method="RK45",
t_eval=t_eval,
)
xs = torch.tensor(res.y, device=device).T.view(
-1, batch_size, pose_dim
) # [num_steps, bs, pose_dim]
x = torch.tensor(res.y[:, -1], device=device).reshape(shape) # [bs, pose_dim]
# denoise, using the predictor step in P-C sampler
if denoise:
# Reverse diffusion predictor for denoising
vec_eps = torch.ones((x.shape[0], 1), device=x.device) * eps
drift, diffusion = sde_coeff(vec_eps)
data["sampled_pose"] = x.float()
data["t"] = vec_eps
grad = score_model(data)
drift = drift - diffusion**2 * grad # R-SDE
mean_x = x + drift * ((1 - eps) / (1000 if num_steps is None else num_steps))
x = mean_x
num_steps = xs.shape[0]
xs = xs.reshape(batch_size*num_steps, -1)
xs[:, :-3] = PoseUtil.normalize_rotation(xs[:, :-3], pose_mode)
xs = xs.reshape(num_steps, batch_size, -1)
x[:, :-3] = PoseUtil.normalize_rotation(x[:, :-3], pose_mode)
return xs.permute(1, 0, 2), x

121
modules/func_lib/sde.py Normal file
View File

@ -0,0 +1,121 @@
import functools
import torch
import numpy as np
# ----- VE SDE -----
# ------------------
def ve_marginal_prob(x, t, sigma_min=0.01, sigma_max=90):
std = sigma_min * (sigma_max / sigma_min) ** t
mean = x
return mean, std
def ve_sde(t, sigma_min=0.01, sigma_max=90):
sigma = sigma_min * (sigma_max / sigma_min) ** t
drift_coeff = torch.tensor(0)
diffusion_coeff = sigma * torch.sqrt(torch.tensor(2 * (np.log(sigma_max) - np.log(sigma_min)), device=t.device))
return drift_coeff, diffusion_coeff
def ve_prior(shape, sigma_min=0.01, sigma_max=90, T=1.0):
_, sigma_max_prior = ve_marginal_prob(None, T, sigma_min=sigma_min, sigma_max=sigma_max)
return torch.randn(*shape) * sigma_max_prior
# ----- VP SDE -----
# ------------------
def vp_marginal_prob(x, t, beta_0=0.1, beta_1=20):
log_mean_coeff = -0.25 * t ** 2 * (beta_1 - beta_0) - 0.5 * t * beta_0
mean = torch.exp(log_mean_coeff) * x
std = torch.sqrt(1. - torch.exp(2. * log_mean_coeff))
return mean, std
def vp_sde(t, beta_0=0.1, beta_1=20):
beta_t = beta_0 + t * (beta_1 - beta_0)
drift_coeff = -0.5 * beta_t
diffusion_coeff = torch.sqrt(beta_t)
return drift_coeff, diffusion_coeff
def vp_prior(shape, beta_0=0.1, beta_1=20):
return torch.randn(*shape)
# ----- sub-VP SDE -----
# ----------------------
def subvp_marginal_prob(x, t, beta_0, beta_1):
log_mean_coeff = -0.25 * t ** 2 * (beta_1 - beta_0) - 0.5 * t * beta_0
mean = torch.exp(log_mean_coeff) * x
std = 1 - torch.exp(2. * log_mean_coeff)
return mean, std
def subvp_sde(t, beta_0, beta_1):
beta_t = beta_0 + t * (beta_1 - beta_0)
drift_coeff = -0.5 * beta_t
discount = 1. - torch.exp(-2 * beta_0 * t - (beta_1 - beta_0) * t ** 2)
diffusion_coeff = torch.sqrt(beta_t * discount)
return drift_coeff, diffusion_coeff
def subvp_prior(shape, beta_0=0.1, beta_1=20):
return torch.randn(*shape)
# ----- EDM SDE -----
# ------------------
def edm_marginal_prob(x, t, sigma_min=0.002, sigma_max=80):
std = t
mean = x
return mean, std
def edm_sde(t, sigma_min=0.002, sigma_max=80):
drift_coeff = torch.tensor(0)
diffusion_coeff = torch.sqrt(2 * t)
return drift_coeff, diffusion_coeff
def edm_prior(shape, sigma_min=0.002, sigma_max=80):
return torch.randn(*shape) * sigma_max
def init_sde(sde_mode):
# the SDE-related hyperparameters are copied from https://github.com/yang-song/score_sde_pytorch
if sde_mode == 'edm':
sigma_min = 0.002
sigma_max = 80
eps = 0.002
prior_fn = functools.partial(edm_prior, sigma_min=sigma_min, sigma_max=sigma_max)
marginal_prob_fn = functools.partial(edm_marginal_prob, sigma_min=sigma_min, sigma_max=sigma_max)
sde_fn = functools.partial(edm_sde, sigma_min=sigma_min, sigma_max=sigma_max)
T = sigma_max
elif sde_mode == 've':
sigma_min = 0.01
sigma_max = 50
eps = 1e-5
marginal_prob_fn = functools.partial(ve_marginal_prob, sigma_min=sigma_min, sigma_max=sigma_max)
sde_fn = functools.partial(ve_sde, sigma_min=sigma_min, sigma_max=sigma_max)
T = 1.0
prior_fn = functools.partial(ve_prior, sigma_min=sigma_min, sigma_max=sigma_max)
elif sde_mode == 'vp':
beta_0 = 0.1
beta_1 = 20
eps = 1e-3
prior_fn = functools.partial(vp_prior, beta_0=beta_0, beta_1=beta_1)
marginal_prob_fn = functools.partial(vp_marginal_prob, beta_0=beta_0, beta_1=beta_1)
sde_fn = functools.partial(vp_sde, beta_0=beta_0, beta_1=beta_1)
T = 1.0
elif sde_mode == 'subvp':
beta_0 = 0.1
beta_1 = 20
eps = 1e-3
prior_fn = functools.partial(subvp_prior, beta_0=beta_0, beta_1=beta_1)
marginal_prob_fn = functools.partial(subvp_marginal_prob, beta_0=beta_0, beta_1=beta_1)
sde_fn = functools.partial(subvp_sde, beta_0=beta_0, beta_1=beta_1)
T = 1.0
else:
raise NotImplementedError
return prior_fn, marginal_prob_fn, sde_fn, eps, T

167
modules/gf_view_finder.py Normal file
View File

@ -0,0 +1,167 @@
import torch
import torch.nn as nn
import PytorchBoot.stereotype as stereotype
from utils.pose import PoseUtil
import modules.module_lib as mlib
import modules.func_lib as flib
def zero_module(module):
"""
Zero out the parameters of a module and return it.
"""
for p in module.parameters():
p.detach().zero_()
return module
@stereotype.module("gf_view_finder")
class GradientFieldViewFinder(nn.Module):
def __init__(self, config):
super(GradientFieldViewFinder, self).__init__()
self.regression_head = config["regression_head"]
self.per_point_feature = config["per_point_feature"]
self.act = nn.ReLU(True)
self.sample_mode = config["sample_mode"]
self.pose_mode = config["pose_mode"]
pose_dim = PoseUtil.get_pose_dim(self.pose_mode)
self.prior_fn, self.marginal_prob_fn, self.sde_fn, self.sampling_eps, self.T = flib.init_sde(config["sde_mode"])
self.sampling_steps = config["sampling_steps"]
self.t_feat_dim = config["t_feat_dim"]
self.pose_feat_dim = config["pose_feat_dim"]
self.main_feat_dim = config["main_feat_dim"]
''' encode pose '''
self.pose_encoder = nn.Sequential(
nn.Linear(pose_dim, self.pose_feat_dim ),
self.act,
nn.Linear(self.pose_feat_dim , self.pose_feat_dim ),
self.act,
)
''' encode t '''
self.t_encoder = nn.Sequential(
mlib.GaussianFourierProjection(embed_dim=self.t_feat_dim ),
nn.Linear(self.t_feat_dim , self.t_feat_dim ),
self.act,
)
''' fusion tail '''
if self.regression_head == 'Rx_Ry_and_T':
if self.pose_mode != 'rot_matrix':
raise NotImplementedError
if not self.per_point_feature:
''' rotation_x_axis regress head '''
self.fusion_tail_rot_x = nn.Sequential(
nn.Linear(self.t_feat_dim + self.pose_feat_dim + self.main_feat_dim, 256),
self.act,
zero_module(nn.Linear(256, 3)),
)
self.fusion_tail_rot_y = nn.Sequential(
nn.Linear(self.t_feat_dim + self.pose_feat_dim + self.main_feat_dim, 256),
self.act,
zero_module(nn.Linear(256, 3)),
)
''' tranalation regress head '''
self.fusion_tail_trans = nn.Sequential(
nn.Linear(self.t_feat_dim + self.pose_feat_dim + self.main_feat_dim, 256),
self.act,
zero_module(nn.Linear(256, 3)),
)
else:
raise NotImplementedError
else:
raise NotImplementedError
def forward(self, data):
"""
Args:
data, dict {
'main_feat': [bs, c]
'pose_sample': [bs, pose_dim]
't': [bs, 1]
}
"""
main_feat = data['main_feat']
sampled_pose = data['sampled_pose']
t = data['t']
t_feat = self.t_encoder(t.squeeze(1))
pose_feat = self.pose_encoder(sampled_pose)
if self.per_point_feature:
raise NotImplementedError
else:
total_feat = torch.cat([main_feat, t_feat, pose_feat], dim=-1)
_, std = self.marginal_prob_fn(total_feat, t)
if self.regression_head == 'Rx_Ry_and_T':
rot_x = self.fusion_tail_rot_x(total_feat)
rot_y = self.fusion_tail_rot_y(total_feat)
trans = self.fusion_tail_trans(total_feat)
out_score = torch.cat([rot_x, rot_y, trans], dim=-1) / (std+1e-7) # normalisation
else:
raise NotImplementedError
return out_score
def marginal_prob(self, x, t):
return self.marginal_prob_fn(x,t)
def sample(self, data, atol=1e-5, rtol=1e-5, snr=0.16, denoise=True, init_x=None, T0=None):
if self.sample_mode == 'ode':
T0 = self.T if T0 is None else T0
in_process_sample, res = flib.cond_ode_sampler(
score_model=self,
data=data,
prior=self.prior_fn,
sde_coeff=self.sde_fn,
atol=atol,
rtol=rtol,
eps=self.sampling_eps,
T=T0,
num_steps=self.sampling_steps,
pose_mode=self.pose_mode,
denoise=denoise,
init_x=init_x
)
else:
raise NotImplementedError
return in_process_sample, res
def next_best_view(self, main_feat):
data = {
'main_feat': main_feat,
}
in_process_sample, res = self.sample(data)
return res.to(dtype=torch.float32), in_process_sample
''' ----------- DEBUG -----------'''
if __name__ == "__main__":
config = {
"regression_head": "Rx_Ry_and_T",
"per_point_feature": False,
"pose_mode": "rot_matrix",
"sde_mode": "ve",
"sampling_steps": 500,
"sample_mode": "ode"
}
test_seq_feat = torch.rand(32, 2048).to("cuda:0")
test_pose = torch.rand(32, 9).to("cuda:0")
test_t = torch.rand(32, 1).to("cuda:0")
view_finder = GradientFieldViewFinder(config).to("cuda:0")
test_data = {
'seq_feat': test_seq_feat,
'sampled_pose': test_pose,
't': test_t
}
score = view_finder(test_data)
print(score.shape)
res, inprocess = view_finder.next_best_view(test_seq_feat)
print(res.shape, inprocess.shape)

View File

@ -0,0 +1,91 @@
import torch
import torch.nn as nn
import PytorchBoot.stereotype as stereotype
from utils.pose import PoseUtil
import modules.module_lib as mlib
import modules.func_lib as flib
def zero_module(module):
"""
Zero out the parameters of a module and return it.
"""
for p in module.parameters():
p.detach().zero_()
return module
@stereotype.module("mlp_view_finder")
class MLPViewFinder(nn.Module):
def __init__(self, config):
super(MLPViewFinder, self).__init__()
self.regression_head = 'Rx_Ry_and_T'
self.per_point_feature = False
self.act = nn.ReLU(True)
self.main_feat_dim = config["main_feat_dim"]
''' rotation_x_axis regress head '''
self.fusion_tail_rot_x = nn.Sequential(
nn.Linear(self.main_feat_dim, 256),
self.act,
zero_module(nn.Linear(256, 3)),
)
self.fusion_tail_rot_y = nn.Sequential(
nn.Linear(self.main_feat_dim, 256),
self.act,
zero_module(nn.Linear(256, 3)),
)
''' tranalation regress head '''
self.fusion_tail_trans = nn.Sequential(
nn.Linear(self.main_feat_dim, 256),
self.act,
zero_module(nn.Linear(256, 3)),
)
def forward(self, data):
"""
Args:
data, dict {
'main_feat': [bs, c]
}
"""
total_feat = data['main_feat']
rot_x = self.fusion_tail_rot_x(total_feat)
rot_y = self.fusion_tail_rot_y(total_feat)
trans = self.fusion_tail_trans(total_feat)
output = torch.cat([rot_x,rot_y,trans], dim=-1)
return output
def next_best_view(self, main_feat):
data = {
'main_feat': main_feat,
}
res = self(data)
return res.to(dtype=torch.float32), None
''' ----------- DEBUG -----------'''
if __name__ == "__main__":
config = {
"regression_head": "Rx_Ry_and_T",
"per_point_feature": False,
"pose_mode": "rot_matrix",
"sde_mode": "ve",
"sampling_steps": 500,
"sample_mode": "ode"
}
test_seq_feat = torch.rand(32, 2048).to("cuda:0")
test_pose = torch.rand(32, 9).to("cuda:0")
test_t = torch.rand(32, 1).to("cuda:0")
view_finder = GradientFieldViewFinder(config).to("cuda:0")
test_data = {
'seq_feat': test_seq_feat,
'sampled_pose': test_pose,
't': test_t
}
score = view_finder(test_data)
print(score.shape)
res, inprocess = view_finder.next_best_view(test_seq_feat)
print(res.shape, inprocess.shape)

View File

@ -0,0 +1,2 @@
from modules.module_lib.gaussian_fourier_projection import GaussianFourierProjection
from modules.module_lib.linear import Linear

Binary file not shown.

Binary file not shown.

View File

@ -0,0 +1,17 @@
import torch
import numpy as np
import torch.nn as nn
class GaussianFourierProjection(nn.Module):
"""Gaussian random features for encoding time steps."""
def __init__(self, embed_dim, scale=30.):
super().__init__()
# Randomly sample weights during initialization. These weights are fixed
# during optimization and are not trainable.
self.W = nn.Parameter(torch.randn(embed_dim // 2) * scale, requires_grad=False)
def forward(self, x):
x_proj = x[:, None] * self.W[None, :] * 2 * np.pi
return torch.cat([torch.sin(x_proj), torch.cos(x_proj)], dim=-1)

View File

@ -0,0 +1,30 @@
import torch
import numpy as np
def weight_init(shape, mode, fan_in, fan_out):
if mode == 'xavier_uniform':
return np.sqrt(6 / (fan_in + fan_out)) * (torch.rand(*shape) * 2 - 1)
if mode == 'xavier_normal':
return np.sqrt(2 / (fan_in + fan_out)) * torch.randn(*shape)
if mode == 'kaiming_uniform':
return np.sqrt(3 / fan_in) * (torch.rand(*shape) * 2 - 1)
if mode == 'kaiming_normal':
return np.sqrt(1 / fan_in) * torch.randn(*shape)
raise ValueError(f'Invalid init mode "{mode}"')
class Linear(torch.nn.Module):
def __init__(self, in_features, out_features, bias=True, init_mode='kaiming_normal', init_weight=1, init_bias=0):
super().__init__()
self.in_features = in_features
self.out_features = out_features
init_kwargs = dict(mode=init_mode, fan_in=in_features, fan_out=out_features)
self.weight = torch.nn.Parameter(weight_init([out_features, in_features], **init_kwargs) * init_weight)
self.bias = torch.nn.Parameter(weight_init([out_features], **init_kwargs) * init_bias) if bias else None
def forward(self, x):
x = x @ self.weight.to(x.dtype).t()
if self.bias is not None:
x = x.add_(self.bias.to(x.dtype))
return x

View File

@ -0,0 +1,162 @@
import torch
import torch.nn as nn
import torch.nn.functional as F
from . import pointnet2_utils
from . import pytorch_utils as pt_utils
from typing import List
class _PointnetSAModuleBase(nn.Module):
def __init__(self):
super().__init__()
self.npoint = None
self.groupers = None
self.mlps = None
self.pool_method = 'max_pool'
def forward(self, xyz: torch.Tensor, features: torch.Tensor = None, new_xyz=None) -> (torch.Tensor, torch.Tensor):
"""
:param xyz: (B, N, 3) tensor of the xyz coordinates of the features
:param features: (B, N, C) tensor of the descriptors of the the features
:param new_xyz:
:return:
new_xyz: (B, npoint, 3) tensor of the new features' xyz
new_features: (B, npoint, \sum_k(mlps[k][-1])) tensor of the new_features descriptors
"""
new_features_list = []
xyz_flipped = xyz.transpose(1, 2).contiguous()
if new_xyz is None:
new_xyz = pointnet2_utils.gather_operation(
xyz_flipped,
pointnet2_utils.furthest_point_sample(xyz, self.npoint)
).transpose(1, 2).contiguous() if self.npoint is not None else None
for i in range(len(self.groupers)):
new_features = self.groupers[i](xyz, new_xyz, features) # (B, C, npoint, nsample)
new_features = self.mlps[i](new_features) # (B, mlp[-1], npoint, nsample)
if self.pool_method == 'max_pool':
new_features = F.max_pool2d(
new_features, kernel_size=[1, new_features.size(3)]
) # (B, mlp[-1], npoint, 1)
elif self.pool_method == 'avg_pool':
new_features = F.avg_pool2d(
new_features, kernel_size=[1, new_features.size(3)]
) # (B, mlp[-1], npoint, 1)
else:
raise NotImplementedError
new_features = new_features.squeeze(-1) # (B, mlp[-1], npoint)
new_features_list.append(new_features)
return new_xyz, torch.cat(new_features_list, dim=1)
class PointnetSAModuleMSG(_PointnetSAModuleBase):
"""Pointnet set abstraction layer with multiscale grouping"""
def __init__(self, *, npoint: int, radii: List[float], nsamples: List[int], mlps: List[List[int]], bn: bool = True,
use_xyz: bool = True, pool_method='max_pool', instance_norm=False):
"""
:param npoint: int
:param radii: list of float, list of radii to group with
:param nsamples: list of int, number of samples in each ball query
:param mlps: list of list of int, spec of the pointnet before the global pooling for each scale
:param bn: whether to use batchnorm
:param use_xyz:
:param pool_method: max_pool / avg_pool
:param instance_norm: whether to use instance_norm
"""
super().__init__()
assert len(radii) == len(nsamples) == len(mlps)
self.npoint = npoint
self.groupers = nn.ModuleList()
self.mlps = nn.ModuleList()
for i in range(len(radii)):
radius = radii[i]
nsample = nsamples[i]
self.groupers.append(
pointnet2_utils.QueryAndGroup(radius, nsample, use_xyz=use_xyz)
if npoint is not None else pointnet2_utils.GroupAll(use_xyz)
)
mlp_spec = mlps[i]
if use_xyz:
mlp_spec[0] += 3
self.mlps.append(pt_utils.SharedMLP(mlp_spec, bn=bn, instance_norm=instance_norm))
self.pool_method = pool_method
class PointnetSAModule(PointnetSAModuleMSG):
"""Pointnet set abstraction layer"""
def __init__(self, *, mlp: List[int], npoint: int = None, radius: float = None, nsample: int = None,
bn: bool = True, use_xyz: bool = True, pool_method='max_pool', instance_norm=False):
"""
:param mlp: list of int, spec of the pointnet before the global max_pool
:param npoint: int, number of features
:param radius: float, radius of ball
:param nsample: int, number of samples in the ball query
:param bn: whether to use batchnorm
:param use_xyz:
:param pool_method: max_pool / avg_pool
:param instance_norm: whether to use instance_norm
"""
super().__init__(
mlps=[mlp], npoint=npoint, radii=[radius], nsamples=[nsample], bn=bn, use_xyz=use_xyz,
pool_method=pool_method, instance_norm=instance_norm
)
class PointnetFPModule(nn.Module):
r"""Propigates the features of one set to another"""
def __init__(self, *, mlp: List[int], bn: bool = True):
"""
:param mlp: list of int
:param bn: whether to use batchnorm
"""
super().__init__()
self.mlp = pt_utils.SharedMLP(mlp, bn=bn)
def forward(
self, unknown: torch.Tensor, known: torch.Tensor, unknow_feats: torch.Tensor, known_feats: torch.Tensor
) -> torch.Tensor:
"""
:param unknown: (B, n, 3) tensor of the xyz positions of the unknown features
:param known: (B, m, 3) tensor of the xyz positions of the known features
:param unknow_feats: (B, C1, n) tensor of the features to be propigated to
:param known_feats: (B, C2, m) tensor of features to be propigated
:return:
new_features: (B, mlp[-1], n) tensor of the features of the unknown features
"""
if known is not None:
dist, idx = pointnet2_utils.three_nn(unknown, known)
dist_recip = 1.0 / (dist + 1e-8)
norm = torch.sum(dist_recip, dim=2, keepdim=True)
weight = dist_recip / norm
interpolated_feats = pointnet2_utils.three_interpolate(known_feats, idx, weight)
else:
interpolated_feats = known_feats.expand(*known_feats.size()[0:2], unknown.size(1))
if unknow_feats is not None:
new_features = torch.cat([interpolated_feats, unknow_feats], dim=1) # (B, C2 + C1, n)
else:
new_features = interpolated_feats
new_features = new_features.unsqueeze(-1)
new_features = self.mlp(new_features)
return new_features.squeeze(-1)
if __name__ == "__main__":
pass

View File

@ -0,0 +1,291 @@
import torch
from torch.autograd import Variable
from torch.autograd import Function
import torch.nn as nn
from typing import Tuple
import sys
import pointnet2_cuda as pointnet2
class FurthestPointSampling(Function):
@staticmethod
def forward(ctx, xyz: torch.Tensor, npoint: int) -> torch.Tensor:
"""
Uses iterative furthest point sampling to select a set of npoint features that have the largest
minimum distance
:param ctx:
:param xyz: (B, N, 3) where N > npoint
:param npoint: int, number of features in the sampled set
:return:
output: (B, npoint) tensor containing the set
"""
assert xyz.is_contiguous()
B, N, _ = xyz.size()
output = torch.cuda.IntTensor(B, npoint)
temp = torch.cuda.FloatTensor(B, N).fill_(1e10)
pointnet2.furthest_point_sampling_wrapper(B, N, npoint, xyz, temp, output)
return output
@staticmethod
def backward(xyz, a=None):
return None, None
furthest_point_sample = FurthestPointSampling.apply
class GatherOperation(Function):
@staticmethod
def forward(ctx, features: torch.Tensor, idx: torch.Tensor) -> torch.Tensor:
"""
:param ctx:
:param features: (B, C, N)
:param idx: (B, npoint) index tensor of the features to gather
:return:
output: (B, C, npoint)
"""
assert features.is_contiguous()
assert idx.is_contiguous()
B, npoint = idx.size()
_, C, N = features.size()
output = torch.cuda.FloatTensor(B, C, npoint)
pointnet2.gather_points_wrapper(B, C, N, npoint, features, idx, output)
ctx.for_backwards = (idx, C, N)
return output
@staticmethod
def backward(ctx, grad_out):
idx, C, N = ctx.for_backwards
B, npoint = idx.size()
grad_features = Variable(torch.cuda.FloatTensor(B, C, N).zero_())
grad_out_data = grad_out.data.contiguous()
pointnet2.gather_points_grad_wrapper(B, C, N, npoint, grad_out_data, idx, grad_features.data)
return grad_features, None
gather_operation = GatherOperation.apply
class ThreeNN(Function):
@staticmethod
def forward(ctx, unknown: torch.Tensor, known: torch.Tensor) -> Tuple[torch.Tensor, torch.Tensor]:
"""
Find the three nearest neighbors of unknown in known
:param ctx:
:param unknown: (B, N, 3)
:param known: (B, M, 3)
:return:
dist: (B, N, 3) l2 distance to the three nearest neighbors
idx: (B, N, 3) index of 3 nearest neighbors
"""
assert unknown.is_contiguous()
assert known.is_contiguous()
B, N, _ = unknown.size()
m = known.size(1)
dist2 = torch.cuda.FloatTensor(B, N, 3)
idx = torch.cuda.IntTensor(B, N, 3)
pointnet2.three_nn_wrapper(B, N, m, unknown, known, dist2, idx)
return torch.sqrt(dist2), idx
@staticmethod
def backward(ctx, a=None, b=None):
return None, None
three_nn = ThreeNN.apply
class ThreeInterpolate(Function):
@staticmethod
def forward(ctx, features: torch.Tensor, idx: torch.Tensor, weight: torch.Tensor) -> torch.Tensor:
"""
Performs weight linear interpolation on 3 features
:param ctx:
:param features: (B, C, M) Features descriptors to be interpolated from
:param idx: (B, n, 3) three nearest neighbors of the target features in features
:param weight: (B, n, 3) weights
:return:
output: (B, C, N) tensor of the interpolated features
"""
assert features.is_contiguous()
assert idx.is_contiguous()
assert weight.is_contiguous()
B, c, m = features.size()
n = idx.size(1)
ctx.three_interpolate_for_backward = (idx, weight, m)
output = torch.cuda.FloatTensor(B, c, n)
pointnet2.three_interpolate_wrapper(B, c, m, n, features, idx, weight, output)
return output
@staticmethod
def backward(ctx, grad_out: torch.Tensor) -> Tuple[torch.Tensor, torch.Tensor, torch.Tensor]:
"""
:param ctx:
:param grad_out: (B, C, N) tensor with gradients of outputs
:return:
grad_features: (B, C, M) tensor with gradients of features
None:
None:
"""
idx, weight, m = ctx.three_interpolate_for_backward
B, c, n = grad_out.size()
grad_features = Variable(torch.cuda.FloatTensor(B, c, m).zero_())
grad_out_data = grad_out.data.contiguous()
pointnet2.three_interpolate_grad_wrapper(B, c, n, m, grad_out_data, idx, weight, grad_features.data)
return grad_features, None, None
three_interpolate = ThreeInterpolate.apply
class GroupingOperation(Function):
@staticmethod
def forward(ctx, features: torch.Tensor, idx: torch.Tensor) -> torch.Tensor:
"""
:param ctx:
:param features: (B, C, N) tensor of features to group
:param idx: (B, npoint, nsample) tensor containing the indicies of features to group with
:return:
output: (B, C, npoint, nsample) tensor
"""
assert features.is_contiguous()
assert idx.is_contiguous()
B, nfeatures, nsample = idx.size()
_, C, N = features.size()
output = torch.cuda.FloatTensor(B, C, nfeatures, nsample)
pointnet2.group_points_wrapper(B, C, N, nfeatures, nsample, features, idx, output)
ctx.for_backwards = (idx, N)
return output
@staticmethod
def backward(ctx, grad_out: torch.Tensor) -> Tuple[torch.Tensor, torch.Tensor]:
"""
:param ctx:
:param grad_out: (B, C, npoint, nsample) tensor of the gradients of the output from forward
:return:
grad_features: (B, C, N) gradient of the features
"""
idx, N = ctx.for_backwards
B, C, npoint, nsample = grad_out.size()
grad_features = Variable(torch.cuda.FloatTensor(B, C, N).zero_())
grad_out_data = grad_out.data.contiguous()
pointnet2.group_points_grad_wrapper(B, C, N, npoint, nsample, grad_out_data, idx, grad_features.data)
return grad_features, None
grouping_operation = GroupingOperation.apply
class BallQuery(Function):
@staticmethod
def forward(ctx, radius: float, nsample: int, xyz: torch.Tensor, new_xyz: torch.Tensor) -> torch.Tensor:
"""
:param ctx:
:param radius: float, radius of the balls
:param nsample: int, maximum number of features in the balls
:param xyz: (B, N, 3) xyz coordinates of the features
:param new_xyz: (B, npoint, 3) centers of the ball query
:return:
idx: (B, npoint, nsample) tensor with the indicies of the features that form the query balls
"""
assert new_xyz.is_contiguous()
assert xyz.is_contiguous()
B, N, _ = xyz.size()
npoint = new_xyz.size(1)
idx = torch.cuda.IntTensor(B, npoint, nsample).zero_()
pointnet2.ball_query_wrapper(B, N, npoint, radius, nsample, new_xyz, xyz, idx)
return idx
@staticmethod
def backward(ctx, a=None):
return None, None, None, None
ball_query = BallQuery.apply
class QueryAndGroup(nn.Module):
def __init__(self, radius: float, nsample: int, use_xyz: bool = True):
"""
:param radius: float, radius of ball
:param nsample: int, maximum number of features to gather in the ball
:param use_xyz:
"""
super().__init__()
self.radius, self.nsample, self.use_xyz = radius, nsample, use_xyz
def forward(self, xyz: torch.Tensor, new_xyz: torch.Tensor, features: torch.Tensor = None) -> Tuple[torch.Tensor]:
"""
:param xyz: (B, N, 3) xyz coordinates of the features
:param new_xyz: (B, npoint, 3) centroids
:param features: (B, C, N) descriptors of the features
:return:
new_features: (B, 3 + C, npoint, nsample)
"""
idx = ball_query(self.radius, self.nsample, xyz, new_xyz)
xyz_trans = xyz.transpose(1, 2).contiguous()
grouped_xyz = grouping_operation(xyz_trans, idx) # (B, 3, npoint, nsample)
grouped_xyz -= new_xyz.transpose(1, 2).unsqueeze(-1)
if features is not None:
grouped_features = grouping_operation(features, idx)
if self.use_xyz:
new_features = torch.cat([grouped_xyz, grouped_features], dim=1) # (B, C + 3, npoint, nsample)
else:
new_features = grouped_features
else:
assert self.use_xyz, "Cannot have not features and not use xyz as a feature!"
new_features = grouped_xyz
return new_features
class GroupAll(nn.Module):
def __init__(self, use_xyz: bool = True):
super().__init__()
self.use_xyz = use_xyz
def forward(self, xyz: torch.Tensor, new_xyz: torch.Tensor, features: torch.Tensor = None):
"""
:param xyz: (B, N, 3) xyz coordinates of the features
:param new_xyz: ignored
:param features: (B, C, N) descriptors of the features
:return:
new_features: (B, C + 3, 1, N)
"""
grouped_xyz = xyz.transpose(1, 2).unsqueeze(2)
if features is not None:
grouped_features = features.unsqueeze(2)
if self.use_xyz:
new_features = torch.cat([grouped_xyz, grouped_features], dim=1) # (B, 3 + C, 1, N)
else:
new_features = grouped_features
else:
new_features = grouped_xyz
return new_features

View File

@ -0,0 +1,236 @@
import torch.nn as nn
from typing import List, Tuple
class SharedMLP(nn.Sequential):
def __init__(
self,
args: List[int],
*,
bn: bool = False,
activation=nn.ReLU(inplace=True),
preact: bool = False,
first: bool = False,
name: str = "",
instance_norm: bool = False,
):
super().__init__()
for i in range(len(args) - 1):
self.add_module(
name + 'layer{}'.format(i),
Conv2d(
args[i],
args[i + 1],
bn=(not first or not preact or (i != 0)) and bn,
activation=activation
if (not first or not preact or (i != 0)) else None,
preact=preact,
instance_norm=instance_norm
)
)
class _ConvBase(nn.Sequential):
def __init__(
self,
in_size,
out_size,
kernel_size,
stride,
padding,
activation,
bn,
init,
conv=None,
batch_norm=None,
bias=True,
preact=False,
name="",
instance_norm=False,
instance_norm_func=None
):
super().__init__()
bias = bias and (not bn)
conv_unit = conv(
in_size,
out_size,
kernel_size=kernel_size,
stride=stride,
padding=padding,
bias=bias
)
init(conv_unit.weight)
if bias:
nn.init.constant_(conv_unit.bias, 0)
if bn:
if not preact:
bn_unit = batch_norm(out_size)
else:
bn_unit = batch_norm(in_size)
if instance_norm:
if not preact:
in_unit = instance_norm_func(out_size, affine=False, track_running_stats=False)
else:
in_unit = instance_norm_func(in_size, affine=False, track_running_stats=False)
if preact:
if bn:
self.add_module(name + 'bn', bn_unit)
if activation is not None:
self.add_module(name + 'activation', activation)
if not bn and instance_norm:
self.add_module(name + 'in', in_unit)
self.add_module(name + 'conv', conv_unit)
if not preact:
if bn:
self.add_module(name + 'bn', bn_unit)
if activation is not None:
self.add_module(name + 'activation', activation)
if not bn and instance_norm:
self.add_module(name + 'in', in_unit)
class _BNBase(nn.Sequential):
def __init__(self, in_size, batch_norm=None, name=""):
super().__init__()
self.add_module(name + "bn", batch_norm(in_size))
nn.init.constant_(self[0].weight, 1.0)
nn.init.constant_(self[0].bias, 0)
class BatchNorm1d(_BNBase):
def __init__(self, in_size: int, *, name: str = ""):
super().__init__(in_size, batch_norm=nn.BatchNorm1d, name=name)
class BatchNorm2d(_BNBase):
def __init__(self, in_size: int, name: str = ""):
super().__init__(in_size, batch_norm=nn.BatchNorm2d, name=name)
class Conv1d(_ConvBase):
def __init__(
self,
in_size: int,
out_size: int,
*,
kernel_size: int = 1,
stride: int = 1,
padding: int = 0,
activation=nn.ReLU(inplace=True),
bn: bool = False,
init=nn.init.kaiming_normal_,
bias: bool = True,
preact: bool = False,
name: str = "",
instance_norm=False
):
super().__init__(
in_size,
out_size,
kernel_size,
stride,
padding,
activation,
bn,
init,
conv=nn.Conv1d,
batch_norm=BatchNorm1d,
bias=bias,
preact=preact,
name=name,
instance_norm=instance_norm,
instance_norm_func=nn.InstanceNorm1d
)
class Conv2d(_ConvBase):
def __init__(
self,
in_size: int,
out_size: int,
*,
kernel_size: Tuple[int, int] = (1, 1),
stride: Tuple[int, int] = (1, 1),
padding: Tuple[int, int] = (0, 0),
activation=nn.ReLU(inplace=True),
bn: bool = False,
init=nn.init.kaiming_normal_,
bias: bool = True,
preact: bool = False,
name: str = "",
instance_norm=False
):
super().__init__(
in_size,
out_size,
kernel_size,
stride,
padding,
activation,
bn,
init,
conv=nn.Conv2d,
batch_norm=BatchNorm2d,
bias=bias,
preact=preact,
name=name,
instance_norm=instance_norm,
instance_norm_func=nn.InstanceNorm2d
)
class FC(nn.Sequential):
def __init__(
self,
in_size: int,
out_size: int,
*,
activation=nn.ReLU(inplace=True),
bn: bool = False,
init=None,
preact: bool = False,
name: str = ""
):
super().__init__()
fc = nn.Linear(in_size, out_size, bias=not bn)
if init is not None:
init(fc.weight)
if not bn:
nn.init.constant(fc.bias, 0)
if preact:
if bn:
self.add_module(name + 'bn', BatchNorm1d(in_size))
if activation is not None:
self.add_module(name + 'activation', activation)
self.add_module(name + 'fc', fc)
if not preact:
if bn:
self.add_module(name + 'bn', BatchNorm1d(out_size))
if activation is not None:
self.add_module(name + 'activation', activation)

View File

@ -0,0 +1,149 @@
import torch
import torch.nn as nn
import os
import sys
path = os.path.abspath(__file__)
for i in range(2):
path = os.path.dirname(path)
PROJECT_ROOT = path
sys.path.append(PROJECT_ROOT)
import PytorchBoot.stereotype as stereotype
from modules.module_lib.pointnet2_modules import PointnetSAModuleMSG
ClsMSG_CFG_Dense = {
'NPOINTS': [512, 256, 128, None],
'RADIUS': [[0.02, 0.04], [0.04, 0.08], [0.08, 0.16], [None, None]],
'NSAMPLE': [[32, 64], [16, 32], [8, 16], [None, None]],
'MLPS': [[[16, 16, 32], [32, 32, 64]],
[[64, 64, 128], [64, 96, 128]],
[[128, 196, 256], [128, 196, 256]],
[[256, 256, 512], [256, 384, 512]]],
'DP_RATIO': 0.5,
}
ClsMSG_CFG_Light = {
'NPOINTS': [512, 256, 128, None],
'RADIUS': [[0.02, 0.04], [0.04, 0.08], [0.08, 0.16], [None, None]],
'NSAMPLE': [[16, 32], [16, 32], [16, 32], [None, None]],
'MLPS': [[[16, 16, 32], [32, 32, 64]],
[[64, 64, 128], [64, 96, 128]],
[[128, 196, 256], [128, 196, 256]],
[[256, 256, 512], [256, 384, 512]]],
'DP_RATIO': 0.5,
}
ClsMSG_CFG_Light_2048 = {
'NPOINTS': [512, 256, 128, None],
'RADIUS': [[0.02, 0.04], [0.04, 0.08], [0.08, 0.16], [None, None]],
'NSAMPLE': [[16, 32], [16, 32], [16, 32], [None, None]],
'MLPS': [[[16, 16, 32], [32, 32, 64]],
[[64, 64, 128], [64, 96, 128]],
[[128, 196, 256], [128, 196, 256]],
[[256, 256, 1024], [256, 512, 1024]]],
'DP_RATIO': 0.5,
}
ClsMSG_CFG_Strong = {
'NPOINTS': [512, 256, 128, 64, None],
'RADIUS': [[0.02, 0.04], [0.04, 0.08], [0.08, 0.16],[0.16, 0.32], [None, None]],
'NSAMPLE': [[16, 32], [16, 32], [16, 32], [16, 32], [None, None]],
'MLPS': [[[16, 16, 32], [32, 32, 64]],
[[64, 64, 128], [64, 96, 128]],
[[128, 196, 256], [128, 196, 256]],
[[256, 256, 512], [256, 512, 512]],
[[512, 512, 2048], [512, 1024, 2048]]
],
'DP_RATIO': 0.5,
}
ClsMSG_CFG_Lighter = {
'NPOINTS': [512, 256, 128, 64, None],
'RADIUS': [[0.01], [0.02], [0.04], [0.08], [None]],
'NSAMPLE': [[64], [32], [16], [8], [None]],
'MLPS': [[[32, 32, 64]],
[[64, 64, 128]],
[[128, 196, 256]],
[[256, 256, 512]],
[[512, 512, 1024]]],
'DP_RATIO': 0.5,
}
def select_params(name):
if name == 'light':
return ClsMSG_CFG_Light
elif name == 'lighter':
return ClsMSG_CFG_Lighter
elif name == 'dense':
return ClsMSG_CFG_Dense
elif name == 'light_2048':
return ClsMSG_CFG_Light_2048
elif name == 'strong':
return ClsMSG_CFG_Strong
else:
raise NotImplementedError
def break_up_pc(pc):
xyz = pc[..., 0:3].contiguous()
features = (
pc[..., 3:].transpose(1, 2).contiguous()
if pc.size(-1) > 3 else None
)
return xyz, features
@stereotype.module("pointnet++_encoder")
class PointNet2Encoder(nn.Module):
def encode_points(self, pts, require_per_point_feat=False):
return self.forward(pts)
def __init__(self, config:dict):
super().__init__()
channel_in = config.get("in_dim", 3) - 3
params_name = config.get("params_name", "light")
self.SA_modules = nn.ModuleList()
selected_params = select_params(params_name)
for k in range(selected_params['NPOINTS'].__len__()):
mlps = selected_params['MLPS'][k].copy()
channel_out = 0
for idx in range(mlps.__len__()):
mlps[idx] = [channel_in] + mlps[idx]
channel_out += mlps[idx][-1]
self.SA_modules.append(
PointnetSAModuleMSG(
npoint=selected_params['NPOINTS'][k],
radii=selected_params['RADIUS'][k],
nsamples=selected_params['NSAMPLE'][k],
mlps=mlps,
use_xyz=True,
bn=True
)
)
channel_in = channel_out
def forward(self, point_cloud: torch.cuda.FloatTensor):
xyz, features = break_up_pc(point_cloud)
l_xyz, l_features = [xyz], [features]
for i in range(len(self.SA_modules)):
li_xyz, li_features = self.SA_modules[i](l_xyz[i], l_features[i])
l_xyz.append(li_xyz)
l_features.append(li_features)
return l_features[-1].squeeze(-1)
if __name__ == '__main__':
seed = 100
torch.manual_seed(seed)
torch.cuda.manual_seed(seed)
net = PointNet2Encoder(config={"in_dim": 3, "params_name": "strong"}).cuda()
pts = torch.randn(2, 2444, 3).cuda()
print(torch.mean(pts, dim=1))
pre = net.encode_points(pts)
print(pre.shape)

107
modules/pointnet_encoder.py Normal file
View File

@ -0,0 +1,107 @@
from __future__ import print_function
import torch
import torch.nn as nn
import torch.nn.parallel
import torch.utils.data
from torch.autograd import Variable
import numpy as np
import torch.nn.functional as F
import PytorchBoot.stereotype as stereotype
@stereotype.module("pointnet_encoder")
class PointNetEncoder(nn.Module):
def __init__(self, config:dict):
super(PointNetEncoder, self).__init__()
self.out_dim = config["out_dim"]
self.in_dim = config["in_dim"]
self.feature_transform = config.get("feature_transform", False)
self.stn = STNkd(k=self.in_dim)
self.conv1 = torch.nn.Conv1d(self.in_dim , 64, 1)
self.conv2 = torch.nn.Conv1d(64, 128, 1)
self.conv3 = torch.nn.Conv1d(128, 512, 1)
self.conv4 = torch.nn.Conv1d(512, self.out_dim , 1)
if self.feature_transform:
self.f_stn = STNkd(k=64)
def forward(self, x):
trans = self.stn(x)
x = x.transpose(2, 1)
x = torch.bmm(x, trans)
x = x.transpose(2, 1)
x = F.relu(self.conv1(x))
if self.feature_transform:
trans_feat = self.f_stn(x)
x = x.transpose(2, 1)
x = torch.bmm(x, trans_feat)
x = x.transpose(2, 1)
point_feat = x
x = F.relu(self.conv2(x))
x = F.relu(self.conv3(x))
x = self.conv4(x)
x = torch.max(x, 2, keepdim=True)[0]
x = x.view(-1, self.out_dim)
return x, point_feat
def encode_points(self, pts, require_per_point_feat=False):
pts = pts.transpose(2, 1)
global_pts_feature, per_point_feature = self(pts)
if require_per_point_feat:
return global_pts_feature, per_point_feature.transpose(2, 1)
else:
return global_pts_feature
class STNkd(nn.Module):
def __init__(self, k=64):
super(STNkd, self).__init__()
self.conv1 = torch.nn.Conv1d(k, 64, 1)
self.conv2 = torch.nn.Conv1d(64, 128, 1)
self.conv3 = torch.nn.Conv1d(128, 1024, 1)
self.fc1 = nn.Linear(1024, 512)
self.fc2 = nn.Linear(512, 256)
self.fc3 = nn.Linear(256, k * k)
self.relu = nn.ReLU()
self.k = k
def forward(self, x):
batchsize = x.size()[0]
x = F.relu(self.conv1(x))
x = F.relu(self.conv2(x))
x = F.relu(self.conv3(x))
x = torch.max(x, 2, keepdim=True)[0]
x = x.view(-1, 1024)
x = F.relu(self.fc1(x))
x = F.relu(self.fc2(x))
x = self.fc3(x)
iden = (
Variable(torch.from_numpy(np.eye(self.k).flatten().astype(np.float32)))
.view(1, self.k * self.k)
.repeat(batchsize, 1)
)
if x.is_cuda:
iden = iden.to(x.get_device())
x = x + iden
x = x.view(-1, self.k, self.k)
return x
if __name__ == "__main__":
sim_data = Variable(torch.rand(32, 2500, 3))
config = {
"in_dim": 3,
"out_dim": 1024,
"feature_transform": False
}
pointnet = PointNetEncoder(config)
out = pointnet.encode_points(sim_data)
print("global feat", out.size())
out, per_point_out = pointnet.encode_points(sim_data, require_per_point_feat=True)
print("point feat", out.size())
print("per point feat", per_point_out.size())

21
modules/pose_encoder.py Normal file
View File

@ -0,0 +1,21 @@
from torch import nn
import PytorchBoot.stereotype as stereotype
@stereotype.module("pose_encoder")
class PoseEncoder(nn.Module):
def __init__(self, config):
super(PoseEncoder, self).__init__()
self.config = config
pose_dim = config["pose_dim"]
out_dim = config["out_dim"]
self.act = nn.ReLU(True)
self.pose_encoder = nn.Sequential(
nn.Linear(pose_dim, out_dim),
self.act,
nn.Linear(out_dim, out_dim),
self.act,
)
def encode_pose(self, pose):
return self.pose_encoder(pose)

View File

@ -0,0 +1,20 @@
from torch import nn
import PytorchBoot.stereotype as stereotype
@stereotype.module("pts_num_encoder")
class PointsNumEncoder(nn.Module):
def __init__(self, config):
super(PointsNumEncoder, self).__init__()
self.config = config
out_dim = config["out_dim"]
self.act = nn.ReLU(True)
self.pts_num_encoder = nn.Sequential(
nn.Linear(1, out_dim),
self.act,
nn.Linear(out_dim, out_dim),
self.act,
)
def encode_pts_num(self, num_seq):
return self.pts_num_encoder(num_seq)

View File

@ -0,0 +1,63 @@
import torch
from torch import nn
from torch.nn.utils.rnn import pad_sequence
import PytorchBoot.stereotype as stereotype
@stereotype.module("transformer_seq_encoder")
class TransformerSequenceEncoder(nn.Module):
def __init__(self, config):
super(TransformerSequenceEncoder, self).__init__()
self.config = config
embed_dim = config["embed_dim"]
encoder_layer = nn.TransformerEncoderLayer(
d_model=embed_dim,
nhead=config["num_heads"],
dim_feedforward=config["ffn_dim"],
batch_first=True,
)
self.transformer_encoder = nn.TransformerEncoder(
encoder_layer, num_layers=config["num_layers"]
)
self.fc = nn.Linear(embed_dim, config["output_dim"])
def encode_sequence(self, embedding_list_batch):
lengths = []
for embedding_list in embedding_list_batch:
lengths.append(len(embedding_list))
embedding_tensor = pad_sequence(embedding_list_batch, batch_first=True) # Shape: [batch_size, max_seq_len, embed_dim]
max_len = max(lengths)
padding_mask = torch.tensor([([0] * length + [1] * (max_len - length)) for length in lengths], dtype=torch.bool).to(embedding_tensor.device)
transformer_output = self.transformer_encoder(embedding_tensor, src_key_padding_mask=padding_mask)
final_feature = transformer_output.mean(dim=1)
final_output = self.fc(final_feature)
return final_output
if __name__ == "__main__":
config = {
"embed_dim": 256,
"num_heads": 4,
"ffn_dim": 256,
"num_layers": 3,
"output_dim": 1024,
}
encoder = TransformerSequenceEncoder(config)
seq_len = [5, 8, 9, 4]
batch_size = 4
embedding_list_batch = [
torch.randn(seq_len[idx], config["embed_dim"]) for idx in range(batch_size)
]
output_feature = encoder.encode_sequence(
embedding_list_batch
)
print("Encoded Feature:", output_feature)
print("Feature Shape:", output_feature.shape)

Binary file not shown.

Binary file not shown.

View File

@ -0,0 +1,43 @@
import os
import shutil
def clean_scene_data(root, scene):
# 清理目标点云数据
pts_dir = os.path.join(root, scene, "pts")
if os.path.exists(pts_dir):
shutil.rmtree(pts_dir)
print(f"已删除 {pts_dir}")
# 清理法线数据
nrm_dir = os.path.join(root, scene, "nrm")
if os.path.exists(nrm_dir):
shutil.rmtree(nrm_dir)
print(f"已删除 {nrm_dir}")
# 清理扫描点索引数据
scan_points_indices_dir = os.path.join(root, scene, "scan_points_indices")
if os.path.exists(scan_points_indices_dir):
shutil.rmtree(scan_points_indices_dir)
print(f"已删除 {scan_points_indices_dir}")
# 删除扫描点数据文件
scan_points_file = os.path.join(root, scene, "scan_points.txt")
if os.path.exists(scan_points_file):
os.remove(scan_points_file)
print(f"已删除 {scan_points_file}")
def clean_all_scenes(root, scene_list):
for idx, scene in enumerate(scene_list):
print(f"正在清理场景 {scene} ({idx+1}/{len(scene_list)})")
clean_scene_data(root, scene)
if __name__ == "__main__":
root = r"c:\Document\Local Project\nbv_rec\nbv_reconstruction\temp"
scene_list = os.listdir(root)
from_idx = 0
to_idx = len(scene_list)
print(f"正在清理场景 {scene_list[from_idx:to_idx]}")
clean_all_scenes(root, scene_list[from_idx:to_idx])
print("清理完成")

View File

@ -0,0 +1,48 @@
import os
import shutil
def pack_scene_data(root, scene, output_dir):
scene_dir = os.path.join(output_dir, scene)
if not os.path.exists(scene_dir):
os.makedirs(scene_dir)
pts_dir = os.path.join(root, scene, "pts")
if os.path.exists(pts_dir):
shutil.move(pts_dir, os.path.join(scene_dir, "pts"))
scan_points_indices_dir = os.path.join(root, scene, "scan_points_indices")
if os.path.exists(scan_points_indices_dir):
shutil.move(scan_points_indices_dir, os.path.join(scene_dir, "scan_points_indices"))
scan_points_file = os.path.join(root, scene, "scan_points.txt")
if os.path.exists(scan_points_file):
shutil.move(scan_points_file, os.path.join(scene_dir, "scan_points.txt"))
model_pts_nrm_file = os.path.join(root, scene, "points_and_normals.txt")
if os.path.exists(model_pts_nrm_file):
shutil.move(model_pts_nrm_file, os.path.join(scene_dir, "points_and_normals.txt"))
camera_dir = os.path.join(root, scene, "camera_params")
if os.path.exists(camera_dir):
shutil.move(camera_dir, os.path.join(scene_dir, "camera_params"))
scene_info_file = os.path.join(root, scene, "scene_info.json")
if os.path.exists(scene_info_file):
shutil.move(scene_info_file, os.path.join(scene_dir, "scene_info.json"))
def pack_all_scenes(root, scene_list, output_dir):
for idx, scene in enumerate(scene_list):
print(f"正在打包场景 {scene} ({idx+1}/{len(scene_list)})")
pack_scene_data(root, scene, output_dir)
if __name__ == "__main__":
root = r"H:\AI\Datasets\nbv_rec_part2"
output_dir = r"H:\AI\Datasets\scene_info_part2"
scene_list = os.listdir(root)
from_idx = 0
to_idx = len(scene_list)
print(f"正在打包场景 {scene_list[from_idx:to_idx]}")
pack_all_scenes(root, scene_list[from_idx:to_idx], output_dir)
print("打包完成")

View File

@ -0,0 +1,41 @@
import os
import shutil
def pack_scene_data(root, scene, output_dir):
scene_dir = os.path.join(output_dir, scene)
if not os.path.exists(scene_dir):
os.makedirs(scene_dir)
pts_dir = os.path.join(root, scene, "pts")
if os.path.exists(pts_dir):
shutil.move(pts_dir, os.path.join(scene_dir, "pts"))
camera_dir = os.path.join(root, scene, "camera_params")
if os.path.exists(camera_dir):
shutil.move(camera_dir, os.path.join(scene_dir, "camera_params"))
scene_info_file = os.path.join(root, scene, "scene_info.json")
if os.path.exists(scene_info_file):
shutil.move(scene_info_file, os.path.join(scene_dir, "scene_info.json"))
label_dir = os.path.join(root, scene, "label")
if os.path.exists(label_dir):
shutil.move(label_dir, os.path.join(scene_dir, "label"))
def pack_all_scenes(root, scene_list, output_dir):
for idx, scene in enumerate(scene_list):
print(f"packing {scene} ({idx+1}/{len(scene_list)})")
pack_scene_data(root, scene, output_dir)
if __name__ == "__main__":
root = r"H:\AI\Datasets\nbv_rec_part2"
output_dir = r"H:\AI\Datasets\upload_part2"
scene_list = os.listdir(root)
from_idx = 0
to_idx = len(scene_list)
print(f"packing {scene_list[from_idx:to_idx]}")
pack_all_scenes(root, scene_list[from_idx:to_idx], output_dir)
print("packing done")

185
preprocess/preprocessor.py Normal file
View File

@ -0,0 +1,185 @@
import os
import numpy as np
import time
import sys
np.random.seed(0)
sys.path.append(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
from utils.reconstruction import ReconstructionUtil
from utils.data_load import DataLoadUtil
from utils.pts import PtsUtil
def save_np_pts(path, pts: np.ndarray, file_type="txt"):
if file_type == "txt":
np.savetxt(path, pts)
else:
np.save(path, pts)
def save_target_points(root, scene, frame_idx, target_points: np.ndarray, file_type="txt"):
pts_path = os.path.join(root,scene, "pts", f"{frame_idx}.{file_type}")
if not os.path.exists(os.path.join(root,scene, "pts")):
os.makedirs(os.path.join(root,scene, "pts"))
save_np_pts(pts_path, target_points, file_type)
def save_target_normals(root, scene, frame_idx, target_normals: np.ndarray, file_type="txt"):
pts_path = os.path.join(root,scene, "nrm", f"{frame_idx}.{file_type}")
if not os.path.exists(os.path.join(root,scene, "nrm")):
os.makedirs(os.path.join(root,scene, "nrm"))
save_np_pts(pts_path, target_normals, file_type)
def save_scan_points_indices(root, scene, frame_idx, scan_points_indices: np.ndarray, file_type="txt"):
indices_path = os.path.join(root,scene, "scan_points_indices", f"{frame_idx}.{file_type}")
if not os.path.exists(os.path.join(root,scene, "scan_points_indices")):
os.makedirs(os.path.join(root,scene, "scan_points_indices"))
save_np_pts(indices_path, scan_points_indices, file_type)
def save_scan_points(root, scene, scan_points: np.ndarray):
scan_points_path = os.path.join(root,scene, "scan_points.txt")
save_np_pts(scan_points_path, scan_points)
def get_world_points(depth, mask, cam_intrinsic, cam_extrinsic, random_downsample_N):
z = depth[mask]
i, j = np.nonzero(mask)
x = (j - cam_intrinsic[0, 2]) * z / cam_intrinsic[0, 0]
y = (i - cam_intrinsic[1, 2]) * z / cam_intrinsic[1, 1]
points_camera = np.stack((x, y, z), axis=-1).reshape(-1, 3)
sampled_target_points = PtsUtil.random_downsample_point_cloud(
points_camera, random_downsample_N
)
points_camera_aug = np.concatenate((sampled_target_points, np.ones((sampled_target_points.shape[0], 1))), axis=-1)
points_camera_world = np.dot(cam_extrinsic, points_camera_aug.T).T[:, :3]
return points_camera_world
def get_world_points_and_normal(depth, mask, normal, cam_intrinsic, cam_extrinsic, random_downsample_N):
z = depth[mask]
i, j = np.nonzero(mask)
x = (j - cam_intrinsic[0, 2]) * z / cam_intrinsic[0, 0]
y = (i - cam_intrinsic[1, 2]) * z / cam_intrinsic[1, 1]
points_camera = np.stack((x, y, z), axis=-1).reshape(-1, 3)
normal_camera = normal[mask].reshape(-1, 3)
sampled_target_points, idx = PtsUtil.random_downsample_point_cloud(
points_camera, random_downsample_N, require_idx=True
)
if len(sampled_target_points) == 0:
return np.zeros((0, 3)), np.zeros((0, 3))
sampled_normal_camera = normal_camera[idx]
points_camera_aug = np.concatenate((sampled_target_points, np.ones((sampled_target_points.shape[0], 1))), axis=-1)
points_camera_world = np.dot(cam_extrinsic, points_camera_aug.T).T[:, :3]
return points_camera_world, sampled_normal_camera
def get_scan_points_indices(scan_points, mask, display_table_mask_label, cam_intrinsic, cam_extrinsic):
scan_points_homogeneous = np.hstack((scan_points, np.ones((scan_points.shape[0], 1))))
points_camera = np.dot(np.linalg.inv(cam_extrinsic), scan_points_homogeneous.T).T[:, :3]
points_image_homogeneous = np.dot(cam_intrinsic, points_camera.T).T
points_image_homogeneous /= points_image_homogeneous[:, 2:]
pixel_x = points_image_homogeneous[:, 0].astype(int)
pixel_y = points_image_homogeneous[:, 1].astype(int)
h, w = mask.shape[:2]
valid_indices = (pixel_x >= 0) & (pixel_x < w) & (pixel_y >= 0) & (pixel_y < h)
mask_colors = mask[pixel_y[valid_indices], pixel_x[valid_indices]]
selected_points_indices = np.where((mask_colors == display_table_mask_label).all(axis=-1))[0]
selected_points_indices = np.where(valid_indices)[0][selected_points_indices]
return selected_points_indices
def save_scene_data(root, scene, scene_idx=0, scene_total=1,file_type="txt"):
''' configuration '''
target_mask_label = (0, 255, 0)
display_table_mask_label=(0, 0, 255)
random_downsample_N = 32768
voxel_size=0.003
filter_degree = 75
min_z = 0.2
max_z = 0.5
''' scan points '''
display_table_info = DataLoadUtil.get_display_table_info(root, scene)
radius = display_table_info["radius"]
scan_points = np.asarray(ReconstructionUtil.generate_scan_points(display_table_top=0,display_table_radius=radius))
''' read frame data(depth|mask|normal) '''
frame_num = DataLoadUtil.get_scene_seq_length(root, scene)
for frame_id in range(frame_num):
print(f"[scene({scene_idx}/{scene_total})|frame({frame_id}/{frame_num})]Processing {scene} frame {frame_id}")
path = DataLoadUtil.get_path(root, scene, frame_id)
cam_info = DataLoadUtil.load_cam_info(path, binocular=True)
depth_L, depth_R = DataLoadUtil.load_depth(
path, cam_info["near_plane"],
cam_info["far_plane"],
binocular=True
)
mask_L, mask_R = DataLoadUtil.load_seg(path, binocular=True)
normal_L = DataLoadUtil.load_normal(path, binocular=True, left_only=True)
''' target points '''
mask_img_L = mask_L
mask_img_R = mask_R
target_mask_img_L = (mask_L == target_mask_label).all(axis=-1)
target_mask_img_R = (mask_R == target_mask_label).all(axis=-1)
sampled_target_points_L, sampled_target_normal_L = get_world_points_and_normal(depth_L,target_mask_img_L,normal_L, cam_info["cam_intrinsic"], cam_info["cam_to_world"], random_downsample_N)
sampled_target_points_R = get_world_points(depth_R, target_mask_img_R, cam_info["cam_intrinsic"], cam_info["cam_to_world_R"], random_downsample_N)
has_points = sampled_target_points_L.shape[0] > 0 and sampled_target_points_R.shape[0] > 0
if has_points:
target_points, overlap_idx = PtsUtil.get_overlapping_points(
sampled_target_points_L, sampled_target_points_R, voxel_size, require_idx=True
)
sampled_target_normal_L = sampled_target_normal_L[overlap_idx]
if has_points:
has_points = target_points.shape[0] > 0
if has_points:
target_points, target_normals = PtsUtil.filter_points(
target_points, sampled_target_normal_L, cam_info["cam_to_world"], theta_limit = filter_degree, z_range=(min_z, max_z)
)
''' scan points indices '''
scan_points_indices_L = get_scan_points_indices(scan_points, mask_img_L, display_table_mask_label, cam_info["cam_intrinsic"], cam_info["cam_to_world"])
scan_points_indices_R = get_scan_points_indices(scan_points, mask_img_R, display_table_mask_label, cam_info["cam_intrinsic"], cam_info["cam_to_world_R"])
scan_points_indices = np.intersect1d(scan_points_indices_L, scan_points_indices_R)
if not has_points:
target_points = np.zeros((0, 3))
target_normals = np.zeros((0, 3))
save_target_points(root, scene, frame_id, target_points, file_type=file_type)
save_target_normals(root, scene, frame_id, target_normals, file_type=file_type)
save_scan_points_indices(root, scene, frame_id, scan_points_indices, file_type=file_type)
save_scan_points(root, scene, scan_points) # The "done" flag of scene preprocess
if __name__ == "__main__":
#root = "/media/hofee/repository/new_data_with_normal"
root = r"/media/hofee/data/data/test_bottle/view"
scene_list = os.listdir(root)
from_idx = 0 # 1000
to_idx = len(scene_list) # 1500
cnt = 0
import time
total = to_idx - from_idx
for scene in scene_list[from_idx:to_idx]:
start = time.time()
if os.path.exists(os.path.join(root, scene, "scan_points.txt")):
print(f"Scene {scene} has been processed")
cnt+=1
continue
save_scene_data(root, scene, cnt, total, file_type="npy")
cnt+=1
end = time.time()
print(f"Time cost: {end-start}")

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

57
runners/data_spliter.py Normal file
View File

@ -0,0 +1,57 @@
import os
import random
from PytorchBoot.runners.runner import Runner
from PytorchBoot.config import ConfigManager
from PytorchBoot.utils import Log
import PytorchBoot.stereotype as stereotype
from PytorchBoot.status import status_manager
@stereotype.runner("data_spliter")
class DataSpliter(Runner):
def __init__(self, config):
super().__init__(config)
self.load_experiment("data_split")
self.root_dir = ConfigManager.get("runner", "split", "root_dir")
self.type = ConfigManager.get("runner", "split", "type")
self.datasets = ConfigManager.get("runner", "split", "datasets")
self.datapath_list = self.load_all_datapath()
def run(self):
self.split_dataset()
def split_dataset(self):
random.shuffle(self.datapath_list)
start_idx = 0
for dataset_idx in range(len(self.datasets)):
dataset = list(self.datasets.keys())[dataset_idx]
ratio = self.datasets[dataset]["ratio"]
path = self.datasets[dataset]["path"]
split_size = int(len(self.datapath_list) * ratio)
split_files = self.datapath_list[start_idx:start_idx + split_size]
start_idx += split_size
self.save_split_files(path, split_files)
status_manager.set_progress("split", "data_splitor", "split dataset", dataset_idx, len(self.datasets))
Log.success(f"save {dataset} split files to {path}")
status_manager.set_progress("split", "data_splitor", "split dataset", len(self.datasets), len(self.datasets))
def save_split_files(self, path, split_files):
os.makedirs(os.path.dirname(path), exist_ok=True)
with open(path, "w") as f:
f.write("\n".join(split_files))
def load_all_datapath(self):
return os.listdir(self.root_dir)
def create_experiment(self, backup_name=None):
super().create_experiment(backup_name)
def load_experiment(self, backup_name=None):
super().load_experiment(backup_name)

View File

@ -0,0 +1,360 @@
import os
import json
from utils.render import RenderUtil
from utils.pose import PoseUtil
from utils.pts import PtsUtil
from utils.reconstruction import ReconstructionUtil
from beans.predict_result import PredictResult
import torch
from tqdm import tqdm
import numpy as np
import pickle
from PytorchBoot.config import ConfigManager
import PytorchBoot.namespace as namespace
import PytorchBoot.stereotype as stereotype
from PytorchBoot.factory import ComponentFactory
from PytorchBoot.dataset import BaseDataset
from PytorchBoot.runners.runner import Runner
from PytorchBoot.utils import Log
from PytorchBoot.status import status_manager
from utils.data_load import DataLoadUtil
@stereotype.runner("evaluate_uncertainty_guide")
class EvaluateUncertaintyGuide(Runner):
def __init__(self, config_path):
super().__init__(config_path)
self.script_path = ConfigManager.get(namespace.Stereotype.RUNNER, "blender_script_path")
self.output_dir = ConfigManager.get(namespace.Stereotype.RUNNER, "output_dir")
self.voxel_size = ConfigManager.get(namespace.Stereotype.RUNNER, "voxel_size")
self.min_new_area = ConfigManager.get(namespace.Stereotype.RUNNER, "min_new_area")
CM = 0.01
self.min_new_pts_num = self.min_new_area * (CM / self.voxel_size) ** 2
self.output_data_root = ConfigManager.get(namespace.Stereotype.RUNNER, "output_data_root")
self.output_data = dict()
for scene_name in os.listdir(self.output_data_root):
real_scene_name = scene_name[:-len("_poses.npy")]
self.output_data[real_scene_name] = np.load(os.path.join(self.output_data_root, scene_name))
#mport ipdb; ipdb.set_trace()
''' Experiment '''
self.load_experiment("nbv_evaluator")
self.stat_result_path = os.path.join(self.output_dir, "stat.json")
if os.path.exists(self.stat_result_path):
with open(self.stat_result_path, "r") as f:
self.stat_result = json.load(f)
else:
self.stat_result = {}
''' Test '''
self.test_config = ConfigManager.get(namespace.Stereotype.RUNNER, namespace.Mode.TEST)
self.test_dataset_name_list = self.test_config["dataset_list"]
self.test_set_list = []
self.test_writer_list = []
seen_name = set()
for test_dataset_name in self.test_dataset_name_list:
if test_dataset_name not in seen_name:
seen_name.add(test_dataset_name)
else:
raise ValueError("Duplicate test dataset name: {}".format(test_dataset_name))
test_set: BaseDataset = ComponentFactory.create(namespace.Stereotype.DATASET, test_dataset_name)
self.test_set_list.append(test_set)
self.print_info()
def run(self):
Log.info("Loading from epoch {}.".format(self.current_epoch))
self.inference()
Log.success("Inference finished.")
def inference(self):
#self.pipeline.eval()
with torch.no_grad():
test_set: BaseDataset
for dataset_idx, test_set in enumerate(self.test_set_list):
status_manager.set_progress("inference", "inferencer", f"dataset", dataset_idx, len(self.test_set_list))
test_set_name = test_set.get_name()
total=int(len(test_set))
for i in tqdm(range(total), desc=f"Processing {test_set_name}", ncols=100):
try:
data = test_set.__getitem__(i)
scene_name = data["scene_name"]
inference_result_path = os.path.join(self.output_dir, test_set_name, f"{scene_name}.pkl")
if os.path.exists(inference_result_path):
Log.info(f"Inference result already exists for scene: {scene_name}")
continue
status_manager.set_progress("inference", "inferencer", f"Batch[{test_set_name}]", i+1, total)
output = self.predict_sequence(data)
self.save_inference_result(test_set_name, data["scene_name"], output)
except Exception as e:
print(e)
Log.error(f"Error, {e}")
continue
status_manager.set_progress("inference", "inferencer", f"dataset", len(self.test_set_list), len(self.test_set_list))
def get_output_data(self, scene_name, idx):
pose_matrix = self.output_data[scene_name][idx]
pose_6d = PoseUtil.matrix_to_rotation_6d_numpy(pose_matrix[:3,:3])
pose_9d = np.concatenate([pose_6d, pose_matrix[:3,3]], axis=0).reshape(1,9)
import ipdb; ipdb.set_trace()
return {"pred_pose_9d": pose_9d}
def predict_sequence(self, data, cr_increase_threshold=0, overlap_area_threshold=25, scan_points_threshold=10, max_iter=50, max_retry = 10, max_success=3):
scene_name = data["scene_name"]
Log.info(f"Processing scene: {scene_name}")
status_manager.set_status("inference", "inferencer", "scene", scene_name)
''' data for rendering '''
scene_path = data["scene_path"]
O_to_L_pose = data["O_to_L_pose"]
voxel_threshold = self.voxel_size
filter_degree = 75
down_sampled_model_pts = data["gt_pts"]
first_frame_to_world_9d = data["first_scanned_n_to_world_pose_9d"][0]
first_frame_to_world = np.eye(4)
first_frame_to_world[:3,:3] = PoseUtil.rotation_6d_to_matrix_numpy(first_frame_to_world_9d[:6])
first_frame_to_world[:3,3] = first_frame_to_world_9d[6:]
''' data for inference '''
input_data = {}
input_data["combined_scanned_pts"] = torch.tensor(data["first_scanned_pts"][0], dtype=torch.float32).to(self.device).unsqueeze(0)
input_data["scanned_pts"] = [torch.tensor(data["first_scanned_pts"][0], dtype=torch.float32).to(self.device).unsqueeze(0)]
input_data["scanned_pts_mask"] = [torch.zeros(input_data["combined_scanned_pts"].shape[1], dtype=torch.bool).to(self.device).unsqueeze(0)]
input_data["scanned_n_to_world_pose_9d"] = [torch.tensor(data["first_scanned_n_to_world_pose_9d"], dtype=torch.float32).to(self.device)]
input_data["mode"] = namespace.Mode.TEST
input_pts_N = input_data["combined_scanned_pts"].shape[1]
root = os.path.dirname(scene_path)
display_table_info = DataLoadUtil.get_display_table_info(root, scene_name)
radius = display_table_info["radius"]
scan_points = np.asarray(ReconstructionUtil.generate_scan_points(display_table_top=0,display_table_radius=radius))
first_frame_target_pts, first_frame_target_normals, first_frame_scan_points_indices = RenderUtil.render_pts(first_frame_to_world, scene_path, self.script_path, scan_points, voxel_threshold=voxel_threshold, filter_degree=filter_degree, nO_to_nL_pose=O_to_L_pose)
scanned_view_pts = [first_frame_target_pts]
history_indices = [first_frame_scan_points_indices]
last_pred_cr, added_pts_num = self.compute_coverage_rate(scanned_view_pts, None, down_sampled_model_pts, threshold=voxel_threshold)
retry_duplication_pose = []
retry_no_pts_pose = []
retry_overlap_pose = []
retry = 0
pred_cr_seq = [last_pred_cr]
success = 0
last_pts_num = PtsUtil.voxel_downsample_point_cloud(data["first_scanned_pts"][0], voxel_threshold).shape[0]
#import time
for i in range(len(self.output_data[scene_name])):
#import ipdb; ipdb.set_trace()
Log.green(f"iter: {len(pred_cr_seq)}, retry: {retry}/{max_retry}, success: {success}/{max_success}")
combined_scanned_pts = np.vstack(scanned_view_pts)
voxel_downsampled_combined_scanned_pts_np, inverse = self.voxel_downsample_with_mapping(combined_scanned_pts, voxel_threshold)
output = self.get_output_data(scene_name, i)
pred_pose_9d = output["pred_pose_9d"]
import ipdb; ipdb.set_trace()
#pred_pose = torch.eye(4, device=pred_pose_9d.device)
# # save pred_pose_9d ------
# root = "/media/hofee/data/project/python/nbv_reconstruction/nbv_reconstruction/temp_output_result"
# scene_dir = os.path.join(root, scene_name)
# if not os.path.exists(scene_dir):
# os.makedirs(scene_dir)
# pred_9d_path = os.path.join(scene_dir,f"pred_pose_9d_{len(pred_cr_seq)}.npy")
# pts_path = os.path.join(scene_dir,f"combined_scanned_pts_{len(pred_cr_seq)}.txt")
# np_combined_scanned_pts = input_data["combined_scanned_pts"][0].cpu().numpy()
# np.save(pred_9d_path, pred_pose_9d.cpu().numpy())
# np.savetxt(pts_path, np_combined_scanned_pts)
# # ----- ----- -----
predict_result = PredictResult(pred_pose_9d, input_pts=input_data["combined_scanned_pts"][0].cpu().numpy(), cluster_params=dict(eps=0.25, min_samples=3))
# -----------------------
# import ipdb; ipdb.set_trace()
# predict_result.visualize()
# -----------------------
pred_pose_9d_candidates = predict_result.candidate_9d_poses
for pred_pose_9d in pred_pose_9d_candidates:
#import ipdb; ipdb.set_trace()
pred_pose_9d = torch.tensor(pred_pose_9d, dtype=torch.float32).to(self.device).unsqueeze(0)
pred_pose[:3,:3] = PoseUtil.rotation_6d_to_matrix_tensor_batch(pred_pose_9d[:,:6])[0]
pred_pose[:3,3] = pred_pose_9d[0,6:]
try:
new_target_pts, new_target_normals, new_scan_points_indices = RenderUtil.render_pts(pred_pose, scene_path, self.script_path, scan_points, voxel_threshold=voxel_threshold, filter_degree=filter_degree, nO_to_nL_pose=O_to_L_pose)
#import ipdb; ipdb.set_trace()
if not ReconstructionUtil.check_scan_points_overlap(history_indices, new_scan_points_indices, scan_points_threshold):
curr_overlap_area_threshold = overlap_area_threshold
else:
curr_overlap_area_threshold = overlap_area_threshold * 0.5
downsampled_new_target_pts = PtsUtil.voxel_downsample_point_cloud(new_target_pts, voxel_threshold)
overlap, _ = ReconstructionUtil.check_overlap(downsampled_new_target_pts, voxel_downsampled_combined_scanned_pts_np, overlap_area_threshold = curr_overlap_area_threshold, voxel_size=voxel_threshold, require_new_added_pts_num = True)
if not overlap:
Log.yellow("no overlap!")
retry += 1
retry_overlap_pose.append(pred_pose.cpu().numpy().tolist())
continue
history_indices.append(new_scan_points_indices)
except Exception as e:
Log.error(f"Error in scene {scene_path}, {e}")
print("current pose: ", pred_pose)
print("curr_pred_cr: ", last_pred_cr)
retry_no_pts_pose.append(pred_pose.cpu().numpy().tolist())
retry += 1
continue
if new_target_pts.shape[0] == 0:
Log.red("no pts in new target")
retry_no_pts_pose.append(pred_pose.cpu().numpy().tolist())
retry += 1
continue
pred_cr, _ = self.compute_coverage_rate(scanned_view_pts, new_target_pts, down_sampled_model_pts, threshold=voxel_threshold)
Log.yellow(f"{pred_cr}, {last_pred_cr}, max: , {data['seq_max_coverage_rate']}")
if pred_cr >= data["seq_max_coverage_rate"] - 1e-3:
print("max coverage rate reached!: ", pred_cr)
pred_cr_seq.append(pred_cr)
scanned_view_pts.append(new_target_pts)
input_data["scanned_n_to_world_pose_9d"] = [torch.cat([input_data["scanned_n_to_world_pose_9d"][0], pred_pose_9d], dim=0)]
combined_scanned_pts = np.vstack(scanned_view_pts)
voxel_downsampled_combined_scanned_pts_np = PtsUtil.voxel_downsample_point_cloud(combined_scanned_pts, voxel_threshold)
random_downsampled_combined_scanned_pts_np = PtsUtil.random_downsample_point_cloud(voxel_downsampled_combined_scanned_pts_np, input_pts_N)
input_data["combined_scanned_pts"] = torch.tensor(random_downsampled_combined_scanned_pts_np, dtype=torch.float32).unsqueeze(0).to(self.device)
input_data["scanned_pts"] = [torch.cat([input_data["scanned_pts"][0], torch.tensor(random_downsampled_combined_scanned_pts_np, dtype=torch.float32).unsqueeze(0).to(self.device)], dim=0)]
last_pred_cr = pred_cr
pts_num = voxel_downsampled_combined_scanned_pts_np.shape[0]
Log.info(f"delta pts num:,{pts_num - last_pts_num },{pts_num}, {last_pts_num}")
if pts_num - last_pts_num < self.min_new_pts_num and pred_cr <= data["seq_max_coverage_rate"] - 1e-2:
retry += 1
retry_duplication_pose.append(pred_pose.cpu().numpy().tolist())
Log.red(f"delta pts num < {self.min_new_pts_num}:, {pts_num}, {last_pts_num}")
elif pts_num - last_pts_num < self.min_new_pts_num and pred_cr > data["seq_max_coverage_rate"] - 1e-2:
success += 1
Log.success(f"delta pts num < {self.min_new_pts_num}:, {pts_num}, {last_pts_num}")
last_pts_num = pts_num
input_data["scanned_n_to_world_pose_9d"] = input_data["scanned_n_to_world_pose_9d"][0].cpu().numpy().tolist()
result = {
"pred_pose_9d_seq": input_data["scanned_n_to_world_pose_9d"],
"combined_scanned_pts": input_data["combined_scanned_pts"],
"target_pts_seq": scanned_view_pts,
"coverage_rate_seq": pred_cr_seq,
"max_coverage_rate": data["seq_max_coverage_rate"],
"pred_max_coverage_rate": max(pred_cr_seq),
"scene_name": scene_name,
"retry_no_pts_pose": retry_no_pts_pose,
"retry_duplication_pose": retry_duplication_pose,
"retry_overlap_pose": retry_overlap_pose,
"best_seq_len": data["best_seq_len"],
}
self.stat_result[scene_name] = {
"coverage_rate_seq": pred_cr_seq,
"pred_max_coverage_rate": max(pred_cr_seq),
"pred_seq_len": len(pred_cr_seq),
}
print('success rate: ', max(pred_cr_seq))
return result
def voxel_downsample_with_mapping(self, point_cloud, voxel_size=0.003):
voxel_indices = np.floor(point_cloud / voxel_size).astype(np.int32)
unique_voxels, inverse, counts = np.unique(voxel_indices, axis=0, return_inverse=True, return_counts=True)
idx_sort = np.argsort(inverse)
idx_unique = idx_sort[np.cumsum(counts)-counts]
downsampled_points = point_cloud[idx_unique]
return downsampled_points, inverse
def compute_coverage_rate(self, scanned_view_pts, new_pts, model_pts, threshold=0.005):
if new_pts is not None:
new_scanned_view_pts = scanned_view_pts + [new_pts]
else:
new_scanned_view_pts = scanned_view_pts
combined_point_cloud = np.vstack(new_scanned_view_pts)
down_sampled_combined_point_cloud = PtsUtil.voxel_downsample_point_cloud(combined_point_cloud,threshold)
return ReconstructionUtil.compute_coverage_rate(model_pts, down_sampled_combined_point_cloud, threshold)
def voxel_downsample_with_mapping(self, point_cloud, voxel_size=0.003):
voxel_indices = np.floor(point_cloud / voxel_size).astype(np.int32)
unique_voxels, inverse, counts = np.unique(voxel_indices, axis=0, return_inverse=True, return_counts=True)
idx_sort = np.argsort(inverse)
idx_unique = idx_sort[np.cumsum(counts)-counts]
downsampled_points = point_cloud[idx_unique]
return downsampled_points, inverse
def save_inference_result(self, dataset_name, scene_name, output):
dataset_dir = os.path.join(self.output_dir, dataset_name)
if not os.path.exists(dataset_dir):
os.makedirs(dataset_dir)
output_path = os.path.join(dataset_dir, f"{scene_name}.pkl")
pickle.dump(output, open(output_path, "wb"))
with open(self.stat_result_path, "w") as f:
json.dump(self.stat_result, f)
def get_checkpoint_path(self, is_last=False):
return os.path.join(self.experiment_path, namespace.Direcotry.CHECKPOINT_DIR_NAME,
"Epoch_{}.pth".format(
self.current_epoch if self.current_epoch != -1 and not is_last else "last"))
def load_checkpoint(self, is_last=False):
self.load(self.get_checkpoint_path(is_last))
Log.success(f"Loaded checkpoint from {self.get_checkpoint_path(is_last)}")
if is_last:
checkpoint_root = os.path.join(self.experiment_path, namespace.Direcotry.CHECKPOINT_DIR_NAME)
meta_path = os.path.join(checkpoint_root, "meta.json")
if not os.path.exists(meta_path):
raise FileNotFoundError(
"No checkpoint meta.json file in the experiment {}".format(self.experiments_config["name"]))
file_path = os.path.join(checkpoint_root, "meta.json")
with open(file_path, "r") as f:
meta = json.load(f)
self.current_epoch = meta["last_epoch"]
self.current_iter = meta["last_iter"]
def load_experiment(self, backup_name=None):
super().load_experiment(backup_name)
self.current_epoch = self.experiments_config["epoch"]
#self.load_checkpoint(is_last=(self.current_epoch == -1))
def create_experiment(self, backup_name=None):
super().create_experiment(backup_name)
def load(self, path):
state_dict = torch.load(path)
self.pipeline.load_state_dict(state_dict)
def print_info(self):
def print_dataset(dataset: BaseDataset):
config = dataset.get_config()
name = dataset.get_name()
Log.blue(f"Dataset: {name}")
for k,v in config.items():
Log.blue(f"\t{k}: {v}")
super().print_info()
table_size = 70
Log.blue(f"{'+' + '-' * (table_size // 2)} Pipeline {'-' * (table_size // 2)}" + '+')
#Log.blue(self.pipeline)
Log.blue(f"{'+' + '-' * (table_size // 2)} Datasets {'-' * (table_size // 2)}" + '+')
for i, test_set in enumerate(self.test_set_list):
Log.blue(f"test dataset {i}: ")
print_dataset(test_set)
Log.blue(f"{'+' + '-' * (table_size // 2)}----------{'-' * (table_size // 2)}" + '+')

Some files were not shown because too many files have changed in this diff Show More