Update example and notes
This commit is contained in:
44
example/README.md
Normal file
44
example/README.md
Normal file
@@ -0,0 +1,44 @@
|
||||
# Example: HAI 21.03 Feature Split
|
||||
|
||||
This folder contains a small, reproducible example that inspects the HAI 21.03
|
||||
CSV (train1) and produces a continuous/discrete split using a simple heuristic.
|
||||
|
||||
## Files
|
||||
- analyze_hai21_03.py: reads a sample of the data and writes results.
|
||||
- data_utils.py: CSV loading, vocab, normalization, and batching helpers.
|
||||
- feature_split.json: column split for HAI 21.03.
|
||||
- hybrid_diffusion.py: hybrid model + diffusion utilities.
|
||||
- prepare_data.py: compute vocab and normalization stats.
|
||||
- train_stub.py: end-to-end scaffold for loss computation.
|
||||
- train.py: minimal training loop with checkpoints.
|
||||
- sample.py: minimal sampling loop.
|
||||
- model_design.md: step-by-step design notes.
|
||||
- results/feature_split.txt: comma-separated feature lists.
|
||||
- results/summary.txt: basic stats (rows sampled, column counts).
|
||||
|
||||
## Run
|
||||
```
|
||||
python /home/anay/Dev/diffusion/mask-ddpm/example/analyze_hai21_03.py
|
||||
```
|
||||
|
||||
Prepare vocab + stats (writes to `example/results`):
|
||||
```
|
||||
python /home/anay/Dev/diffusion/mask-ddpm/example/prepare_data.py
|
||||
```
|
||||
|
||||
Train a small run:
|
||||
```
|
||||
python /home/anay/Dev/diffusion/mask-ddpm/example/train.py
|
||||
```
|
||||
|
||||
Sample from the trained model:
|
||||
```
|
||||
python /home/anay/Dev/diffusion/mask-ddpm/example/sample.py
|
||||
```
|
||||
|
||||
## Notes
|
||||
- Heuristic: integer-like values with low cardinality (<=10) are treated as
|
||||
discrete. All other numeric columns are continuous.
|
||||
- The script only samples the first 5000 rows to stay fast.
|
||||
- `prepare_data.py` runs without PyTorch, but `train.py` and `sample.py` require it.
|
||||
- `train.py` and `sample.py` auto-select GPU if available; otherwise they fall back to CPU.
|
||||
Reference in New Issue
Block a user