Hyperparameters
Jump to navigation
Jump to search
- **choose size fixed-length representations** based on the distribution of the number of cloudy days in the training data: base length
- augmentations
- masking of season or some blocks
- FFT on the pixels
- 16 timeslot sub-sample
- 25 timeslot sub-sample
- Representation dimension
- 64, **128**, or 256
- ~~Representation length for each dimension~~
- ~~FP8~~
- ~~INT8~~
- ~~Float16~~
- ~~Bfloat16~~
- **32 bits**
- look at the distribution of representations for each dimension to see if they can be reduced
- Matryoshka may change things
- Projector size
- 0, 256, 512, **1024**
- Loss function
- Barlow twin (parameter lambda = 0.005) - **MMCR (parameters alpha=0.005, lambda=0.005)**
- Learning rate
- **0.0001** - others - chosen by Frank - depends on the data size
- Encoder type (each with its own parameters)
- MLP
- ResNet
- **Transformer**
- **8 attention heads**
- Q, K, V same dimension as representation dimension = 128
- **3 layers**
- How many augmentation pairs to use for each pixel
- Training
- **1,**2
- Testing (number of inferences for downstream task)
- **1**
- 10 (prioritise this)
- majority vote
- **average**
- Downstream classifier
- **MLP**
- Number of layers
- **3**
- Random Forest
- XGBoost
- Linear regression
- Logistic regression
- **Pixel** not patch