Difference between revisions of "Hyperparameters"
Jump to navigation
Jump to search
(Created page with " - **choose size fixed-length representations** based on the distribution of the number of cloudy days in the training data: base length - augmentations...") |
|||
| Line 1: | Line 1: | ||
| + | - **Pixel** not patch | ||
| − | + | - How many timeslots to sub-sample when creating d-pixel | |
| − | + | - 16 timeslots | |
| − | + | - 25 timeslots | |
| − | + | - 40 timeslots | |
| − | - 16 | ||
| − | - 25 | ||
- Representation dimension | - Representation dimension | ||
- 64, **128**, or 256 | - 64, **128**, or 256 | ||
| − | - | + | -Representation length for each dimension |
- ~~FP8~~ | - ~~FP8~~ | ||
- ~~INT8~~ | - ~~INT8~~ | ||
| Line 28: | Line 27: | ||
- Learning rate | - Learning rate | ||
- **0.0001** | - **0.0001** | ||
| − | - others | + | - others |
- Encoder type (each with its own parameters) | - Encoder type (each with its own parameters) | ||
| Line 56: | Line 55: | ||
- Logistic regression | - Logistic regression | ||
| − | - ** | + | |
| + | - **choose size fixed-length representations** based on the distribution of the number of cloudy days in the training data: base length | ||
| + | - augmentations | ||
| + | - masking of season or some blocks | ||
| + | - FFT on the pixels | ||
Revision as of 11:02, 22 May 2025
- **Pixel** not patch
- How many timeslots to sub-sample when creating d-pixel
- 16 timeslots - 25 timeslots - 40 timeslots
- Representation dimension
- 64, **128**, or 256
-Representation length for each dimension
- ~~FP8~~
- ~~INT8~~
- ~~Float16~~
- ~~Bfloat16~~
- **32 bits**
- look at the distribution of representations for each dimension to see if they can be reduced
- Matryoshka may change things
- Projector size
- 0, 256, 512, **1024**
- Loss function
- Barlow twin (parameter lambda = 0.005) - **MMCR (parameters alpha=0.005, lambda=0.005)**
- Learning rate
- **0.0001** - others
- Encoder type (each with its own parameters)
- MLP
- ResNet
- **Transformer**
- **8 attention heads**
- Q, K, V same dimension as representation dimension = 128
- **3 layers**
- How many augmentation pairs to use for each pixel
- Training
- **1,**2
- Testing (number of inferences for downstream task)
- **1**
- 10 (prioritise this)
- majority vote
- **average**
- Downstream classifier
- **MLP**
- Number of layers
- **3**
- Random Forest
- XGBoost
- Linear regression
- Logistic regression
- **choose size fixed-length representations** based on the distribution of the number of cloudy days in the training data: base length
- augmentations
- masking of season or some blocks
- FFT on the pixels