Hyperparameters

- **Pixel** not patch

- How many timeslots to sub-sample when creating d-pixel

   - 16 timeslots
   - 25 timeslots
   - 40 timeslots

- Representation dimension

   - 64, **128**, or 256

-Representation length for each dimension

   - ~~FP8~~
   - ~~INT8~~
   - ~~Float16~~
   - ~~Bfloat16~~
   - **32 bits**
       - look at the distribution of representations for each dimension to see if they can be reduced
       - Matryoshka may change things

- Projector size

   - 0, 256, 512, **1024**

- Loss function

   - Barlow twin (parameter lambda = 0.005)
   - **MMCR (parameters alpha=0.005, lambda=0.005)**

- Learning rate

   - **0.0001**
   - others

- Encoder type (each with its own parameters)

   - MLP
   - ResNet
   - **Transformer**
       - **8 attention heads**
       - Q, K, V same dimension as representation dimension = 128
       - **3 layers**

- How many augmentation pairs to use for each pixel

   - Training
       - **1,**2
   - Testing (number of inferences for downstream task)
       - **1**
       - 10 (prioritise this)
           - majority vote
           - **average**

- Downstream classifier

   - **MLP**
       - Number of layers
           - **3**
   - Random Forest
   - XGBoost
   - Linear regression
   - Logistic regression

   - **choose size fixed-length representations** based on the distribution of the number of cloudy days in the training data: base length
       - augmentations
           - masking of season or some blocks
           - FFT on the pixels

Hyperparameters

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools