-
Loss function for multi-label classificationData/Machine learning 2021. 5. 3. 20:01
Binary Cross Entropy is used for multi-label classification, and it's involved with a sigmoid function. (Note that cross-entropy is used for multi-class classification that's involved with a softmax function) (reference) Example of a simple neural network model for multi-label classification in Keras (link) - Size of an output layer must be a number of labels. [PyTorch web] BCEWithLogitLoss (link)
-
Metrics for Multi-label classificationData/Machine learning 2021. 5. 3. 12:01
AUC (Area Under the Curve) [1] developers.google.com/machine-learning/crash-course/classification/roc-and-auc?hl=ko The area under the ROC (receiver operating characteristic) curve. The ROC curve consists of TPR (True Positive Rate) and FPR (False Positive Rate) : $$ TPR = \frac{TP}{TP + FN} $$ $$ FPR = \frac{FP}{FP + TN} $$ The ROC curve can be drawn by adjusting a classification threshold from..
-
PyTorch Example of LSTMData/Machine learning 2021. 4. 21. 15:19
Architecture [3] The main components are: 1) hidden and cell states, 2) input gate, forget gate, output gate. $$ i_1 = \sigma ( W_{i_1} \cdot (H_{t-1}, x_t) + b_{i_1} ) $$ $$ i_ = tanh ( W_{i_2} \cdot (H_{t-1}, x_t) + b_{i_2} ) $$ $$ i_{input} = i_1 * i_2 $$ $$ f = \sigma ( W_{forget} \cdot (H_{t-1}, x_t) + b_{forget} ) $$ $$ C_t = C_{t-1} * f + i_{input} $$ $$ O_1 = \sigma ( W_{output_1} \cdot ..
-
Log-bilinear Language ModelData/Machine learning 2021. 4. 20. 17:09
It computes the probability of a next word $w_i$ given the previous words (context) as follows: $$ P(w_i = w | w_{i-1}, ..., w_1) = \frac{ \exp\{ \phi(w)^T c \} }{ \sum_{w^\prime \in V} \exp\{ \phi(w^\prime)^T c \} } $$ Here $\phi (w)$ is a word-vector and $c$ is the context for $w_i$ computed as $$ c = \sum_{n=1}^{i-1} \alpha_n \phi(w_n) $$ Thus, the log-bilinear language model computes a conte..
-
Visualization of CNNData/Machine learning 2021. 4. 15. 16:32
There are two kinds of visualization of CNN: 1) visualization of intermediate activation layers, 2) visualization of a representative image or pattern that a certain kernel is highly activated by. 1. Visualization of Intermediate Activation Layers You visualize output $a$ from a certain activation layer, and $a \in \mathbb{R}^{B \times C_{in} \times H \times W}$ where $B$ refers to the batch siz..
-
What does .eval() do in PyTorch?Data/Machine learning 2021. 4. 14. 18:21
.eval() is known to be used during inference of models that usually contain BN and/or Dropout. When .eval() is used, the model with BN uses runinng_mean and running_var instead of mean and var obtained from each mini-batch. Fine-tuning When fine-tuning, it is important to use running_mean and running_var of the trained model and they should be fixed during fine-tuning. This is because usually th..
-
Git TipsData/Machine learning 2021. 4. 5. 09:54
.gitignore .gitignorefile is a plain text file where each line contains a pattern for files/directories to ignore. You may ignore them for the following reasons: 1) security, 2) size, 3) non-related to a project. .gitignore should be located in the root directory. Syntax [1] www.pluralsight.com/guides/how-to-use-gitignore-file [2] programming119.tistory.com/105 Examples 1-InitialExperiment/devel..
-
W&B TipsData/Machine learning 2021. 4. 4. 10:12
[1] QuickStart: docs.wandb.ai/quickstart [2] wandb.init(...): docs.wandb.ai/library/init [3] wandb.log(...): docs.wandb.ai/library/log [4] PyTorch Integration: docs.wandb.ai/integrations/pytorch Quick Start Set your config with argparse. wandb.init(project='project_name_you_want' , config=args) wandb.watch(model) // automatically log gradients and model parameters Set up a training pipeline and ..