-
Evidence Lower Bound (ELBO)Data/Machine learning 2021. 8. 24. 18:21
In the variational auto encoder (VAE) (my blog posting link), a probability of observing $z$ given $x$ can be presented as follows using Bayes' theorm: $$ p(z | x) = \frac{ p(x|z) p(z) }{ p(x) } $$ where $p(z|x)$ is a posterior. However, this posterior is difficult to compute due to its marginal likelihood $p(x)$. $p(x)$ can be expanded as below using the law of total probability: $$ p(x) = \int..
-
YAML Tutorial Quick StartData 2021. 8. 4. 10:18
https://www.cloudbees.com/blog/yaml-tutorial-everything-you-need-get-started YAML Tutorial: Everything You Need to Get Started in Minutes YAML Ain't Markup Language (YAML) has a risen in popularity over the past few years. Here's a YAML tutorial to get you started quickly. www.cloudbees.com
-
Loss function for multi-label classificationData/Machine learning 2021. 5. 3. 20:01
Binary Cross Entropy is used for multi-label classification, and it's involved with a sigmoid function. (Note that cross-entropy is used for multi-class classification that's involved with a softmax function) (reference) Example of a simple neural network model for multi-label classification in Keras (link) - Size of an output layer must be a number of labels. [PyTorch web] BCEWithLogitLoss (link)
-
Metrics for Multi-label classificationData/Machine learning 2021. 5. 3. 12:01
AUC (Area Under the Curve) [1] developers.google.com/machine-learning/crash-course/classification/roc-and-auc?hl=ko The area under the ROC (receiver operating characteristic) curve. The ROC curve consists of TPR (True Positive Rate) and FPR (False Positive Rate) : $$ TPR = \frac{TP}{TP + FN} $$ $$ FPR = \frac{FP}{FP + TN} $$ The ROC curve can be drawn by adjusting a classification threshold from..
-
Signal Resampling (downsampling / upsampling)Data 2021. 4. 27. 11:35
The most popular/common functions are: scipy.signal.resample scipy.signal.resample_poly from scipy import signal x = np.linspace(0, 10, 20, endpoint=False) y = np.cos(-x**2/6.0) f_fft = signal.resample(y, 100) f_poly = signal.resample_poly(y, 100, 20, padtype='line') # same as "signal.resample_poly(y, 5, 1, padtype='line')" xnew = np.linspace(0, 10, 100, endpoint=False) import matplotlib.pyplot ..
-
PyTorch Example of LSTMData/Machine learning 2021. 4. 21. 15:19
Architecture [3] The main components are: 1) hidden and cell states, 2) input gate, forget gate, output gate. $$ i_1 = \sigma ( W_{i_1} \cdot (H_{t-1}, x_t) + b_{i_1} ) $$ $$ i_ = tanh ( W_{i_2} \cdot (H_{t-1}, x_t) + b_{i_2} ) $$ $$ i_{input} = i_1 * i_2 $$ $$ f = \sigma ( W_{forget} \cdot (H_{t-1}, x_t) + b_{forget} ) $$ $$ C_t = C_{t-1} * f + i_{input} $$ $$ O_1 = \sigma ( W_{output_1} \cdot ..
-
Log-bilinear Language ModelData/Machine learning 2021. 4. 20. 17:09
It computes the probability of a next word $w_i$ given the previous words (context) as follows: $$ P(w_i = w | w_{i-1}, ..., w_1) = \frac{ \exp\{ \phi(w)^T c \} }{ \sum_{w^\prime \in V} \exp\{ \phi(w^\prime)^T c \} } $$ Here $\phi (w)$ is a word-vector and $c$ is the context for $w_i$ computed as $$ c = \sum_{n=1}^{i-1} \alpha_n \phi(w_n) $$ Thus, the log-bilinear language model computes a conte..