Functions
Merlin.BiLSTMMerlin.Conv1DMerlin.LSTMMerlin.LinearMerlin.SwishBase.:*Base.:+Base.:-Base.:/Base.:^Base.broadcastBase.getindexBase.maxBase.reshapeBase.tanhBase.transposeMerlin.argmaxMerlin.concatMerlin.creluMerlin.crossentropyMerlin.dropoutMerlin.eluMerlin.l2Merlin.leaky_reluMerlin.logsoftmaxMerlin.max_batchMerlin.mseMerlin.reluMerlin.seluMerlin.sigmoidMerlin.softmaxMerlin.softmax_crossentropy
Activation
Merlin.crelu — Method.crelu(x::Var)Concatenated Rectified Linear Unit. The output is twice the size of the input.
References
Shang et al., "Understanding and Improving Convolutional Neural Networks via Concatenated Rectified Linear Units", arXiv 2016.
Merlin.elu — Method.elu(x::Var)Exponential Linear Unit.
References
Clevert et al., "Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs)", arXiv 2015.
where $\alpha=1$.
Merlin.leaky_relu — Function.leaky_relu(x::Var, alpha::Float64=0.2)Leaky Rectified Linear Unit.
References
Maas et al., "Rectifier Nonlinearities Improve Neural Network Acoustic Models", ICML 2013.
Merlin.relu — Method.relu(x::Var)Rectified Linear Unit.
Merlin.selu — Method.selu(x::Var)Scaled Exponential Linear Unit.
where $\lambda=1.0507$ and $\alpha=1.6733$.
References
Klambauer et al., "Self-Normalizing Neural Networks", NIPS 2017.
Merlin.sigmoid — Method.sigmoid(x)Sigmoid logistic function.
Merlin.Swish — Type.SwishSwish activation function.
where $\beta$ is a leanable parameter.
References
Ramachandran et al. "Searching for Activation Functions", arXiv 2017.
Base.tanh — Method.tanh(x::Var)Hyperbolic tangent function.
Convolution
Merlin.Conv1D — Type.Conv1D(T, ksize, insize, outsize, pad, stride, [dilation=1, init_W=Xavier(), init_b=Fill(0)])1-dimensional convolution function.
T = Float32
x = Var(rand(T,10,5))
f = Conv1D(T, 5, 10, 3, 2, 1)
y = f(x)Loss
Merlin.l2 — Function.l2(x::Var, lambda::Float64)L2 regularization.
x = Var(rand(Float32,10,5))
y = l2(x, 0.01)Merlin.crossentropy — Function.crossentropy(p, q)Cross-entropy function between p and q.
p::Var:
Varof Vector{Int} or Matrix{Float}. If p isVector{Int}and p[i] == 0, returns 0.q::Var:
Varof Matrix{Float}
p = Var(rand(0:10,5))
q = softmax(Var(rand(Float32,10,5)))
y = crossentropy(p, q)Merlin.mse — Function.mse(x1, x2)Mean Squared Error function between x1 and x2. The mean is calculated over the minibatch. Note that the error is not scaled by 1/2.
Merlin.softmax_crossentropy — Function.softmax_crossentropy(p, x)Cross-entropy function between p and $softmax(x)$.
where $q = softmax(x)$
p: Var of Vector{Int} or Matrix{Float}
q: Var of Matrix{Float}
p = Var(rand(0:10,5))
q = Var(rand(Float32,10,5))
y = softmax_crossentropy(p, x)Math
Base.broadcast — Function..+(x1::Var, x2::Var).-(x1::Var, x2::Var)\.\*(x1::Var, x2::Var)Base.:+ — Function.+(x1::Var, x2::Var)
+(a::Number, x::Var)
+(x::Var, a::Number)Base.:- — Function.-(x1, x2)Base.:* — Function.\*(A::Var, B::Var)Base.:/ — Function./(x1::Var, a)Base.:^ — Function.^(x::Var, a::Number)Base.transpose — Function.transpose(x)Random
Merlin.dropout — Function.dropout(x::Var, rate::Float64, train::Bool)If train is true, drops elements randomly with probability $rate$ and scales the other elements by factor $1 / (1 - rate)$. Otherwise, it just returns x.
Recurrent
Merlin.BiLSTM — Type.BiLSTM(::Type{T}, insize::Int, outsize::Int, [init_W=Uniform(0.001), init_U=Orthogonal()])Bi-directional Long Short-Term Memory network. See LSTM for more details.
Merlin.LSTM — Type.LSTM(::Type{T}, insize::Int, outsize::Int, [init_W=Uniform(0.001), init_U=Orthogonal()])Long Short-Term Memory network.
$x_t \in R^{d}$: input vector to the LSTM block
$f_t \in R^{h}$: forget gate's activation vector
$i_t \in R^{h}$: input gate's activation vector
$o_t \in R^{h}$: output gate's activation vector
$h_t \in R^{h}$: output vector of the LSTM block
$c_t \in R^{h}$: cell state vector
$W \in R^{h \times d}$, $U \in R^{h \times h}$ and $b \in R^{h}$: weight matrices and bias vectors
$\sigma_g$: sigmoid function
$\sigma_c$: hyperbolic tangent function
$\sigma_h$: hyperbolic tangent function
👉 Example
T = Float32
x = Var(rand(T,100,10))
f = LSTM(T, 100, 100)
h = f(x)Reduction
Base.max — Function.max(x::Var, dim::Int)Returns the maximum value over the given dimension.
👉 Example
x = Var(rand(Float32,10,5))
y = max(x, 1)Merlin.max_batch — Function.max_batch(x::Var, dims::Vector{Int})Misc
argmax
batchsort
concat
getindex
Linear
logsoftmax
lookup
reshape
softmax
standardize
window1d