Functions
Merlin.BiLSTM
Merlin.Conv1D
Merlin.LSTM
Merlin.Linear
Merlin.Swish
Base.:*
Base.:+
Base.:-
Base.:/
Base.:^
Base.broadcast
Base.getindex
Base.max
Base.reshape
Base.tanh
Base.transpose
Merlin.argmax
Merlin.concat
Merlin.crelu
Merlin.crossentropy
Merlin.dropout
Merlin.elu
Merlin.l2
Merlin.leaky_relu
Merlin.logsoftmax
Merlin.max_batch
Merlin.mse
Merlin.relu
Merlin.selu
Merlin.sigmoid
Merlin.softmax
Merlin.softmax_crossentropy
Activation
Merlin.crelu
— Method.crelu(x::Var)
Concatenated Rectified Linear Unit. The output is twice the size of the input.
References
Shang et al., "Understanding and Improving Convolutional Neural Networks via Concatenated Rectified Linear Units", arXiv 2016.
Merlin.elu
— Method.elu(x::Var)
Exponential Linear Unit.
References
Clevert et al., "Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs)", arXiv 2015.
where $\alpha=1$.
Merlin.leaky_relu
— Function.leaky_relu(x::Var, alpha::Float64=0.2)
Leaky Rectified Linear Unit.
References
Maas et al., "Rectifier Nonlinearities Improve Neural Network Acoustic Models", ICML 2013.
Merlin.relu
— Method.relu(x::Var)
Rectified Linear Unit.
Merlin.selu
— Method.selu(x::Var)
Scaled Exponential Linear Unit.
where $\lambda=1.0507$ and $\alpha=1.6733$.
References
Klambauer et al., "Self-Normalizing Neural Networks", NIPS 2017.
Merlin.sigmoid
— Method.sigmoid(x)
Sigmoid logistic function.
Merlin.Swish
— Type.Swish
Swish activation function.
where $\beta$ is a leanable parameter.
References
Ramachandran et al. "Searching for Activation Functions", arXiv 2017.
Base.tanh
— Method.tanh(x::Var)
Hyperbolic tangent function.
Convolution
Merlin.Conv1D
— Type.Conv1D(T, ksize, insize, outsize, pad, stride, [dilation=1, init_W=Xavier(), init_b=Fill(0)])
1-dimensional convolution function.
T = Float32
x = Var(rand(T,10,5))
f = Conv1D(T, 5, 10, 3, 2, 1)
y = f(x)
Loss
Merlin.l2
— Function.l2(x::Var, lambda::Float64)
L2 regularization.
x = Var(rand(Float32,10,5))
y = l2(x, 0.01)
Merlin.crossentropy
— Function.crossentropy(p, q)
Cross-entropy function between p and q.
p::Var:
Var
of Vector{Int} or Matrix{Float}. If p isVector{Int}
and p[i] == 0, returns 0.q::Var:
Var
of Matrix{Float}
p = Var(rand(0:10,5))
q = softmax(Var(rand(Float32,10,5)))
y = crossentropy(p, q)
Merlin.mse
— Function.mse(x1, x2)
Mean Squared Error function between x1
and x2
. The mean is calculated over the minibatch. Note that the error is not scaled by 1/2.
Merlin.softmax_crossentropy
— Function.softmax_crossentropy(p, x)
Cross-entropy function between p and $softmax(x)$.
where $q = softmax(x)$
p: Var of Vector{Int} or Matrix{Float}
q: Var of Matrix{Float}
p = Var(rand(0:10,5))
q = Var(rand(Float32,10,5))
y = softmax_crossentropy(p, x)
Math
Base.broadcast
— Function..+(x1::Var, x2::Var)
.-(x1::Var, x2::Var)
\.\*(x1::Var, x2::Var)
Base.:+
— Function.+(x1::Var, x2::Var)
+(a::Number, x::Var)
+(x::Var, a::Number)
Base.:-
— Function.-(x1, x2)
Base.:*
— Function.\*(A::Var, B::Var)
Base.:/
— Function./(x1::Var, a)
Base.:^
— Function.^(x::Var, a::Number)
Base.transpose
— Function.transpose(x)
Random
Merlin.dropout
— Function.dropout(x::Var, rate::Float64, train::Bool)
If train
is true, drops elements randomly with probability $rate$ and scales the other elements by factor $1 / (1 - rate)$. Otherwise, it just returns x
.
Recurrent
Merlin.BiLSTM
— Type.BiLSTM(::Type{T}, insize::Int, outsize::Int, [init_W=Uniform(0.001), init_U=Orthogonal()])
Bi-directional Long Short-Term Memory network. See LSTM
for more details.
Merlin.LSTM
— Type.LSTM(::Type{T}, insize::Int, outsize::Int, [init_W=Uniform(0.001), init_U=Orthogonal()])
Long Short-Term Memory network.
$x_t \in R^{d}$: input vector to the LSTM block
$f_t \in R^{h}$: forget gate's activation vector
$i_t \in R^{h}$: input gate's activation vector
$o_t \in R^{h}$: output gate's activation vector
$h_t \in R^{h}$: output vector of the LSTM block
$c_t \in R^{h}$: cell state vector
$W \in R^{h \times d}$, $U \in R^{h \times h}$ and $b \in R^{h}$: weight matrices and bias vectors
$\sigma_g$: sigmoid function
$\sigma_c$: hyperbolic tangent function
$\sigma_h$: hyperbolic tangent function
👉 Example
T = Float32
x = Var(rand(T,100,10))
f = LSTM(T, 100, 100)
h = f(x)
Reduction
Base.max
— Function.max(x::Var, dim::Int)
Returns the maximum value over the given dimension.
👉 Example
x = Var(rand(Float32,10,5))
y = max(x, 1)
Merlin.max_batch
— Function.max_batch(x::Var, dims::Vector{Int})
Misc
argmax
batchsort
concat
getindex
Linear
logsoftmax
lookup
reshape
softmax
standardize
window1d