Optimizers
Optimizers provides a way to update the weights of Merlin.Var.
x = zerograd(rand(Float32,5,4))
opt = SGD(0.001)
opt(x)
println(x.grad)Merlin.AdaGrad — Type.AdaGradAdaGrad Optimizer.
References
- Duchi t al., "Adaptive Subgradient Methods for Online Learning and Stochastic Optimization", JMLR 2011. 
Merlin.Adam — Type.Merlin.SGD — Type.SGDStochastic Gradient Descent Optimizer.
Arguments
- rate: learning rate 
- [momentum=0.0]: momentum coefficient 
- [nesterov=false]: use nesterov acceleration or not