Key points
- Perceptron
- Sigmoid
- Cost function
- Gradient Descent / Stochastic Gradient Descent
- Back-propagation
- Chain rule
- Quadratic cost
- Cross-Entropy cost
- Softmax + log-likelihood cost
- Overfitting
- Early stopping strategy
- Hold out method
- Regularization
- Weight decay / L2 regularization
- L1 regularization
- Dropout
- Artificially increasing the training set size
- ...