## Annoyances with machine learning, specifically neural network implementation

##### Thursday, 04 May 2017

I'm doing Andrew Ng's Machine Learning course on Coursera. It's nice that they use Octave/MATLAB so that a lot of the matrix operations do not have to be written out. However, I wish they would decide if they want inputs and outputs to be column vectors or row vectors. The mathematical notation used throughout implies that they are dealing with column vectors. However, all the programming assignments deal with row vectors. The training set is given as a matrix, such that each row of the matrix is one training example, i.e. the training set is a "stack" of row vectors". The corresponding outputs are also in rows, naturally. That's well and good. I can deal with switching representations.

Until it comes to backpropagation, where they decide at the end that they really want things in row vectors. So, all the Delta arrays that I computed, which give the gradients, turn out to be transposed. After literally 20 times going over step by step, every single line in my code and comparing with the notation (doing the transpose in my head as well as on paper), and keeping track of matrix dimensions, I finally just transpose the Delta arrays at the end (a line of code that was pre-written by the instructor), and the answer came out correct. </rant>