In the last installment of this series, we constructed a Haskell Neural Network. In this installment, we write the functions we need to get that Neural Network to perform useful work.
The first function we’ll cover is how to calculate the index with the highest output value, in order to, for instance, choose the best output because that output is the single action that our Neural Network can take this turn. It looks something like this:
calculateHighestOutputIndex :: [Value] -> NeuralNetwork -> Int calculateHighestOutputIndex inputValues neural = maxIndex $ calculateOutputValues inputValues neural where maxIndex xs = snd $ maxOutputAndIndex xs maxOutputAndIndex xs = maximumBy (comparing fst) (zip xs [0..])
We provide the input values and the Neural Network we’re using for calculations and it returns an Int, which is the index of the highest output in the list of outputs. That’s basically what “maxIndex $ calculateOutputValues” means!
Of course, we need to define maxIndex. It’s the second return value (b in (a,b)) of the function maxOutputAndIndex defined below as taking the maximum by comparing the first return value (a above) of a zipped list of outputs and a list from zero to infinity.
Or in other words, I use zip to create a pairing of a list element and its index in the list, and then I get the maximum of the outputs by comparing the first and return the second. The “where” functions are just a slightly wordy way of expressing that.
“But wait, what about calculateOutputValues! Where is that coming from?”
Well, that’s a function that’s useful on its own, so I decided to make it a top-level function. Here’s the code for it:
calculateOutputValues :: [Value] -> NeuralNetwork -> [Value] calculateOutputValues inputValues (NeuralNetwork _ _ verticeGroups) = foldl' calculateLayerValues inputValues verticeGroups
calculateOutputValues takes inputValues (a list of values) and a NeuralNetwork and returns a list of values (that we can call for our own convenience “outputValues”)
This simple, easy two lines of code was actually a lot messier to think through than it looks, so I’m going to take the picture:
ISN’T IT OBVIOUS NOW!?
Let’s start by talking about what a fold is. A fold is a function that takes a function, a starting value, and a list of things and uses that function and starting value to reduce that list of things into a single thing. Here’s a friendly example from mathematics:
foldl' + 0 [1,2,3,4]
This takes the starting value, 0, adds 1 to it, then adds 2 to that, then adds 3 to that, then adds 4 to that. 0+1+2+3+4. In math this is the summation of a series.
Replace the + with * and you get 0*1*2*3*4, which is, of course, 0, so maybe you want to start with the accumulator (starting value) as the identity for multiplication, which is 1. Now back to our previous example:
foldl’ calculateLayerValues inputValues verticeGroups takes the inputValues as the accumulator, which is provided from outside of this module. It calculates the layer’s values for them across the list of verticeGroups, and returns a list, which is still technically a fold/reduce, because a list can be thought of as a single (composite) value. (If you’re confused by this, think about whether [1,2,3,4] = [1,2,3,4])
Or, we start at the left, and progressively fold the layers into it using the value generated from the last set of layers.
Okay, but where does calculateLayerValues come from? Well, I made that a top-level function too, although the module doesn’t export it, because I really wanted the type system to help me figure out how to solve this particular problem.
calculateLayerValues :: [Value] -> [[Value]] -> [Value] calculateLayerValues previousLayer verticeGroup = map (calculateNode previousLayer) verticeGroup where calculateNode previousLayer weights = squash $ sum $ zipWith (*) previousLayer weights where squash x = 1 / (1 + ((exp 1) ** (negate x)))
The best way to explain this is bottom up. In order to calculate a node, we need the values of the previous layer (because they all point into the node) along with the weights of each edge. First we multiply together the previousLayer and the weights (zipWith (*) previousLayer weights) then we sum up the list that that generated, then we squash it into range (where squash is defined just below that line).
To go from doing this for a single node to doing it for a list, we use a map, which applies a function to every element in a list. If you’re scratching your head at that, Learn You A Haskell is a friendly place to start, or if you’re feeling adventurous, the Wikipedia article on Functors.
Finally, I’ve defined a function that isn’t strictly necessary for a functioning Neural Network, but that I know I’ll use:
mutate :: (Value,Value) -> NeuralNetwork -> IO NeuralNetwork mutate verticeRange neural@(NeuralNetwork mutationRate nodeLayerSizes vdo g <- newStdGen let (chance,g') = randomR (0.0,1.0) g if chance <= mutationRate then mutateVertices g' else return neural where mutateVertices g = do return $ buildNeuralNetwork mutationRate nodeLayerSizes newValues where values = concat $ concat verticeGroups (verticeToReplace,g') = randomR (0,(length values) - 1) g (left,(_:right)) = splitAt verticeToReplace values (newValue,_) = randomR verticeRange g' newValues = left ++ [newValue] ++ right
Mutate takes an allowed range for an edge to take in that Neural Network (I use -8.0 to 8.0 for MAGICAL REASONS) and a Neural Network. It generates a random number between 0 and 1 and checks it against the Neural Network’s innate mutation rate. If r < m, the Neural Network mutates, that is, it changes only one edge’s value. If the check fails, the Neural Network doesn’t mutate.