Chris' Blog.

My occasional thoughts on iOS development, developers careers, trying to make an income from the App Store, and updates on life in general.

Previewable SwiftUI ViewModels

Hi all, I’d like to talk about a way to setup your ViewModels in SwiftUI to make previews easy:

  • A) Decouple your ViewModels from your Views.
  • B) Replace your ViewModel when previewing.
  • C) Easily inject any ViewState content when previewing.
  • D) Test your ViewModels without needing a View, instead testing their ViewState.

I’ve used a variant of this (I simplified it a little) with a big team before so I know it’s battle-proven. But of course this may be more helpful as a starting point for you, too.

The general idea is this: Have a ‘ViewModel’ protocol, and make your Views have a generic constraint to accept any ViewModel that uses that view’s specific state/events, and use a preview viewmodel that adheres to the protocol.

One-time boilerplate

So here’s the generic ViewModel that every screen will re-use. ViewEvent is typically an enum, and used by the View to eg send button presses to the ViewModel. ViewState is the struct that is used to push the loaded/loading/error/whatever state to the View.

protocol ViewModel<ViewEvent, ViewState>: ObservableObject {
    associatedtype ViewEvent
    associatedtype ViewState

    // For communication in the VM -> View direction:
    var viewState: ViewState { get set }

    // For communication in the View -> VM direction:
    func handle(event: ViewEvent)

Somewhere you’ll have a ‘preview’ viewmodel. This is declared once and used by all screens you want to preview. I’m a fan of putting your preview code in a conditional compilation statement. Note that this allows you to inject any viewstate you like. Is ‘preview view’ a tautology? Should this be called PreviewModel or PreViewModel? Flip a coin to decide…

#if targetEnvironment(simulator)
class PreviewViewModel<ViewEvent, ViewState>: ViewModel {
    @Published var viewState: ViewState

    init(viewState: ViewState) {
        self.viewState = viewState

    func handle(event: ViewEvent) {
        print("Event: \(event)")


Before I show the view, I’ll introduce the event and states. Firstly the event enum, this is the single ‘pipe’ via which the View calls through to the ViewModel (aspirationally… 2-way bindings sidestep this). You will likely have associated values on some of these, eg the id of which row was pressed, that kind of thing:

enum FooViewEvent {
    case hello
    case goodbye
    case present

Next is the ViewState. This controls what is displayed. Typically you might have an loading/loaded/error enum in here, among other things. Notice there’s an ‘xIsPresented’ var here that is used in a 2-way-binding later for modal presentation:

struct FooViewState: Equatable {
    var text: String
    var sheetIsPresented: Bool = false

Ok, now the state and event are out of the way, here’s how a view might look. Note the gnarly generic clause up the top, this is the trickiest part of this whole technique to be honest. Basically it’s saying ‘I can accept any ViewModel that uses this particular screen’s event/state’. Also note the 2-way binding for the modal sheet: even though this somewhat side-steps the idea of piping all input/output through the event/state concept, it’s very SwiftUI-idiomatic to use these bindings so I don’t want to be overly rigid and make life difficult: we want to avoid ‘cutting against the grain’ when working with SwiftUI. So, yeah, this isn’t architecturally pure, but it is productive!

struct FooView<VM: ViewModel>: View
where VM.ViewEvent == FooViewEvent,
      VM.ViewState == FooViewState
    @StateObject var viewModel: VM

    var body: some View {
        VStack {
            Button("Hello") {
                viewModel.handle(event: .hello)
            Button("Goodbye") {
                viewModel.handle(event: .goodbye)
            Button("Present modal sheet") {
                viewModel.handle(event: .present)
        .sheet(isPresented: $viewModel.viewState.sheetIsPresented) {
            Text("This is a modal sheet!")


Last but not least is the ViewModel for this screen. Note that because viewState is @Published, and ViewModel is a @StateObject, any updates to viewState are magically automatically applied to the View. It’s really simple, no Combine required! Also note the xIsPresented is trivial to set to true to present something, far simpler than using some form of router which I fear can be convoluted.

class FooViewModel: ViewModel {
    @Published var viewState: FooViewState

    init() {
        viewState = FooViewState(
            text: "Nothing has happened yet."

    func handle(event: FooViewEvent) {
        switch event {
        case .hello:
            viewState.text = "👋"
        case .goodbye:
            viewState.text = "😢"
        case .present:
            viewState.sheetIsPresented = true


At the bottom of the view file you’ll want your previews. By using the PreviewViewModel you can inject whatever ViewState you like:

#if targetEnvironment(simulator)
#Preview {
        viewModel: PreviewViewModel(
            viewState: FooViewState(
                text: "This is a preview!"


I hope this helps you use SwiftUI in a preview-friendly way! SwiftUI without previews is the pits…

The source for this is on this github gist here

Thanks for reading, hope you found this helpful, at least a tiny bit, God bless!

Photo by Yahya Gopalani on Unsplash Font by Khurasan on Dafont

Training a single neuron

Hi all, here’s the third on my series on neural networks / machine learning / AI from scratch. In the previous articles (please read them first!), I explained how a single neuron works, and how to calculate the gradient of its weight and bias. In this article, I’ll explain how you can use those gradients to train the neuron.


I recommend opening this spreadsheet in a separate tab, and viewing it as you read this post which explains the maths: Single neuron training.

In case the linked spreadsheet is lost to posterity, here it is in slightly less well-formatted form (note: for brevity’s sake, I’ve shortened references such as B2 to simply ‘B’ when referring to a column in the same row):

  A B C D E F G H I J K L M N O P Q
1 Learning rate   Training     Neuron               Outputs      
2 0.1   In Out   Input Weight Weight gradient Bias Bias gradient Net Output   Target Attempt Error Loss
3     0.01 0.1 (C*10)   0.01 (C) 0.5 J * F 0.5 P * (1-L²) F*G+I Tanh(K)   0.1 (D) 1 L-N P² / 2
4     0.01 0.1 (C*10)   0.01 (C) G3 - H3 * LEARNING_RATE J * F I3 - J3 * LEARNING_RATE P * (1-L²) F*G+I Tanh(K)   0.1 (D) 2 L-N P² / 2
5     0.01 0.1 (C*10)   0.01 (C) G4 - H4 * LEARNING_RATE J * F I4 - J4 * LEARNING_RATE P * (1-L²) F*G+I Tanh(K)   0.1 (D) 3 L-N P² / 2

High level explanation

Note: “Parameters” is the umbrella term for “weights and biases”.

  • Row 3 starts with any old values for the parameters.
  • Row 4 optimises the parameters a little to decrease the error.
  • Row 5.1000 repeat this optimisation process, aka ‘gradient descent’.
  • Eventually the optimised parameters will produce the output we want!

Detailed explanation

A2 is the ‘learning rate’. This governs how much we ‘nudge’ our weight/bias each iteration. In this example it’s higher than a more common 0.1% - 1%.

Columns C-D are the ‘training data’. In this example we want to train the neuron to multiply by 10.

Columns F-L are the neuron maths, as covered by my earlier articles. The two gradients in particular are tricky and important: They dictate which direction the bias/weight should respectively be ‘nudged’ to decrease the error.

Columns N-Q are the outputs, and useful for producing the neat graph you’ll hopefully see in the actual spreadsheet, which demonstrates how the error decreases over the iterations.

Row 3 is the initial data. At this point in a real implementation we would typically choose random values for the initial bias and weight, however I’ve chosen 0.5 to start with because it’s a nice round number.

🧨💣💥 Rows 4+ are the same as row 3, except that the parameters have some of their gradient subtracted each time. (this is the important bit)

Incidentally, this might help explain why training a NN uses a lot more computation than using it: Because of all the gradient calculations and iterations over training data.

And there you have it, that’s how to use the gradients to train a single neuron. Next I’ll explain how to calculate the gradients for a network of them!

Rust demo

Because I’m a Rust tragic, here’s a demo:

const LEARNING_RATE: f64 = 0.01;
const TRAINING_INPUT: f64 = 0.01;
const TRAINING_OUTPUT: f64 = 0.1;

fn main() {
    // Initial parameters.
    let mut weight: f64 = 0.5;
    let mut bias: f64 = 0.5;

    // Train.
    for _ in 0..100_000 {
        let net = TRAINING_INPUT * weight + bias;
        let output = net.tanh();
        let error = output - TRAINING_OUTPUT;
        let loss = error * error / 2.;
        let bias_gradient = error * (1. - output * output);
        let weight_gradient = bias_gradient * TRAINING_INPUT;
        weight -= weight_gradient * LEARNING_RATE;
        bias -= bias_gradient * LEARNING_RATE;

    // Use the trained parameters:
    let trained_net = TRAINING_INPUT * weight + bias;
    let trained_output = trained_net.tanh();
    println!("Trained output: {}", trained_output);

Which outputs:

Trained output: 0.1000000000000007

Which matches the training output nicely!

Thanks for reading, hope you found this helpful, at least a tiny bit, God bless!

Photo by Eugene Golovesov on Unsplash

Gradients for a single neuron

Hi all, here’s the second on my series on neural networks / machine learning / AI from scratch. In the previous article (please read it first!), I explained

how a single neuron works. In this article, I’ll explain how you can determine the ‘gradients’ of that neuron, in other words how much effect the weight and bias has on the final ‘loss’, using some high-school calculus. This is an prerequisite for training, which I’ll cover later.


I recommend opening this spreadsheet in a separate tab, and viewing it as you read this post which explains the maths: Single neuron gradients.

In case the linked spreadsheet is lost to posterity, here it is in slightly less well-formatted form (note: for brevity’s sake, I’ve shortened references such as B2 to simply ‘B’ when referring to a column in the same row):

  A B C D E F G H I J K
1   Input Weight Bias Net Output Target Error Loss    
2 Neuron maths: 0.4 0.5 0.6 0.8 (B*C+D) 0.664 (tanh(E)) 0.7 -0.035963 (F-G) 0.0006467 (H^2 / 2)    
3 Real local gradients: 0.5 (C2) 0.4 (B2) 1 0.5591 (1-F2^2) -0.036 (H2)          
4 Real global gradients: -0.0101 (B3*E) -0.0080 (C3*E) -0.0201 (E) -0.0201 (E3*F) -0.036 (F3)          
5                     Faux gradient
6 Faux gradient of ‘output’:         0.66414 (F2+Tiny) 0.7 -0.035863 (F-G) 0.0006431 (H^2 / 2)   -0.0359 ((I - I2)/Tiny)
7 Faux gradient of ‘net’:       0.8001 (E2+Tiny) 0.66409 (tanh(E)) 0.7 -0.035907 (F-G) 0.0006447 (H^2 / 2)   -0.0201 ((I - I2)/Tiny)
8 Faux gradient of ‘bias’: 0.4 0.5 0.6001 (D2+Tiny) 0.8001 (B*C+D) 0.66409 (tanh(E)) 0.7 -0.035907 (F-G) 0.0006447 (H^2 / 2)   -0.0201 ((I - I2)/Tiny)
9 Faux gradient of ‘weight’: 0.4 0.5001 (C2+Tiny) 0.6 0.80004 (B*C+D) 0.66406 (tanh(E)) 0.7 -0.035941 (F-G) 0.0006459 (H^2 / 2)   -0.0080 ((I - I2)/Tiny)
10 Faux gradient of ‘input’: 0.4001 (B2+Tiny) 0.5 0.6 0.80005 (B*C+D) 0.66406 (tanh(E)) 0.7 -0.035935 (F-G) 0.0006457 (H^2 / 2)   -0.0100 ((I - I2)/Tiny)
Tiny 0.0001 Moved down here to help with readability                  

What is the gradient?

Firstly: what is the gradient? It is also known as the slope, derivative, or velocity of an equation.

For a simple example, consider tides in a river mouth:

  • At high tide (maximum position), the water is still (0 velocity).
  • Then, half-way from high to low tide (0 position), the water is rushing out (maximum positive velocity). This is the time when the waves are biggest and my friend almost drowned the other day on his jet ski, but that’s a story for another day!
  • Then, at low tide (minimum position), the water is still again (0 velocity).
  • Then, half-way from low to high tide (0 position again), the water is rushing in (maximum negative velocity).

In this analogy, the height of the water is the position (like the values for the weights, bias, net, output, or loss), and the velocity of the water is the gradient (or derivative, or slope). Figuring out that gradient is what this article is all about.

For a more thorough explanation of gradients, check out Wikipedia.

Why do we want to know the gradients?

The reason we want the gradients of a neuron’s weight(s) and bias, is that we can use them to figure out whether we need to nudge their values up or down a bit or leave them as-is, in order to get an output that’s closer to the target during training.

Faking a gradient

You can fake a gradient by comparing the result of an equation vs the result when adding a tiny amount to the input. These faux gradients are helpful for verifying our calculus later.

Here’s the general way to fake a gradient:

Faux gradient of f(x) = ( f(x + tiny) - f(x) ) / tiny

To make it more specific to our neuron:

Faux gradient of how weight affects output = (
    tanh(input * (weight + tiny) + bias) -
    tanh(input * weight + bias)
) / tiny

Or the full kahuna on the loss function:

Faux gradient of how bias affects loss = (
    (tanh(input * weight + (bias + tiny)) - target)^2 / 2 
    (tanh(input * weight + bias) - target)^2 / 2
) / tiny

Please note that the loss function changed vs the previous article (it now has a / 2) - this is to make the calculus simpler.

You can look at rows 6 through 10 in the spreadsheet to see how these faux gradients are calculated. In columns B to I, various things have the tiny value added to them, to see how this affects the final ‘loss’. For instance, on row 6, you can see I’m adding the tiny value to the output, then feeding that through to the loss function, and doing the (loss with tiny - loss without tiny) / tiny to calculate the faux gradient. The rest of these faux gradients are similar.

Real gradients with calculus

Lets use calculus to calculate the real gradients. Firstly we need to calculate the ‘local’ gradients. See row 3 in the spreadsheet as you follow along:

What is a local gradient? Since all our calculations are performed in stages (eg net > output > error > loss), a local gradient is how much impact changes in one stage have on the next stage.

A better maths teacher than I would be able to explain how we arrive at the following, but here are the formulas below:

Local gradient equations

(Note when I say ‘the gradient of Y with respect to X’ it means that X is the input/earlier stage, Y is the output/later stage, and it roughly means ‘if you nudge X, what impact will that have on Y?’.)

  • Input (gradient of Net with respect to Input) = Weight (see B3)
  • Weight (gradient of Net with respect to Weight) = Input (see C3)
  • Bias (gradient of Net with respect to Bias) = 1 (see D3)
  • Net (gradient of Output with respect to Net) = 1 - Output^2 (see E3)
  • Output (gradient of Error with respect to Output) = Error (see F3)
  • Error (gradient of Loss with respect to Error) = Error (this is where the / 2 in our loss helps) (see H3)

Global gradients

Next we need to combine the gradients using the calculus ‘chain rule’, so that we can get the impacts of each variable on the loss.

These are calculated in reverse order (this is why it is called _back_propagation) because most of these rely on the next step’s gradient.

  • Output (gradient of Loss with respect to Output) = Output (See F4)
  • Net (gradient of Loss with respect to Net) = (1 - Output^2) * Output global gradient (See E4)
  • Bias (gradient of Loss with respect to Bias) = Net global gradient (See D4)
  • Weight (gradient of Loss with respect to Weight) = Input * Net global gradient (See C4)
  • Input (gradient of Loss with respect to Input) = Weight * Net global gradient (See B4)

You may like to compare these with the respective faux gradients and see that they are (roughly) the same.

And there you have it, you have the gradients for a single neuron. Next I’ll explain how to use these gradients for training!

Unnecessary Rust implementation

Just for the hell of it, here’s an implementation in Rust:

struct Neuron {
    input: f32,
    weight: f32,
    bias: f32,
    target: f32,

impl Neuron {
    fn net(&self) -> f32 {
        self.input * self.weight + self.bias
    fn output(&self) -> f32 {
    fn error(&self) -> f32 {
        self.output() -
    fn loss(&self) -> f32 {
        let e = self.error();
        e * e / 2.
    fn output_gradient(&self) -> f32 {
    fn net_gradient(&self) -> f32 {
        let o = self.output();
        let net_local_derivative = 1. - o * o;
        net_local_derivative * self.output_gradient()
    fn bias_gradient(&self) -> f32 {
    fn weight_gradient(&self) -> f32 {
        self.input * self.net_gradient()

fn main() {
    let neuron = Neuron {
        input: 0.4,
        weight: 0.5,
        bias: 0.6,
        target: 0.7,
    println!("Weight gradient: {:.4}", neuron.weight_gradient());
    println!("Bias gradient: {:.4}", neuron.bias_gradient());

Which outputs:

Weight gradient: -0.0080
Bias gradient: -0.0201

Which matches the spreadsheet nicely!

Thanks for reading, hope you found this helpful, at least a tiny bit, God bless!

Photo by Chinnu Indrakumar on Unsplash

