Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

“Because even neural networks sometimes forget what they learned yesterday.”


🎤 The ResNet Revolution

Let’s start with a confession: as neural networks got deeper… they got dumber. Adding more layers should help, but in reality, it often made training worse. The model started forgetting how to learn — like a manager after too many PowerPoints.

Enter ResNet (Residual Network) — the network that looked at this chaos and said:

“What if… I just skip a few layers?”

🎯 The idea: instead of forcing every layer to learn new transformations, ResNet lets layers learn residuals — small tweaks to the existing knowledge.

Mathematically: [ y = F(x) + x ]

Where:

  • ( F(x) ) = what the current layer learns

  • ( x ) = original input (the “skip connection”)

So if the new layer doesn’t learn anything useful, the model just keeps the old knowledge. Genius. Lazy. Efficient.


🧠 Intuition

Layer TypeAnalogy
Regular NNEvery intern tries to reinvent the process
ResNetIntern just says, “Boss, it already works — I’ll just make it slightly better.”

⚙️ PyTorch: Tiny ResNet Example

Here’s a small, ResNet-inspired block — no PhD required.

import torch
import torch.nn as nn
import torch.nn.functional as F

class ResidualBlock(nn.Module):
    def __init__(self, in_channels):
        super().__init__()
        self.conv1 = nn.Conv2d(in_channels, in_channels, kernel_size=3, padding=1)
        self.bn1 = nn.BatchNorm2d(in_channels)
        self.conv2 = nn.Conv2d(in_channels, in_channels, kernel_size=3, padding=1)
        self.bn2 = nn.BatchNorm2d(in_channels)

    def forward(self, x):
        identity = x
        out = F.relu(self.bn1(self.conv1(x)))
        out = self.bn2(self.conv2(out))
        out += identity  # Skip connection
        return F.relu(out)

block = ResidualBlock(16)
print(block)

✅ If your model stops learning — don’t panic, just add skip connections. It’s like saying “I’ll circle back to this later,” but in math.


🔥 Why ResNet Rocks

  • Solves the vanishing gradient problem (gradients flow through skips)

  • Enables very deep networks (100+ layers!)

  • Is modular and flexible

  • Works beautifully on images, text, audio, and business KPIs disguised as tensors


🕰️ TCN – When CNNs Discover Time

Okay, so CNNs are great with space (images). But what if you want them to understand time — like sales trends, web traffic, or customer churn over weeks?

That’s where Temporal Convolutional Networks (TCNs) come in. They’re like ResNets that discovered calendars. 📅


⏳ The Core TCN Trick: Causal Convolutions

TCNs use 1D convolutions that only look backward in time, never forward — because predicting tomorrow’s sales using tomorrow’s data is cheating.

Visually:

t-3 → t-2 → t-1 → [ t ]

Each output at time t only depends on past data.


🧩 TCN Architecture

ComponentWhat It DoesAnalogy
Causal ConvolutionLooks only at past inputsNostalgic data scientist
DilationExpands receptive fieldSkips boring meetings (data points)
Residual BlockAdds stability and memoryLong-term planning
1D LayersWorks on time stepsBecause “time” isn’t 2D

🔧 PyTorch Example: TCN Block

class TemporalBlock(nn.Module):
    def __init__(self, n_inputs, n_outputs, kernel_size, dilation):
        super().__init__()
        self.conv1 = nn.Conv1d(n_inputs, n_outputs, kernel_size,
                               padding=(kernel_size-1)*dilation, dilation=dilation)
        self.bn1 = nn.BatchNorm1d(n_outputs)
        self.conv2 = nn.Conv1d(n_outputs, n_outputs, kernel_size,
                               padding=(kernel_size-1)*dilation, dilation=dilation)
        self.bn2 = nn.BatchNorm1d(n_outputs)
        self.downsample = nn.Conv1d(n_inputs, n_outputs, 1) if n_inputs != n_outputs else None

    def forward(self, x):
        out = F.relu(self.bn1(self.conv1(x)))
        out = self.bn2(self.conv2(out))
        res = x if self.downsample is None else self.downsample(x)
        return F.relu(out + res)

# Example: TCN with dilation over time series data
x = torch.randn(8, 1, 100)  # batch=8, 1 feature, 100 time steps
block = TemporalBlock(1, 8, kernel_size=3, dilation=2)
print(block(x).shape)

📈 Business Example: TCN for Forecasting

Imagine you’re predicting weekly revenue for multiple stores. A TCN can learn temporal dependencies like:

  • Seasonal trends

  • Promotions’ lag effects

  • Customer behavior waves

…and do it without recurrence (so it trains fast).


🤯 ResNet vs. TCN Cheat Sheet

ModelBest ForKey TrickBusiness Example
ResNetImages, tabular dataSkip connectionsProduct recognition, defect detection
TCNTime seriesDilated causal convolutionsRevenue forecasting, churn over time

💡 Real Talk: Why PyTorch Shines Here

Let’s address the TensorFlow elephant in the room 🐘.

TensorFlow is like a corporate PowerPoint — impressive but rigid. PyTorch is like a whiteboard brainstorming session — fast, flexible, and fun.

💬 “TensorFlow makes you feel like you’re configuring a rocket. PyTorch makes you feel like you’re building one.”

That’s why we’ll stick with PyTorch — it’s intuitive, Pythonic, and loved by researchers who occasionally sleep.


🧠 Mini Challenges

  1. Implement a 3-block ResNet for CIFAR-10 images.

  2. Use a TCN to predict synthetic sine-wave data.

  3. Compare training speeds between TCN and RNN.

  4. Add dropout and see if your model generalizes better.


🎯 Summary

ConceptEssence
ResNetSkip connections for deep stability
TCNCNNs that understand time
PyTorchFreedom with tensors
Business ValueFaster, stable models that don’t overfit meetings

“ResNet skips layers. TCN skips days. You? Just skip TensorFlow.” 😎


# Your code here