What Is a Neural Network — Principles and Your First Model in Python

Neural networks are the foundation of modern artificial intelligence, but their principles are surprisingly simple. In this article, we’ll explain how they work and create our first functional model in Python with just a few lines of code.

What Are Neural Networks and How Do They Work¶

Neural networks are mathematical models inspired by the functioning of the human brain. At their core lies an artificial neuron (perceptron), which receives input signals, processes them using weight coefficients and bias values, and produces output through an activation function.

Each neuron in the network performs a simple operation: output = activation_function(sum(inputs × weights) + bias). When we connect multiple neurons into layers and interconnect the layers, we get a neural network capable of solving complex problems.

Basic Components of a Neural Network¶

A neural network consists of three types of layers:

Input Layer – receives data and passes it forward
Hidden Layers – process data using weight transformations
Output Layer – produces the final prediction

Each connection between neurons has its weight, which determines how strongly one neuron influences another. During training, these weights are gradually adjusted using the backpropagation algorithm.

Implementing a Simple Neural Network in PyTorch¶

For practical demonstration, we’ll create a neural network that can classify data from the well-known Iris dataset. The network will have one hidden layer and use the ReLU activation function.

Data Preparation and Environment Setup¶

import torch
import torch.nn as nn
import torch.optim as optim
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
import numpy as np

# What Is a Neural Network — Principles and Your First Model in Python
iris = load_iris()
X = iris.data  # 4 features: sepal length/width, petal length/width
y = iris.target  # 3 classes: setosa, versicolor, virginica

# Split into training and testing data
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)

# Data normalization
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

# Convert to PyTorch tensors
X_train_tensor = torch.FloatTensor(X_train_scaled)
X_test_tensor = torch.FloatTensor(X_test_scaled)
y_train_tensor = torch.LongTensor(y_train)
y_test_tensor = torch.LongTensor(y_test)

Neural Network Architecture Definition¶

class SimpleNeuralNetwork(nn.Module):
    def __init__(self, input_size, hidden_size, output_size):
        super(SimpleNeuralNetwork, self).__init__()

        # Define layers
        self.hidden = nn.Linear(input_size, hidden_size)
        self.output = nn.Linear(hidden_size, output_size)

        # Activation functions
        self.relu = nn.ReLU()
        self.softmax = nn.Softmax(dim=1)

    def forward(self, x):
        # Forward pass - data flowing through the network
        x = self.hidden(x)      # Linear transformation
        x = self.relu(x)        # Apply ReLU activation
        x = self.output(x)      # Output layer
        return x

# Create model instance
model = SimpleNeuralNetwork(
    input_size=4,    # 4 features from Iris dataset
    hidden_size=10,  # 10 neurons in hidden layer
    output_size=3    # 3 classes for classification
)

print(f"Model architecture:\n{model}")

Training the Neural Network¶

For training, we need to define a loss function and optimizer. For classification, we’ll use CrossEntropyLoss and Adam optimizer.

# Define loss function and optimizer
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.01)

# Training loop
epochs = 1000
losses = []

for epoch in range(epochs):
    # Forward pass
    outputs = model(X_train_tensor)
    loss = criterion(outputs, y_train_tensor)

    # Backward pass and optimization
    optimizer.zero_grad()  # Clear gradients
    loss.backward()        # Backpropagation
    optimizer.step()       # Update weights

    losses.append(loss.item())

    # Print progress every 100 epochs
    if (epoch + 1) % 100 == 0:
        print(f'Epoch [{epoch+1}/{epochs}], Loss: {loss.item():.4f}')

print("Training completed!")

Model Evaluation¶

# Testing on test data
model.eval()  # Switch to evaluation mode
with torch.no_grad():
    test_outputs = model(X_test_tensor)
    _, predicted = torch.max(test_outputs.data, 1)

    # Calculate accuracy
    total = y_test_tensor.size(0)
    correct = (predicted == y_test_tensor).sum().item()
    accuracy = 100 * correct / total

    print(f'Accuracy on test data: {accuracy:.2f}%')

# Detailed look at predictions
print("\nComparison of actual and predicted values:")
for i in range(min(10, len(y_test))):
    actual = iris.target_names[y_test[i]]
    predicted_class = iris.target_names[predicted[i]]
    print(f"Actual: {actual:12} | Predicted: {predicted_class:12}")

How Neural Networks “Learn”¶

The neural network learning process occurs in four steps:

Forward Propagation – data flows through the network forward and creates a prediction
Loss Calculation – comparing prediction with actual value
Backpropagation – calculating gradients using the chain rule
Weight Update – adjusting weights based on gradients

The optimizer plays a key role in determining how quickly and efficiently the network learns. The Adam optimizer combines advantages of momentum and adaptive learning rate, often leading to faster convergence.

Activation Functions and Their Significance¶

ReLU (Rectified Linear Unit) is the most popular activation function for hidden layers. Its simplicity (max(0, x)) brings several advantages:

# Comparison of different activation functions
import matplotlib.pyplot as plt

x = torch.linspace(-5, 5, 100)
relu = torch.relu(x)
sigmoid = torch.sigmoid(x)
tanh = torch.tanh(x)

# ReLU: f(x) = max(0, x)
# Sigmoid: f(x) = 1 / (1 + e^(-x))
# Tanh: f(x) = (e^x - e^(-x)) / (e^x + e^(-x))

print("ReLU advantages:")
print("- Fast computations")
print("- Solves vanishing gradient problem")
print("- Sparsity - many neurons are inactive")

Extensions and Practical Tips¶

To improve neural network performance, we can use several techniques:

class ImprovedNeuralNetwork(nn.Module):
    def __init__(self, input_size, hidden_sizes, output_size, dropout_rate=0.2):
        super(ImprovedNeuralNetwork, self).__init__()

        layers = []
        prev_size = input_size

        # Create multiple hidden layers
        for hidden_size in hidden_sizes:
            layers.append(nn.Linear(prev_size, hidden_size))
            layers.append(nn.ReLU())
            layers.append(nn.Dropout(dropout_rate))  # Regularization
            prev_size = hidden_size

        # Output layer
        layers.append(nn.Linear(prev_size, output_size))

        self.network = nn.Sequential(*layers)

    def forward(self, x):
        return self.network(x)

# Using more advanced architecture
advanced_model = ImprovedNeuralNetwork(
    input_size=4,
    hidden_sizes=[16, 8],  # Two hidden layers
    output_size=3,
    dropout_rate=0.3
)

Summary¶

Neural networks are a powerful tool for solving complex machine learning problems. Understanding basic principles – from neuron structure through forward and backward propagation to optimization – is key to effective work with deep learning. PyTorch provides an intuitive interface for implementing and experimenting with various architectures. Starting with simple models like in our example is an ideal way to master the basics before moving on to more complex architectures like CNN or Transformer.

neural networkspytorchdeep learning

CORE SYSTEMS team

We build core systems and AI agents that keep operations running. 15 years of experience with enterprise IT.

All articles