Data Representation in Deep Learning

Understanding Tensors

Tensors are the primary structure for data in machine learning systems. They range from scalars (0D tensors) to higher-dimensional arrays.

Scalars are single numbers:


>>> import numpy as np
>>> x = np.array(10)
>>> x.ndim
0

Vectors (1D tensors) are arrays of numbers:


>>> x = np.array([12, 3, 6, 14])
>>> x.ndim
1

Matrices (2D tensors) have rows and columns:


>>> x = np.array([[5, 78, 2, 34, 0],
                  [6, 79, 3, 35, 1],
                  [7, 80, 4, 36, 2]])
>>> x.ndim
2

3D tensors and higher follow this pattern, stacking lower-dimensional structures.

Key tensor attributes:

Number of axes (ndim)

Shape (dimensions along each axis)
Data type

Tensors are used in various applications:

Vector data: (samples, features)
Time-series data: (timesteps, features)
Images: (height, width, channels)

Videos: (frames, height, width, channels)

These structures enable efficient data processing in machine learning tasks across various fields.

Real-World Examples of Data Tensors

Vector data (2D tensors):

Customer Demographics: A dataset of 100 people with age, height, and gender would be stored in a tensor shaped (100, 3).

Time-series data (3D tensors):

Stock Market Data: A year of trading data with high, low, and close prices every minute might be represented as (250, 390, 3).

Image data (4D tensors):

Image Processing: 128 grayscale images sized 256×256 would use a tensor shaped (128, 256, 256, 1). For color images, the last dimension would be 3.

Video data (5D tensors):

Video Processing: Four 60-second video clips at 4 frames per second and 144×256 resolution would be represented as (4, 240, 144, 256, 3).

These tensor representations facilitate efficient data processing and enable neural networks to recognize patterns, extract features, and make predictions in various domains.

Collage of real-world data represented as tensors

Feature Learning and Representation Learning

Data representation learning has evolved from simple linear techniques to complex deep learning models. This progression has enhanced the ability of machine learning systems to process raw data effectively.

Early methods:

Principal Component Analysis (PCA): Unsupervised technique for dimensionality reduction.
Linear Discriminant Analysis (LDA): Supervised method for maximizing class separability.

Example of PCA implementation:


from sklearn.decomposition import PCA
import numpy as np

data = np.array([[2.5, 2.4], [0.5, 0.7], [2.2, 2.9], [1.9, 2.2], [3.1, 3.0]])
pca = PCA(n_components=1)
transformed_data = pca.fit_transform(data)

Manifold learning methods like Isometric Mapping (Isomap) and Locally Linear Embedding (LLE) emerged to handle more complex data structures.

Deep learning marked a significant advancement in representation learning. Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs) can learn intricate representations directly from raw data.¹

Example of a simple CNN structure:


from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense
from tensorflow.keras.models import Sequential

model = Sequential([
    Conv2D(32, (3, 3), activation='relu', input_shape=(64, 64, 3)),
    MaxPooling2D(pool_size=(2, 2)),
    Flatten(),
    Dense(128, activation='relu'),
    Dense(10, activation='softmax')
])
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

These advancements in representation learning have enabled significant improvements in various AI applications, from healthcare to finance, enhancing our ability to process and understand complex data.²

Graph Neural Networks

Graph Neural Networks (GNNs) are specialized neural network architectures designed for graph-structured data. Unlike traditional neural networks, GNNs can process data where relationships between elements are crucial, enabling advancements in social network analysis, chemical molecule discovery, and recommendation systems.

GNNs understand the interconnected nature of graph data, where nodes and edges represent entities and their relationships. Three main types of GNNs are:

Recurrent Graph Neural Networks (Recurrent GNNs): These extend recurrent neural networks to graph data, iteratively propagating information across the graph until convergence.
Spatial Convolutional Graph Neural Networks: These apply convolutions directly on the graph, aggregating node features from neighboring nodes.

Spectral Convolutional Graph Neural Networks: These perform convolutions in the spectral domain using graph Fourier transforms.

Here's an example of a Recurrent GNN implementation:


import torch
import torch.nn as nn
import torch.nn.functional as F

class RecurrentGNN(nn.Module):
    def __init__(self, input_size, hidden_size, num_classes):
        super(RecurrentGNN, self).__init__()
        self.rnn = nn.RNN(input_size, hidden_size)
        self.fc = nn.Linear(hidden_size, num_classes)

    def forward(self, x, hidden):
        out, hidden = self.rnn(x, hidden)
        out = self.fc(out[-1, :, :])
        return out

hidden = torch.zeros(1, 1, hidden_size)
model = RecurrentGNN(input_size=10, hidden_size=20, num_classes=5)

GNNs excel at capturing and processing dependencies within graph-structured data, making them useful for understanding complex systems in real-world scenarios.

Advanced Data Representation Techniques

Advanced data representation techniques like embeddings and auto-encoders offer new ways to handle and interpret complex data types.

Embeddings transform complex data into dense, fixed-size vector spaces, retaining meaningful relationships between elements. They're particularly useful in natural language processing:


from gensim.models import Word2Vec

sentences = [['machine', 'learning', 'is', 'fun'], ['deep', 'learning', 'requires', 'lots', 'of', 'data']]
model = Word2Vec(sentences, vector_size=10, window=5, min_count=1, workers=4)
vector = model.wv['learning']  # Access the word vector for 'learning'

Auto-encoders are neural networks designed for unsupervised learning of efficient codings. They compress input into a latent-space representation and then reconstruct the output:


from tensorflow.keras.layers import Input, Dense
from tensorflow.keras.models import Model

input_dim = 784  # Example for an image with 28x28 pixels
input_img = Input(shape=(input_dim,))
encoded = Dense(64, activation='relu')(input_img)  # Encoder
decoded = Dense(input_dim, activation='sigmoid')(encoded)  # Decoder

autoencoder = Model(input_img, decoded)
autoencoder.compile(optimizer='adam', loss='binary_crossentropy')

These techniques have applications in various fields:

Image Processing: Style transfer, where the style of one image is applied to another.
Biology: Protein-protein interaction prediction. Models like CycleDNN, inspired by auto-encoders, can predict CETSA features across different cell lines.

These advanced data representation techniques enhance our ability to interpret and manipulate data across diverse fields, from image processing to biomedical research.

Visual comparison of embeddings and auto-encoders

Tensors are fundamental in machine learning, enabling efficient data processing and complex computations. Understanding their structure and applications can drive advancements in various fields, from biology to artificial intelligence.

Writio: AI content writer for websites and blogs, creating high-quality articles automatically. This was written by Writio.