Ram Vikas Mishra (246101011)
PhD (CSE), IIT Guwahati
DA623: Computing with Signal (Even Semester, Batch 2025)
Instructor: Dr. Neeraj Sharma
(Scroll down or use navigation buttons to proceed)
Autoencoders are crucial precursors and components in multimodal systems.
(Generic diagram showing modality mapping via latent space)
This project focuses on visualizing the *unimodal* representation learning aspect, a building block for more complex multimodal architectures.
(Screenshot of Config Area & Chart)
(Screenshot of Canvas Input Area)
(Screenshot of Encoder Activations)
(Screenshot of Bottleneck Grid and Output of Decoder)
(Screenshot of Decoder Activations)
Simplified Keras code demonstrating the encoder structure:
from tensorflow.keras import layers, models, Input
def build_encoder(input_shape, filters1, filters2, latent_dim):
encoder_inputs = Input(shape=input_shape, name='encoder_input')
# Conv 1 -> 14x14
x = layers.Conv2D(filters1, (3, 3), activation='relu', padding='same',
strides=2, name='encoder_conv1')(encoder_inputs)
x = layers.BatchNormalization(name='encoder_bn1')(x)
# Conv 2 -> 7x7
x = layers.Conv2D(filters2, (3, 3), activation='relu', padding='same',
strides=2, name='encoder_conv2')(x)
x = layers.BatchNormalization(name='encoder_bn2')(x)
x = layers.Flatten(name='encoder_flatten')(x)
# Bottleneck
encoder_outputs = layers.Dense(latent_dim, name='bottleneck')(x)
encoder = models.Model(encoder_inputs, encoder_outputs, name='encoder')
return encoder
# Usage:
# encoder = build_encoder((28, 28, 1), f1=8, f2=4, latent_dim=9)
Key components: Convolutional layers for spatial reduction, Batch Normalization for stability, Flattening, and a Dense layer for the final bottleneck.