MEDIA SIGNAL PROCESSING

TIME : Winter 2016

Programming Language : iPython (Numpy and Scipy Libraries)

Processing multi-media data is at the core of today’s digital world. From image data mining through computer vision to audio processing and compression for Internet telephony, media signal processing is as ubiquitous as it is inconspicuous. This course hopes to serve as an introduction to the field of processing sound, image and video, from a theoretical perspective, but firmly grounded in practical applications and demonstrations. This course will explore fundamental concepts in digital signal processing, multimedia signal processing, and multimedia representations. It will deal with topics like audio and image filtering and feature extraction, gestural input and computer vision.

Course Website


PART 1

Sonification on an Image

Here I have used a spiral image and performed some operations on it to produce audio which is in the audible range for humans. The image I have used here is a spiral image which helps in developing a more pronounced waveform.

SOURCE CODE


PART 2

Sonification on an Image by using image data as the spectrum on which an Inverse STFT was performed. To perform the Inverse STFT , the image data is extracted and stored as 1D arrays. These 1D arrays are accessed one window at a time and their corresponding fft is computed and stored in the final_sig.

SOURCE CODE


PART 3

DETECTING ROADS USING SATELLITE IMAGES

I aim to utilise cross-correlation between satellite images of cityscape and reference image of a rectangle centrally placed in a square to detect the roadways. The drawback of this implementation is that only vertical roads are detected appropriately. Any horizontal or inclined road needs the reference image to be aligned differently.

SOURCE CODE


PART 4

IDENTIFYING PROMINENT PEAKS IN AUDIOS

Three wav files have been given .I have computed the DFT of the entire file using fft.rfft and then identify the index of the most prominent peak in its magnitude spectrum. The procedure has been commented with the work flow.

SOURCE CODE


PART 5

IDENTIFYING PROMINENT PEAKS IN AUDIOS

Three wav files have been given .The autocorrelation is computed , which is a measure of how similar is a signal to itself at different points. The points where autocorrelation is the highest correspond to the points where the signal is being repeated and in other words the distance between two peaks of autocorrelation give us the period of the original signal. This is how I have tried to extract the frequency of the signals by extracting the indices of the prominent peaks in the audios autocorrelation. After this I have used the computation done as a part of the last assignment to extablish how accurate is the prediction of the signals frequency using the autocorrelation function.

SOURCE CODE