Research Projects

Question Answering using KG and Web Corpus

Guided by: Prof. Soumen Chakrabarti | Machine Learning, IR | IIT Bombay

Developed AQQUCN, a QA system which gracefully combines KG and corpus evidence, improving the state-of-the-art performance by 5–16 % on four public query sets. Accepted at Information Retrieval Journal, 2019.


_______________________________________________________________________________________________________________________________

Languge Modeling in code switched speech

Guided by: Prof. Preethi Jyothi | Machine Learning, NLP | IIT Bombay

Proposed D-RNNLM, a novel language modeling approach for code-switched text. Explored techniques to effectively train RNNLM for low-resource scenarios. Accepted as a short paper at EMNLP, 2018. Formulated a framework for combining two monolingual language models using a probabilistic model. Accepted at Interspeech 2018.


_______________________________________________________________________________________________________________________________

Markov Chain Monte Carlo Sampling

Guided by: Prof. Suyash Awate | Statistical Machine Learning, MIP | IIT Bombay

Proposed an exact MCMC sampling algorithm from MRF models on label images and showed its efficacy for uncertainty estimation in Medical MRI images. Accepted at MICCAI, 2018. Extended above work to more generic MRF models, including GF and SBM. Submitted a detailed version to MIA Journal which involves more theoretical analysis and validation.


_______________________________________________________________________________________________________________________________

Multiword Expressions and Error Tracking in IR,

Guided by: Prof. Pushpak Bhattacharyya | Machine Learning, IR | IIT Bombay

Designed and implemented a framework for error detection and correction tool that provides the ability to perform pseudo-error-correction in a large-scale search engine system. Deployed the search engine tool on the Sandhan, a cross-lingual search engine in Indian languages. Studied WordNet-based features and word embeddings approach to detect Multiword Expressions in IR.


_______________________________________________________________________________________________________________________________

Methods for labeling unsegmented sequence data

Guided by: Prof. Preethi Jyothi | Machine Learning, ASR | IIT Bombay

Surveyed the literature on state-of-the-art methods in deep learning for the problem of labeling unsegmented data. Explored models based on a combination of RNNs and the CTC objective function, and models based on segmental RNNs. Implemented an end to end speech recognition system using CTC objective function Studied attention and memory augmented based RNN models for some tasks including machine translation, handwriting synthesis, and speech recognition


_______________________________________________________________________________________________________________________________

Approximation algorithms for weighted b-Matching

Guided by: Prof. Alex Pothen | Algorithms | Purdue University, US

Worked on designing an approximation algorithm that possesses a high degree of concurrency so that they can be implemented efficiently on shared and distributed memory multiprocessors for a variant of stable fixtures problem. Implemented the algorithm to analyzed its performance on various graph structures and compared its performance against a collection of algorithms that have been proposed earlier.


_______________________________________________________________________________________________________________________________