Who am I?

Hi I'm Bajibabu Bollepalli! Currently I'm a PhD student in Aalto University working under supervision of Prof. Paavo Alku.

I worked as an Applied Scientist intern in Amazon Gdansk. I did Licentiate degree in TMH, KTH Royal Institute of Technology, Stockholm, Sweden. I worked as a research intern in NII Tokyo, Japan under supervision of Prof. Junichi Yamagishi.


Work Experience


Education

  • Ph.D. in Text-To-Speech (in progress).
    Department of Signal Processing and Acoustics, Aalto University, Finland
    Thesis (tentative): Speaking Style Adaptation in Text-To-Speech Synthesis using Deep Neural Networks.
  • Licentiate Degree in Text-To-Speech
    KTH Royal Institute of Technology, Sweden.
    Thesis: Towards Conversational Speech Synthesis - Experiments with Data Quality, Prosody Modification, and Non-verbal Signals.
  • Bachelor and Masters Dual Degree
    IIIT Hyderabad, India.
    Thesis: Voice conversion using articulatory features.

Publications

[2019]
  1. Bajibabu Bollepalli, Lauri Juvela, and Paavo Alku.
    Lombard Speech Synthesis Using Transfer Learning in a Tacotron Text-to-Speech System.
    Proc. Interspeech 2019. [PDF]
  2. Lauri Juvela, Bajibabu Bollepalli, Junichi Yamagishi, and Paavo Alku.
    GELP: GAN-Excited Linear Prediction for Speech Synthesis from Mel-Spectrogram.
    Proc. Interspeech 2019. [PDF]
  3. Bajibabu Bollepalli, Lauri Juvela, Manu Airaksinen, Cassia Valentini-Botinhao, and Paavo Alku.
    Normal-to-Lombard adaptation of speech synthesis using long short-term memory recurrent neural networks.
    Speech Communication. [SPECOM]
  4. Lauri Juvela, Bajibabu Bollepalli, Vassilis Tsiaras, and Paavo Alku.
    GlotNet—A raw waveform model for the glottal excitation in statistical parametric speech synthesis.
    IEEE/ACM Transactions on Audio, Speech, and Language Processing. [IEEE]
  5. Lauri Juvela, Bajibabu Bollepalli, Junichi Yamagishi, and Paavo Alku.
    Waveform generation for text-to-speech synthesis using pitch-synchronous multi-scale generative adversarial networks.
    Proc. ICASSP 2019. [arXiv:1810:12598]
[2018]
  1. Lauri Juvela, Vassilis Tsiaras, Bajibabu Bollepalli, Manu Airaksinen, Junichi Yamagishi, and Paavo Alku.
    Speaker-independent raw waveform model for glottal excitation.
    Proc. Interspeech 2018. [PDF][arXiv:1804.09593]
  2. Lauri Juvela, Bajibabu Bollepalli, Xin Wang, Hirokazu Kameoka, Manu Airaksinen, Junichi Yamagishi, Paavo Alku.
    Speech waveform synthesis from MFCC sequences with generative adversarial networks.
    Proc. ICASSP 2018. [IEEE][arXiv:1804:00920]
  3. Manu Airaksinen, Lauri Juvela, Bajibabu Bollepalli, Junichi Yamagishi, Paavo Alku,.
    A Comparison between STRAIGHT, glottal, and sinusoidal vocoding in statistical parametric speech synthesis.
    IEEE/ACM Transactions on Audio, Speech, and Language Processing. September 2018 [IEEE]
[2017]
  1. Bajibabu Bollepalli, Lauri Juvela, Paavo Alku.
    Generative adversarial network-based glottal waveform model for statistical parametric speech synthesis.
    Proc. Interspeech 2017. [PDF]
  2. Lauri Juvela, Bajibabu Bollepalli, Junichi Yamagishi, Paavo Alku.
    Reducing mismatch in training of DNN-based glottal excitation models in a statistical parametric text-to-speech system.
    Proc. Interspeech 2017. [PDF]
  3. Manu Airaksinen, Bajibabu Bollepalli, Jouni Pohjalainen, Paavo Alku.
    Frequency-warped time-weighted linear prediction for glottal vocoding.
    Proc. ICASSP 2017. [IEEE]
  4. Bajibabu Bollepalli, Manu Airaksinen, Paavo Alku.
    Lombard speech synthesis using long short-term memory recurrent neural networks.
    Proc. ICASSP 2017. [IEEE]
  5. Manu Airaksinen, Bajibabu Bollepalli, Jouni Pohjalainen, Paavo Alku.
    Glottal vocoding with frequency-warped time-weighted linear prediction.
    IEEE Signal Processing Letters. April, 2017. [IEEE]
  6. Bajibabu Bollepalli.
    Towards conversational speech synthesis: Experiments with data quality, prosody modification, and non-verbal signals.
    Licentiate Thesis. KTH Royal Institute of Technology, Sweden, 2017. [PDF]
[2016]
  1. Srikanth Ronanki, Siva Reddy, Bajibabu Bollepalli, Simon King.
    DNN-based Speech Synthesis for Indian Languages from ASCII text.
    Proc. Speech Synthesis Workshop (SSW9) 2016. [PDF]
  2. Manu Airaksinen, Bajibabu Bollepalli, Lauri Juvela, Zhizheng Wu, Simon King, Paavo Alku.
    GlottDNN - A full-band glottal vocoder for statistical parametric speech synthesis.
    Proc. Interspeech 2016. [PDF](Best Student Paper Award)
  3. Lauri Juvela, Bajibabu Bollepalli, Manu Airaksinen, Paavo Alku.
    High-pitched excitation generation for glottal vocoding in statistical parametric speech synthesis using a deep neural network.
    Proc. ICASSP 2016. [IEEE](Best Student Paper Award)
[2014]
  1. Bajibabu Bollepalli, Tuomo Raitio.
    Effect of MPEG audio compression on vocoders used in statistical parametric speech synthesis.
    Proc. 22nd European Signal Processing Conference (EUSIPCO) 2014. [IEEE]
  2. Maria Koutsombogera, Samer Al Moubayed, Bajibabu Bollepalli, Ahmed Hussen Abdelaziz, Martin Johansson, Jose David Aguas Lopes, Jekaterina Novikova, Catharine Oertel, Kalin Stefanov, Gul Varol.
    The Tutorbot Corpus - A Corpus for Studying Tutoring Behaviour in Multiparty Face-to-Face Spoken Dialogue.
    Proc. LREC 2014. [PDF]
  3. Bajibabu Bollepalli, Jerome Urbain, Tuomo Raitio, Joakim Gustafson, Huseyin Cakmak.
    A Comparative Evaluation of Vocoding Techniques for HMM-based Laughter Synthesis.
    Proc. ICASSP 2014. [IEEE]
  4. Samer Al Moubayed, Jonas Beskow, Bajibabu Bollepalli, Joakim Gustafson, Ahmed Hussen-Abdelaziz, Martin Johansson, Maria Koutsombogera, Jose David Lopes, Jekaterina Novikova, Catharine Oertel, Gabriel Skantze, Kalin Stefanov, Gul Varol.
    Human-robot collaborative tutoring using multiparty multimodal spoken dialogue.
    Proc. HRI 2014. [PDF]
[2013]
  1. Bajibabu Bollepalli, Tuomo Raitio, Paavo Alku.
    Effect of MPEG Audio Compression on HMM-based Speech Synthesis.
    Proc. Interspeech 2013. [PDF]
  2. Bajibabu Bollepalli, Jonas Beskow, Joakim Gustafson.
    Non-Linear Pitch Modification in Voice Conversion using Artificial Neural Networks.
    Proc. NOLISP 2013. [PDF]
[2012]
  1. Bajibabu Bollepalli, Jonas Beskow, Joakim Gustafson.
    HMM based speech synthesis system for Swedish Language.
    Proc. 4th Swedish Language Technology Conference 2012. [PDF]
  2. Bajibabu Bollepalli.
    Voice conversion using articulatory features.
    Masters Thesis. IIIT Hyderabad, India, 2012. [PDF]
  3. Bajibabu Bollepalli, Alan W Black, Kishore Prahallad.
    Modelling a noisy-channel for voice Conversion using articulatory features.
    Proc. Interspeech 2012. [PDF]
  4. Sathya adithya Thati, Bajibabu Bollepalli, Peri Bhaskararao, B. Yegnanarayana.
    Analysis of breathy voice based on excitation characteristics of speech production.
    Proc. SPCOM 2012. [IEEE]
  5. Srikanth Ronanki, Bajibabu Bollepalli, Kishore Prahallad.
    Duration modelling in voice conversion using artificial neural networks.
    Proc. IWSSIP 2012. [IEEE]
[2011]
  1. Bajibabu Bollepalli, Ronanki Srikanth, Sathya Adithya Thati, Bhiksha Raj, B Yegnanarayana, Kishore Prahallad.
    A comparison of prosody modification using instants of significant excitation and mel-cepstral vocoder.
    Proc. Centenary Conference of the Indian Institute of Science 2011. [PDF]
  2. Gautam Verma Mantena, Bajibabu Bollepalli, Kishore Prahallad.
    SWS task: Articulatory phonetic units and sliding DTW.
    Proc. MediaEval, Satellite Events in Interspeech 2011. [PDF]