Rahim Saeidi, PhD

Research Fellow,
Speech Communication Technology,
Department of Signal Processing and Acoustics,
School of Electrical Engineering, Aalto University

rahim.saeidi (at) aalto.fi

Rahim Saeidi picture

Address Otakaari 5I
00076 Aalto
Finland

Research interests

I have a wide interest in processing speech signals. In my research I develop machine learning algorithms and digital signal processing techniques for speech technology; including speaker recognition, speech recognition and speech enhancement. I did my PhD studies in University of Eastern Finland ( my PhD thesis in pdf). I used to work as a Marie Curie Post-doctoral Fellow at Centre for Language and Speech Technology in Faculty of Arts, Radboud University Nijmegen, The Netherlands with Prof. David van Leeuwen. I am working now as a research fellow with Prof. Paavo Alku.


Teaching:

Spring 2009: TA for Digital Speech Processing Course(Exercise page).
Fall 2009: TA for Pattern Recognition Course(Exercise page)
Spring 2010: Scientific Writing
Spring 2012: TA for Forensic Linguistics (with David van Leeuwen)
Spring 2012: TA for Machine Learning in Practice (with Tom Heskes)
Fall 2013: Location-aware mobile applications development
Spring 2014: Web Design and Development

Talks:

Patents:

P. Mowlaee, R. Saeidi, and G. Kubin, "Iterative closed-loop speech enhancement", European Patent, filed on August 23rd, 2013.

My Google scholar profile

Peer-Reviewed Journal Publications

  1. M. I. Mandasari, M. Gunther, R. Wallace, R. Saeidi, S. Marcel and D. A. van Leeuwen, Score Calibration in Face Recognition, IET Biometrics, 3(4), pp.246-256, December 2014 (pdf) (Source Code).
  2. P. Mowlaee and R. Saeidi, Iterative Closed-Loop Phase-Aware Single-Channel Speech Enhancement, IEEE Signal Processing Letters, 20(12), pp. 1235-1239, December 2013 (pdf) (Audio Examples).
  3. M. I. Mandasari, R. Saeidi, M. McLaren and D. A. van Leeuwen, Quality Measure Functions for Calibration of Speaker Recognition System in Various Duration Conditions, IEEE Transactions on Audio, Speech and Language Processing, 21(11), pp. 2425-2438, November 2013 (pdf).
  4. D. Kolossa, S. Zeiler, R. Saeidi and R. F. Astudillo, Noise-Adaptive LDA: A New Approach for Speech Recognition Under Observation Uncertainty, IEEE Signal Processing Letters, 20(11), pp. 1018-1021, November 2013 (pdf).
  5. R. F. Astudillo, D. Kolossa, A. Abad, S. Zeiler, R. Saeidi, P. Mowlaee, J. P. da Silva Neto and R. Martin, Integration of Beamforming and Uncertainty-of-Observation Techniques for Robust ASR in Multi-Source Environments, Computer Speech and Language, 27(3), pp. 837-850, May 2013.
  6. P. Mowlaee, R. Saeidi, Z.-H, Tan, M. G. Christensen, T. Kinnunen, P. Franti and S. H. Jensen, A Joint Approach for Single-Channel Speaker Identification and Speech Separation, IEEE Transactions on Audio, Speech and Language Processing, 20(9), pp. 2586-2601, November 2012.
  7. T. Kinnunen, R. Saeidi, F. Sedlak, K.A. Lee, J. Sandberg, M. Hansson-Sandsten and H. Li, Low-Variance Multitaper MFCC Features: a Case Study in Robust Speaker Verification, IEEE Transactions on Audio, Speech and Language Processing, 20(7), pp. 1990-2001, September 2012.
  8. C. Hanilci, T. Kinnunen, F. Ertas, R. Saeidi, J. Pohjalainen and P. Alku, Regularized All-Pole Models for Speaker Verification Under Noisy Environments, IEEE Signal Processing Letters, 19(3), pp. 163-166, March 2012.
  9. R. Saeidi, J. Pohjalainen, T. Kinnunen and P. Alku, Temporally Weighted Linear Prediction Features for Tackling Additive Noise in Speaker Verification, IEEE Signal Processing Letters, 17(6), pp. 599-602, June 2010.
  10. J. Sandberg, M. Hansson-Sandsten, T. Kinnunen, R. Saiedi, P. Flandrin and P. Borgnat, Multitaper Estimation of Frequency-Warped Cepstra With Application to Speaker Verification, IEEE Signal Processing Letters, 17(4), pp. 343-346, April 2010.
  11. R. Saeidi, H. R. Sadegh Mohammadi, T. Ganchev and R. D. Rodman, Particle Swarm Optimization for Sorted Adapted Gaussian Mixture Models, IEEE Trans. Audio, Speech, and Language Processing, 17(2), 344-353, February 2009.

Peer-Reviewed Conferece Publications:

  1. F. Kelly, R. Saeidi, N. Harte and D. A. van Leeuwen, Effect of long-term ageing on i-vector speaker verification, in Proc. INTERSPEECH 2014, pp. 86-90, Singapore, September 2014 (pdf).
  2. P. Mowlaee, R. Saeidi and Y. Stylianou, INTERSPEECH 2014 Special Session on Phase Importance in Speech Processing Applications, in Proc. INTERSPEECH 2014, pp. 1623-1627, Singapore, September 2014 (pdf).
  3. P. Mowlaee and R. Saeidi, Time-Frequency Constraints for Phase Estimation in Single-Channel Speech Enhancement, in Proc. The 14th International Workshop on Acoustic Signal Enhancement (IWAENC 2014), pp. 338-342, Antibes, France, September 2014 (pdf).
  4. R. Mariescu-Istodor, A. Tabarcea, R. Saeidi and P. Franti, Low complexity spatial similarity measure of GPS trajectories, in Proc. Int. Conf. on Web Information Systems & Technologies (WEBIST'14), pp. 62-69, Barcelona, Spain, April 2014.
  5. M. I. Mandasari, R. Saeidi and D. A. van Leeuwen, Calibration based on duration quality measure function in noise robust speaker recognition for NIST SRE'12, in Proc. Biometric Technologies in Forensic Science, pp. 1-5, Nijmegen, Netherlands, October 2013 (pdf).
  6. R. Saeidi, K. A. Lee, T. Kinnunen, T. Hasan, B. Fauve, P.-M. Bousquet, E. Khoury, P. L. Sordo Martinez, J. M. K. Kua, C. H. You, H. Sun, A. Larcher, P. Rajan, V. Hautamaki, C. Hanilci, B. Braithwaite, R. Gonzales-Hautamaki, S. O. Sadjadi, G. Liu, H. Boril, N. Shokouhi, D. Matrouf, L. El Shafey, P. Mowlaee, J. Epps, T. Thiruvaran, D. A. van Leeuwen, B. Ma, H. Li, J. H. L. Hansen, J.-F. Bonastre, S. Marcel, J. Mason, E. Ambikairajah, I4U submission to NIST SRE 2012: A large-scale collaborative effort for noise-robust speaker verification, in Proc. INTERSPEECH 2013, pp. 1986-1990, Lyon, France, August 2013. (Link)
  7. V. Hautamaki, K.A. Lee, D. A. van Leeuwen, R. Saeidi, A. Larcher, T. Kinnunen, T. Hasan, S.O. Sadjadi, G. Liu, H. Boril, J.H.L. Hansen, B. Fauve, Automatic regularization of cross-entropy cost for speaker recognition fusion, in Proc. INTERSPEECH 2013, pp. 1609-1613, Lyon, France, August 2013.
  8. E. Khoury, B. Vesnicer, J. Franco-Pedroso, R. Violato, Z. Boulkenafet, L-M. M. Fernandez, M. Diez, J. Kosmala, H. Khemiri, T. Cipr, R. Saeidi, M. Gunther, J. Zganec-Gros, R. Z. Candil, F. Simoes, M. Bengherabi, A. A. Marquina, M. Penagarikano, A. Abad, M. Boulayemen, P. Schwarz, D. A. van Leeuwen, J. Gonzalez-Dominguez, M. U. Neto, E. Boutellaa, P. G. Vilda, A. Varona, D. Petrovska-Delacretaz, P. Matejka, J. Gonzalez-Rodriguez, T. de F. Pereira, F. Harizi, L. J. Rodriguez-Fuentes, L. El Shafey, M. Angeloni, G. Bordel, G. Chollet and S. Marcel, The 2013 Speaker Recognition Evaluation in Mobile Environment, in Proc. The 6th IAPR International Conference on Biometrics, pp. 1-8, Madrid, Spain, June 2013.
  9. D. A. van Leeuwen, R. Saeidi, Knowing the non-target speakers: the effect of the i-vector population for PLDA training in speaker recognition, in Proc. ICASSP 2013, pp. 6778-6782, Vancouver, Canada, May 2013.
  10. T. Hasan, R. Saeidi, J. H. L. Hansen, and D. A. van Leeuwen, Duration mismatch compensation for i-vector based speaker recognition systems, in Proc. ICASSP 2013, pp. 7663-7667, Vancouver, Canada, May 2013.
  11. M. H. Bahari, R. Saeidi, H. van Hamme, and D. A. van Leeuwen, Accent recognition using i-vector, Gaussian mean supervector and Gaussian posterior probability supervector for spontaneous telephone speech, in Proc. ICASSP 2013, pp. 7344-7348, Vancouver, Canada, May 2013.
  12. P. Mowlaee, R. Saeidi, On phase importance in parameter estimation in single-channel speech enhancement, in Proc. ICASSP 2013, pp. 7462-7466, Vancouver, Canada, May 2013.
  13. P. Mowlaee, R. Saeidi, Target speaker separation in multisource environment using speaker-dependent postfilter and noise estimation, in Proc. ICASSP 2013, pp. 7254-7258, Vancouver, Canada, May 2013.
  14. C. Hanilci, T. Kinnunen, R. Saeidi, J. Pohjalainen, P. Alku, and F. Ertas, Speaker identification from shouted speech: analysis and compensation, in Proc. ICASSP 2013, pp. 8027-8031, Vancouver, Canada, May 2013.
  15. A. Hurmalainen, R. Saeidi and T. Virtanen, Group Sparsity for Speaker Identity Discrimination in Factorisation-based Speech Recognition, in Proc. Interspeech 2012, pp. 2138-2141, Portland, September 2012.
  16. P. Mowlaee, R. Saeidi, R. Martin, Phase estimation for signal reconstruction in single-channel source separation, in Proc. Interspeech 2012, pp. 1548-1551, Portland, US, September 2012.
  17. R. Saeidi, A. Hurmalainen, T. Virtanen and D. A. van Leeuwen, Exemplar-based Sparse Representation and Sparse Discrimination for Noise Robust Speaker Identification, in Proc. Odyssey: the speaker and language recognition workshop, pp. 248-255, Singapore, June 2012.
  18. T. Kinnunen, R. Saeidi, J. Leppanen, J.P. Saarinen, Audio context recognition in variable mobile environments from shortsegments using speaker and language recognizers, in Proc. Odyssey: the speaker and language recognition workshop, pp. 304-311, Singapore, June 2012.
  19. C. Hanilci, T. Kinnunen, R. Saeidi, J. Pohjalainen, P. Alku, F. Ertas, Regularization of all-pole models for speaker verification under additive noise, in Proc. Odyssey: the speaker and language recognition workshop, pp. 236-242, Singapore, June 2012.
  20. P. Mowlaee, R. Saeidi, M. G. Christensen, R. Martin, Subjective and Objective Quality Assessment of Single-channel Speech Separation Algorithims, in Proc. ICASSP 2012, pp. 69-72, Kyoto, Japan, March 2012.
  21. C. Hanilci, T. Kinnunen, R. Saeidi, J. Pohjalainen, P. Alku, F. Ertas, J. Sandberg, M. Hansson-Sandsten, Comparing Spectrum Estimators in Speaker Verification Under Additive Noise Degradation, in Proc. ICASSP 2012, pp. 4769-4772, Kyoto, Japan, March 2012.
  22. P. Mowlaee, R. Saeidi, R. Martin, Model-driven speech enhancement for multisource reverberant environment (Signal Separation Evaluation Campaign SiSEC 2011), in Proc. International Conference on Latent Variable Analysis and Source Separation LVA/ICA 2012, Springer LNCS 7191, pp. 454-461, Tel Aviv, Israel, March 2012.
  23. J. Rodriguez-Fuentes, M. Penagarikano, A. Varona, M. Diez, G. Bordel, D. Martinez, J. Villalba, A. Miguel, A. Ortega, E. Lleida, A. Abad, O. Koller, I. Trancoso, P. Lopez-Otero, L. Docio-Fernandez, C. Garcia-Mateo, R. Saeidi, M. Soufifar, T. Kinnunen, T. Svendsen, P. Franti, Multi-Site Heterogenous System Fusions for the Albayzin 2010 Language Recognition Evaluation, in Proc. ASRU 2011, pp. 377-382, Hawaii, US, December 2011.
  24. P. Mowlaee, R. Saeidi, Z. -H. Tan, M. G. Christensen, T. Kinnunen, S. H. Jensen, P. Franti, Sinusoidal-based Approach for Single-Channel Speech Separation Challenge, in Proc. Interspeech 2011, pp. 677-680, Florence, Italy, August 2011.
  25. D. Kolossa, R. F. Astudillo, A. Abad, S. Zeiler, R. Saeidi, P. Mowlaee, J. P. da Silva Neto, R. Martin, CHiME Challenge: Approaches to Robustness using Beamforming and Uncertainty-of-Observation Techniques, in Proc. CHiME 2011 - Workshop on Machine Listening in Multisource Environments, Interspeech 2011 satellite event, pp.6-11, Florence, Italy, September 2011.
  26. R. Saeidi, P. Mowlaee, T. Kinnunen, Z. H Tan, M. G. Christensen, S. H. Jensen and P. Franti, Improving Monaural Speake Identification by Double-Talk Detection, in Proc. Interspeech 2010, pp. 1069-1072, Makuhari, Japan, September 2010.
  27. J. Pohjalainen, R. Saiedi, T. Kinnunen, and P. Alku, Extended Weighted Linear Prediction (XLP) Analysis of Speech and its Application to Speaker Verification in Adverse Conditions, in Proc. Interspeech 2010, Makuhari, Japan, pp. 1477-1480, September 2010.
  28. T. Kinnunen, R. Saiedi, J. Sandberg, and M. Hansson-Sandsten, What Else is New Than the Hamming Window? Robust MFCCs for Speaker Recognition via Multitapering, in Proc. Interspeech 2010, pp. 2734-2737, Makuhari, Japan, September 2010.
  29. R. Saiedi, J. Pohjalainen, T. Kinnunen, and P. Alku, Temporally Weighted Linear Prediction Features for Speaker Verification in Additive Noise, in Proc. Odyssey 2010: The Speaker and Language Recognition Workshop, pp. 40-46, Brno, Czech Republic, June 2010.
  30. R. Saeidi, P. Mowlaee, T. Kinnunen, Z. H Tan, M. G. Christensen, S. H. Jensen and P. Franti, Signal-to-signal ratio independentspeaker identification for co-channel speech signals, in Proc. ICPR 2010 , pp. 4565-4568, Turkey, August 2010.
  31. R. Saeidi, T. Kinnunen, H. R. Sadegh Mohammadi, R. D. Rodman and P. Franti, Joint Frame and Gaussian Selection for Text Independent Speaker Verification , in Proc. ICASSP 2010, pp. 4530-4533, Dallas, US, 2010.
  32. P. Mowlaee, R. Saiedi, Z. -H. Tan, M. G. Christensen, P. Franti, and S. H. Jensen, Joint Single-Channel Speech Separation and Speaker Identification , in Proc. ICASSP 2010, pp. 4430-4433, Dallas, US, 2010.
  33. R. Saeidi, H. R. Sadegh Mohammadi, T. Ganchev, R. D. Rodman, Effects of Feature Domain Normalizations on Text Independent Speaker Verification Using Sorted Adapted GMMs, International CSI Computer Conference CSICC, CCIS 6, pp. 493-500, Springer-Verlag Berlin Heidelberg, 2008.
  34. R. Saeidi, H. R. Sadegh Mohammadi, T. Ganchev, R. D. Rodman, Hierarchical Mixture Clustering and its Application to GMM Based Text Independent Speaker Identification, in Proc. International Symposium on Telecommunications IST 2008, pp. 770-773, Tehran, Iran, 2008.
  35. R. Saeidi, T. Ganchev, H. R. Sadegh Mohammadi, Text Independent Speaker Verification Using enhanced Sorted Gaussian Mixture Model, in Proc. ICSPC 2007, vol. I, pp. 1191-1194, Dubai, UAE, 2007.
  36. H. R. Sadegh Mohammadi,R. Saeidi, Speaker Identification Performance Enhancement Using Gaussian Mixture Model with GMM Classification Post-Processor, in Proc. ICSPC 2007, vol. I, pp. 501-504, Dubai, UAE, 2007.
  37. R. Saeidi, H. R. Sadegh Mohammadi, R. D. Rodman, and T. Kinnunen, A new segmentation algorithm combined with transient frames power for text independent speaker verification, ICASSP 2007, vol. IV, pp. 305-308, Las Vegas, USA, 2007.
  38. H. R. Sadegh Mohammadi, R. Saeidi, M. R. Rohani, and R. D. Rodman, Combined inter-frame and intra-frame fast scoring methods for efficient implementation of GMM-based Speaker Verification Systems, in Proc. ICASSP 2007, vol. IV, pp. 309-312, Las Vegas, USA, 2007.
  39. H. R. Sadegh Mohammadi, R. Saeidi, Efficient Implementation of GMM Based Speaker Verification Using Sorted Gaussian Mixture Model, in Proc. EUSIPCO 2006, Florence, Italy, 2006.
  40. R. Saeidi, H. R. Sadegh Mohammadi, M. Khalaj Amirhosseini, An Efficient GMM Classification Post-Processing Method For Structural Gaussian Mixture Model Based Speaker Verification, in Proc. ICASSP 2006, vol. I, pp. 909-912, Toulouse, France, 2006.

Other Publications
  • P. Mowlaee, M. K. Watanabe, R. Saeidi, Show & Tell: Iterative Refinement of Amplitude and Phase in Single-channel Speech Enhancement, in Proc. Show and Tell Interspeech 2014, Singapore, September 2014 (pdf).
  • P. Mowlaee, M. K. Watanabe, R. Saeidi, Phase-Aware Single-Channel Speech Enhancement, in Proc. Show and Tell Interspeech 2013, Lyon, August 2013 (pdf).
  • R. Saeidi, D. A. van Leeuwen, The Radboud University Nijmegen submission to NIST SRE-2012, in Proc. SRE12 workshop , Orlando, December 2012 (pdf).
  • M. I. Mandasari, R. Saeidi and D. A. van Leeuwen, A Study Of Likelihood Ratio Calibration In High Vocal Effort Speech For A Modern Automatic Speaker Recognition System, in Proc. International Association of Forensic Phonetics and Acoustics 2012 Annual Conference (IAFPA 2012) , Santander, Spain, August 2012 (pdf).
  • R. Saeidi, T. Kinnunen, Alternative Spectrum Estimation in MFCCs for Speaker Verification: NIST SRE'08 and SRE'10 corpora, in Proc. SRE11 analysis workshop , Atlanta, December 2011 (pdf).

Copyright Notice IEEE-Copyrighted Material: Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works, must be obtained from the IEEE. Contact: Manager, Copyrights and Permissions / IEEE Service Center / 445 Hoes Lane / P.O. Box 1331 / Piscataway, NJ 08855-1331, USA. Telephone: +Intl. 908-562-3966.