Birla Institute of Technology, Mesra
Dr. Sujan Kumar Saha
Associate Professor, Computer Science and Engg
Associate Dean (Undergraduate Studies) :- adugs@bitmesra.ac.in . Ph.D.(IIT Kharagpur, 2010), M.Tech (IIT Delhi, 2005), B.Tech (Kalyani Govt. Engg. College, 2003)
Contact Address
Permanent Address Burdwan, West Bengal
Local Address BIT Campus, Mesra, Ranchi - 835215
Phone (Office) 00
Phone Residence 00
Email Id sujan.kr.saha@gmail.com
Joined Institute on : 5-Mar-2010

  Work Experience
 
Teaching : 12 Years

Research : 16 Years

  Research Areas
 

Natural Language Processing, Education Technology, Health Informatics

Ph.D. Students & Their Area:

Degree Awarded

1. Dr. Mukta Majumder (Awarded on 27 February 2018). Area: Named Entity Recognition. [currently: Assistant Professor, CSE, University of North Bengal]

2. Dr. Rakesh Patra (Awarded on 01 November 2019). Area: ML for Efficient Information Extraction. [currently: Assistant Professor, NIST Berhampur, Odisha]

3. Dr. Dhawaleswar Rao CH. (Awarded on 08 July 2020). Area: Automatic Question Generation from Text, Answer evaluation, and Learning Platform Design. [currently: Assistant Professor, CSE, KL University, Guntur, AP]

4. Dr. Amit Prakash (Awarded on 12 January 2021). Area: Information Retrieval: Title: Development of IR systems in Cross-lingual, Mixed-Script, and Remedy Finding Applications. [currently: CSE, BIT Mesra]

5. Dr. Ankur Priyadarshi. (Awarded, 4 October 2021). Area: POS tagging, NER and Stemmer in Maithili Language. [Currently, CV Raman Global Univ. Bhubaneswar]

 

Ongoing Students

6. Mr. Shubhojeet Paul. (On-going since July 2019). Area: Processing (Speech) of Jharkhand Tribal Languages.

  1. Mr. Gaurav Kumar Pandey. (Ongoing Since MO 2021). Area: Hanwritten Character Recognition. 
  Publications
 

 

In International Journals [SCI papers – SCIE or SSCI] [Cumulative Impact Factor: 64.034]

  1. Ankur Priyadarshi, Sujan Kumar Saha. (2022 – online) A study on the performance of Recurrent Neural Network based models in Maithili Part of Speech Tagging. ACM Transaction on Asian and Low-Resource Language Information Processing (TALLIP). https://dl.acm.org/doi/10.1145/3540260. [IF – 1.471]
  2. Amit Prakash, Niraj Kumar Singh, Sujan Kumar Saha. (2022) Automated analysis of Hindi poetry for Rasa based classification. ETRI journal, Wiley. DOI: 10.4218/etrij.2019-0396. [IF 1.622]
  3. Sujan Kumar Saha and Dhawaleswar Rao CH (2022). Development of a Practical System for Computerized Evaluation of Descriptive Answers of Middle School Level Students. Interactive Learning Environments, Taylor & Francis. Volume 30, Issue 2, PP. Pages 215-228. DOI: 10.1080/10494820.2019.1651743. [IF 4.965 – Q1]
  4. Ankur Priyadarshi and Sujan Kumar Saha. (2021). The First Named Entity Recognizer in Maithili: Resource Creation and System Development. Journal of Intelligent and Fuzzy systems, IOS Press. Vol. 41 (2021) 1083–1095. DOI: 10.3233/JIFS-210051.  [SCI 1.737]
  5. Ankur Priyadarshi, Sujan Kumar Saha. (2020) Towards the First Maithili Part of Speech Tagger: Resource Creation and System Development. Computer Speech & Language Elsevier, Vol. 62 (July 2020) 101054. July 2020. ISSN: 0885-2308. DOI: https://doi.org/10.1016/j.csl.2019.101054  [IF 3.252 – Q3]
  6. Dhawaleswar Rao CH, Sujan Kumar Saha. (2020). Automatic Multiple Choice Question Generation from Text: A Survey. IEEE Transaction on Learning Technologies. Vol. 13, Issue 1, pp. 14-25. ISSN: 1939-1382 DOI: 10.1109/TLT.2018.2889100 [SCIE SSCI 4.433 – Q1]
  7. Sujan Kumar Saha and Rushali Gupta. (2020). Adopting computer-assisted assessment in evaluation of handwritten answer books: An experimental study. Education and Information Technologies Springer. Vol. 25, Issue 6. pages 4845–4860 (Nov. 2020) ISSN: 1360-2357. DOI: 10.1007/s10639-020-10192-6. [IF 3.666 – Q1]
  8. Sujan Kumar Saha, Amit Prakash, Mukta Majumder. (2019). "Similar Query was Answered Earlier": Processing of Patient Authored Text for Retrieving Relevant Contents from Health Discussion Forum. Health Information Science and Systems, Springer, 7: 4, DOI: 10.1007/s13755-019-0067-3  [SCOPUS, SCIE]. [IF 5.017 – Q2]
  9. Dhawaleswar Rao CH, Sujan Kumar Saha. (2019). A Computer Assisted Learning Platform for Efficient Biology Learning of Secondary School Level Students. Journal of Educational Computing Research. Special Issue on “Augmented & Virtual Reality in Education: Immersive Learning Research”. Volume 57, Number 7, Pages 1671-1694 . [IF 4.345 – Q1]
  10. Amit Prakash, Sujan Kumar Saha. (2019). A Study on Use of the Web for Automatic Answering of Remedy Finding Questions of Common Users. Technology and Health Care, IOS Press. Vol. 27, No. 1, pp. 23-35, published 24 January 2019, ISSN: 0928-7329. [SCIE IF 1.205]
  11. Dhawaleswar Rao CH, Sujan Kumar Saha. (2019) “RemedialTutor: A Blended Learning Platform for Weak Students and Study its Efficiency in Social Science Learning of Middle School Students in India”. Special issue on “Blended Learning in South Asia”, Education and Information Technologies, Springer, Volume 24, Issue 3, pp 1925–1941. ISSN: 1360-2357 (Print) 1573-7608 (Online). [IF 3.666 – Q1].
  12. Rakesh Patra, Sujan Kumar Saha. (2019) A Hybrid Approach for Automatic Generation of Named Entity Distractors for Multiple Choice Questions. Education and Information Technologies, Springer. ISSN: 1360-2357 (Print) 1573-7608 (Online). Volume 24, Issue 2, pp 973–993. [IF 3.666 – Q1].
  13. Sujan Kumar Saha, Mukta Majumdar. 2018. Development of a Hindi Named Entity Recognition System without Using Manually Annotated Training Corpus. International Arab Journal of Information Technology. Vol. 15, No. 6. Pages 1088 – 1098. [SCIE IF 0.967]
  14. Rakesh Patra, Sujan Kumar Saha. 2013. A Kernel Based Approach for Biomedical Named Entity Recognition. The Scientific World Journal, Hindawi, Article id: 950796, 7 pages.  ISSN: 2356-6140. [2013 JCR Impact Factor 1.73]
  15. Mukta Majumdar, N Das, Sujan Kumar Saha. 2013. A Novel Technique for Multiple Faults and Their Location Detection and Start Electrode Selection in Microfluidic Digital BioChip. Journal of Innovative Optical Health Sciences, World Scientific Press. Vol. 6,  No. 4. Pages 1350032-1 to 1350032-8.  [SCIE Q3 2.396].
  16. Sujan Kumar Saha, Pabitra Mitra and Sudeshna Sarkar. 2012. A Comparative Study on Feature Reduction Approaches in Hindi and Bengali named entity recognition. Knowledge-Based Systems, Elsevier, Vol. 27, pages 322 – 332. ISSN 0950-7051. [IF 8.139 – Q1]
  17. Sujan Kumar Saha, Shashi Narayan, Sudeshna Sarkar and Pabitra Mitra. 2010.  A Composite Kernel for Named Entity Recognition. Pattern Recognition Letters, Elsevier, Vol. 31, No. 12, pages 1591 – 1597. ISSN: 0167-8655. [IF 4.757 – Q2].
  18. Sujan Kumar Saha, Sudeshna Sarkar and Pabitra Mitra. 2009. Feature Selection Techniques for Maximum Entropy Based Biomedical Named Entity Recognition. Journal of Biomedical Informatics, Elsevier, Vol. 42, No. 5, pages 905 – 911. ISSN 1532-0464. [IF 8.0 – Q1] 

 

In International Journals [SCOPUS unpaid / ESCI papers]

  1. Sujan Kumar Saha. (2021). Towards Development of a System for Automatic Assessment of a Question Paper. Smart Learning Environments (Springer, SCOPUS). 8: 4(2021). https://doi.org/10.1186/s40561-021-00148-9. 
  2. Rakesh Patra and Sujan Kumar Saha. (2020). Utilizing External Corpora Through Kernel Function: Application in Biomedical Named Entity Recognition. Progress in Artificial Intelligence Springer, DOI: 10.1007/s13748-020-00208-0.
  3. Ankur Priyadarshi, Sujan Kumar Saha. (2020). Web Information Extraction for Finding Remedy Based on a Patient-Authored Text: A Study on Homeopathy. Network Modeling Analysis in Health Informatics and Bioinformatics, Springer. Volume 9, Article number: 9 (2020). DOI:10.1007/ s13721-019-0216-2.
  4. Rakesh Patra, Sujan Kumar Saha. (2019). A Novel Word Clustering and Cluster Merging Technique for Named Entity Recognition. Journal of Intelligent Systems. Vol. 28, Issue 1, pages 15-30. ISSN 2191-026X [Web of Science, SCOPUS]
  5. Mukta Majumdar, Sujan Kumar Saha. 2014. Automatic Selection of Informative Sentences: The Sentences That Can Generate Multiple Choice Questions.  Knowledge Management & E-Learning: An International Journal. Vol. 6, No. 4, pages 377-391. ISSN: 2073-7904. [ESCI, SCOPUS]
  6. Mukta Majumdar, Sujan Kumar Saha. 2014. Use of Global Context for Handling Noisy Names in Discussion Texts of a Homeopathy Discussion Forum. Knowledge Management & E-Learning: An International Journal. Vol. 6, No. 1, pages 18 – 29. ISSN: 2073-7904. [ESCI, SCOPUS]
  7. Mukta Majumder, Utsav Barman, Rahul Prasad, Kumar Saurabh, Sujan Kumar Saha. 2012. A novel technique for name identification from homeopathy diagnosis discussion forum. Elsevier Procedia Technology. Vol 6. pages 379-386. [SCOPUS]

 

In International Conferences

  1. Shail Bala and Sujan Kumar Saha. 2022. Comparative Study of Maithili Pre-Trained Embeddings on Named Entity Recognition Task. Fifth International Conference on IoT, Cloud Computing and Data Science (IRCICD2022). SRM University, Chennai, India. May 6-7, 2022. (Best Paper Award)
  2. Shubhojeet Paul, Sujan Kumar Saha and Vandana Bhattacharjee. 2022. A Deep Learning based system for Continuous Speech Recognition in Hindi for Healthcare Domain. 7th International Conference on Emerging Applications of Information Technology (EAIT 2022). Kolkata, March 29-31, 2022.
  3. Ankur Priyadarshi and Sujan Kumar Saha. 2019. A study on the Importance of Linguistic Suffixes in Maithili POS Tagger Development. MIKE 2019: Seventh International Conference on Mining Intelligence and Knowledge Exploration. 19-22 December, NIT Goa.
  4. Ankur Priyadarshi and Sujan Kumar Saha. 2019. A Hybrid Approach to Develop the First Stemmer in Maithili. FIRE 2019: 11th meeting of the Forum for Information Retrieval Evaluation (FIRE, 19) 12-15 December, ISI Kolkata.
  5. Shashank Srigiri, Sujan Kumar Saha. 2018. Spelling Correction of OCR Generated Hindi Text Using Word Embedding and Levenshtein Distance. 4th International Conference on Nano-electronics, Circuits & Communication Systems (NCCS-2018) on 3-4th Nov-2018, at ARTTC, BSNL, Ranchi.
  6. Ambuj Mishra, Rakesh Ranjan, Neelansh and Sujan Kr. Saha. 2017. BIT Mesra @ SAIL CodeMixed-2017: Majority Voting of SVM and NB classifiers for Sentiment Analysis of Hindi-English Code-Mixed Text. In International Conference on Natural Language Processing (ICON 2017) 2nd Prize in NLP tool contest. Jadavpur University. Dec 18-21, 2017.
  7. Sumeet Pannu, Aishwarya Krishna, Shiwani Kumari, Rakesh Patra and Sujan Kumar Saha. 2017.  Automatic Generation of Fill-in-the-blank Questions from History Books for School Level Evaluation. In International Conference on Computing Analytics and Networking (ICCAN 2017), KIIT Bhubaneswar. In: Pattnaik P., Rautaray S., Das H., Nayak J. (eds) Progress in Computing, Analytics and Networking. Advances in Intelligent Systems and Computing, vol 710. Springer, Singapore, pages 461-469. Dec 15-16, 2017.
  8. Rakesh Patra and Sujan Kumar Saha. 2017. Automatic Generation of Named Entity Distractors of Multiple Choice Questions using Web Information. In International Conference on Computing Analytics and Networking (ICCAN 2017), KIIT Bhubaneswar. Book Chapter in “Progress in Computing, Analytics and Networking”, pages 511-518. Dec 15-16, 2017.
  9. Harsh Ranjan, Sumit Agarwal, Amit Prakash and Sujan Kumar Saha. 2017. Automatic Labelling of Important Terms and Phrases from Medical Discussions. In Conference on Information and Communication Technology (CICT 2017), ABV-IITM Gwalior, Nov 3-5, 2017.         
  10. Dheeraj Kumar Paras, Sudhanshu Nandan, Vikash Singh, Sujan Kumar Saha. 2016. An Automatic Domain Specific Speech Recognition System in Maithili. RegICON 2016. IIT BHU, India. December 16, 2016.
  11. Shlok Kumar Mishra, Pranav Kumar, and Sujan Kumar Saha. 2015. A Support Vector Machine Based System for Technical Question Classification. Third International Conference on Mining Intelligence and Knowledge Exploration (MIKE 2015). Springer LNCS Volume 9468, pp 640-649. IIIT Hyderabad, India. December 9 - 11, 2015.
  12. Amit Prakash, Sujan Kumar Saha. 2015. A Comparative Study on Different Translation Approaches for Query Formation in the Source Retrieval Task. 7th meeting of Forum for Information Retrieval Evaluation (FIRE 2015).  DAIICT, Gandhinagar. December 4 - 6, 2015.
  13. Nimesh Ghelani, Sujan Kumar Saha, Amit Prakash. 2015. Mixed Script Ad hoc Retrieval using back transliteration and phrase matching through bigram indexing: Shared Task report by BIT, Mesra. 7th meeting of Forum for Information Retrieval Evaluation (FIRE 2015).  Pages: 61-64.DAIICT, Gandhinagar. December 4 - 6, 2015.
  14. Mukta Majumdar, Sujan Kumar Saha. 2015. A System for Generating Multiple Choice Questions: With a Novel Approach for Sentence Selection. Proceedings of The 2nd ACL Workshop on Natural Language Processing Techniques for Educational Applications (NLP-TEA), pages 64–72, Beijing, China, July 31, 2015.
  15. Amit Prakash, Sujan Kumar Saha. 2014. A Relevance Feedback Based Approach for Mixed Script Transliterated Text Search. FIRE 2014 Workshop on Transliteration Search. 6th meeting of Forum for Information Retrieval Evaluation (FIRE 2014). Indian Statistical Institute, Bangalore, December 5 – 7, 2014.
  16. Amit Prakash, Sujan Kumar Saha. 2014. Experiments on Document Chunking and Query Formation for Plagiarism Source Retrieval. Notebook for PAN at CLEF 2014. Proceedings of the CLEF2014 Working Notes. Pages 990-996. ISSN 1613-0073. Sheffield, UK,  September 15-18, 2014.  [Highest recall in the shared task competetion on Plagiarism Detection 2014]
  17. Satyam, Anand, A. K. Dawn, and S. K. Saha. 2014. A Statistical Analysis Approach to Author Identification Using Latent Semantic Analysis. Notebook for PAN at CLEF 2014. Proceedings of the CLEF2014 Working Notes. Pages 1143-1147. ISSN 1613-0073. Sheffield, UK,  September 15-18, 2014.  [Highest accuracy in Dutch reviews and 2nd Highest accuracy English essays]
  18. Mukta Majumdar, Sujan Kumar Saha. 2014. Development of NER System for Wikipedia without using Wikipedia text as training data: Sports (Cricket) a case study. Proceedings in EIIC-The 3rd Electronic International Interdisciplinary Conference ISSN:1338-7871.
  19. Arjun Singh Bhatia, Manas Kirti, and Sujan Kumar Saha. 2013. Automatic Generation of Multiple Choice Questions using Wikipedia. Proc. of Pattern Recognition and Machine Intelligence (PReMI -13), LNCS Vol. 8251,  pages 733 – 738. ISI Kolkata, December 10-14, 2013
  20. Mukta Majumder, Utsav Barman, Rahul Prasad, Kumar Saurabh, Sujan Kumar Saha. 2012. A novel technique for name identification from homeopathy diagnosis discussion forum. 2nd International Conference on Communication, Computing & Security. National Institute of Technology Rourkela, October 6-8, 2012. Elsevier Procedia Technology. Vol 6. pages 379-386.  
  21. Sujan Kumar Saha, Pabitra Mitra and Sudeshna Sarkar. 2009. A Semi-supervised Approach for Maximum Entropy Based Hindi Named Entity Recognition. Proc. of Pattern Recognition and Machine Intelligence (PReMI-09), IIT Delhi, December 16–20, 2009. LNCS Vol. 5909/2009, pages 225 – 230, New Delhi, India.
  22. Sujan Kumar Saha, Pabitra Mitra and Sudeshna Sarkar. 2008. Word Clustering and Word Selection Based Feature Reduction for MaxEnt Based NER in Hindi. Proc. of the Association for Computational Linguistics (ACL-08:HLT), pages 488 – 495, Columbus, Ohio. June 15-20, 2008.
  23. Sujan Kumar Saha, Sudeshna Sarkar and Pabitra Mitra. 2008. A Hybrid Feature Set Based Maximum Entropy Hindi Named Entity Recognition. Proc. of the 3rd International Joint Conference on Natural Language Processing (IJCNLP 2008), pages 343 – 349, January 7-12, 2008, IIIT Hyderabad, India.
  24. Sujan Kumar Saha, Sanjay Chatterjee, Sandipan Dandapat, Sudeshna Sarkar and Pabitra Mitra. 2008. A Hybrid Named Entity Recognition System for South and South East Asian Languages. Proc. of the IJCNLP 2008 Workshop on Named Entity Recognition for South and South East Asian Languages (NERSSAL 2008), pages 17 – 24, Hyderabad, India. [Best result in the NERSSAL 2008 shared task]
  25. Sujan Kumar Saha, Sudeshna Sarkar and Pabitra Mitra. 2008. Gazetteer Preparation for Named Entity Recognition in Indian Languages. Proc. of the IJCNLP-08 Workshop on Asian Language Resources (ALR6 2008), pages 9 – 16, Hyderabad, India.
  26. Sujan Kumar Saha, Sudeshna Sarkar and Pabitra Mitra. 2008. A Novel Semi-supervised Method for Named Entity Detection. Proc. of the 6th International Conference on Natural Language Processing (ICON 2008), CDAC Pune, India, December 21-22, 2008.
  27. Sujan Kumar Saha, Sudeshna Sarkar and Pabitra Mitra. 2007. Transliteration Based Gazetteer Preparation for Named Entity Recognition in Hindi. Proc. of the 7th International Symposium on Natural Language Processing (SNLP 2007), Pattaya, Thailand. December 13-15, 2007.
  28. Sujan Kumar Saha, Lipika De. 2006. Ontology Based Text Information Retrieval. Proc. of the International Conference on Emerging Applications of IT (EAIT 2006),  February 10-11, 2006, Elsevier, pages 45 – 48, Kolkata, India. [One of the selected best papers]

 

  Current Sponsored Projects
 

 

Funded Research Projects:

 

  1. Title:   Automatic Speech Recognition System in Santali and Nagpuri Languages: Resource Creation and System Development.

Role:               Principle Investigator (PI).

Co-PI:             NIL.

Funded By:     SERB (Science and Engg. Research Board), Govt. of India.

Duration:         3 years (Started in March 2022).

Budget:          ~30Lakhs

 

  1. Title:   System for Automatic Evaluation of Handwritten Answer Books in Bengali.

Role:               Principle Investigator (PI).

Co-PI:             NIL.

Funded By:     SERB (Science and Engg. Research Board), Govt. of India.

Duration:         3 years (Started in March 2022)

Budget:      ~20 Lakhs.

 

Completed Projects:

  1. Title:   Automatic Question Generation and Evaluation Based System for Instant Assessment of Learning in School Level.

             Role:               Principle Investigator (PI).

             Co-PI:             NIL.

             Funded By:     SERB (Science and Engg. Research Board), Govt. of India.

             Duration:         3 years (Completed on 1st May 2019).

 

  1.    Title:  Development of Basic Natural Language Processing Tools and Resources for Maithili.

              Role:               Principle Investigator (PI).

              Co-PI:             NIL.

              Funded By:     SERB (Science and Engg. Research Board), Govt. of India.

              Duration:         3 years (Completed on  15 March, 2020).

       

  Text and Reference Books
 

Book Title: Advances in Computational Intelligence (ISSN: 2194-5357)

Editors: Sudip Kr Sahana and Sujan Kr Saha

Publisher: Springer

URL: http://link.springer.com/book/10.1007/978-981-10-2525-9