Dr. Prasenjit Mitra

Principal Scientist
Social Computing

Connect with me

Storing, managing, retrieving, and mining Big Data is one of the most difficult computing challenges of our times. Along with my colleagues in the Data Analytics Group at QCRI, I am interested in enabling end-users to utilize large datasets to the fullest by designing infrastructure and algorithms, and applying data and text mining techniques.

Research Focus at QCRI

At QCRI, Prasenjit will focus on infrastructure for scalable data mining and machine learning techniques, with applications in various domains such as digital libraries and health informatics.

Previous Experience

Prior to joining QCRI, Prasenjit was a Professor of Information Sciences and Technology at The Pennsylvania State University. He was a member of the graduate faculty in Computer Science and Engineering and an affiliate faculty member in the Department of Industrial and Manufacturing Engineering. Before his time at Penn State, Prasenjit worked as a senior member of the technical staff at Oracle Corporation in their Server Technologies group on Massively Parallel Processing and Languages and Relational Technologies.

Prasenjit’s research has been funded by the CAREER award from the U.S. National Science Foundation, Department of Homeland Security, National Geospatial Intelligence Agency, Microsoft, Dow and Lockheed Martin. He has co-authored 17 journal papers and over 125 conference papers in top venues. He has also supervised over 10 PhD students; and been the co-chair, area chair, and senior PC member at top conferences including IEEE Society, CIKM, and IJCAI, respectively.


Professional Experience

  • Professor, College of Information Science and Technology, The Pennyslvania State University - 2014
  • Associate Professor, College of Information Science and Technology, The Pennyslvania State University - 2010 - 2014
  • Assistant Professor, College of Information Sciences and Technology, The Pennyslvania State University - 2003 - 2010
  • Graduate Faculty, Department of Computer Science and Engineering, The Pennyslvania State University - 2003
  • Chief Scientist (Consulting), Global IDs, New York, 2008-2012
  • Senior Software Engineer, DBWizards, California - 2002
  • Senior Software Engineer, Narus, California - 2001 - 2002
  • Research Assistant, Department of Computer Science, Stanford University - 1999 - 2003
  • Technical Staff, Oracle Corporation - 1995 - 2000

Professional Associations and Awards

  • Member, ACM
  • Member, IEEE

  • NSF Career Award - 2009 - 2014
  • IEEE VAST Challenge - 2008


  • PhD in Electrical Engineering, Stanford University, California - 2004
  • MS in Computer Science, University of Texas at Austin, Texas - 1994
  • BS in Computer Science and Engineering, Indian Institute of Technology - 1993

Selected Research

  • Anuj R. Jaiswal, David J. Miller, Prasenjit Mitra: Schema matching and embedded value mapping for databases with opaque column names and mixed continuous and discrete-valued data fields. ACM Trans. Database Syst. (TODS) 38(1):2 (2013)
  • Dayu Yuan, Prasenjit Mitra: Lindex: a lattice-based index for graph databases. VLDB J. (VLDB) 22(2):229-252 (2013)
  • Sumit Bhatia, Cornelia Caragea, Hung-Hsuan Chen, Jian Wu, Pucktada Treeratpituk, Zhaohui Wu, Madian Khabsa, Prasenjit Mitra, C. Lee Giles: Specialized Research Datasets in the CiteSeerx Digital Library. D-Lib Magazine 18(7/8) (2012)
  • Sumit Bhatia, Prasenjit Mitra: Summarizing figures, tables, and algorithms in scientific publications to augment search results. ACM Trans. Inf. Syst. 30(1): 3 (2012)
Peer-reviewed conference proceedings
  • Prakhar Biyani, Cornelia Caragea, Prasenjit Mitra: Predicting Subjectivity Orientation of Online Forum Threads. CICLing 2013:109-120.
  • Cornelia Caragea, Adrian Silvescu, Prasenjit Mitra: Combining Hashing and Abstraction in Sparse High Dimensional Feature Spaces. AAAI 2012.
  • Jing Fang, Prasenjit Mitra, Zhi Tang, C. Lee Giles: Table Header Detection and Classification. AAAI 2012.
  • Wenyi Huang, Saurabh Kataria, Cornelia Caragea, Prasenjit Mitra, C. Lee Giles, Lior Rokach: Recommending citations: translating papers into references. CIKM 2012:1910-1914.
  • Prakhar Biyani, Cornelia Caragea, Amit Singh, Prasenjit Mitra: I want what i need!: analyzing subjectivity of online forum threads. CIKM 2012:2495-2498.
  • George H. L. Fletcher, Prasenjit Mitra: WIDM 2012: the 12th international workshop on web information and data management. CIKM 2012:2778-2779.
  • Prakhar Biyani, Sumit Bhatia, Cornelia Caragea, Prasenjit Mitra: Thread Specific Features are Helpful for Identifying Subjectivity Orientation of Online Forum Threads. COLING 2012:295-310.
  • Xiao Zhang, Baojun Qiu, Prasenjit Mitra, Sen Xu, Alexander Klippel, Alan M. MacEachren: Disambiguating Road Names in Text Route Descriptions using Exact-All-Hop Shortest Path Algorithm. ECAI 2012: 876-881.
  • Sujatha Das, Prasenjit Mitra, C. Lee Giles: Phrase Pair Classification for Identifying Subtopics. ECIR 2012: 489-493.
  • Dayu Yuan, Prasenjit Mitra, Huiwen Yu, C. Lee Giles: Iterative Graph Feature Mining for Graph Indexing. ICDE 2012: 198-209.
  • Yan Zhang, Prasenjit Mitra, John Yen, Wai-Tat Fu: Panel on social media for consumer health. IHI 2012: 889-890.
  • Sujatha Das Gollapalli, Prasenjit Mitra, C. Lee Giles: Similar researcher search in academic environments. JCDL 2012: 167-170.
  • Suppawong Tuarob, Prasenjit Mitra, C. Lee Giles: Improving algorithm search using the algorithm co-citation network. JCDL 2012: 277-280.
  • Sooyoung Oh, Zhen Lei, Prasenjit Mitra, John Yen: Evaluating and ranking patents using weighted citations. JCDL 2012: 281-284.
  • Jian Wu, Pradeep B. Teregowda, Juan Pablo Fernández Ramírez, Prasenjit Mitra, Shuyi Zheng, C. Lee Giles: The evolution of a crawling strategy for an academic document search engine: whitelists and blacklists. WebSci 2012: 340-343.
  • Jian Wu, Pradeep B. Teregowda, Madian Khabsa, Stephen Carman, Douglas Jordan, Jose San Pedro Wandelmer, Xin Lu, Prasenjit Mitra, C. Lee Giles: Web crawler middleware for search engine digital libraries: a case study for citeseerX. WIDM 2012: 57-64.
  • George H. L. Fletcher, Prasenjit Mitra (Eds.): Proceedings of the Twelfth International Workshop on Web Information and Data Management, WIDM 2012, Maui, HI, USA, November 02, 2012. ACM 2012, isbn 978-1-4503-1720-7.

Connect with me

Follow Us

  • YouTube
  • Twitter
  • Facebook
  • RSS Feed
  • Linkedin
  • github-web.png
Back to Top

In the Media

Forbes fake news pic.jpg

Can AI Put An End To Fake News? Don't Be So Sure


Fake news was the Collin’s word of the year for 2017 with good reason. In a year where politics-as-usual was torn apart at the seams, high-profile scandals rocked our faith in humanity and the ...

Read More


MIT/QCRI system uses machine learning to build road maps


Map apps may have changed our world, but they still haven’t mapped all of it yet. Specifically, mapping roads can be difficult and tedious: even after taking aerial images, companies still have to ...

Read More

Economist story pic.JPG

Improving disaster response efforts through data


Extreme weather events put the most vulnerable communities at high risk. How can data analytics strengthen early warning systems and and support relief efforts for communities in need? The size and ...

Read More

Upcoming Events

Past Events


Eman interns pic 2017.jpg

QCRI Summer Internship Program

Download ICS File 06/05/2018  - 05/07/2018 , Hamad Bin Khalifa Research Complex

Each year, Qatar Computing Research Institute organizes a summer internship program for undergraduate students studying computer science, computer engineering and other disciplines. The internship is unpaid, and QCRI does not provide any visa support.

Read More


Public Talk by Prof. Regina Barzilay "Artificial Intelligence for Oncology: Learning to Cure Cancer from Images and Text"

Download ICS File 27/03/2018 ,

Artificial Intelligence for Oncology: Learning to Cure Cancer from Images and Text A talk by Professor Regina Barzilay, MIT CSAIL Winner of 2017 MacArthur ‘genius grant’ At Education City Student ...

Read More


QCRI & MIT-CSAIL Annual Project Review 2018

Download ICS File 27/03/2018 ,

Executive Overview Sessions Open to public Date:    Tuesday, March 27, 2018 Time:    9:00AM – 3:00PM Venue:  HBKU Research Complex Multipurpose Room To view full agenda, please click here . To RSVP, ...

Read More

News Releases

UNDP workshop.JPG

UNDP partners with QCRI to use AI for social good


Qatar forum on leveraging AI to solve humanitarian problems fills to capacity.

Read More

C. Mohan pic.jpg

Renowned computing expert C. Mohan to bust blockchain myths in Qatar talk


Well-known inventor of database recovery algorithms to deliver keynote at QCRI's first blockchain workshop.

Read More

Darb Al Saai QCRI 2017.JPG

QCRI to offer kids’ computing activities at this year’s Darb Al Saai


Tech fun and robotics computing activities will be available to children attending the annual family celebration from December 12 to 20.

Read More