Projects

Can Large Language Models (LLMs) Replicate Human Counselor Responses?
  • Over 50% of the global population grapple with mental health illnesses or disorders. Despite earlier endeavors to create Artificial Intelligence (AI)- powered chatbots for the treatment of mental health issues, the algorithms and frameworks employed fell short in replicating the nuanced dimensions of counseling as practiced by human counselors. This project aims to explore the potential of AI and Large Language models in providing counseling support, with the ultimate objective of enhancing the accessibility of counseling services. Hallmarks of good counselors (e.g. lengthier text, high proportions of positivity in the text etc.) are identified via literature review and some of those aspects are used as points of comparison between human counselor responses and AI (in our case, Dolly) generated responses to mental health patient questions using the CounselChat Data.
  • Full Code
  • Write-Up
Comparing Depression Levels of People during 2008 SubPrime Mortgage Crisis v.s. 2020 COVID-19 Pandemic
  • This project was for the 2020 Vassar College DataFest sponsored by the college and the American Statistical Association.
  • Compared to the 2008-2009 Subprime Mortage Crisis did people feel more depressed? Did U.S. U.S. government’s responses contribute to alleviating people’s anxiety/depression? If so, which policies / responses? We explore this question using four datasets - 2009 Tweets Data, Tweets from early Covid period + Tweets from 2020 April, Reddit Comments Data under the depression and AskReddit Sections and the Oxford Covid-19 Government Response Dataset.
  • Write-Up
  • Slides
  • Full Code
Do players in certain positions (e.g. Center, Guard, Forward) have more instagram followers, on average, than those in other positions?
  • MATH347(Bayesian Inference) Final Project
  • Do players in certain positions (e.g. Center, Guard, Forward) have more instagram followers, on average, than those in other positions? How do prior and posterior distributions compare to each other?
  • How do we choose priors? Which model should we use? If we use the hierarchical model, for which position is the shrinkage / pooling effect the most prominent? What does this mean?
  • Slides
  • Full Code
  • MATH301 Final Project
  • Build a model that surpasses regression performances of price predictions of Airbnbs in NYC
  • Offer insights on how various characteristics of New York City Airbnb listings, especially gentrification trends, have changed in the most recent 3 years (2017-2020).
  • Final Report
  • Slides
  • Full Code
Real, Fake and Sarcastic News Detection in a Class Imbalance Setting
  • CS366 Computational Linguisitics Final Project
  • Aim to build a model that achieves similar or better classification performance using just the titles / headlines
  • Final Report
  • Full Code