The rich textual data contained in years worth of fellowship applications provides a unique opportunity to analyze the pool of applications and perhaps gain insights into trends and what makes applicants successful in achieving social impact. We believe such insights could help the broader community of social entrepreneurs to better direct their efforts and magnify the collective impact they can achieve. Some questions to help address this need are; What do the applications focus on and how did this focus changed over time? What factors differentiate successful applications? Do they contain cues about what it takes to achieve ? Do individuals of different demographics and traits tend to focus on different topics?

To explore such questions, this summer, a group from the IBM Science for Social Good Program teamed with Echoing Green to use machine learning and natural language processing techniques to extract explanatory cues from this unique collection of anonymized application data. Though much of Echoing Green’s work is to help dismantle barriers to opportunity for its Fellows, they recognize the importance of regularly and rigorously evaluating their own search and selection processes as a way to help dismantle structural barriers to entry for emerging entrepreneurs across the globe.

The effort was led by our IBM Social Good Summer Fellow, Aditya Garg – a graduate student at Columbia University – and includes several data science researchers from IBM Research. Our team’s focus was on distilling the traits that are predictive of successful applications and to run an exploratory analysis to identify trends in the data. Some initial results of the project are below.

Read more