Introduction

VeggieTB, short for Vegetarian Tacos and Burritos, is a short project I did at the TAMU Datathon in October 2019. I was given a dataset of tacos and burritos in restaurant menus across the United States, and I used to extract some useful information regarding the state of vegetarian Mexican meals throughout the country. This site is to present my findings: please enjoy the same way I would enjoy a burrito.

Chart Overview

This is a bubble map of the number of vegetarian options available in the US by city. Unsurprisingly, the big hubs such as LA, San Francisco, and New York have the largest number of accommodating options. It's interesting that none of the largest bubbles are in Texas, however, since we are so proud of our "Tex Mex" food style.

Legend

The legend on the right is the output of the K means algorithm described below, which is how I distributed the data into effective clusters by bubble color. The largest spots are indicated both by the color green and their size on the map, centered at the baricentre of the city.

Data Filtration

Since the original dataset features all taco and burrito data for the country, I had to filter out entries that contained some sort of meat. I did this by filtering out based on both the name of the item and the menu's description, if either or both were present. The default assumption was that it was not vegetarian, since most are not.

K Means Algorithm

I used the K Means clustering algorithm provided by scikit-learn to come up with good labels to partition the bubble graph by color. Since there was too much data to manually choose categories, I used an unsupervised algorithm to obtain the optimal partitions, and emphasized the larger cities for maximum usefulness.