Document Type


Lead Author Type

CIS Masters Student


Dr. Christian Trefftz,

Embargo Period



clustering, data, grand rapids, coffee shop, geographic data, machine learning


Many businesses suffer from losses after establishing their business due to a lack of proper research before deciding on a new establishment location. The method proposed in this paper can land on the best possible location for a new establishment by web scraping a target list of Grand Rapids neighbourhoods using beautifulsoup library, and passing this list to geocoder library, to retrieve a list of geographical coordinates. API calls are made to Foursquare API with each coordinate as parameter which returns a JSON output consisting all the venues around. After various stages of pre-processing such as data cleaning, normalization and feature engineering this data is fed to a clustering algorithm such as K-means clustering; an unsupervised learning technique which strives to choose its centroids to minimize the inertia in the given data. The number of centroids in K-means clustering is determined by utilizing the two methods namely, Silhouette and Elbow method. The best location is determined by scrutinizing the frequency of coffee shops, hence, the competition/demand of coffee shops in the area and suggest the best possible spot for a new coffee shop. Grand Rapids is chosen as the location for this project. Of course, just like any other business decision, opening a new coffee shop requires various other factors to be considered, such as the audience in that area or any schools around. Nevertheless, determining a location for the new establishment is the primary step that any individual would think of.