Document Type

Project

Lead Author Type

CIS Masters Student

Advisors

Dr. Christian Trefftz, trefftzc@gvsu.edu

Embargo Period

3-8-2021

Keywords

clustering, data, grand rapids, coffee shop, geographic data, machine learning

Abstract

Many businesses suffer from losses after establishing their business due to a lack of proper research before deciding on a new establishment location. The method proposed in this paper can land on the best possible location for a new establishment by web scraping a target list of Grand Rapids neighbourhoods using beautifulsoup library, and passing this list to geocoder library, to retrieve a list of geographical coordinates. API calls are made to Foursquare API with each coordinate as parameter which returns a JSON output consisting all the venues around. After various stages of pre-processing such as data cleaning, normalization and feature engineering this data is fed to a clustering algorithm such as K-means clustering; an unsupervised learning technique which strives to choose its centroids to minimize the inertia in the given data. The number of centroids in K-means clustering is determined by utilizing the two methods namely, Silhouette and Elbow method. The best location is determined by scrutinizing the frequency of coffee shops, hence, the competition/demand of coffee shops in the area and suggest the best possible spot for a new coffee shop. Grand Rapids is chosen as the location for this project. Of course, just like any other business decision, opening a new coffee shop requires various other factors to be considered, such as the audience in that area or any schools around. Nevertheless, determining a location for the new establishment is the primary step that any individual would think of.

Share

COinS