Discovering Associations Between Demography and Residential Electric Usage Patterns: A Data Mining Approach

Document Type

Thesis Proposal


Dr. Jamal Alsabbagh, alsabbaj@gvsu.edu

Committee Members

Dr. Jerry Scripps, scrippsj@gvsu.edu; Dr. Yonglei Tao, taoy@gvsu.edu

Embargo Period



The intention of this thesis is to discover associations between demography and electric usage patterns among residential customers. The basis for the research will include electric consumption data from customers within the Holland Board of Public Works’ (HBPW) footprint in Holland, Michigan and household demographic data from Directories USA. Data compilation and pre-processing will take place in SQL Server 2005. Additional pre-processing and subsequent machine learning algorithms will be applied using WEKA, an open source data mining software tool from the University of Waikato. Algorithms under consideration include, but are not limited to, Kohonen Self Organizing Maps (SOMs), Decision Tree Analysis, and k-Means Clustering.

While overall system loads are automatically stored within various Supervisory Control and Data Acquisition (SCADA) systems with granularity down to the hour, the consumer components historically were collected on a monthly basis by manual read. If meter reading employees could not ascertain a reading for a customer in a given month, estimations would be used.

Beginning in 1999, the Holland BPW began the process of transitioning to an Automatic Meter Reading (AMR) system. By September 2006, the bulk of electric meters in the service area were communicating hourly to a SQL Server database cluster. These regular transmissions include usage, demand and alarm data used throughout the organization for processes such as billing, outage management, and transmission system load analysis.

The future of this system includes movement to bidirectional, on demand interactions with consuming assets within the service area. This emerging technology is known as the Smart Grid and affords utilities the ability to offer more reliable transmission to its customers by closely monitoring and maintaining balance between supply and demand. Likewise, real time granularity would allow the utility to offer new rate structures that would incentivize customers to conserve during peak load periods.

This research will attempt to discover correlations that may exist between demography and residential electricity usage. The intent is to discretize hourly load patterns using Kohonen SOMs and feed the demographic data with the resulting usage pattern categories into various classification algorithms. The goal, through this two phased approach, is to provide insight into the various demographic groups that the electric utility serves.

This document is currently not available here.