Skip to content

akoutsop1909/bsc-thesis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

98 Commits
 
 
 
 
 
 
 
 

Repository files navigation

images

Big data analytics for electric vehicles in the smart grid

As plug-in electric vehicle (PEV) adoption accelerates, uncoordinated charging during peak hours can cause power losses, overloads, and voltage fluctuations in smart grids. This thesis simulates success rates (10%–50%) for a hypothetical campaign encouraging consumers to shift their PEV charging from peak to off-peak hours to help alleviate grid strain. Data analytics tools and descriptive statistics are used to process data and visualize the potential impact. The findings show that even a modest shift in charging behavior can lead to a more balanced distribution of power demand.

Note

This thesis was completed in 2020. Since then, I've made several improvements, including implementing dplyr pipelines to streamline the code, enhancing the simulation model's sampling method, and adopting the RColorBrewer Dark 2 color-blind friendly palette for visualizations.

⚙️ System Requirements

  • R and an R IDE (e.g., RStudio) are required. You can download both from the official RStudio website.
  • You will also need the following R packages: ggplot2, RColorBrewer, lubridate, dplyr, and glue. You can install them by running this command in your R console:
install.packages(c("ggplot2", "RColorBrewer", "lubridate", "dplyr", "glue"))
  • To ensure correct date handling, your computer's display language must be set to English (United States) in the language settings.

Note

If you prefer to view the generated graphs without setting up the coding environment or running your own simulations, you can access a Colab notebook linked in the About section of this repository. This notebook summarizes Chapter 2 of the thesis, presenting key findings and relevant graphs generated from pre-executed code. It also provides links to additional notebooks for other chapters of the thesis, where you can explore further insights and visualizations.

📊 Data Sources

The three primary datasets were sourced from publicly available data provided by the National Laboratory of the Rockies (NLR). These were generated by a model simulating realistic electricity consumption patterns for the Midwest region of the United States and have been processed for use in this research.

The first dataset contains electricity demand profiles (in watts) for 200 households, recorded in 10-minute intervals throughout 2010 in the DD/MM/YYYY H:MM format. The second and third datasets contain PEV charging profiles for 348 vehicles associated with these households. One dataset uses Level 1 (1920 W) charging, while the other uses Level 2 (6600 W) charging, both recorded in 10-minute intervals over the same period in the same format as the first dataset.

The Time Zones structure was derived by processing the PEV charging profiles. It categorizes each weekday charge into one of four time zones: Shoulder 1 (7:00–13:50), Peak (14:00–19:50), Shoulder 2 (20:00–21:50), and Off-Peak (22:00–6:50). Weekend charges are categorized into two zones: Shoulder (7:00–21:50) and Off-Peak (22:00–6:50). This structure is a key component of the research, helping to confirm patterns, relationships, and anomalies observed during data analysis. It also serves as input for the simulation model.

All datasets are complete with no missing values for any time interval.

Sample Tables

  1. Household Electricity Demand Profiles (Household.csv)
Time Household 1 Household 2 Household 3 ... Household 200
1/1/2010 0:00 274.16 576.44 1523.90 ... 664.39
... ... ... ... ... ...
26/4/2010 18:20 818.34 1845.10 1421.30 ... 819.74
26/4/2010 18:30 513.47 1810.40 996.30 ... 819.74
26/4/2010 18:40 531.81 534.71 1721.30 ... 819.74
... ... ... ... ... ...
31/12/2010 23:50 1625.00 1013.30 420.61 ... 919.74
  1. PEV Charging Profiles Using Level 1 Charging (PEV_L1.csv)
Time H001.V001 H002.V002 H002.V003 ... H200.V348
1/1/2010 0:00 0 0 0 ... 0
... ... ... ... ... ...
26/4/2010 18:20 0 1920 1920 ... 0
26/4/2010 18:30 0 1920 0 ... 0
26/4/2010 18:40 1920 1920 0 ... 0
... ... ... ... ... ...
31/12/2010 23:50 0 0 0 ... 0
  1. PEV Charging Profiles Using Level 2 Charging (PEV_L2.csv)
Time H001.V001 H002.V002 H002.V003 ... H200.V348
1/1/2010 0:00 0 0 0 ... 0
... ... ... ... ... ...
26/4/2010 18:20 0 6600 0 ... 0
26/4/2010 18:30 0 6600 0 ... 0
26/4/2010 18:40 6600 0 0 ... 0
... ... ... ... ... ...
31/12/2010 23:50 0 0 0 ... 0
  1. Time Zones (TimeZones_full.csv)
Charge_Type Date Day_Type PEV_Code Charge_Duration Start_Time Stop_Time Time_Zone KWh Spans_Zones
L1 1/1/2010 Weekday H001.V001 12 11:30 13:20 Shoulder 1 3,84 FALSE
... ... ... ... ... ... ... ... ... ...
L1 31/12/2010 Weekday H200.V348 1 22:00 22:00 Off-Peak 0,32 TRUE
L2 1/1/2010 Weekday H001.V001 3 11:30 11:50 Shoulder 1 3,3 FALSE
L2 2/1/2010 Weekend H001.V001 1 11:30 11:30 Shoulder 1,1 FALSE
... ... ... ... ... ... ... ... ... ...
L2 31/12/2010 Weekday H200.V348 6 18:50 19:40 Peak 6,6 FALSE

Data Clarifications

  • PEV column names: Each label corresponds to a unique household-vehicle combination. For example, H001.V001 refers to Vehicle 1 (V001) in Household 1 (H001). This naming pattern applies to all 348 vehicles.
  • PEV charging: A value of zero means the vehicle is not charging at that time, while any non-zero value (1920 for Level 1 or 6600 for Level 2) indicates the charging power in watts.
  • Charging types: Level 1 (L1) charging uses a standard 120V household outlet and consumes 1920 W, making it the slowest and most affordable option. Level 2 (L2) charging uses a 240V outlet and consumes 6600 W. It is faster than Level 1 but more expensive and requires special equipment to connect to the 240V outlet.
  • Spans_Zones: This Time Zones column indicates whether a charge spans more than one zone (e.g., starts in Shoulder 1 and stops in Peak). If so, the value is TRUE; otherwise, it is FALSE, meaning the entire charge occurs within a single zone.

🚀 Getting Started

Clone this repository to your local machine following the instructions from GitHub. Inside, you will find the /data and /scripts folders:

  • The /data folder contains:
    • The processed datasets used in the research, including household electricity demand (Household.csv) and PEV charging profiles (PEV_L1.csv and PEV_L2.csv).
    • The Time Zones data, including the full yearly dataset (TimeZones_full.csv) and a subset of one week from January (TimeZones.csv).
    • Datasets generated after applying load shifting (a technique that shifts a percentage of PEV charges from peak to off-peak hours to balance grid load) to the Time Zones data. These datasets cover two main simulation cases, each with five subcases that vary the percentage of kWh shifted between zones.
  • The /scripts folder contains the necessary R scripts for data processing, analysis, and simulation.

Running the Scripts

Each script begins with instructions, package loading, and sets the working directory to the /data folder. The scripts are pre-configured to load data from subfolders, but you can modify the paths if needed. Some process and save data as .csv files to the specified directory, while others generate visualizations that you can view directly in your R IDE's plot window or save to your preferred location.

⚠️The rest of the readme is under construction⚠️

1. First plot and Exploratory Data Analysis (EDA)

2. The Time Zones Structure

3. Load Shifting

⌨️ Demo Run

Rplot

📂 Folder Structure