I’ve started analyzing a dataset of fatal police shootings in the US during today’s class. I started by closely examining a number of dataset factors. Notably, the age of the people engaged in these terrible situations was revealed by the number column of interest labeled “age.” The dataset also included geospatial information, namely latitude and longitude, which made it possible to plot the firing locations precisely.
I discovered a “id” column during this initial evaluation that appeared to have little analytical use, so I thought it might be best to exclude it. In addition, I thoroughly checked for missing values, which turned up null or missing data for a number of variables, including “name,” “armed,” “age,” “gender,” “race,” “flee,” “longitude,” and “latitude.” discovered one additional duplicate entry in the dataset, which stood out due to its absence of a “name” value.
The development of GeoHistograms and geospatial analysis were presented as essential methods for examining and visualizing geographical data. With the use of these techniques, we are able to identify geographic hotspots, clusters, and spatial trends within the dataset.