27th OCT

In the latest advancement of my data science project for MTH522, I have performed and  executed a K-means clustering algorithm on a comprehensive dataset detailing fatal police shootings. In this critical analysis I carefully handled missing values, normalized numerical features, and appropriately encoded categorical variables to feed into the clustering model. The optimized clusters were derived through an analytical approach, potentially employing the Elbow method to identify the most relevant grouping structure within the data. The findings from this clustering will provide insightful patterns and correlations, contributing significantly to the understanding of the underlying trends in fatal police encounters.

25th OCT

In our latest analysis of the police shooting dataset, we focused on understanding the age distribution among different races of individuals involved in shooting incidents. We formulated a null hypothesis stating there is no significant age difference between individuals of different races shot by police. Conversely, our alternative hypothesis proposed that such a difference does exist. Through statistical testing, we sought to determine whether the observed data could reject the null hypothesis and support the alternative, thereby providing insights into demographic disparities within these critical incidents.

23rd OCT

Today, I started working with a dataset about police shootings. My job is to clean up the data, which means I checked for any missing or incomplete information. I found some gaps in important details like names, what the person was carrying, their age, gender, race, and where they were trying to go. For the simpler details, I just marked them as ‘unknown’. But for key information like where the incident happened and how old the person was, I had to be careful. I took out any records that didn’t have a location, and for age, I used the most common age in the dataset to fill in the blanks. Now that the data is cleaned up, I’m ready to dig deeper into the analysis.

21st OCT

Hello,
Today I have performed T Test for the Fatal-police-shootings-and also examined the data with few Regression methods and raised few Questions on the Fatal Police Shooting data.

18th Oct

Today I have learned about Logistic Regression Logistic Regression is a widely-used statistical method for predicting the probability of a binary outcome, that is, where there are only two possible categories or events. Unlike linear regression, which predicts continuous outcomes, logistic regression computes the probability of an event’s occurrence by keeping the predictions within the 0 to 1 range, thanks to the S-shaped logistic function. the model’s coefficients represent changes in the log odds of the outcome for every one-unit change in a predictor variable. Positive coefficients indicate an increase in the likelihood of the event, while negative ones suggest a decrease. Understanding these coefficients—and by extension, the odds ratio—is essential for proper interpretation and is crucial in sectors such as finance, healthcare, and political campaigning.

16th Oct

After a deep analysis of the Washington Post data, we gathered detailed information about when and where shootings happen in the United States. By looking closely at this information, we can make maps that show us which parts of the country have more of these shootings. We can see if certain areas have more shootings than others and check if this is because of how many people live there or other reasons. We can also look at how things change over time – do more shootings happen in summer, for example, or after certain laws are passed?

By comparing the places where shootings happen with information about who lives there, we can figure out if some kinds of people are more likely to be involved in these shootings. It’s important to make sure the addresses in the data are correct so we don’t make mistakes when we’re looking at the map. People who are good with computers and maps can use special tools to help us understand the information better using many visualization tools. This can help police departments, city leaders, and communities make better decisions to hopefully prevent these shootings in the future.

13th Oct

There are 12 columns in the dataset. The ‘Date’ column shows when shootings occur, the ‘Age’ column reveals which age groups are most frequently implicated in crimes, and the ‘City’ column identifies areas with high or low crime rates. We learn about patterns of violence and potential police bias from the ‘Gender’ and ‘Race’ columns. The ‘Armed’ column indicates whether or not someone may have planned to cause damage. The integration of ‘State’ and ‘City’ data allows us to visualize the overall picture of crime locations. The ‘Body camera’ data provides insights into additional trends, while the ‘Flee’ column suggests how well the cops are performing. The effectiveness of mental health services is indicated by the signs of mental illness column, while the performance of various police departments is displayed in the police departments involved column.

11th Oct

I conducted an analysis of what the data was about and I tried to understand the main points of the data and analyzed what data is all about and raised few questions from the data.

Oct-07

Hello, Today I was working on the conclusion analysis and adding those points in the report in the Punchline format and on few changes in the source code for the model. and we discussed on the questions we made from the cdc data with our group members along with this we were discussing about the title of our project !!