Home » nba dataset kaggle

Nba dataset kaggle

GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together. If nothing happens, download GitHub Desktop and try again.

If nothing happens, download Xcode and try again. If nothing happens, download the GitHub extension for Visual Studio and try again. As Fall Exam Season reaches its climax, I did what any other university student would do - make a side project! From any NBA match, you can tell that many basketball players have a unique style of play. But avid NBA fans know these characteristics out of instinct after watching the NBA for weeks, months, maybe even years. I wanted to find a concrete method of arriving at these conclusions.

This led me to create a Python program which analyzes any current NBA player's gameplay to find which areas of the court they have most shooting success and the probability of shooting from certain spots.

With this analysis, it can allow coaches and players to know their opponents gameplay and can show them which areas of the court to prioritize their defense. Using datasets from Kaggle formatted with Pandaswe can use Matplotlib to illustrate our data analysis.

This dataset includes every shot taken in the Regular Season so any player who played a game during this season can be analyzed. By incorporating machine learning through Python's Scikit-Learn using a K-Nearest Neighbours Classifier we can also simulate a player's shooting from every position on the court. In this plot, the green dots represent scoring shots while the red dots represent missed shots. The black dots represent spots that Curry is most likely to shoot from.

The size of the dots represent the relative probability of Curry shooting from that position and the darkness of the dot illustrates his shot accuracy from that spot with darker shades representing a higher accuracy. However, the black dots are a bit hard to see because of the green and red dots, so let's simplify the plot. So how did we do? Our plot says that Curry is most likely to shoot from very close range at layup distance or from the 3-point range. Any NBA fans can verify the accuracy of this prediction since Curry is known for his 3-point shooting and his ability to rush past defenders for layups.

For our next analysis let's change things up and take a look at another player - Curry's teammate, Kevin Durant.

Injury Report

In this analysis we are incorporating machine learning by using a Decision Tree to simulate Kevin Durant's shooting throughout the court. With this analysis we have a more uniform distribution for our analysis so we can predict how Durant will shoot based on his past shooting habits.

In this plot, the green dots represent scoring shots and the red dots represent missed shots.

nba dataset kaggle

From this analysis we can determine that Durant is more dominant on the right side of the court which is the case for right-handed players. He also is predicted to score with high consistency throughout the key and around the 3-point perimeter - another prediction which is confirmed by his known playing style.

But we can take this one step further, let's compare Kevin Durant's shooting to the average NBA player. Similar to how we simulated Kevin Durant's shooting with machine learning, we can use the data for all NBA players in the Regular Season to find the shooting habits of the average NBA player.

nba dataset kaggle

In this plot, green dots represent shots that Durant made but the NBA average player missed meaning Durant is above average shooting at this positions.GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together. If nothing happens, download GitHub Desktop and try again. If nothing happens, download Xcode and try again.

If nothing happens, download the GitHub extension for Visual Studio and try again. Unless otherwise noted, our data sets are available under the Creative Commons Attribution 4. If you find this information useful, please let us know. Skip to content. Dismiss Join GitHub today GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.

Sign up. Jupyter Notebook R Python. Jupyter Notebook Branch: master. Find file. Sign in Sign up. Go back. Launching Xcode If nothing happens, download Xcode and try again. Latest commit. Latest commit a76 Apr 16, You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Feb 9, Aug 31, May 9, Correct typo in title. Feb 26, Apr 16, Mar 14, Feb 28, Aug 30, Dec 4, Jul 15, Oct 19, Jul 2, Mar 12, Our Insights blog presents deep data-driven analysis and visual content on important global issues from the expert data team at Knoema.

Leverage our AI Workflow Tools and online data environment to manipulate, visualize, present, and export data. Okay to continue Our website uses cookies to improve your online experience.

They were placed on your computer when you launched this website. You can change your personal cookie settings through your internet browser settings. Data Products Insights Data Partners. Sign Up Log in. World Data Atlas World and regional statistics, national data, maps and rankings. Data Bulletin Latest releases of new datasets and data updates from different sources around the world. Insights blog Our Insights blog presents deep data-driven analysis and visual content on important global issues from the expert data team at Knoema.

Learn more. World Data Atlas World and regional statistics, national data, maps, rankings. B Basic Social Statistics of Japan. Source: National Bureau of Statistics, China. Cricket Player Statistics, - Cricket Player Statistics, - This dataset covers cricket players statistics on batting, bowling, fielding, all rounders across Test, ODI, T20 matches. Culture and Sport Statistics of Chile. Source: National Statistics Institute of Chile.

Culture and Sports in Gabon. Source: General Directorate of Statistics of Gabon.

Data Analysis on a Kaggle's Dataset

Culture and Sports Statistics of Japan. Culture and Sports Statistics of Tunisia. Source: National Institute of Statistics, Tunisia. Employment in sport by age.

Employment in sport statistics aim at investigating on the dimension of the contribution of sport employment to the overall employment. Employment in sport by educational attainment level. Employment in sport by sex. Source: International Federation of Association Football. French federation sports licences, France.Classification Regression Clustering 92 Other Categorical 38 Numerical Mixed Less than 10 10 to Greater than Less than 27 to Greater than Matrix Non-Matrix Data Types.

Default Task. Attribute Types. Anonymous Microsoft Web Data. Audiology Standardized.

nba dataset kaggle

Breast Cancer Wisconsin Original. Breast Cancer Wisconsin Prognostic. Breast Cancer Wisconsin Diagnostic. Chess King-Rook vs. Contraceptive Method Choice. Molecular Biology Promoter Gene Sequences.

Molecular Biology Protein Secondary Structure. Molecular Biology Splice-junction Gene Sequences. Page Blocks Classification. Optical Recognition of Handwritten Digits.

Pen-Based Recognition of Handwritten Digits. Qualitative Structure Activity Relationships. Low Resolution Spectrometer. Teaching Assistant Evaluation. Congressional Voting Records.Question, Comment, Feedback, or Correction? Are you a Stathead, too? Subscribe to our Newsletter. This Week in Sports Reference Find out when we add a feature or make a change. Do you have a sports website? Or write about sports? We have tools and resources that can help you use sports data. Find out more. We present them here for purely educational purposes.

Our reasoning for presenting offensive logos. Logos were compiled by the amazing SportsLogos. All rights reserved. Injury Report 84 Injuries. Bickerstaff said Garland won't play in the team's next two games. There is currently no timetable for his return. Bol Bol Denver Nuggets Wed, Jan 8, Out Foot - Bol has not played this season and no timetable has been announced for when he could play this year. Andre Roberson Oklahoma City Thunder Thu, Feb 27, Out Knee - Roberson has continued his rehab with the team and is working out on a consistent basis, though there is still no timetable for his rreturn according to Erik Horne of The Athletic.

kaggle-dataset

He will be re-evaluated in 12 weeks. Ben Simmons Philadelphia 76ers Wed, Mar 11, Out Back - The 76ers announced that Simmons will be re-evaluated in three weeks as he recovers from a pinched nerve in his lower back. He's out for the rest of this season and may miss the start of next season as well. Full Site Menu Return to Top. In the News : L. JamesK. DurantK. BryantG. AntetokounmpoK. LeonardA. All-Time Greats : J. WestH.BigDataBall transforms traditional box score stats, odds, play-by-play logs, and DFS data into cleaned-up, aggregated, enriched spreadsheets.

Already having the metrics that matter most, you save hours of research and focus only on crunching numbers. Are you ready to be your own data scientist? Let us do the hassle work for you and bring the accurate stats while your favorite sports season swings into full gear. How In-Season Plans Work? Join our shared folder on Dropbox to get the daily files pushed to your computer. Backtest your model against historical data, research trends, gain insight from situation analysis.

This is where the historical datasets come in.

nba dataset kaggle

Add new columns, calculate new metrics and build a unique customized database. Get introduced to sports-analytics with schedule spreadsheets which help you build and execute your season strategy. Target games with all necessary information such as team and opponent rest days provided. Game date, game times, game scores, rest days, opponent rest days and total game minutes provided. Access to Historical Schedules. Enhanced Sports Datasets BigDataBall transforms traditional box score stats, odds, play-by-play logs, and DFS data into cleaned-up, aggregated, enriched spreadsheets.

Explore Datasets. View In-Season Plans. View Historical Datasets.In short, Finding answers that could help business. In this tutorial, We will see how to get started with Data Analysis in Python. The Python packages that we use in this notebook are: numpypandasmatplotliband seaborn Since usually such tutorials are based on in-built datasets like irisIt becomes harder for the learner to connect with the analysis and hence learning becomes difficult.

IPL is one of the most popular cricket tournaments in the world, thus the problems we try to solve and the questions that we try to answer should be familiar to anyone who knows Cricket. To make our plots look nice, let us set a theme for our seaborn sns plots and also let us define the size in which we would like to print the plot figures.

This is to make sure that the path is stored in a string first before using the same concatenated with the file name to read the input csv using pd. To begin humbly, Let us check the basic information of the dataset. And the final level of this basic information retrieval is to see a couple of actual rows of the input dataset. Now, with the basic understanding of the input dataset. We are promoted to answer our questions with basic data analysis.

So to get the number of matches in our dataset is as same as to get the number of rows in the dataset or maximum value of the variable id.

Introduction to Data Analysis in Python with IPL Dataset

To answer this question, we can divide the question logically — first we need to find maximum runs, then we can find the row winning team with this maximum runs — which would indeed be the team won by maximum runs.

Hence, the minimum win by runs will always be 0 and the minimum win by wickets will also always be 0 in a tournament since sometimes chasing team or sometimes the team that batted first could win. To overcome this caveat, we just have to apply a simple workaround as you can see below.

To advance further in our quest to understand the process of Data analysis in Python, let us answer further questions with Data Visulization i. Gives this:. The most successful IPL team is the team that has won most number of times. For those who follow IPL, you might have been wondering the irony now. Having solved those not-so-tough questions above, we are nowhere to extract a critical insight — which is — Has winning toss actually helped in winning the match?

Before visualizing the outcome, let us first see how the numbers look. Gives this plot:. Hope this post helps you in starting your journey of Data Analysis in Python.

The complete code used here is available on my github. Share: Twitter Facebook. Abdul Majed Raja. Share it. Facebook Twitter Reddit Linkedin Email this. Related Posts. Online Courses.

Connect with Us.


About the Author: Dizahn

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *