Qnt 351 - Deficit and Debt - Summarizing and Presenting Data
Autor: sochoa • March 3, 2016 • Presentation or Speech • 779 Words (4 Pages) • 1,007 Views
Summarizing and Presenting Data
Michael Robinson, Jessie Meeker, Stephen Ochoa, Alex Morrow
QNT/351
January 14, 2016
Peter Miedzinski
Summarizing and Presenting Data
To further our analysis of the data collected for the 2008 MLB season, we are going to focus on descriptive statistics. As mentioned earlier, we discovered that attendance has the strongest correlation with salary. Utilizing Megastat and the data we gathered from the Basic Statistics for Business and Economics 7e textbook appendix A.2 (Lind et al., 2011) we can determine the mean, mode, median, and standard deviation. We then used histograms, normal curve plots, box plots, and scatterplots to create graphical displays of the data for further analysis and presentation.
To define the scope of the salary numbers for the league, we use descriptive statistics including the mean, median, mode and standard deviation. For the 2008 season, the league has a mean, or average, salary of 89.913 million. This average is skewed by an outlier in the league. The range is from 21.8 to 209.1 million; with the salary of 209.1 million the outlier for the league. The next highest salary is 138.7 with all the rest of the salaries ranging from 21.8 to 138.7 million. The net effect of the outlier is a skewing of the curve to the right causing the mean to be above the median. The median for the salary range fell on exactly 80 million because the two central data points were 79 and 81. When we average those central data points, the result is the median of 80. The median being 80 is interesting because it reveals the skewness of the standard curve as a consequence of the outlier. This data has no mode because there are no numbers in the range that are repeated. All the individual teams have differing salaries, the result being no repeated numbers. The standard deviation of our salary range is 37.843. Standard deviation is the numerical value demonstrating how spread out the data is. Because there is a range of 187.3 and only 30 data points, the data is spread out over a relatively wide range. Again, the outlier affected the standard deviation because it was so far out of the range of all the rest of the teams.
We found a definite correlation between attendance and salary. The correlation matrix we ran showed a +.837 correlation between attendance and team salary. The strength of this correlation supports the reasonable conclusion that salary is directly related attendance, attendance generates revenue, so likely salary is related to revenue. Interestingly, the correlation held true even in the case of the outlier. Even though the outliers' salary was greater than any other team, their attendance was also the highest in the league granting a not entirely unreasonable justification for their salary. That being said, it is reasonable to presume that increasing attendance or the revenue generated from attendance will result in increased salary. This correlation begs several questions. What produces attendance? Are there other influential factors not revealed due to the limited data in the data set that we have? What part, if any, does player performance play in determining team salary? These questions demonstrate a need for further research into the factors that influence attendance including win/loss record and individual player performance.
...