> setwd('D:/Training_Material/Book/Files')
> getwd()
[1] "D:/Training_Material/Book/Files"
> Agedf<-read.csv("PlayersAge.csv",header =
TRUE)
> AgePlr <- as.numeric(Agedf$Age)
>
library(MASS)
> fitdistr(AgePlr,"Normal")
mean sd
28.9014025 4.6278804
( 0.1733155) ( 0.1225526)
> a.teo<-rnorm(n=713,mean=29,sd=4.6)
> qqplot(AgePlr,a.teo,main="QQ-plot
for Normal distribution")
> abline(0,1)
It
can be observed that the Age of cricketers follow Normal distribution.
Lower
Tail Test of Population Mean with Known Variance
Problem: Suppose mean age of a player is more
than 32 in a sample of 713 players; Assume the population standard deviation is
5. At .05 significance level, can we reject the claim that the average age is
more than 32 ?
Solution
#
The null hypothesis is that mu > 32.
> xbar =
29 # sample mean
>
mu0 =
32
# hypothesized value
>
sigma=
5
# population standard deviation
>
n =
713
# sample size
>
z = (xbar-mu0)/(sigma/sqrt(n))
>
z
# test statistic
[1]
-16.02124
Øcompute the critical value at .05 significance level.
>
alpha = .05
> z.alpha = qnorm(1-alpha)
>
-z.alpha
# critical value
[1]
-1.644854
§The test statistic -16.02124 is less than the critical value
of -1.6449. Hence, at .05 significance level, we reject the
claim “mean age of a
player is more than 32“
> pval = pnorm(z)
> pval
# lower tail p-value
§[1] 4.541344e-58
Two-Tailed Test of Population Mean with Known Variance
Problem: Suppose mean age of a player is
equal to 30 in a sample of 713 players; Assume the population standard
deviation is 5. At .05 significance level, can we reject the claim that the
average age is 30?
Solution: # The null hypothesis is that mu = 32.
> xbar =
28.9 # sample mean
>
mu0 =
30
# hypothesized value
>
sigma=
5
# population standard deviation
>
n =
713
# sample size
>
z = (xbar-mu0)/(sigma/sqrt(n))
>
z
# test statistic
[1] -5.874453
compute
the critical value at .05 significance level.
>
alpha = .05
> z.half.alpha = qnorm(1-alpha/2)
>
c(-z.half.alpha, z.half.alpha)
[1]
-1.959964 1.959964
§The test statistic -5.874453 is not
between the critical values -1.9600 and 1.9600. Hence, at .05 significance
level, we reject the null hypothesis that the “mean age of a
player is equal to 32“
> pval =
2 * pnorm(z) # lower tail
> pval
# two-tailed p-value
[1] 4.242414e-09
Two-Tailed
Test of Population Mean with Unknown Variance
Problem: Suppose mean age of a player is
equal to 30 in a sample of 713 players and population standard deviation is
unknown; Can we reject the claim that the average age is 29.5 at .05
significance level?
Solution: # The null hypothesis is that mu = 30.
> getwd()
[1]
"D:/Training_Material/Book/Files"
> Agedf<-read.csv("PlayersAge.csv",header =
TRUE)
> AgePlr <- as.numeric(Agedf$Age)
>
library(fBasics)
> basicStats(AgePlr)
> basicStats(AgePlr)
AgePlr
nobs
713.000000
NAs
0.000000
Minimum
17.200000
Maximum
45.400000
1.
Quartile 25.500000
3.
Quartile 31.800000
Mean
28.901403
Median
28.600000
Sum
20606.700000
SE
Mean 0.173437
LCL
Mean 28.560893
UCL
Mean 29.241912
Variance
21.447358
Stdev
4.631129
Skewness
0.389330
Kurtosis
0.173905
> xbar=
28.901403 # sample mean
>
mu0 =
30
# hypothesized value
>
s = 4.631129 # sample
standard deviation
>
n =
713 #
sample size
>
t = (xbar−mu0)/(s/sqrt(n))
>
t
# test statistic
[1]
-6.334266
>
alpha = .05
> t.half.alpha = qt(1−alpha/2, df=n−1)
>
c(−t.half.alpha, t.half.alpha)
[1]
-1.963301 1.963301
The
test statistic -6.334266 is not between the critical values -1.9600 and 1.9600.
Hence, at .05 significance level, we reject the null
hypothesis that the “mean age of a player is equal to 3o“
> pval =
2 ∗ pt(t, df=n−1)
# lower tail
> pval
# two−tailed p−value
[1] 4.225286e-10