Correlation
CORRELATION
Description also available in video format (attached
below), for better experience use your desktop.
Introduction
·
It refers to a process which establishes a
relation between two variables
·
After developing correlation you get an idea
about whether the two variables are related or not
·
Correlation coefficient is generally
represent by the symbol (r) and
usually ranges from -1 to +1
·
When the coefficient is close to -1, it is
called negative relationship between the two variables.
·
When the coefficient is close to +1, it is
called positive relationship between the two variables.
Scatter Diagram
·
It is used to examine the relationship
between the X & Y axis with one variable
·
If all the points in the diagram stretch in
one line then it means that the correlation is perfect
·
If all the points are scattering widely then
it means that the correlation is low
·
If the scatter point rest near a line or on
the line then it means that the correlation is linear
Karl Pearson’s Coefficient
R = (n (∑xy)- (∑x)(∑y))/(√ [n ∑x2-(∑x)2][n ∑y2– (∑y)2)
·
A method in which the numerical
representation is applied to measure the level of relationship between the
linearly related variables.
·
When the correlation coefficient is +1, it
means there is a positive increase in the one proportion when the second
variables starts to increase, just like shoe size changes according to the
length of the feet.
·
When the correlation coefficient is -1, it
means there is a decrease in the one proportion when the second variables
starts to increase, just like the decrease in the quantity of gas in a gas
tank.
·
When the correlation coefficient is 0, it
means there is no positive or negative increase because the two variables are
not related.
Spearman’s Rank Correlation Coefficient
= | Spearman's rank correlation coefficient | |
= | difference between the two ranks of each observation | |
= | number of observations |
The Spearman's Rank Correlation Coefficient is used to
discover the strength of a link between two sets of data. This example looks at
the strength of the link between the price of a convenience item (a 50cl bottle
of water) and distance from the Contemporary Art Museum in El Raval, Barcelona.
Example: The hypothesis tested is that prices should decrease with distance from the key area of gentrification surrounding the Contemporary Art Museum. The line followed is Transect 2 in the map below, with continuous sampling of the price of a 50cl bottle water at every convenience store.
Spearman's Rank correlation coefficient is a technique which
can be used to summarise the strength and direction (negative or positive) of a
relationship between two variables.
The result will always be between 1 and minus 1.
Method - calculating the coefficient
·
Create
a table from your data.
·
Rank
the two data sets. Ranking is achieved by giving the ranking '1' to the biggest
number in a column, '2' to the second biggest value and so on. The smallest
value in the column will get the lowest ranking. This should be done for both
sets of measurements.
·
Tied
scores are given the mean (average) rank. For example, the three tied scores of
1 euro in the example below are ranked fifth in order of price, but occupy
three positions (fifth, sixth and seventh) in a ranking hierarchy of ten. The
mean rank in this case is calculated as (5+6+7) ÷ 3 = 6.
·
Find
the difference in the ranks (d): This is the difference between the ranks of
the two values on each row of the table. The rank of the second value (price)
is subtracted from the rank of the first (distance from the museum).
·
Square
the differences (d²) To remove negative values and then sum them (
Convenience
Store |
Distance from
CAM (m) |
Rank distance |
Price of 50cl
bottle (€) |
Rank price |
Difference
between ranks (d) |
d² |
1 |
50 |
10 |
1.80 |
2 |
8 |
64 |
2 |
175 |
9 |
1.20 |
3.5 |
5.5 |
30.25 |
3 |
270 |
8 |
2.00 |
1 |
7 |
49 |
4 |
375 |
7 |
1.00 |
6 |
1 |
1 |
5 |
425 |
6 |
1.00 |
6 |
0 |
0 |
6 |
580 |
5 |
1.20 |
3.5 |
1.5 |
2.25 |
7 |
710 |
4 |
0.80 |
9 |
-5 |
25 |
8 |
790 |
3 |
0.60 |
10 |
-7 |
49 |
9 |
890 |
2 |
1.00 |
6 |
-4 |
16 |
10 |
980 |
1 |
0.85 |
8 |
-7 |
49 |
|
·
Calculate
the coefficient (Rs) using the formula below. The
answer will always be between 1.0 (a perfect positive correlation) and -1.0 (a
perfect negative correlation).
Now to put all these values into the formula.
·
Find
the value of all the d² values by adding up all the values in the Difference²
column. In our example this is 285.5. Multiplying this
by 6 gives 1713.
·
Now
for the bottom line of the equation. The value n is the
number of sites at which you took measurements. This, in our example is 10.
Substituting these values into n³ - n we get 1000
- 10
= | Spearman's rank correlation coefficient | |
= | difference between the two ranks of each observation | |
= | number of observations |
·
We
now have the formula: R = 1 - (1713/990) which gives a
value for R:
1 - 1.73 = -0.73
This Rs value of -0.73 mean?
The closer Rs is to +1 or -1, the stronger the likely correlation. A perfect positive correlation is +1 and a perfect negative correlation is -1. The Rs value of -0.73 suggests a fairly strong negative relationship.
Video Description
Don’t forget to do these things if you get benefitted from this article
o Visit our Let’s contribute page https://keedainformation.blogspot.com/p/lets-contribute.html
o Follow our page
o Like & comment on our post
Comments