GSEB Class 12 Statistics Solutions Chapter 2 Linear Correlation Exercise 2

Get the most accurate GSEB Solutions for Class 12 Statistics Chapter 02 Linear Correlation here. Updated for the 2026-27 academic session, these solutions are based on the latest GSEB textbooks for Class 12 Statistics. Our expert-created answers for Class 12 Statistics are available for free download in PDF format.

Detailed Chapter 02 Linear Correlation GSEB Solutions for Class 12 Statistics

For Class 12 students, solving GSEB textbook questions is the most effective way to build a strong conceptual foundation. Our Class 12 Statistics solutions follow a detailed, step-by-step approach to ensure you understand the logic behind every answer. Practicing these Chapter 02 Linear Correlation solutions will improve your exam performance.

Class 12 Statistics Chapter 02 Linear Correlation GSEB Solutions PDF

Section A

Answer the following questions by selecting a correct option from the given options:

 

Question 1. In context with correlation, what do you call the graph, if the points of paired observations (x, y) are shown in a graph?
(a) Histogram
(b) Circle diagram
(c) Scatter diagram
(d) Frequency curve
Answer: (c) Scatter diagram
In simple words: When you plot two sets of data points, like X and Y, on a graph to see their relationship, that graph is called a scatter diagram.

🎯 Exam Tip: Understanding the visual representation of data through a scatter diagram is key for quickly identifying correlation types. Always know the basic definitions.

 

Question 2. Which kind of the correlation exists if the following scatter diagram is of two variables X and Y?
ℹ️ चित्र व्याख्या (Diagram Explanation): यह चित्र एक परफेक्ट पॉजिटिव सहसंबंध स्कैटर डायग्राम को दर्शाता है। इसमें सभी बिंदु एक सीधी रेखा पर हैं जो बाएं से ऊपर की ओर दाएं जा रही है, जिससे पता चलता है कि X और Y चर के बीच पूर्ण सकारात्मक संबंध है।
(a) Perfect Positive Correlation
(b) Partial Positive Correlation
(c) Perfect Negative Correlation
(d) Partial Negative Correlation
Answer: (a) Perfect Positive Correlation
In simple words: When all the data points on a graph fall perfectly on a straight line that goes upwards from left to right, it shows a perfect positive relationship between the two variables.

🎯 Exam Tip: For perfect correlations, points align perfectly on a straight line. Upward slope means positive, downward means negative. This helps score well in graphical analysis questions.

 

Question 3. Which kind of the correlation exists if the following scatter diagram is of two variables X and Y?
ℹ️ चित्र व्याख्या (Diagram Explanation): यह चित्र एक पार्शियल नेगेटिव सहसंबंध स्कैटर डायग्राम को दर्शाता है। इसमें बिंदु एक सीधी रेखा पर नहीं हैं, लेकिन वे बाएं से नीचे की ओर दाएं जाने वाली एक रेखा के आसपास बिखरे हुए हैं, जो X और Y चर के बीच आंशिक नकारात्मक संबंध का संकेत देता है।
(a) Perfect Positive Correlation
(b) Partial Positive Correlation
(c) Perfect Negative Correlation
(d) Partial Negative Correlation
Answer: (d) Partial Negative Correlation
In simple words: If data points are spread out but generally follow a downward trend from left to right, it means there is a partial negative correlation, indicating an inverse relationship that is not perfectly linear.

🎯 Exam Tip: Partial correlations show a general trend but with some scattering of points. Negative means one variable increases as the other decreases. Differentiate between 'perfect' and 'partial' for accurate answers.

 

Question 4. What is the value of r, if all the points plotted in a scatter diagram lie on a single line only?
(a) 0
(b) 1 or - 1
(c) 0.5
(d) - 0.5
Answer: (b) 1 or - 1
In simple words: If all the data points form a perfectly straight line on a graph, the correlation coefficient 'r' will be either 1 (for a perfect upward line) or -1 (for a perfect downward line).

🎯 Exam Tip: A value of 'r' exactly equal to 1 or -1 signifies a perfect linear relationship, with no scatter around the line. This is a fundamental concept for correlation coefficient interpretation.

 

Question 5. What is the range of the correlation coefficient r?
(a) - 1 < r < 1
(b) 0 to 1
(c) - 1 ≤ r ≤ 1
(d) - 1 to 0
Answer: (c) - 1 ≤ r ≤ 1
In simple words: The correlation coefficient 'r' always falls between -1 and 1, including -1 and 1 themselves, showing the strength and direction of a linear relationship.

🎯 Exam Tip: Remembering the exact range of the correlation coefficient \((r)\) is crucial. Values outside this range indicate a calculation error. This is a basic knowledge point for all correlation problems.

 

Question 6. The measurement unit of a variable 'Weight' is kg and that of 'Height' is cm. What can you say about the measurement unit of the correlation coefficient between them?
(a) kg
(b) cm
(c) km
(d) does not have any unit
Answer: (d) does not have any unit
In simple words: The correlation coefficient is a pure number that shows how two variables relate, so it does not have any unit of measurement like kg or cm.

🎯 Exam Tip: The correlation coefficient is a dimensionless quantity. This means its value is not affected by the units in which the variables are measured, making it universally applicable.

 

Question 7. Which kind of the correlation can be obtained if the two variables are varying in opposite direction in constant proportion?
(a) Partial Positive Correlation
(b) Perfect Negative Correlation
(c) Perfect Positive Correlation
(d) Partial Negative Correlation
Answer: (b) Perfect Negative Correlation
In simple words: When two variables change in exact opposite ways and by the same amount, meaning one goes up as the other goes down perfectly, they have a perfect negative correlation.

🎯 Exam Tip: "Constant proportion" implies linearity and "opposite direction" implies negativity. Combining these leads to "perfect negative correlation," a concept often tested.

 

Question 8. What does the numerator indicate in the formula for calculating the correlation coefficient by Karl Pearson's method?
(a) Product of variance of X and Y
(b) Covariance of X and Y
(c) Variance of X
(d) Variance of Y
Answer: (b) Covariance of X and Y
In simple words: In Karl Pearson's formula for correlation, the top part (numerator) shows how much X and Y change together, which is called their covariance.

🎯 Exam Tip: The numerator in Karl Pearson's correlation formula specifically represents the covariance, which measures the directional relationship between two variables. Knowing the components of the formula is vital.

 

Question 9. Which of the following values is not possible as a value of r?
(a) 0.99
(b) - 1.07
(c) -0.85
(d) 0
Answer: (b) - 1.07
In simple words: Since the correlation coefficient 'r' must always be between -1 and 1, a value like -1.07 is impossible because it falls outside this allowed range.

🎯 Exam Tip: Any calculated value of 'r' outside the \([-1, 1]\) interval indicates an error in calculation. This is a quick check to validate your answers.

 

Question 10. If \(u = \frac{x-A}{C_{x}}\) and \(v = \frac{y-B}{C_{y}}\), \(C_x > 0\), which of the following statement is correct?
(a) r (x, y) \(\neq\) r (u, v)
(b) r (x, y) > r (u, v)
(c) r(x, y) = r (u, v)
(d) r (x, y) < r(u, v)
Answer: (c) r(x, y) = r (u, v)
In simple words: The correlation coefficient stays the same even if you shift (add or subtract a constant) or scale (multiply or divide by a constant) the variables, as long as the scaling constant is positive.

🎯 Exam Tip: The correlation coefficient is independent of the change of origin and scale (provided the scale factor is positive). This property is fundamental for simplifying calculations in complex problems.

 

Question 11. If r(x, y) = 0.7, then what is the value of r (x + 0.2, y+ 0.2)?
(a) 0.7
(b) 0.9
(c) 1.1
(d) - 0.7
Answer: (a) 0.7
In simple words: Adding a constant to both variables (x and y) does not change their correlation coefficient because their relationship and spread remain the same.

🎯 Exam Tip: This question tests the property of independence from change of origin. Adding a constant to variables shifts their origin but doesn't affect their correlation, which is crucial for quick problem-solving.

 

Question 12. If r(-x, y) = - 0.5, then what is the value of r(x, -y)?
(a) 0.5
(b) - 0.5
(d) 0
Answer: (b) - 0.5
In simple words: Changing the sign of one variable flips the sign of the correlation. If the signs of both variables are changed, the correlation sign remains the same. Since r(-x, y) = -0.5, it means the signs of x and y were opposite. If we consider r(x, -y), it's the same situation; the signs of x and y are still effectively opposite from the perspective of their relationship, so the correlation remains -0.5.

🎯 Exam Tip: Remember the rules for how sign changes in variables affect the correlation coefficient 'r'. If you change the sign of one variable, 'r' changes its sign. If you change the sign of both variables, 'r' remains the same. If r(-x,y) = -0.5, then r(x,y) = 0.5. And if r(x,y)=0.5, then r(x,-y)=-0.5. So from r(-x,y) = -0.5, r(x,-y) should be -0.5.

 

Question 13. What is the value of the rank correlation coefficient if \(\Sigma d^2 = 0\)?
(a) 0
(b) - 1
(c) 1
(d) 0.5
Answer: (c) 1
In simple words: If the sum of squared differences in ranks (\(\Sigma d^2\)) is zero, it means there are no differences in ranks, showing a perfect agreement and thus a rank correlation coefficient of 1.

🎯 Exam Tip: A sum of squared differences of ranks equal to zero means perfect positive rank correlation. This is a direct application of Spearman's rank correlation formula.

 

Question 14. In the method of rank correlation, in usual notations if \(R_x = R_y\) for each pair of observations, then what is the value of the r?
(a) 0
(b) - 1
(c) 1
(d) 0.1
Answer: (c) 1
In simple words: If the ranks for each pair of observations are exactly the same, it means there is perfect agreement, leading to a rank correlation coefficient of 1.

🎯 Exam Tip: When \(R_x = R_y\), the difference in ranks (\(d\)) is always 0, resulting in \(\Sigma d^2 = 0\). Plugging this into the formula gives a rank correlation coefficient of 1, indicating perfect positive rank correlation.

 

Question 15. In the method of rank correlation, what is the sum of differences of the ranks of two variables?
(a) 0
(b) - 1
(c) 1
(d) Any real number
Answer: (a) 0
In simple words: When calculating rank correlation, the total sum of the differences between the ranks of two variables will always add up to zero.

🎯 Exam Tip: This is a key property in rank correlation: \(\Sigma d = 0\). This serves as a useful check during manual calculations, ensuring accuracy before proceeding to calculate \(\Sigma d^2\).

 

Question 16. In the method of rank correlation, if the ranks of two variables are exactly in reverse order, then what is the value of r?
(a) r = 0
(b) r = - 1
(c) r = 1
(d) r = 0.1
Answer: (b) r = - 1
In simple words: If the ranks of two variables are completely opposite for every item, it shows a perfect disagreement, meaning the rank correlation coefficient is -1.

🎯 Exam Tip: "Exactly in reverse order" means one variable's ranks are perfectly inverse to the other's. This directly translates to a perfect negative rank correlation, or \(r = -1\).

 

Question 17. In usual notations, which term is added in \(\Sigma d^2\) for each repeated observation in the rank correlation?
(a) \(\frac{m^{2}-1}{12}\)
(b) \(\frac{m^{3}-m}{12}\)
(c) \(\frac{6 m^{3}-m}{12}\)
(d) \(n(n^2 - 1)\)
Answer: (b) \(\frac{m^{3}-m}{12}\)
In simple words: When some observations have the same rank (ties), a correction factor, which is calculated as \(\frac{m^{3}-m}{12}\) for each tie, is added to \(\Sigma d^2\) to adjust the rank correlation formula.

🎯 Exam Tip: The correction factor for tied ranks is an important adjustment for Spearman's rank correlation. Be sure to correctly identify \(m\) (the number of times an observation is repeated) for each tie to apply this term accurately.

 

Question 18. Which kind of correlation will you get between the number of units sold and its revenue at constant price?
(a) Perfect Positive
(b) Partial Positive
(c) Perfect Negative
(d) Partial Negative
Answer: (a) Perfect Positive
In simple words: If the selling price stays the same, selling more units will always bring in more money, showing a perfect positive relationship between units sold and total income.

🎯 Exam Tip: "Constant price" is a critical condition here. It implies a direct and perfectly proportional relationship between sales volume and revenue, leading to a perfect positive correlation.

Section B

Answer the following questions in one sentence:

 

Question 1. Define correlation.
Answer: Correlation exists between variables when changes in their values happen together because of a direct or indirect cause and effect connection.
In simple words: Correlation means two things change together, either because one causes the other or they are linked in some way.

🎯 Exam Tip: A concise definition of correlation should highlight the simultaneous change between two related variables and the underlying cause-effect relationship (direct or indirect).

 

Question 2. Define correlation coefficient.
Answer: The correlation coefficient is a number that shows how strong and in what direction a straight-line relationship is between two variables.
In simple words: It's a number that tells you how strongly two things are linked in a straight line and if they move in the same or opposite ways.

🎯 Exam Tip: Emphasize "numerical measure," "strength or degree," and "linear correlation" in the definition. These are key terms for a complete understanding.

 

Identify, whether there is a positive correlation or negative correlation between the following pairs of variables (Question 3 to Question 6).

 

Question 3. The age of an adult person and life insurance premium at the time of taking an insurance under a plan.
Answer: There is a 'Positive correlation' between an adult's age and the life insurance premium paid when taking out a plan.
In simple words: As people get older, the cost of their life insurance usually goes up, showing a positive link.

🎯 Exam Tip: When both variables increase or decrease together, it's a positive correlation. This type of reasoning is essential for real-world application questions.

 

Question 4. The sales and profit of last five years for a mostly accepted product of a company.
Answer: There is a 'Positive correlation' between the sales and profit of a company's popular product over the past five years.
In simple words: When a company sells more of a popular product, it usually earns more profit, which is a positive relationship.

🎯 Exam Tip: In business, higher sales generally lead to higher profits (assuming costs are managed), illustrating a strong positive correlation. This is a common economic example.

 

Question 5. The rate of inflation and the purchase power of common man of a country when income of the common man is stable.
Answer: There is a 'Negative correlation' between the rate of inflation and the purchasing power of an average person in a country when their income stays the same.
In simple words: If prices go up (inflation) but your money stays the same, you can buy less, showing a negative link.

🎯 Exam Tip: When one variable increases while the other decreases (or vice-versa), it's a negative correlation. Economic examples like inflation and purchasing power are frequently used to demonstrate this.

 

Question 6. Altitude and amount of Oxygen in air.
Answer: There is a 'Negative correlation' between altitude and the amount of oxygen in the air.
In simple words: As you go higher up a mountain (more altitude), there is less oxygen in the air, meaning they are negatively related.

🎯 Exam Tip: This is a classic scientific example of negative correlation. Higher altitude means lower atmospheric pressure, leading to a decrease in oxygen concentration.

 

Question 7. What can be said about the correlation between the annual import of crude oil and the number of marriages during the same time period?
Answer: There is a 'Nonsense correlation' between the annual import of crude oil and the number of marriages during the same time period.
In simple words: Even if two things seem to change together, if there's no logical reason for them to be connected, it's called a nonsense correlation.

🎯 Exam Tip: Nonsense (or spurious) correlation occurs when two variables appear related statistically but have no logical cause-and-effect link. Identifying such correlations is important for avoiding misleading conclusions.

 

Question 8. The correlation coefficient between X and Y is 0.4. What will be the value of correlation coefficient if 5 is added in each observation of X and 10 is subtracted from each observation of Y?
Answer: The correlation coefficient r(X, Y) is 0.4. If 5 is added to each observation of X and 10 is subtracted from each observation of Y, then r(x + 5, y-10) will still be 0.4, because the value of r does not change with a change of origin.
In simple words: Adding or subtracting a constant number from the data points does not change the correlation between two variables; it just shifts the data.

🎯 Exam Tip: The correlation coefficient is invariant to a change of origin (adding or subtracting a constant). This means transformations like \(x' = x + a\) and \(y' = y + b\) do not alter the value of \(r\).

 

Question 9. What is the main limitation of scatter diagram method?
Answer: The primary drawback of the scatter diagram method is that it does not provide an exact measure of the degree of correlation between two variables.
In simple words: A scatter diagram can show you if two things are linked and how, but it can't tell you an exact number for how strong that link is.

🎯 Exam Tip: While scatter diagrams are good for visual inspection of correlation type and rough strength, they lack the precision of numerical methods (like Pearson's or Spearman's coefficient) for quantifying the degree of correlation.

 

Question 10. If the value of \(n(n^2 – 1)\) is six times the value of \(\Sigma d^2\), then what is the value of r?
Answer: Given that \(n(n^2 – 1) = 6\Sigma d^2\). The formula for rank correlation coefficient is \(r = 1 - \frac{6 \Sigma d^{2}}{n(n^{2}-1)}\). When we substitute the given relationship into the formula, we get \(r = 1 - \frac{6 \Sigma d^{2}}{6 \Sigma d^{2}}\), which simplifies to \(r = 1 - 1 = 0\). Therefore, the value of r is '0'.
In simple words: If the numbers in the rank correlation formula cancel each other out in a certain way, it means there is no correlation, and 'r' equals 0.

🎯 Exam Tip: This question tests your understanding of the rank correlation formula and algebraic manipulation. When \(\Sigma d^2 = \frac{n(n^2-1)}{6}\), it results in \(r=0\), indicating no rank correlation.

 

Question 11. What will be the sign of r if the value of the covariance is negative?
Answer: If the covariance is negative, then the sign of the correlation coefficient (r) will also be negative.
In simple words: If two things tend to move in opposite directions, the covariance is negative, and so is their correlation.

🎯 Exam Tip: The sign of the correlation coefficient directly follows the sign of the covariance. A negative covariance implies a negative linear relationship, hence \(r < 0\).

Section C

Answer the following questions as required:

 

Question 1. Explain the meaning of positive correlation with an illustration.
Answer: **Meaning of positive correlation:** If two related variables change in the same direction, meaning if one variable (X) increases, the other variable (Y) also increases, or if X decreases, Y also decreases, then they are said to have a positive correlation.
**Illustration:**
The table below shows how the sale of an item relates to the profit earned:

Sale of an item (units)3035404345
Profit (in '00 Rs.)913162530

From this table, it is clear that as the number of items sold goes up, the profit earned also goes up. This demonstrates a positive correlation between the sale of an item and its profit.
In simple words: When two things move in the same direction—both go up or both go down—they have a positive correlation, like more sales leading to more profit.

🎯 Exam Tip: To illustrate positive correlation effectively, provide a clear, real-world example where an increase in one variable consistently corresponds to an increase in another, and vice-versa. Tabular data enhances the illustration.

 

Question 2. Explain the meaning of negative correlation with an illustration.
Answer: **Meaning of negative correlation:** If two related variables change in opposite directions, meaning if one variable (X) increases, the other variable (Y) decreases, or if X decreases, Y increases, then they are said to have a negative correlation.
**Illustration:**
The table below shows the relationship between monthly expenditure and monthly saving:

Monthly expenditure (Rs.)3000350038004000
Monthly saving (Rs.)1000800700600

From this table, it is clear that as monthly expenditure increases, the monthly saving decreases. This shows a negative correlation between monthly expenditure and monthly saving.
In simple words: When two things move in opposite directions—one goes up while the other goes down—they have a negative correlation, such as spending more leading to saving less.

🎯 Exam Tip: For negative correlation, select an example where an increase in one variable consistently leads to a decrease in the other. Clear tabular data makes the concept easy to grasp and scores well.

 

Question 3. Write the assumptions of Karl Pearson's method.
Answer: The assumptions for Karl Pearson's method are:

  • There must be a straight-line relationship (linear correlation) between the two variables.
  • The two variables must have a cause-effect relationship.

In simple words: Karl Pearson's method works best if the two things you're comparing are linked in a straight line, and one causes the other (or affects it in some way).

🎯 Exam Tip: Listing these two core assumptions is crucial when discussing Karl Pearson's method. Without them, the calculated coefficient may not accurately reflect the relationship between variables.

 

Question 4. Define: Scatter Diagram
Answer: A scatter diagram is a graph created by plotting 'n' ordered pairs \( (X_1, Y_1), (X_2, Y_2), \ldots, (X_n, Y_n) \) of two variables, X and Y, using a suitable scale, where X values are shown on the X-axis and Y values on the Y-axis. The resulting graph of these plotted points is called a scatter diagram.
In simple words: A scatter diagram is a graph where you put dots for each pair of data points (like X and Y) to see how they spread out and relate to each other.

🎯 Exam Tip: A good definition includes the purpose (showing relationship), the elements (paired observations), and the axes. Precision in terminology is important.

 

Question 5. What is spurious correlation?
Answer: Spurious correlation occurs when two variables appear to have a correlation but are not actually linked by a cause-effect relationship and lack a true linear correlation. For instance, there might seem to be a spurious correlation between the yearly import of crude oil and the number of marriages during the same time period.
In simple words: Spurious correlation is when two things seem connected on a graph, but there's no real reason why they should be, like oil imports and marriages.

🎯 Exam Tip: The key to defining spurious correlation is emphasizing the *lack* of a true cause-and-effect relationship despite a statistical association. Providing a clear example reinforces the concept.

 

Question 6. Explain the cause and effect relationship.
Answer: In many situations, variables change simultaneously. When a change in one variable causes a change in another, this is a cause-effect relationship. The variable that initiates the change is the 'cause', and the resulting change in the other variable is the 'effect'. This relationship can be direct or indirect. For example, yearly rainfall (cause) and rice yield (effect) often show a cause-effect relationship: more rain usually leads to more rice, and less rain leads to less rice in a region.
In simple words: A cause and effect relationship means one thing directly makes another thing happen, like rain causing more rice to grow.

🎯 Exam Tip: Clearly differentiate between the 'cause' (independent variable) and 'effect' (dependent variable). A strong example, like rainfall and crop yield, helps illustrate this fundamental concept effectively.

 

Question 7. Explain: Perfect positive correlation
Answer: Perfect positive correlation occurs when two correlated variables change simultaneously in the exact same direction and proportion. This means if one variable increases or decreases by a specific amount, the other variable also increases or decreases by a constant, proportional amount.

  • In a scatter diagram for perfect positive correlation, all points lie exactly on a straight line that slopes upwards from left to right.
  • The correlation coefficient, r, for perfect positive correlation is exactly 1.
  • For example, the correlation between the number of cinema tickets bought and the total price paid (e.g., 1 ticket = Rs. 150, 4 tickets = Rs. 600) is a perfect positive correlation.

In simple words: When two things always change together in the same direction and by the same amount, like buying more tickets always costing exactly more money, it's a perfect positive correlation.

🎯 Exam Tip: Focus on "same direction," "constant proportion," a scatter plot showing a perfect upward line, and \(r = 1\). These are the critical elements for defining and identifying perfect positive correlation.

 

Question 8. Explain: Perfect negative correlation
Answer: Perfect negative correlation occurs when two correlated variables change simultaneously in exactly opposite directions and in constant proportion. This means if one variable increases or decreases by a specific amount, the other variable decreases or increases by a constant, proportional amount.

  • In a scatter diagram for perfect negative correlation, all points lie exactly on a straight line that slopes downwards from left to right.
  • The correlation coefficient for perfect negative correlation is exactly -1.
  • For example, there is a perfect negative correlation between a place's height above sea level and the amount of oxygen in the air; as altitude increases, oxygen decreases proportionally.

In simple words: When two things always change in opposite directions by the same amount, like going higher up a mountain always means less oxygen, it's a perfect negative correlation.

🎯 Exam Tip: Key aspects include "opposite direction," "constant proportion," a scatter plot showing a perfect downward line, and \(r = -1\). These details are essential for a complete explanation.

 

Question 9. When is it necessary to use rank correlation?
Answer: Rank correlation is needed in these situations:

  • When the variables are qualitative traits (like honesty, punctuality, beauty) that cannot be measured directly but can be ranked based on their skill or quality.
  • When the measured values of the variables are very large or widely spread out, making traditional methods complex.

In simple words: You use rank correlation when you can't measure things with numbers but can put them in order, or when the numbers are too big or spread out to easily work with.

🎯 Exam Tip: Highlight the use of rank correlation for qualitative data or when quantitative data has extreme values. This distinguishes it from Pearson's method and is a common conceptual question.

 

Question 10. In which situation, the values of Karl Pearson's correlation coefficient and Spearman's rank correlation coefficient are equal?
Answer: The values of Karl Pearson's correlation coefficient and Spearman's rank correlation coefficient are equal when the two variables are arranged as a sequence of the first 'n' natural numbers.
In simple words: Both correlation methods give the same result when the data for the two things you're comparing are simply a list of counting numbers in some order.

🎯 Exam Tip: This specific condition—variables being an arrangement of natural numbers—is a unique case where both coefficients align. It's a point of distinction often tested in conceptual questions.

 

Question 11. Find the value of r if \(Cov(x, y) = 120\), \(S_x = 12\), \(S_y = 15\).
Answer: Here, we are given \(Cov(x, y) = 120\), \(S_x = 12\), and \(S_y = 15\).
Using the formula for the correlation coefficient \(r\):
\(r = \frac{Cov(x, y)}{S_x \cdot S_y}\)
\( = \frac{120}{12 \times 15}\)
\( = \frac{120}{180} \)
\( = 0.67\)
In simple words: If you know how much two things change together (covariance) and how much each changes on its own (standard deviation), you can calculate their correlation. Here, it's 0.67.

🎯 Exam Tip: Always remember the fundamental formula for Pearson's correlation coefficient using covariance and standard deviations. Correct substitution of values is key to getting the right answer.

 

Question 12. Find the value of r if \(\Sigma(x – \bar{x})(y – \bar{y}) = - 65\), \(S_x = 3\), \(S_y = 4\) and \(n = 10\).
Answer: We are given \(\Sigma(x – \bar{x})(y – \bar{y}) = - 65\), \(S_x = 3\), \(S_y = 4\), and \(n = 10\).
Using the formula for the correlation coefficient \(r\):
\(r = \frac{\Sigma(x-\bar{x})(y-\bar{y})}{n \cdot S_{x} \cdot S_{y}}\)
\( = \frac{-65}{10 \times 3 \times 4}\)
\( = \frac{-65}{120}\)
\( = - 0.54\)
In simple words: Using the given sums of differences, number of observations, and standard deviations, we can calculate the correlation coefficient, which comes out to be -0.54.

🎯 Exam Tip: Ensure you correctly apply the formula for 'r' when provided with sum of products of deviations, sample size, and standard deviations. Pay attention to the negative signs in calculations.

 

Question 13. Here, \(n = 10\); \(\Sigma d^2 = 120\). Find the value of the rank correlation coefficient.
Answer: We have \(n = 10\) and \(\Sigma d^2 = 120\).
The formula for the rank correlation coefficient \(r\) is:
\(r = 1 - \frac{6 \Sigma d^{2}}{n(n^{2}-1)}\)
\( = 1 -\frac{6(120)}{10(100-1)}\)
\( = 1 - \frac{720}{10 \times 99}\)
\( = 1 - \frac{720}{990}\)
\( = 1 -0.73\)
\( = 0.27\)
\( \implies r = 0.27\)
In simple words: By using the number of items and the total of squared differences in ranks in the formula, we find the rank correlation coefficient to be 0.27.

🎯 Exam Tip: Accurately substituting \(n\) and \(\Sigma d^2\) into Spearman's rank correlation formula is crucial. Remember to follow the order of operations carefully to avoid calculation errors.

Section D

Answer the following questions as required:

 

Question 1. Explain scatter diagram method.
Answer: The scatter diagram method is a widely used way to figure out the type of relationship (correlation) between two variables.

  • To create a scatter diagram, 'n' pairs of data points for two variables, X and Y, are plotted on graph paper using a suitable scale. The X values go on the X-axis, and Y values go on the Y-axis. The picture formed by these plotted points is called a scatter diagram.
  • The way these points are spread out on the scatter diagram helps us see the kind of correlation (like positive, negative, or no correlation) and roughly how strong that correlation is.

ℹ️ चित्र व्याख्या (Diagram Explanation): यह चित्र एक परफेक्ट पॉजिटिव सहसंबंध स्कैटर डायग्राम को दर्शाता है। इसमें सभी बिंदु एक सीधी रेखा पर हैं जो बाएं से ऊपर की ओर दाएं जा रही है, जिससे पता चलता है कि X और Y चर के बीच पूर्ण सकारात्मक संबंध है।
If all the points of a scatter diagram lie on one line going in the upward direction from left to right, then it shows perfect Positive correlation between two variables X and Y.

ℹ️ चित्र व्याख्या (Diagram Explanation): यह चित्र एक परफेक्ट नेगेटिव सहसंबंध स्कैटर डायग्राम को दर्शाता है। इसमें सभी बिंदु एक सीधी रेखा पर हैं जो बाएं से नीचे की ओर दाएं जा रही है, जिससे पता चलता है कि X और Y चर के बीच पूर्ण नकारात्मक संबंध है।
If all the points of a scatter diagram lie on one line going in the downward direction from left to right, then it shows perfect negative correlation between two variables X and Y.

ℹ️ चित्र व्याख्या (Diagram Explanation): यह चित्र एक पार्शियल पॉजिटिव सहसंबंध स्कैटर डायग्राम को दर्शाता है। इसमें बिंदु एक सीधी रेखा पर नहीं हैं, लेकिन वे बाएं से ऊपर की ओर दाएं जाने वाली एक रेखा के आसपास बिखरे हुए हैं, जो X और Y चर के बीच आंशिक सकारात्मक संबंध का संकेत देता है।
If all the points of a scatter diagram are not on one line but lie around a line going in upward direction from left to right, then it shows partial positive correlation between two variables X and Y.

ℹ️ चित्र व्याख्या (Diagram Explanation): यह चित्र एक पार्शियल नेगेटिव सहसंबंध स्कैटर डायग्राम को दर्शाता है। इसमें बिंदु एक सीधी रेखा पर नहीं हैं, लेकिन वे बाएं से नीचे की ओर दाएं जाने वाली एक रेखा के आसपास बिखरे हुए हैं, जो X और Y चर के बीच आंशिक नकारात्मक संबंध का संकेत देता है।
If all the points of a scatter diagram are not on one line but lie around a line going in downward direction from left to right, then it shows partial negative correlation between two variables X and Y.

ℹ️ चित्र व्याख्या (Diagram Explanation): यह चित्र रेखीय सहसंबंध के अभाव वाले स्कैटर डायग्राम को दर्शाता है। इसमें बिंदु बिना किसी विशेष पैटर्न के बेतरतीब ढंग से फैले हुए हैं, जिससे पता चलता है कि X और Y चर के बीच कोई रेखीय सहसंबंध नहीं है।
If the points of a scatter diagram lie randomly without forming a specific pattern, then it shows lack (absence) of linear correlation between two variables X and Y.
In simple words: A scatter diagram plots data points on a graph to visually show if two things are related, what kind of relation it is (like positive or negative), and how strong that relation seems.

🎯 Exam Tip: When explaining the scatter diagram method, include its purpose, how it's constructed, and how different patterns of points (upward, downward, scattered) reveal different types of correlation. Visual examples strengthen the explanation.

 

Question 2. Write merits and limitations of scatter diagram method.
Answer: The scatter diagram method is a simple way to understand the type of relationship between two variables. It needs only basic knowledge of plotting points, not advanced math skills. This method also gives some idea about how strong the connection is between the variables. The pattern of points helps to see if the relationship is straight or not. Even very unusual data points do not make it difficult to understand the basic nature of the relationship.
The main limitation of the scatter diagram method is that it does not show the exact strength of the correlation between variables. Also, it is not a good method for studying data that is already grouped into categories.
In simple words: This method is easy to use for seeing relationships, but it doesn't give an exact number for how strong the link is.

🎯 Exam Tip: Focus on understanding the core concept of correlation and how graphical representation aids initial data assessment, especially for qualitative understanding over precise quantification.

 

Question 3. Write the properties of correlation coefficient.
Answer: The value of the correlation coefficient, `r`, always falls within the range of -1 to 1, inclusive (i.e., \( -1 \le r \le 1 \)). This coefficient does not have any units of measurement. The correlation between variable X and variable Y is the same as the correlation between variable Y and variable X (i.e., \( r(x, y) = r(y, x) \)). The value of the correlation coefficient `r` remains unchanged even if the origin or scale of the variables is changed. This means if \( u = \frac{x-A}{C_x} \) and \( v = \frac{y-B}{C_y} \), where \( C_x > 0 \), \( C_y > 0 \), and A, B, \( C_x \), \( C_y \) are constants, then \( r(x, y) = r(u, v) \). The correlation coefficient `r` is an absolute measurement. If the sign of only one of the two variables is changed, the sign of the correlation coefficient also changes (e.g., \( r(-x, y) = -r(x, y) \); \( r(x, -y) = -r(x, y) \)). However, if the signs of both variables are changed, the sign of the correlation coefficient stays the same (e.g., \( r(-x, -y) = r(x, y) \)).
In simple words: The correlation value is always between -1 and 1, has no units, and doesn't change if you shift or stretch the data. If you flip the sign of one variable, the correlation sign flips too.

🎯 Exam Tip: Memorize the range and unit-free nature of 'r'. Understand the invariance property under change of origin and scale, as this is a frequent conceptual check in exams.

 

Question 4. Write the merits and limitations of Karl Pearson's method.
Answer: Karl Pearson's method has several benefits. It helps to understand both the type and the exact strength of the relationship between two variables. It is the most commonly used method for measuring straight-line connections. This method provides a single number that summarizes the degree of correlation.
However, there are also limitations. This method assumes that the relationship between the variables is a straight line. The calculated value of `r` can be greatly affected by extreme or unusual data points. It is very important to be careful when interpreting the `r` value, as a wrong understanding can lead to incorrect conclusions.
In simple words: Karl Pearson's method is good for finding the exact strength of straight-line relationships, but it can be sensitive to extreme data and assumes a straight line.

🎯 Exam Tip: Remember that Pearson's method is best for linear relationships and can be skewed by outliers. Its primary strength is providing a precise numerical measure.

 

Question 5. Interpret r = 1, r = - 1 and r = 0.
Answer:When \( r = 1 \), it means there is a perfect positive correlation between two variables. This happens when an increase or decrease in one variable leads to a proportional increase or decrease in the other variable. In a scatter diagram, all the data points would form a straight line that goes upwards from the left to the right.
When \( r = -1 \), it means there is a perfect negative correlation between two variables. This occurs when an increase in one variable leads to a proportional decrease in the other, or vice versa. In a scatter diagram, all the data points would form a straight line that goes downwards from the left to the right.
When \( r = 0 \), it means there is no linear correlation between the two variables. The data points in a scatter diagram would appear scattered randomly, showing no clear straight-line pattern. It is important to note that \( r = 0 \) only indicates the absence of a linear relationship; there might still be a non-linear (curved) relationship between the variables.
In simple words: \( r = 1 \) means perfect upward line, \( r = -1 \) means perfect downward line, and \( r = 0 \) means no straight-line pattern.

🎯 Exam Tip: Clearly distinguish between perfect positive, perfect negative, and no linear correlation. Emphasize that \( r=0 \) does not rule out non-linear relationships.

 

Question 6. Explain Spearman's rank correlation method.
Answer: Spearman's rank correlation method is used for variables that cannot be measured numerically but can be assigned ranks, such as honesty, beauty, or punctuality. The method calculates a correlation coefficient by using these ranks. It can also be applied to numerical data by first ranking the observations based on their values.
When ranking `n` pairs of observations (\( R_x \), \( R_y \)), the formula for the rank correlation coefficient is: \[ r = 1 - \frac{6 \Sigma d^{2}}{n(n^{2}-1)} \]
Here, \( d = R_x - R_y \) is the difference between the ranks of each pair of observations, and \( n \) is the total number of pairs.
If some observations have the same value (a tie), they are assigned the average of the ranks they would have received. In such cases, a correction factor (CF) is added to the formula: \[ r = 1 - \frac{6(\Sigma d^{2}+CF)}{n(n^{2}-1)} \]
The correction factor is calculated as \( CF = \Sigma\left(\frac{m^{3}-m}{12}\right) \), where \( m \) is the number of times an observation's value is repeated. If the values of the two variables are the first 'n' natural numbers, then the correlation coefficients obtained by Karl Pearson's method and Spearman's method will be the same.
In simple words: This method finds how well two sets of ranks match up. It's used for things you can't measure with numbers directly, like beauty, or when numbers have many ties.

🎯 Exam Tip: Understand when to apply Spearman's method (qualitative data, ranks, or when outliers are significant). Be prepared to calculate it with and without a correction factor for tied ranks.

 

Question 7. Write merits and limitations of Spearman's rank correlation method.
Answer: Spearman's rank correlation method has several advantages. It is straightforward to understand and apply. Calculating the correlation coefficient using this method is often simpler than using Karl Pearson's method. It is the only method available for finding the measure of correlation when the data is qualitative (based on qualities rather than numbers). This method is preferred when data points are widely spread out or when there are extreme values, as it is less affected by them.
However, there are also limitations. This method does not provide as precise a measure of the correlation coefficient compared to Karl Pearson's method. Assigning ranks can become difficult and time-consuming when there is a large number of observations. Additionally, this method cannot be used for data presented in a bivariate frequency distribution (a table showing frequencies of two variables together).
In simple words: This method is easy and works for ranked data or spread-out data, but it's not as exact as other methods and can be hard with many observations.

🎯 Exam Tip: Highlight Spearman's applicability to qualitative data and its robustness to outliers. Note its drawback in precision compared to Pearson's and its limitations with large datasets or frequency distributions.

 

Question 8. How would you interpret partial correlation ?
Answer: Partial (or imperfect) correlation occurs when the absolute value of the correlation coefficient, \( |r| \), is less than 1 (i.e., `r` is between 0 and 1, or between -1 and 0).
If the absolute value of `r` is closer to 1, it means there is a strong and close-to-linear relationship between the two variables. In such cases, we can get a good estimate of how changes in one variable correspond to changes in the other.
If the absolute value of `r` is closer to 0, it means the straight-line relationship is very weak, almost indicating a lack of linear correlation between the variables. In this situation, it is hard to get a reliable estimate of how one variable changes in response to another.
In simple words: Partial correlation means the relationship is not perfect. If the correlation number is near 1 or -1, the link is strong; if it's near 0, the link is weak.

🎯 Exam Tip: Understand that partial correlation indicates a degree of relationship less than perfect. The closer `|r|` is to 1, the stronger the linear association.

 

Question 9. State the necessary precautions to be taken while interpreting the value of correlation coefficient.
Answer: When interpreting the value of the correlation coefficient, several precautions are necessary.
1. Nonsense or Spurious Correlation: A high correlation coefficient does not always mean there is a direct cause-and-effect relationship or a meaningful connection between two variables. For example, if the correlation coefficient between the number of doctors in a hospital and the number of patient deaths seems high, it doesn't mean more doctors cause more deaths. Other factors might be at play, making the correlation misleading. Therefore, it is important to consider if a logical cause-effect relationship exists.
2. Influence of Other Factors: Sometimes, the value of `r` might be close to 1 even if the two variables are not directly correlated, due to the influence of other hidden factors.
3. Lack of Correlation: If the value of `r` is zero, it only indicates an absence or lack of *linear* correlation between the two variables. It does not mean there is no correlation at all; a non-linear relationship might still exist. So, saying there is no correlation at all when \( r=0 \) is incorrect.
4. Contextual Limitation: If the correlation coefficient is calculated for data specific to a certain region, class, or time period, its interpretation should be limited only to that context. Applying it broadly to other situations might lead to incorrect conclusions.
In simple words: Be careful! High correlation doesn't always mean cause and effect. Hidden factors can fake a strong link. Zero correlation just means no straight-line link, not no link at all. And apply results only to the specific data context.

🎯 Exam Tip: Always analyze correlation in context. Avoid assuming causation from correlation, acknowledge the impact of extraneous factors, and remember that \( r=0 \) only implies no *linear* relationship.

 

Question 10. The following data is available for two variables rainfall in mm (X) and yield of crop Qtl/Hectare (Y): \( n = 10 \), \( \bar{x} \) = 120, \( \bar{y} \) = 150, \( S_x \) = 30, \( S_y \) = 40 and \( \Sigma xy \) = 189000. Find the correlation coefficient.
Answer: We are given the following information: the number of observations \( n = 10 \), the mean of X is \( \bar{x} = 120 \), the mean of Y is \( \bar{y} = 150 \), the standard deviation of X is \( S_x = 30 \), the standard deviation of Y is \( S_y = 40 \), and the sum of the products of X and Y is \( \Sigma xy = 189000 \).
To find the correlation coefficient, we use the formula: \[ r = \frac{\Sigma xy - n \bar{x} \bar{y}}{n S_x S_y} \]
Substituting the given values into the formula: \[ r = \frac{189000 - 10 \times 120 \times 150}{10 \times 30 \times 40} \] \[ r = \frac{189000 - 180000}{12000} \] \[ r = \frac{9000}{12000} \] \[ r = 0.75 \]
Therefore, the correlation coefficient is 0.75.
In simple words: Given all the averages and sums, we put them into the correlation formula to get a positive link of 0.75.

🎯 Exam Tip: When mean and standard deviation are provided directly, use the simplified correlation coefficient formula. Ensure correct substitution and calculation for full marks.

 

Question 11. The following information is obtained for 9 pairs of observations: \( \Sigma x = 51 \), \( \Sigma y = 72 \), \( \Sigma x^2 = 315 \), \( \Sigma y^2 = 582 \), \( \Sigma xy = 408 \). Find the correlation coefficient.
Answer: We are given the following information for 9 pairs of observations: \( n = 9 \), sum of X is \( \Sigma x = 51 \), sum of Y is \( \Sigma y = 72 \), sum of X squared is \( \Sigma x^2 = 315 \), sum of Y squared is \( \Sigma y^2 = 582 \), and sum of products of X and Y is \( \Sigma xy = 408 \).
To find the correlation coefficient, we use the formula: \[ r = \frac{n \Sigma xy - (\Sigma x)(\Sigma y)}{\sqrt{n \Sigma x^2 - (\Sigma x)^2} \cdot \sqrt{n \Sigma y^2 - (\Sigma y)^2}} \]
Substituting the given values: \[ r = \frac{9 \times 408 - (51)(72)}{\sqrt{9 \times 315 - (51)^2} \cdot \sqrt{9 \times 582 - (72)^2}} \] \[ r = \frac{3672 - 3672}{\sqrt{2835 - 2601} \cdot \sqrt{5238 - 5184}} \] \[ r = \frac{0}{\sqrt{234} \cdot \sqrt{54}} \] \[ r = \frac{0}{15.297 \times 7.348} \] \[ r = \frac{0}{112.41} \] \[ r = 0 \]
Thus, the correlation coefficient is 0.
In simple words: With the sums of X, Y, X-squared, Y-squared, and XY, we put them into the correlation formula and find that the correlation is 0.

🎯 Exam Tip: Pay close attention to the formula for `r` when sums of X, Y, \( X^2 \), \( Y^2 \), and XY are given. A numerator of zero immediately implies a zero correlation coefficient.

 

Question 12. The information obtained on the basis of ranks given by two judges to eight contestants of a dance competition is given below: \( \Sigma(R_x - R_y)^2 = 126 \). Where, \( R_x \) and \( R_y \) are the ranks given to a constant by the two judges respectively. Find Spearman's rank correlation coefficient.
Answer: We have the following information: the number of contestants \( n = 8 \), and the sum of the squared differences between the ranks given by the two judges is \( \Sigma d^2 = \Sigma(R_x - R_y)^2 = 126 \).
To find Spearman's rank correlation coefficient, we use the formula: \[ r = 1 - \frac{6 \Sigma d^2}{n(n^2-1)} \]
Substituting the given values into the formula: \[ r = 1 - \frac{6 \times 126}{8(8^2-1)} \] \[ r = 1 - \frac{756}{8(64-1)} \] \[ r = 1 - \frac{756}{8 \times 63} \] \[ r = 1 - \frac{756}{504} \] \[ r = 1 - 1.5 \] \[ r = -0.5 \]
Therefore, Spearman's rank correlation coefficient is -0.5.
In simple words: Given the number of contestants and the sum of squared rank differences, we use Spearman's formula to find the rank correlation, which comes out to be -0.5.

🎯 Exam Tip: Remember Spearman's formula for rank correlation. Ensure to calculate \( n^2-1 \) correctly before dividing, and pay attention to the sign of the final answer.

 

Question 13. The ranks given by two experts on the basis of interviews of five candidates for a job are (3, 5), (5, 4), (1, 2), (2, 3) and (4, 1). Find the rank correlation coefficient from this data.
Answer: We have \( n = 5 \) candidates. The ranks given by the first expert are \( R_x \), and by the second expert are \( R_y \). We will create a table to calculate the differences in ranks (\( d = R_x - R_y \)) and their squares (\( d^2 \)).

Candidate\( R_x \)\( R_y \)\( d = R_x - R_y \)\( d^2 \)
135-24
25411
312-11
423-11
54139
Total\( n = 5 \)-\( \Sigma d = 0 \)\( \Sigma d^2 = 16 \)

From the table, we find \( \Sigma d^2 = 16 \).
To find Spearman's rank correlation coefficient, we use the formula: \[ r = 1 - \frac{6 \Sigma d^2}{n(n^2-1)} \]
Substituting \( n = 5 \) and \( \Sigma d^2 = 16 \) into the formula: \[ r = 1 - \frac{6 \times 16}{5(5^2-1)} \] \[ r = 1 - \frac{96}{5(25-1)} \] \[ r = 1 - \frac{96}{5 \times 24} \] \[ r = 1 - \frac{96}{120} \] \[ r = 1 - 0.8 \] \[ r = 0.2 \]
Therefore, the rank correlation coefficient is 0.2.
In simple words: We list the ranks from two experts, find the differences, square them, and sum them up. Then, using Spearman's formula, we calculate the rank correlation as 0.2.

🎯 Exam Tip: When given paired ranks, always construct a table to systematically calculate 'd' and 'd-squared'. Double-check the summation of 'd-squared' as it's critical for the final result.

Section E

 

Question 1. The following information is obtained to study the relation between the selling price of nose mask and its demand during an epidemic:

Price (Rs.)3845404235
Demand (units)103929798100
Find the correlation coefficient between the price and demand of mask by Karl Pearson's method.
Answer: We want to find the correlation coefficient using Karl Pearson's method for the selling price (X) and demand (Y) of nose masks. We have \( n = 5 \) observations.
First, calculate the means: Mean of X: \( \bar{x} = \frac{\Sigma x}{n} = \frac{38+45+40+42+35}{5} = \frac{200}{5} = 40 \) Rs. Mean of Y: \( \bar{y} = \frac{\Sigma y}{n} = \frac{103+92+97+98+100}{5} = \frac{490}{5} = 98 \) units
Since \( \bar{x} \) and \( \bar{y} \) are integers, we will create a table to calculate the correlation coefficient using deviations from the mean:
Price X (Rs.)Demand Y (units)\( (x-\bar{x}) \) (X-40)\( (y-\bar{y}) \) (Y-98)\( (x-\bar{x})(y-\bar{y}) \)\( (x-\bar{x})^2 \)\( (y-\bar{y})^2 \)
38103-25-10425
45925-6-302536
40970-1001
429820040
35100-52-10254
\( \Sigma x = 200 \)\( \Sigma y = 490 \)\( \Sigma(x-\bar{x}) = 0 \)\( \Sigma(y-\bar{y}) = 0 \)\( \Sigma(x-\bar{x})(y-\bar{y}) = -50 \)\( \Sigma(x-\bar{x})^2 = 58 \)\( \Sigma(y-\bar{y})^2 = 66 \)

Now, we calculate the correlation coefficient `r` using the formula: \[ r = \frac{\Sigma(x - \bar{x})(y - \bar{y})}{\sqrt{\Sigma(x - \bar{x})^2} \cdot \sqrt{\Sigma(y - \bar{y})^2}} \]
Substitute the values from the table: \[ r = \frac{-50}{\sqrt{58} \cdot \sqrt{66}} \] \[ r = \frac{-50}{\sqrt{3828}} \] \[ r = \frac{-50}{61.8708} \] \[ r \approx -0.81 \]
Hence, the correlation coefficient between the price and demand of the mask is approximately -0.81.
In simple words: We calculated the average price and demand, then made a table to find how each point varied from the average. Using these values in Karl Pearson's formula, we found a negative correlation of -0.81.

🎯 Exam Tip: For Karl Pearson's method, if means are integers, using the deviation method \( (x-\bar{x}) \) and \( (y-\bar{y}) \) simplifies calculations. Ensure accurate sums for products and squares of deviations.

 

Question 2. In order to study the relationship between the abilities in the subjects of Human Resource Management and Personality Development for the students of a post graduate level course, a sample of 5 students is taken and the following information is obtained:

Student12345
Marks in HRM4525402045
Marks in PD4723173548
Calculate the Karl Pearson's correlation coefficient between the marks of both the subjects.
Answer: We want to find Karl Pearson's correlation coefficient between marks in Human Resource Management (X) and Personality Development (Y) for 5 students. We have \( n = 5 \) observations.
First, calculate the means: Mean of X: \( \bar{x} = \frac{\Sigma x}{n} = \frac{45+25+40+20+45}{5} = \frac{175}{5} = 35 \) marks Mean of Y: \( \bar{y} = \frac{\Sigma y}{n} = \frac{47+23+17+35+48}{5} = \frac{170}{5} = 34 \) marks
Since \( \bar{x} \) and \( \bar{y} \) are integers, we will create a table to calculate the correlation coefficient using deviations from the mean:
StudentHRM XPD Y\( (x-\bar{x}) \) (X-35)\( (y-\bar{y}) \) (Y-34)\( (x-\bar{x})(y-\bar{y}) \)\( (x-\bar{x})^2 \)\( (y-\bar{y})^2 \)
145471013130100169
22523-10-11110100121
340175-17-8525289
42035-151-152251
545481014140100196
Total\( \Sigma x = 175 \)\( \Sigma y = 170 \)\( \Sigma(x-\bar{x}) = 0 \)\( \Sigma(y-\bar{y}) = 0 \)\( \Sigma(x-\bar{x})(y-\bar{y}) = 280 \)\( \Sigma(x-\bar{x})^2 = 550 \)\( \Sigma(y-\bar{y})^2 = 776 \)

Now, we calculate the correlation coefficient `r` using the formula: \[ r = \frac{\Sigma(x - \bar{x})(y - \bar{y})}{\sqrt{\Sigma(x - \bar{x})^2} \cdot \sqrt{\Sigma(y - \bar{y})^2}} \]
Substitute the values from the table: \[ r = \frac{280}{\sqrt{550} \cdot \sqrt{776}} \] \[ r = \frac{280}{\sqrt{426800}} \] \[ r = \frac{280}{653.30} \] \[ r \approx 0.43 \]
Hence, the correlation coefficient between the marks in HRM and PD is approximately 0.43.
In simple words: We calculated average marks for both subjects. Then, using a table for deviations and Karl Pearson's formula, we found a positive correlation of 0.43 between their marks.

🎯 Exam Tip: Always verify that the sums of deviations from the mean are zero; this is a quick check for calculation accuracy. Show all steps for mean, deviations, and final formula application.

 

Question 3. A vendor wants to display lipsticks of different brands according to their popularity. For that, he invites two experts Preyal and Nishi to rank the lipsticks of different brands:

LipstickABCDEFG
Rank by Preyal5671324
Rank by Nishi5762143
Find the rank correlation coefficient to know the similarity in the decision of both the experts.
Answer: We need to find the similarity in rankings given by two experts, Preyal and Nishi, for 7 lipstick brands. Here, \( n = 7 \). Let \( R_x \) be the rank by Preyal and \( R_y \) be the rank by Nishi. We will create a table to calculate the differences in ranks (\( d = R_x - R_y \)) and their squares (\( d^2 \)).
Lipstick\( R_x \)\( R_y \)\( d = R_x - R_y \)\( d^2 \)
A5500
B67-11
C7611
D12-11
E3124
F24-24
G4311
Total\( n = 7 \)-\( \Sigma d = 0 \)\( \Sigma d^2 = 12 \)

From the table, we find \( \Sigma d^2 = 12 \).
To find Spearman's rank correlation coefficient, we use the formula: \[ r = 1 - \frac{6 \Sigma d^2}{n(n^2-1)} \]
Substitute \( n = 7 \) and \( \Sigma d^2 = 12 \) into the formula: \[ r = 1 - \frac{6 \times 12}{7(7^2-1)} \] \[ r = 1 - \frac{72}{7(49-1)} \] \[ r = 1 - \frac{72}{7 \times 48} \] \[ r = 1 - \frac{72}{336} \] \[ r = 1 - 0.214 \] \[ r \approx 0.79 \]
Hence, the rank correlation coefficient between the experts' decisions is approximately 0.79.
In simple words: We took the ranks given by two experts for 7 brands, found the difference in ranks, squared them, and summed them up. Using Spearman's formula, the rank correlation is found to be 0.79.

🎯 Exam Tip: Always set up a clear table to calculate 'd' and 'd-squared' for each pair of ranks. Ensure accurate summation, as this directly affects the final rank correlation coefficient.

 

Question 4. A merchant wants to study the relation between prices of tea and coffee in Ahmedabad city. He obtains the following information about prices of tea and coffee of the last six months:

Price per kg for tea (Rs.)340370450320300360
Price per 100 grams for coffee (Rs.)190215200180163175
Calculate the rank correlation coefficient between the price of tea and coffee.
Answer: We want to find the rank correlation coefficient between the price of tea (X) and the price of coffee (Y) over six months. Here, \( n = 6 \). We will assign ranks \( R_x \) to the tea prices and \( R_y \) to the coffee prices, with the highest price receiving rank 1.
Price per kg for Tea (X) (Rs.)Price per 100 grams for Coffee (Y) (Rs.)\( R_x \)\( R_y \)\( d = R_x - R_y \)\( d^2 \)
3401904311
3702152111
45020012-11
3201805411
3001636600
36017535-24
Total\( n = 6 \)--\( \Sigma d = 0 \)\( \Sigma d^2 = 8 \)

From the table, we find \( \Sigma d^2 = 8 \).
To find Spearman's rank correlation coefficient, we use the formula: \[ r = 1 - \frac{6 \Sigma d^2}{n(n^2-1)} \]
Substitute \( n = 6 \) and \( \Sigma d^2 = 8 \) into the formula: \[ r = 1 - \frac{6 \times 8}{6(6^2-1)} \] \[ r = 1 - \frac{48}{6(36-1)} \] \[ r = 1 - \frac{48}{6 \times 35} \] \[ r = 1 - \frac{48}{210} \] \[ r = 1 - 0.228 \] \[ r \approx 0.77 \]
Hence, the rank correlation coefficient between the price of tea and coffee is approximately 0.77.
In simple words: We ranked tea and coffee prices, found the differences, squared them, and summed them up. Using Spearman's formula, the rank correlation is calculated as 0.77.

🎯 Exam Tip: When raw data is given, ensure consistent ranking (e.g., highest value gets rank 1, or lowest gets rank 1) for both variables. Carefully calculate ranks before proceeding to find 'd' and 'd-squared'.

 

Question 5. The demand of an imported fruit in a local market is very uncertain. To know the relation between the price of the fruit and its supply, a vendor collects the information about the average price and supply for last ten months:

Average price per unit (Rs.)65684338774835302550
Supply (hundred units)52534260454137382527
Find the rank correlation between the average price and the supply.
Answer: We want to find the rank correlation between the average price (X) and supply (Y) of an imported fruit over ten months. Here, \( n = 10 \). We will assign ranks \( R_x \) to the average prices and \( R_y \) to the supply values, with the highest value receiving rank 1.
Average price per unit (X) (Rs.)Supply (hundred units) (Y)\( R_x \)\( R_y \)\( d = R_x - R_y \)\( d^2 \)
65523300
68532200
43426511
386071636
774514-39
484156-11
35378800
30389724
2525101000
502749-525
Total\( n = 10 \)--\( \Sigma d = 0 \)\( \Sigma d^2 = 76 \)

From the table, we find \( \Sigma d^2 = 76 \).
To find Spearman's rank correlation coefficient, we use the formula: \[ r = 1 - \frac{6 \Sigma d^2}{n(n^2-1)} \]
Substitute \( n = 10 \) and \( \Sigma d^2 = 76 \) into the formula: \[ r = 1 - \frac{6 \times 76}{10(10^2-1)} \] \[ r = 1 - \frac{456}{10(100-1)} \] \[ r = 1 - \frac{456}{10 \times 99} \] \[ r = 1 - \frac{456}{990} \] \[ r = 1 - 0.4606 \] \[ r \approx 0.54 \]
Hence, the rank correlation coefficient between the average price and the supply is approximately 0.54.
In simple words: We ranked the average prices and supplies for 10 months. Then, we calculated the squared differences in ranks and used Spearman's formula to get a rank correlation of 0.54.

🎯 Exam Tip: For problems with more data points, careful ranking is essential. Always double-check that \( \Sigma d \) sums to zero as a self-check for rank assignments.

 

Question 6. To know the relation between the results of the Tests taken in a span of short time, a teacher has conducted two Tests in last two weeks and the ranks obtained by seven students are as follows:

StudentABCDEFG
Rank in Test 15123.53.576
Rank in Test 27146532
Find the rank correlation coefficient to know the similarity between the results of two examinations.
Answer: We want to find the rank correlation coefficient between the results of two tests for 7 students. Here, \( n = 7 \). Let \( R_x \) be the rank in Test 1 and \( R_y \) be the rank in Test 2.
We observe that in Test 1, the rank 3.5 appears twice. This means there are tied ranks, so we need to calculate a correction factor (CF). For \( m=2 \) (because rank 3.5 is repeated twice), the correction factor is: \[ CF = \frac{m^3-m}{12} = \frac{2^3-2}{12} = \frac{8-2}{12} = \frac{6}{12} = 0.5 \]
Now, we create a table to calculate the differences in ranks (\( d = R_x - R_y \)) and their squares (\( d^2 \)).
Student\( R_x \)\( R_y \)\( d = R_x - R_y \)\( d^2 \)
A57-24
B1100
C24-24
D3.56-2.56.25
E3.55-1.52.25
F73416
G62416
Total\( n = 7 \)-\( \Sigma d = 0 \)\( \Sigma d^2 = 48.50 \)

From the table, we find \( \Sigma d^2 = 48.50 \).
To find Spearman's rank correlation coefficient with a correction factor, we use the formula: \[ r = 1 - \frac{6(\Sigma d^2 + CF)}{n(n^2-1)} \]
Substitute \( n = 7 \), \( \Sigma d^2 = 48.50 \), and \( CF = 0.5 \) into the formula: \[ r = 1 - \frac{6(48.50 + 0.5)}{7(7^2-1)} \] \[ r = 1 - \frac{6(49)}{7(49-1)} \] \[ r = 1 - \frac{294}{7 \times 48} \] \[ r = 1 - \frac{294}{336} \] \[ r = 1 - 0.875 \] \[ r = 0.125 \] \[ r \approx 0.13 \]
Hence, the rank correlation coefficient between the results of the two tests is approximately 0.13.
In simple words: We calculated the sum of squared rank differences and a correction factor for tied ranks. Using Spearman's formula with this correction, we found the rank correlation between test results to be about 0.13.

🎯 Exam Tip: For tied ranks, always calculate the correction factor (CF) using the formula \( \frac{m^3-m}{12} \) for each set of tied ranks and add them up. Include this CF in Spearman's formula for accurate results.

Section F

Question 1. The information of fertilizer used (in tons) and productivity (in tons) of eight districts is given below:

Fertilizer (tons)1518202529354038
Productivity (tons)859395105115130140145

Calculate the correlation coefficient by Karl Pearson's method.
Answer: Here, we have 8 observations, so \( n = 8 \). Let fertilizer used be \( X \) and productivity be \( Y \).
First, find the mean of \( X \) and \( Y \):
\( \Sigma X = 220 \) tons
\( \Sigma Y = 908 \) tons
\( \bar{X} = \frac{\Sigma X}{n} = \frac{220}{8} = 27.5 \) tons
\( \bar{Y} = \frac{\Sigma Y}{n} = \frac{908}{8} = 113.5 \) tons
Since the means are not whole numbers and the values are large, we will use the shortcut method by creating new variables \( u \) and \( v \).
Let \( u = X - A \) where \( A = 27 \)
Let \( v = Y - B \) where \( B = 113 \)
The table for calculation is as follows:

Fertilizer (tons) XProductivity (tons) Y\( u = X - 27 \)\( v = Y - 113 \)\( u^2 \)\( v^2 \)\( uv \)
1585-12-28144784336
1893-9-2081400180
2095-7-1849324126
25105-2-846416
2911522444
3513081764289136
401401327169729351
3814511321211024352
\( \Sigma X = 220 \)\( \Sigma Y = 908 \)\( \Sigma u = 4 \)\( \Sigma v = 4 \)\( \Sigma u^2 = 636 \)\( \Sigma v^2 = 3618 \)\( \Sigma uv = 1501 \)

Now, use the Karl Pearson's correlation coefficient formula for \( u \) and \( v \):
\( r = \frac{n \Sigma uv - (\Sigma u)(\Sigma v)}{\sqrt{n \Sigma u^2 - (\Sigma u)^2} \cdot \sqrt{n \Sigma v^2 - (\Sigma v)^2}} \)
Substitute the values:
\( r = \frac{8 \times 1501 - (4)(4)}{\sqrt{8 \times 636 - (4)^2} \cdot \sqrt{8 \times 3618 - (4)^2}} \)
\( r = \frac{12008 - 16}{\sqrt{5088 - 16} \cdot \sqrt{28944 - 16}} \)
\( r = \frac{11992}{\sqrt{5072} \cdot \sqrt{28928}} \)
\( r = \frac{11992}{71.218 \times 170.082} \)
\( r = \frac{11992}{12112.92} \)
\( r \approx 0.99 \)
The correlation coefficient between fertilizer use and productivity is approximately 0.99.
In simple words: We calculated how strongly fertilizer use and productivity are related. Since the value is close to 1, it means they have a very strong positive relationship, meaning as fertilizer use increases, productivity also tends to increase a lot.

🎯 Exam Tip: Remember to choose the appropriate method (direct, shortcut, or step-deviation) based on whether the mean is an integer and the size of the data for easier calculations. Always show the formula and all calculation steps clearly.

Question 2. Find the Karl Pearson's correlation coefficient from the following information of the average weekly hours spent on Video games and the grade points obtained in an examination by 6 children of a big city.

Weekly average hours spent for Video games434745504051
Grade points obtained in an examination5.24.95.04.75.44.3


Answer: Here, we have 6 children, so \( n = 6 \). Let weekly hours be \( X \) and grade points be \( Y \).
First, find the mean of \( X \) and \( Y \):
\( \Sigma X = 276 \) hours
\( \Sigma Y = 29.5 \) grade points
\( \bar{X} = \frac{\Sigma X}{n} = \frac{276}{6} = 46 \) hours
\( \bar{Y} = \frac{\Sigma Y}{n} = \frac{29.5}{6} \approx 4.92 \) grade points
Since the means are not whole numbers, we will use the shortcut method by creating new variables \( u \) and \( v \).
Let \( u = X - A \) where \( A = 45 \) (a convenient value near \( \bar{X} \))
Let \( v = \frac{Y - B}{C_y} \) where \( B = 5 \) and \( C_y = 0.1 \) (to simplify Y values with decimals)
The table for calculation is as follows:

XY\( u = X - 45 \)\( v = \frac{Y - 5}{0.1} \)\( u^2 \)\( v^2 \)\( uv \)
435.2-2244-4
474.92-141-2
455.000000
504.75-3259-15
405.4-542516-20
514.36-73649-42
\( \Sigma X = 276 \)\( \Sigma Y = 29.5 \)\( \Sigma u = 6 \)\( \Sigma v = -5 \)\( \Sigma u^2 = 94 \)\( \Sigma v^2 = 79 \)\( \Sigma uv = -83 \)

Now, use the Karl Pearson's correlation coefficient formula for \( u \) and \( v \):
\( r = \frac{n \Sigma uv - (\Sigma u)(\Sigma v)}{\sqrt{n \Sigma u^2 - (\Sigma u)^2} \cdot \sqrt{n \Sigma v^2 - (\Sigma v)^2}} \)
Substitute the values:
\( r = \frac{6 \times (-83) - (6)(-5)}{\sqrt{6 \times 94 - (6)^2} \cdot \sqrt{6 \times 79 - (-5)^2}} \)
\( r = \frac{-498 + 30}{\sqrt{564 - 36} \cdot \sqrt{474 - 25}} \)
\( r = \frac{-468}{\sqrt{528} \cdot \sqrt{449}} \)
\( r = \frac{-468}{22.978 \times 21.190} \)
\( r = \frac{-468}{486.9} \)
\( r \approx -0.96 \)
The correlation coefficient between weekly hours spent on video games and grade points is approximately -0.96.
In simple words: We found a very strong negative connection between time spent on video games and exam grades. This means that as children play more video games, their grades tend to go down significantly.

🎯 Exam Tip: When dealing with decimal values or large numbers, using the shortcut method with assumed mean and scaling (if applicable) can greatly simplify calculations. Be extra careful with negative signs in the formula.

Question 3. Find Karl Pearson's correlation coefficient between density of population (per square km) and death rate (per thousand) from the following data:

CityABCDEFG
Density (per sq. km)750600350500200700850
Death rate (per thousand)30201520102550


Answer: Here, we have 7 cities, so \( n = 7 \). Let density be \( X \) and death rate be \( Y \).
We calculate the correlation coefficient using the shortcut method, creating new variables \( u \) and \( v \).
Let \( u = \frac{X - A}{C_x} \) where \( A = 500 \) and \( C_x = 50 \)
Let \( v = \frac{Y - B}{C_y} \) where \( B = 20 \) and \( C_y = 5 \)
The table for calculation is as follows:

CityXY\( u = \frac{X - 500}{50} \)\( v = \frac{Y - 20}{5} \)\( u^2 \)\( v^2 \)\( uv \)
A750305225410
B6002020400
C35015-3-1913
D5002000000
E20010-6-236412
F70025411614
G8505076493642
Total--\( \Sigma u = 9 \)\( \Sigma v = 6 \)\( \Sigma u^2 = 139 \)\( \Sigma v^2 = 46 \)\( \Sigma uv = 71 \)

Now, use the Karl Pearson's correlation coefficient formula for \( u \) and \( v \):
\( r = \frac{n \Sigma uv - (\Sigma u)(\Sigma v)}{\sqrt{n \Sigma u^2 - (\Sigma u)^2} \cdot \sqrt{n \Sigma v^2 - (\Sigma v)^2}} \)
Substitute the values:
\( r = \frac{7 \times 71 - (9)(6)}{\sqrt{7 \times 139 - (9)^2} \cdot \sqrt{7 \times 46 - (6)^2}} \)
\( r = \frac{497 - 54}{\sqrt{973 - 81} \cdot \sqrt{322 - 36}} \)
\( r = \frac{443}{\sqrt{892} \cdot \sqrt{286}} \)
\( r = \frac{443}{29.866 \times 16.911} \)
\( r = \frac{443}{505.09} \)
\( r \approx 0.88 \)
The correlation coefficient between population density and death rate is approximately 0.88.
In simple words: We found a strong positive connection between how dense a city's population is and its death rate. This suggests that as population density increases, the death rate also tends to rise.

🎯 Exam Tip: When given grouped or continuous data, the shortcut method with step deviation (scaling) helps reduce large numbers, making calculations easier and less prone to error. Ensure the constants for scaling (\( C_x, C_y \)) are chosen carefully.

Question 4. The following information is obtained to study the relationship between the advertisement cost and the sales of electric fans of the companies manufacturing electric fans. Find the correlation coefficient between advertisement cost and the sales by Karl Pearson's method:

CompanyABCDEF
Advertisement Cost (lakh Rs.)1401208010080180
Sales of electric fans (crore Rs.)354515402050


Answer: Here, we have 6 companies, so \( n = 6 \). Let advertisement cost be \( X \) and sales of electric fans be \( Y \).
We calculate the correlation coefficient using the shortcut method, creating new variables \( u \) and \( v \).
Let \( u = \frac{X - A}{C_x} \) where \( A = 100 \) and \( C_x = 20 \)
Let \( v = \frac{Y - B}{C_y} \) where \( B = 35 \) and \( C_y = 5 \)
The table for calculation is as follows:

CompanyXY\( u = \frac{X - 100}{20} \)\( v = \frac{Y - 35}{5} \)\( u^2 \)\( v^2 \)\( uv \)
A1403520400
B1204512142
C8015-1-41164
D1004001010
E8020-1-3193
F180504316912
Total\( n = 6 \)-\( \Sigma u = 5 \)\( \Sigma v = -1 \)\( \Sigma u^2 = 23 \)\( \Sigma v^2 = 39 \)\( \Sigma uv = 21 \)

Now, use the Karl Pearson's correlation coefficient formula for \( u \) and \( v \):
\( r = \frac{n \Sigma uv - (\Sigma u)(\Sigma v)}{\sqrt{n \Sigma u^2 - (\Sigma u)^2} \cdot \sqrt{n \Sigma v^2 - (\Sigma v)^2}} \)
Substitute the values:
\( r = \frac{6 \times 21 - (5)(-1)}{\sqrt{6 \times 23 - (5)^2} \cdot \sqrt{6 \times 39 - (-1)^2}} \)
\( r = \frac{126 + 5}{\sqrt{138 - 25} \cdot \sqrt{234 - 1}} \)
\( r = \frac{131}{\sqrt{113} \cdot \sqrt{233}} \)
\( r = \frac{131}{10.630 \times 15.264} \)
\( r = \frac{131}{162.26} \)
\( r \approx 0.81 \)
The correlation coefficient between advertisement cost and sales of electric fans is approximately 0.81.
In simple words: We found a strong positive relationship between the money spent on advertising and the sales of electric fans. This means that increasing advertisement costs generally leads to higher sales.

🎯 Exam Tip: When using the shortcut method with both mean and scaling, remember that \( C_x \) and \( C_y \) are only used in defining \( u \) and \( v \), but do not appear directly in the final correlation coefficient formula as they cancel out. Focus on correct summation values.

Question 5. A doctor obtains the following information for the weights of seven mothers and their children from a maternity home for his research to know the relation between the weights of mothers and weights of their children at the time of birth:

Weight of mother (kg)59726664776660
Weight of child (kg)2.53.43.12.72.82.33.0

Find rank correlation coefficient between the weights of mother and child.
Answer: Here, we have 7 pairs of observations, so \( n = 7 \).
Let \( X \) be the weight of the mother and \( Y \) be the weight of the child.
We assign ranks to \( X \) (denoted as \( R_x \)) and \( Y \) (denoted as \( R_y \)). When values are repeated, we assign the average rank.
For \( X \): The value 66 kg appears twice. The ranks would be 3 and 4, so the average rank is \( \frac{3+4}{2} = 3.5 \).
The table for calculating the rank correlation coefficient is prepared as follows:

Weight of mother (kg) XWeight of child (kg) YRank of X \( R_x \)Rank of Y \( R_y \)\( d = R_x - R_y \)\( d^2 \)
592.57611
723.42111
663.13.521.52.25
642.75500
772.814-39
662.33.57-3.512.25
603.06339
Total\( n = 7 \)--\( \Sigma d = 0 \)\( \Sigma d^2 = 34.50 \)

Since there are repeated ranks, we need to calculate the Correction Factor (CF).
For \( X \): The value 66 is repeated 2 times (\( m = 2 \)).
\( CF = \Sigma \frac{m^3 - m}{12} \)
\( CF = \frac{2^3 - 2}{12} = \frac{8 - 2}{12} = \frac{6}{12} = 0.5 \)
Now, use Spearman's rank correlation coefficient formula with CF:
\( r = 1 - \frac{6 \left( \Sigma d^2 + CF \right)}{n(n^2 - 1)} \)
Substitute the values:
\( r = 1 - \frac{6 (34.5 + 0.5)}{7(7^2 - 1)} \)
\( r = 1 - \frac{6 (35)}{7(49 - 1)} \)
\( r = 1 - \frac{210}{7(48)} \)
\( r = 1 - \frac{210}{336} \)
\( r = 1 - 0.625 \)
\( r = 0.375 \approx 0.38 \)
The rank correlation coefficient between mother's weight and child's weight is approximately 0.38.
In simple words: We found a weak positive relationship between the weight of mothers and the weight of their babies at birth. This means that heavier mothers tend to have slightly heavier babies, but the connection is not very strong.

🎯 Exam Tip: When ranks are tied, calculating the Correction Factor (CF) is crucial. Ensure you apply the correct formula for CF for each repeated value and sum them up before adding to \( \Sigma d^2 \).

Question 6. The following data is obtained to know the relation between maximum day temperature and the sale of ice cream in Ahmedabad city:

Maximum Temperature (Celsius)3542403944404540
Sale of ice cream (kg)600680750630920750900720

Calculate the rank correlation coefficient.
Answer: Here, we have 8 observations, so \( n = 8 \).
Let \( X \) be the maximum temperature and \( Y \) be the sale of ice cream.
We assign ranks to \( X \) (denoted as \( R_x \)) and \( Y \) (denoted as \( R_y \)). When values are repeated, we assign the average rank.
For \( X \): The value 40 Celsius appears three times. The ranks would be 4, 5, 6, so the average rank is \( \frac{4+5+6}{3} = 5 \).
For \( Y \): The value 750 kg appears twice. The ranks would be 3 and 4, so the average rank is \( \frac{3+4}{2} = 3.5 \).
The table for calculating the rank correlation coefficient is prepared as follows:

Maximum Temperature (Celsius) XSale of ice creams (kg) YRank of X \( R_x \)Rank of Y \( R_y \)\( d = R_x - R_y \)\( d^2 \)
356008800
4268036-39
4075053.51.52.25
396307700
449202111
4075053.51.52.25
4590012-11
407205500
Total\( n = 8 \)--\( \Sigma d = 0 \)\( \Sigma d^2 = 15.50 \)

Since there are repeated ranks for both X and Y, we need to calculate the Correction Factor (CF).
For \( X \): The value 40 is repeated 3 times (\( m = 3 \)).
\( CF_x = \frac{3^3 - 3}{12} = \frac{27 - 3}{12} = \frac{24}{12} = 2 \)
For \( Y \): The value 750 is repeated 2 times (\( m = 2 \)).
\( CF_y = \frac{2^3 - 2}{12} = \frac{8 - 2}{12} = \frac{6}{12} = 0.5 \)
Total \( CF = CF_x + CF_y = 2 + 0.5 = 2.5 \)
Now, use Spearman's rank correlation coefficient formula with CF:
\( r = 1 - \frac{6 \left( \Sigma d^2 + CF \right)}{n(n^2 - 1)} \)
Substitute the values:
\( r = 1 - \frac{6 (15.5 + 2.5)}{8(8^2 - 1)} \)
\( r = 1 - \frac{6 (18)}{8(64 - 1)} \)
\( r = 1 - \frac{108}{8(63)} \)
\( r = 1 - \frac{108}{504} \)
\( r = 1 - 0.214 \)
\( r \approx 0.79 \)
The rank correlation coefficient between maximum temperature and ice cream sales is approximately 0.79.
In simple words: We found a strong positive connection between the highest daily temperature and how much ice cream is sold. This means that on hotter days, people tend to buy a lot more ice cream.

🎯 Exam Tip: When multiple values are repeated in *either* variable, calculate a separate Correction Factor (\( CF \)) for each set of tied ranks and then sum them up to get the total CF. Remember the formula \( \Sigma \frac{m^3 - m}{12} \).

Question 7. An entrance test required to study abroad is conducted online. The marks obtained in Reasoning Ability and English Speaking in this online test (having negative marking system for wrong answer) by 5 students selected in a sample are given below:

StudentABCDE
Marks in Reasoning Ability55555
Marks in English Speaking2-2-202

Find the rank correlation coefficient between Reasoning Ability and ability in English Speaking.
Answer: Here, we have 5 students, so \( n = 5 \).
Let \( X \) be marks in Reasoning Ability and \( Y \) be marks in English Speaking.
We assign ranks to \( X \) (denoted as \( R_x \)) and \( Y \) (denoted as \( R_y \)). When values are repeated, we assign the average rank.
For \( X \): The value 5 is repeated 5 times. The ranks would be 1, 2, 3, 4, 5, so the average rank is \( \frac{1+2+3+4+5}{5} = 3 \).
For \( Y \):
The value 2 is repeated 2 times. The ranks would be 1 and 2, so the average rank is \( \frac{1+2}{2} = 1.5 \).
The value -2 is repeated 2 times. The ranks would be 4 and 5, so the average rank is \( \frac{4+5}{2} = 4.5 \).
The value 0 has rank 3.
The table for calculating the rank correlation coefficient is prepared as follows:

StudentMarks in Reasoning Ability XMarks in English Speaking YRank of X \( R_x \)Rank of Y \( R_y \)\( d = R_x - R_y \)\( d^2 \)
A5231.51.52.25
B5-234.5-1.52.25
C5-234.5-1.52.25
D503300
E5231.51.52.25
Total\( n = 5 \)---\( \Sigma d = 0 \)\( \Sigma d^2 = 9.0 \)

Since there are repeated ranks for both X and Y, we need to calculate the Correction Factor (CF).
For \( X \): The value 5 is repeated 5 times (\( m = 5 \)).
\( CF_x = \frac{5^3 - 5}{12} = \frac{125 - 5}{12} = \frac{120}{12} = 10 \)
For \( Y \): The value 2 is repeated 2 times (\( m = 2 \)).
\( CF_{y1} = \frac{2^3 - 2}{12} = \frac{6}{12} = 0.5 \)
For \( Y \): The value -2 is repeated 2 times (\( m = 2 \)).
\( CF_{y2} = \frac{2^3 - 2}{12} = \frac{6}{12} = 0.5 \)
Total \( CF = CF_x + CF_{y1} + CF_{y2} = 10 + 0.5 + 0.5 = 11 \)
Now, use Spearman's rank correlation coefficient formula with CF:
\( r = 1 - \frac{6 \left( \Sigma d^2 + CF \right)}{n(n^2 - 1)} \)
Substitute the values:
\( r = 1 - \frac{6 (9 + 11)}{5(5^2 - 1)} \)
\( r = 1 - \frac{6 (20)}{5(25 - 1)} \)
\( r = 1 - \frac{120}{5(24)} \)
\( r = 1 - \frac{120}{120} \)
\( r = 1 - 1 \)
\( r = 0 \)
The rank correlation coefficient between Reasoning Ability and English Speaking ability is 0.
In simple words: We found no linear relationship between marks in Reasoning Ability and English Speaking for these students. This means there is no clear trend where higher marks in one subject consistently lead to higher or lower marks in the other.

🎯 Exam Tip: Always identify all unique values in each variable and count their repetitions to correctly calculate the Correction Factor (CF). A zero correlation coefficient (r=0) implies no linear relationship, but doesn't rule out other types of relationships.

Question 8. Six dancers A, B, C, D, E and F in a dance competition were judged by two dance Gurus. The ranks assigned to the dancers are as follows:

Rank123456
By Guru 1BFACDE
By Guru 2FACBED

Find the rank correlation coefficient between the judgement of the two Gurus.
Answer: Here, we have 6 dancers, so \( n = 6 \).
Let \( R_x \) be the ranks given by Guru 1 and \( R_y \) be the ranks given by Guru 2.
First, we list the ranks for each dancer from both gurus:
Dancer A: \( R_x = 3, R_y = 2 \)
Dancer B: \( R_x = 1, R_y = 4 \)
Dancer C: \( R_x = 4, R_y = 3 \)
Dancer D: \( R_x = 5, R_y = 6 \)
Dancer E: \( R_x = 6, R_y = 5 \)
Dancer F: \( R_x = 2, R_y = 1 \)
The table for calculating the rank correlation coefficient is prepared as follows:

Dancer\( R_x \)\( R_y \)\( d = R_x - R_y \)\( d^2 \)
A3211
B14-39
C4311
D56-11
E6511
F2111
Total\( n = 6 \)-\( \Sigma d = 0 \)\( \Sigma d^2 = 14 \)

Now, use Spearman's rank correlation coefficient formula (since there are no tied ranks, CF = 0):
\( r = 1 - \frac{6 \Sigma d^2}{n(n^2 - 1)} \)
Substitute the values:
\( r = 1 - \frac{6 \times 14}{6(6^2 - 1)} \)
\( r = 1 - \frac{84}{6(36 - 1)} \)
\( r = 1 - \frac{84}{6(35)} \)
\( r = 1 - \frac{84}{210} \)
\( r = 1 - 0.4 \)
\( r = 0.6 \)
The rank correlation coefficient between the judgements of the two Gurus is 0.6.
In simple words: We found a moderate positive agreement between the two judges' rankings of the dancers. This means they largely agree on which dancers are better, but there are some differences.

🎯 Exam Tip: When ranks are already provided (not raw data), directly use them to calculate \( d \) and \( d^2 \). Always check for tied ranks, but if none exist, the Correction Factor (CF) is zero, simplifying the formula.

Question 9. The following data is obtained for two variables, inflation (X) and interest rate (Y):
\( n = 50, \Sigma x = 500, \Sigma y = 300, \Sigma x^2 = 5450, \Sigma y^2 = 2000, \Sigma xy = 3090 \)
Later on, it was known that one pair of observation (10, 6) was included additionally by mistake. Find the correlation coefficient by excluding this pair of observations.
Answer: First, we need to correct the given sums by excluding the mistaken observation (10, 6).
Mistaken observation: \( x_{mistake} = 10, y_{mistake} = 6 \)
Original values: \( n = 50, \Sigma x = 500, \Sigma y = 300, \Sigma x^2 = 5450, \Sigma y^2 = 2000, \Sigma xy = 3090 \)
Corrected values:
\( n_{corrected} = n - 1 = 50 - 1 = 49 \)
\( \Sigma x_{corrected} = \Sigma x - x_{mistake} = 500 - 10 = 490 \)
\( \Sigma y_{corrected} = \Sigma y - y_{mistake} = 300 - 6 = 294 \)
\( \Sigma x^2_{corrected} = \Sigma x^2 - x_{mistake}^2 = 5450 - 10^2 = 5450 - 100 = 5350 \)
\( \Sigma y^2_{corrected} = \Sigma y^2 - y_{mistake}^2 = 2000 - 6^2 = 2000 - 36 = 1964 \)
\( \Sigma xy_{corrected} = \Sigma xy - (x_{mistake} \times y_{mistake}) = 3090 - (10 \times 6) = 3090 - 60 = 3030 \)
Now, use Karl Pearson's correlation coefficient formula with the corrected values:
\( r = \frac{n \Sigma xy - (\Sigma x)(\Sigma y)}{\sqrt{n \Sigma x^2 - (\Sigma x)^2} \cdot \sqrt{n \Sigma y^2 - (\Sigma y)^2}} \)
Substitute the corrected values:
\( r = \frac{49 \times 3030 - (490)(294)}{\sqrt{49 \times 5350 - (490)^2} \cdot \sqrt{49 \times 1964 - (294)^2}} \)
\( r = \frac{148470 - 144060}{\sqrt{262150 - 240100} \cdot \sqrt{96236 - 86436}} \)
\( r = \frac{4410}{\sqrt{22050} \cdot \sqrt{9800}} \)
\( r = \frac{4410}{148.492 \times 98.995} \)
\( r = \frac{4410}{14700} \)
\( r \approx 0.3 \)
The correlation coefficient after excluding the erroneous observation is approximately 0.3.
In simple words: We first removed a wrong data point from the totals. Then, using these corrected totals, we calculated the correlation. The result of 0.3 shows a weak positive link between inflation and interest rates.

🎯 Exam Tip: When correcting data, remember to adjust `n` and `n²` as well as the individual sums (`Σx`, `Σy`, `Σx²`, `Σy²`, `Σxy`). A common mistake is to only adjust the sums without changing `n`.

Question 10. The information regarding sales (X) and expenses (Y) of 10 firms is given below:
\( \bar{x} = 58, \bar{y} = 14, \Sigma(x - 65)^2 = 850, \Sigma(y - 13)^2 = 32, \Sigma(x - 65)(y - 13) = 0 \)
Find the correlation coefficient.
Answer: We are given:
\( n = 10 \)
\( \bar{x} = 58, \bar{y} = 14 \)
\( \Sigma(x - 65)^2 = 850 \)
\( \Sigma(y - 13)^2 = 32 \)
\( \Sigma(x - 65)(y - 13) = 0 \)
Let \( u = x - 65 \) and \( v = y - 13 \).
Then, \( \Sigma u^2 = \Sigma(x - 65)^2 = 850 \)
And \( \Sigma v^2 = \Sigma(y - 13)^2 = 32 \)
And \( \Sigma uv = \Sigma(x - 65)(y - 13) = 0 \)
Now, we need to find \( \Sigma u \) and \( \Sigma v \).
We know \( \bar{x} = 58 \).
\( \Sigma u = \Sigma(x - 65) = \Sigma x - 65n \)
Since \( \bar{x} = \frac{\Sigma x}{n} \implies \Sigma x = n \bar{x} \)
\( \Sigma u = n \bar{x} - 65n = 10 \times 58 - 65 \times 10 = 580 - 650 = -70 \)
We know \( \bar{y} = 14 \).
\( \Sigma v = \Sigma(y - 13) = \Sigma y - 13n \)
Since \( \bar{y} = \frac{\Sigma y}{n} \implies \Sigma y = n \bar{y} \)
\( \Sigma v = n \bar{y} - 13n = 10 \times 14 - 13 \times 10 = 140 - 130 = 10 \)
Now, use Karl Pearson's correlation coefficient formula for \( u \) and \( v \):
\( r = \frac{n \Sigma uv - (\Sigma u)(\Sigma v)}{\sqrt{n \Sigma u^2 - (\Sigma u)^2} \cdot \sqrt{n \Sigma v^2 - (\Sigma v)^2}} \)
Substitute the values:
\( r = \frac{10 \times 0 - (-70)(10)}{\sqrt{10 \times 850 - (-70)^2} \cdot \sqrt{10 \times 32 - (10)^2}} \)
\( r = \frac{0 + 700}{\sqrt{8500 - 4900} \cdot \sqrt{320 - 100}} \)
\( r = \frac{700}{\sqrt{3600} \cdot \sqrt{220}} \)
\( r = \frac{700}{60 \times 14.832} \)
\( r = \frac{700}{889.92} \)
\( r \approx 0.79 \)
The correlation coefficient obtained is approximately 0.79.
In simple words: Given some calculated sums and means, we adjusted the data to simpler variables and then used the correlation formula. The result of 0.79 shows a strong positive link between sales and expenses, meaning as sales increase, expenses tend to increase as well.

🎯 Exam Tip: When given sums of squared deviations from values other than the mean, remember that \( \Sigma(x-A)^2 = \Sigma x^2 - 2A \Sigma x + nA^2 \). If \( u = x-A \), then \( \Sigma u = \Sigma x - nA \) and \( \Sigma u^2 = \Sigma(x-A)^2 \). Carefully calculate \( \Sigma u \) and \( \Sigma v \) using the given means before applying the formula.

Question 11. The coefficient from this information is 0.6. On subsequent verification, it was noticed that the difference of ranks of X and Y for one of the persons was taken as 2 instead of 4. Find the correct value of rank correlation coefficient.
Answer: We are given:
\( n = 10 \)
Initial rank correlation coefficient \( r = 0.6 \)
Wrong difference of ranks \( d_{wrong} = 2 \)
Correct difference of ranks \( d_{correct} = 4 \)
First, find the initial \( \Sigma d^2 \) using the given \( r \):
\( r = 1 - \frac{6 \Sigma d^2}{n(n^2 - 1)} \)
\( 0.6 = 1 - \frac{6 \Sigma d^2}{10(10^2 - 1)} \)
\( 0.6 = 1 - \frac{6 \Sigma d^2}{10(100 - 1)} \)
\( 0.6 = 1 - \frac{6 \Sigma d^2}{10 \times 99} \)
\( 0.6 = 1 - \frac{6 \Sigma d^2}{990} \)
\( \frac{6 \Sigma d^2}{990} = 1 - 0.6 \)
\( \frac{6 \Sigma d^2}{990} = 0.4 \)
\( 6 \Sigma d^2 = 0.4 \times 990 \)
\( 6 \Sigma d^2 = 396 \)
\( \Sigma d^2 = \frac{396}{6} = 66 \)
Now, correct \( \Sigma d^2 \):
\( \Sigma d^2_{corrected} = \Sigma d^2_{initial} - (d_{wrong})^2 + (d_{correct})^2 \)
\( \Sigma d^2_{corrected} = 66 - (2)^2 + (4)^2 \)
\( \Sigma d^2_{corrected} = 66 - 4 + 16 \)
\( \Sigma d^2_{corrected} = 78 \)
Finally, calculate the corrected rank correlation coefficient:
\( r_{corrected} = 1 - \frac{6 \Sigma d^2_{corrected}}{n(n^2 - 1)} \)
\( r_{corrected} = 1 - \frac{6 \times 78}{10(10^2 - 1)} \)
\( r_{corrected} = 1 - \frac{468}{10(99)} \)
\( r_{corrected} = 1 - \frac{468}{990} \)
\( r_{corrected} = 1 - 0.4727 \)
\( r_{corrected} \approx 0.53 \)
The corrected rank correlation coefficient is approximately 0.53.
In simple words: We first used the wrong correlation value to find the sum of squared differences in ranks. Then, we adjusted this sum by removing the square of the wrong difference and adding the square of the correct difference. Finally, we recalculated the correlation with the corrected sum, getting a new value of about 0.53.

🎯 Exam Tip: When correcting \( \Sigma d^2 \), remember the formula: \( \Sigma d^2_{new} = \Sigma d^2_{old} - (d_{wrong})^2 + (d_{correct})^2 \). This is a common type of error correction question.

Question 12. The information of health index x and life expectancy y is obtained for 10 people. These data are ranked to find the rank correlation coefficient and the sum of squares of the ranks was found to be 42.5. It was also observed that health index 70 was repeated three times and life expectancy 45 was repeated twice in the data. Find the rank correlation coefficient using this information.
Answer: We are given:
\( n = 10 \)
\( \Sigma d^2 = 42.5 \)
Information about tied ranks:
For health index \( X \): The value 70 is repeated 3 times (\( m = 3 \)).
For life expectancy \( Y \): The value 45 is repeated 2 times (\( m = 2 \)).
First, calculate the Correction Factor (CF) for tied ranks.
For \( X \): \( CF_x = \frac{m^3 - m}{12} = \frac{3^3 - 3}{12} = \frac{27 - 3}{12} = \frac{24}{12} = 2.0 \)
For \( Y \): \( CF_y = \frac{m^3 - m}{12} = \frac{2^3 - 2}{12} = \frac{8 - 2}{12} = \frac{6}{12} = 0.5 \)
Total \( CF = CF_x + CF_y = 2.0 + 0.5 = 2.5 \)
Now, use Spearman's rank correlation coefficient formula with CF:
\( r = 1 - \frac{6 \left( \Sigma d^2 + CF \right)}{n(n^2 - 1)} \)
Substitute the values:
\( r = 1 - \frac{6 (42.5 + 2.5)}{10(10^2 - 1)} \)
\( r = 1 - \frac{6 (45)}{10(100 - 1)} \)
\( r = 1 - \frac{270}{10(99)} \)
\( r = 1 - \frac{270}{990} \)
\( r = 1 - 0.2727 \)
\( r \approx 0.73 \)
The rank correlation coefficient is approximately 0.73.
In simple words: We were given the sum of squared differences for ranks and information about repeated values. We used this to calculate a "correction factor" for the repeated ranks. Then, we applied Spearman's formula with this correction to find the rank correlation coefficient, which turned out to be around 0.73, indicating a fairly strong positive relationship.

🎯 Exam Tip: When given \( \Sigma d^2 \) directly, always check for information about tied ranks. If ties are mentioned, calculate the total Correction Factor (CF) and include it in the numerator of the Spearman's rank correlation formula.

There are no questions or answers in the specified page range (between page 43 and page 44) that fit the content processing rules. The content on these pages consists of a comment form, a list of recent posts (links), and a copyright notice.

Free study material for Statistics

GSEB Solutions Class 12 Statistics Chapter 02 Linear Correlation

Students can now access the GSEB Solutions for Chapter 02 Linear Correlation prepared by teachers on our website. These solutions cover all questions in exercise in your Class 12 Statistics textbook. Each answer is updated based on the current academic session as per the latest GSEB syllabus.

Detailed Explanations for Chapter 02 Linear Correlation

Our expert teachers have provided step-by-step explanations for all the difficult questions in the Class 12 Statistics chapter. Along with the final answers, we have also explained the concept behind it to help you build stronger understanding of each topic. This will be really helpful for Class 12 students who want to understand both theoretical and practical questions. By studying these GSEB Questions and Answers your basic concepts will improve a lot.

Benefits of using Statistics Class 12 Solved Papers

Using our Statistics solutions regularly students will be able to improve their logical thinking and problem-solving speed. These Class 12 solutions are a guide for self-study and homework assistance. Along with the chapter-wise solutions, you should also refer to our Revision Notes and Sample Papers for Chapter 02 Linear Correlation to get a complete preparation experience.

FAQs

Where can I find the latest #REF! for the 2026-27 session?

The complete and updated #REF! is available for free on StudiesToday.com. These solutions for Class 12 Statistics are as per latest GSEB curriculum.

Are the Statistics GSEB solutions for Class 12 updated for the new 50% competency-based exam pattern?

Yes, our experts have revised the #REF! as per 2026 exam pattern. All textbook exercises have been solved and have added explanation about how the Statistics concepts are applied in case-study and assertion-reasoning questions.

How do these Class 12 GSEB solutions help in scoring 90% plus marks?

Toppers recommend using GSEB language because GSEB marking schemes are strictly based on textbook definitions. Our #REF! will help students to get full marks in the theory paper.

Do you offer #REF! in multiple languages like Hindi and English?

Yes, we provide bilingual support for Class 12 Statistics. You can access #REF! in both English and Hindi medium.

Is it possible to download the Statistics GSEB solutions for Class 12 as a PDF?

Yes, you can download the entire #REF! in printable PDF format for offline study on any device.