In this post we have Shared Gauhati University Business Statistics Question Paper Solution 2023 Pdf, B.Com 3rd Sem GU, Which can be very beneficial for your upcoming exam preparation. So read this post from top to bottom and get familiar with the question paper solution.
Guwahati University BCom 3rd semester
GU Business Statistics Solved Question paper 2023
COMMERCE (Honours Generic)
Paper: COM-HG-3016 (Business Statistics)
The figures in the margin indicate full marks for the questions.
Answer Question Nos. 1, 2, 3 and any four from the rest.
A) Select the correct answer: 1x4=4
(i) What is the empirical relationship between mean, median, and mode?
(a) Mean - Mode = 3 (Mean - Median)
(b) Mean - Median = 3 (Mean - Mode)
(c) Mode - Mean = 3 (Mean - Median)
(d) None of the above
Answer: (a) Mean - Mode = 3 (Mean - Median)
(ii) Standard Deviation (S.D.) is independent of the change of:
(a) origin
(b) scale
(c) origin and scale
(d) None of the above
Answer: (a) origin
(iii) Which of the following is a unitless measure?
(a) Median
(b) Standard Deviation
(c) Mean Deviation
(d) Coefficient of Variation
Answer: (d) Coefficient of Variation
(iv) What are the salient features responsible for seasonal variation?
(a) weather
(b) social custom
(c) festival
(d) All of the above
Answer: (d) All of the above
(B) Fill in the blanks: 1x3=3
(i) The coefficient of correlation lies between -1 and _______.
Answer: The coefficient of correlation lies between -1 and 1.
(ii) Variance is denoted by _______.
Answer: Variance is denoted by σ².
(iii) If A and B are two independent events, then P(A/B) = _______.
Answer: If A and B are two independent events, then P(A/B) = P(A).
(C) Write True or False: 1x3=3
(i) Mean of the binomial distribution is less than variance.
Answer: False
(ii) Coefficient of Standard Deviation is σ/x̄.
Answer: True
(iii) Normal distribution is a continuous distribution.
Answer: True
2. Answer the following questions: 2×5=10
(a) Write two properties of regression coefficients.
Answer:
The product of the two regression coefficients (of X on Y and Y on X) is less than or equal to 1: bxy⋅byx ≤ 1b
The regression coefficients are independent of the change in origin but depend on the change of scale.
(b) Distinguish between Parameter and Statistic.
Answer:
(c) Find E(X) for the following probability distribution of X:
Solution:
(d) State the Factor Reversal Test (FRT) of Index Numbers.
Answer:
The Factor Reversal Test states that the product of a price index and a quantity index should equal the value index. Mathematically:
P⋅Q=V
Where P is the price index, Q is the quantity index, and V is the value index.
(e) Three coins are tossed. Write down the sample space.
Answer:
When three coins are tossed, the possible outcomes are:
S = (HHH,HHT,HTH,HTT,THH,THT,TTH,TTT)
Where H represents heads and T represents tails.
3. Answer any four of the following questions: 5×4=20
(a) Write down any five essential characteristics of an ideal questionnaire.
Ans:- An ideal questionnaire is a well-designed tool used to collect accurate and relevant information from respondents. It is clear, concise, unbiased, logically organized, and ensures ease of understanding and answering, making it effective for achieving research objectives.
Five Essential Characteristics of an Ideal Questionnaire
Clarity: Questions should be clearly worded, easy to understand, and free from ambiguity.
Relevance: Each question should be directly related to the research objectives and avoid unnecessary details.
Conciseness: Questions should be concise and to the point, avoiding overly complex or lengthy phrasing.
Neutrality: Questions should not lead or bias respondents; they should allow for honest and independent responses.
Logical Flow: The questions should be arranged in a logical order, transitioning smoothly from one topic to another.
(b) The regression lines have the equations 3x+2y=6 and 7x+5y=12. Find x and y.
Ans:-
(c) Describe the procedures of testing a hypothesis.
Ans:- Procedures of Testing a Hypothesis
Formulate the Hypotheses:
Null Hypothesis (H0):The default assumption or claim.
Alternative Hypothesis (Ha): The claim to be tested against (H0).
Select the Significance Level (α): Choose a probability threshold (e.g., 0.05) for rejecting H0.
Choose the Test Statistic: Decide on a statistical method (e.g., t-test, z-test, chi-square test) based on the data and research question.
Compute the Test Statistic: Calculate the value of the chosen statistic using sample data.
Determine the Critical Value or p-Value: Compare the computed statistic to a critical value or interpret the p-value.
Make a Decision:
If the test statistic exceeds the critical value or p-value is less than α, reject H0.
Otherwise, fail to reject H0.
Interpret the Results: Summarize the findings in the context of the research question.
(d) What do you mean by a "scatter diagram"? How can the correlation between two variables be studied with the help of this diagram?
Ans:- Scatter Diagram and Its Use for Correlation Study
A scatter diagram is a graphical representation of the relationship between two variables. Each point on the diagram represents an observation, with one variable plotted on the x-axis and the other on the y-axis.
How Correlation Can Be Studied Using a Scatter Diagram:
Positive Correlation: Points are clustered upward, indicating that as one variable increases, the other also increases.
Negative Correlation: Points are clustered downward, indicating that as one variable increases, the other decreases.
No Correlation: Points are scattered randomly, showing no relationship between the variables.
Strength of Correlation: The closer the points are to forming a straight line, the stronger the correlation.
(e) Determine made for the following distribution :
Ans:- DOWNLOAD PDF FOR COMPLETE SOLUTION
(f) Write a note on the advantages of sample survey over census method.
Ans:- Advantages of Sample Survey over Census Method
A sample survey offers several advantages over the census method, including:
Cost-Effective: Sample surveys are typically less expensive as they involve collecting data from only a subset of the population, unlike the census method which requires data from every individual.
Time-Efficient: Since a smaller number of people are surveyed, data collection and analysis can be completed much faster compared to a census, which may take years to process.
Practicality: In situations where it's difficult or impossible to reach the entire population, sample surveys offer a feasible solution. This can be especially relevant in large or geographically dispersed populations.
Accuracy: A well-designed sample survey can provide more accurate and reliable data in a shorter time frame than a census, which is prone to human errors, omissions, and misreporting when covering a large number of people.
Flexibility: Sample surveys can be more easily adapted to focus on specific groups or characteristics, allowing for more targeted analysis, whereas a census gathers data from every individual, which can be less flexible.
Lower Administrative Burden: Managing a sample survey requires less administrative effort than conducting a full census, which often involves significant coordination and resource allocation.
Reduced Complexity: In terms of data processing, sample surveys are less complex than censuses, since they deal with fewer records, making analysis and interpretation more straightforward.
4. (a) Find Q₁ and D4 from the following data: 6
Ans:- DOWNLOAD PDF FOR COMPLETE SOLUTION
(b) Define coefficient of variation. What are the special uses of this measure? 4
Ans:- Coefficient of Variation (CV)
The coefficient of variation (CV) is a statistical measure that expresses the relative variability of a data set compared to its mean. It is calculated as the ratio of the standard deviation (σ) to the mean (μ), often expressed as a percentage:
CV=σ/μ×100
Where:
σ is the standard deviation,
μ is the mean of the data.
Special Uses of Coefficient of Variation:
Comparing Distributions: The CV allows comparison of the relative variability of two or more data sets, even when their means differ. A higher CV indicates greater variability in relation to the mean.
Risk Assessment: In finance and economics, the CV is used to compare the risk (volatility) of different investments relative to their expected returns. A higher CV implies a higher risk.
Consistency Measurement: In quality control and manufacturing, a lower CV indicates more consistency or uniformity in a process or product, while a higher CV suggests variability.
Data Normalization: The CV helps to standardize the data, particularly in cases where data sets have different units or scales, making it easier to compare them.
5.(a) (i) State the multiplication law of probability.
Ans:- Multiplication Law of Probability
The multiplication law of probability states that the probability of the occurrence of two or more independent events happening simultaneously is the product of their individual probabilities. Mathematically:
P(A∩B)=P(A)×P(B)
Where:
P(A∩B) is the probability of both events A and B happening.
P(A) is the probability of event A occurring.
P(B) is the probability of event B occurring.
If the events are dependent, the law is modified to:
P(A∩B)=P(A)×P(B/A)
Where P(B/A) is the conditional probability of event B given that event A has occurred.
(ii) The probability that a person travels by plane is 1/5 and that he travels by train is 2/3 Find the probability of his travelling by plane or train. Also find the probability of not travelling either by plane or train. 2 + 3 + 1 = 6
Ans:- DOWNLOAD PDF FOR COMPLETE SOLUTION
(b) Define the following terms with one example: 2+2=4
(i) Mutually exclusive events
(ii) Equally likely events
Ans:- Definitions and Examples:
(i) Mutually Exclusive Events
Mutually exclusive events are events that cannot occur at the same time. In other words, if one event occurs, the other cannot. The occurrence of one event excludes the possibility of the other event happening.
Example:
Tossing a fair coin: The events "getting a head" and "getting a tail" are mutually exclusive because you cannot get both outcomes on a single toss of the coin. If the coin lands on heads, it cannot land on tails at the same time.
(ii) Equally Likely Events
Equally likely events are events that have the same probability of occurring. Each outcome has an identical chance of happening.
Example:
Rolling a fair six-sided die: The events "rolling a 1", "rolling a 2", "rolling a 3", and so on are equally likely, as each face of the die has a 1/6 probability of landing face up.
Summary:
Mutually exclusive events: Events that cannot happen together (e.g., heads or tails in a coin toss).
Equally likely events: Events that have the same probability of occurring (e.g., rolling any one of the six faces of a fair die).
6. (a) Write the definition of Spearman's rank coefficient. Find the rank correlation coefficient for the following data of marks obtained by 10 students in Mathematics and Statistics. 2+5=7
Ans:- DOWNLOAD PDF FOR COMPLETE SOLUTION
(b) Interpret the values of the correlation coefficient (r).
r = 0 r = +1 , r = - 1
Ans:- Interpretation of the Spearman's Rank Correlation Coefficient
The Spearman's rank correlation coefficient rs can take values between −1 and +1. The interpretation of these values is as follows:
rs=0;
No correlation: When rs=0, there is no correlation between the two variables. This means that the ranks of the two variables do not follow any specific trend or pattern (i.e., there is no relationship between the two variables).
rs=+1:
Perfect positive correlation: When rs=+1, it indicates a perfect positive correlation between the two variables. In other words, as the rank of one variable increases, the rank of the other variable increases in a perfectly consistent manner. All the data points lie on a straight line with a positive slope, and there are no deviations.
rs=−1:
Perfect negative correlation: When rs=−1, it indicates a perfect negative correlation between the two variables. In this case, as the rank of one variable increases, the rank of the other variable decreases in a perfectly consistent manner. All the data points lie on a straight line with a negative slope, and there are no deviations.
7. (a) Given below the bivariate data:
(i) Fit a regression line of Y on X and estimate y when X = 5.8
(ii) Fit a regression line of X on Y and estimate X when y = 9.5
Ans:- DOWNLOAD PDF FOR COMPLETE SOLUTION
(b) Explain the concept of Type I error and Type II error associated with Testing of Statistical hypothesis. 4
Ans:- Type I and Type II Errors in Hypothesis Testing
In hypothesis testing, two types of errors can occur when making a decision based on sample data:
Type I Error (False Positive):
Definition: Type I error occurs when the null hypothesis (H₀) is rejected when it is actually true.
Probability: The probability of making a Type I error is denoted as α (alpha), also known as the significance level. It represents the risk of concluding that there is an effect when, in reality, there is none.
Example: A court finds an innocent person guilty.
Type II Error (False Negative):
Definition: Type II error occurs when the null hypothesis is not rejected when it is actually false.
Probability: The probability of making a Type II error is denoted as β (beta). This represents the risk of failing to detect an effect that actually exists.
Example: A court finds a guilty person innocent.
The relationship between Type I and Type II errors is inversely proportional. Lowering α (making the test more stringent) increases the probability of β (failing to detect a true effect).
8. (a) Write down the mathematical form of the normal distribution. What are the properties of normal distribution? 2+5=7
Ans:- DOWNLOAD PDF FOR COMPLETE SOLUTION
(b) A random variable X follows Poisson law such that P(X=k) = P(X= k+1). Find mean and variance. 3
Ans:- DOWNLOAD PDF FOR COMPLETE SOLUTION
9. (a) The following table gives the index numbers for different groups of items with their respective weights for the year 2005 (base year 2000).
Calculate the cost of living Index number and interpret the result. 4+1=5
Ans:- DOWNLOAD PDF FOR COMPLETE SOLUTION
(b) Mention any five properties of binomial distribution. 5
Ans:- The binomial distribution is a probability distribution that describes the number of successes in a fixed number of independent trials, where each trial has two possible outcomes: success or failure. The distribution is characterized by two parameters:
n: The number of trials.
p: The probability of success on a single trial.
- Properties of Binomial Distribution:
Discrete Distribution: The binomial distribution is a discrete probability distribution, meaning it deals with discrete outcomes (e.g., number of successes in a fixed number of trials).
Fixed Number of Trials (n): The distribution is based on a fixed number of trials or experiments, denoted as nnn.
Two Possible Outcomes: Each trial has exactly two possible outcomes, typically labeled as success (S) and failure (F).
Constant Probability of Success (p): The probability of success, ppp, remains constant for each trial, and the probability of failure is 1−p1 - p1−p.
Independence of Trials: The trials are independent, meaning the outcome of one trial does not affect the outcomes of other trials.
10. (a) What do you mean by "time series"? Explain various components of time series. 6
Ans:- Time Series and Its Components:
A time series is a sequence of data points or observations collected at regular intervals over time. It is used in various fields like economics, finance, and statistics to analyze trends, patterns, and behaviors over time.
The main components of a time series are:
Trend: The long-term movement or direction in the data, which can either be upward, downward, or constant. It reflects the overall trajectory over time.
Seasonality: Regular, repeating fluctuations or patterns in the data that occur at specific intervals, such as monthly, quarterly, or yearly. These are often related to seasonal effects like weather or holidays.
Cyclic Variations: Long-term oscillations in data, typically over periods greater than one year, that are not regular like seasonality but are influenced by economic cycles or other long-term factors.
Irregular/Random Fluctuations: The random or unpredictable variations that do not follow any recognizable pattern. These could be due to rare events like natural disasters or political upheaval.
Noise: The random, unstructured variation in the data that cannot be explained by trend, seasonality, or cycles. It is often considered as the "background" noise in time series analysis
(b) From the following data find the trend values by 5 yearly moving average method:
Ans:- DOWNLOAD PDF FOR COMPLETE SOLUTION
11. (a) Explain null hypothesis and alternative hypothesis.
Ans:- Null Hypothesis and Alternative Hypothesis
- Null Hypothesis (H₀): The null hypothesis is a statement or assumption that there is no effect, no difference, or no relationship between variables in the population. It is the hypothesis that the researcher tries to test and possibly reject. In most cases, the null hypothesis assumes that any observed effect is due to random chance.Example: "There is no difference in the average test scores between male and female students."
- Alternative Hypothesis (H₁ or Ha): The alternative hypothesis is the opposite of the null hypothesis. It suggests that there is a significant effect, difference, or relationship between variables in the population. If the null hypothesis is rejected based on the sample data, the alternative hypothesis is accepted.Example: "There is a difference in the average test scores between male and female students."
The null hypothesis is typically tested using statistical methods to determine whether the sample data provides enough evidence to reject it in favor of the alternative hypothesis.
(ii) A random sample of size 5 is drawn without replacement from a finite population consisting of 41 units. If the population standard deviation is 6.25, what is the standard error of sample mean? 4+2=6
Ans:- DOWNLOAD PDF FOR COMPLETE SOLUTION
(b) Write short notes on any two of the following:
(i) Sampling error
(ii) Non-sampling error
(iii) Level of significance
Ans:- (i) Sampling Error
Sampling error refers to the natural variation or discrepancy that occurs when a sample is taken from a population. It is the difference between the sample statistic (e.g., sample mean) and the actual population parameter (e.g., population mean). Sampling error occurs because only a subset of the population is used to make inferences about the whole population. Even with a random sample, the sample's characteristics are unlikely to perfectly represent those of the population, leading to sampling error.
Example:
If you randomly select 100 students from a school of 1,000 and find that the average height is 5'6", but the true average height of the entire school is 5'8", the difference of 2 inches is the sampling error.
Sampling error can be reduced by increasing the sample size or using more sophisticated sampling techniques.
(ii) Non-sampling Error
Non-sampling errors refer to all errors that are not related to the process of sampling but still affect the accuracy of the data collection or analysis. These errors can arise due to issues such as misreporting, incorrect data entry, biases in the survey design, non-response, or errors in measurement. Non-sampling errors are often more difficult to quantify and control compared to sampling errors.
Examples of non-sampling errors:
Measurement errors: Errors that arise due to inaccurate instruments or improper procedures.
Non-response bias: Occurs when individuals chosen for the sample do not respond, and their absence affects the sample’s representativeness.
Data entry errors: Mistakes made during the transfer of data from one form to another (e.g., incorrect typing of numbers).
Non-sampling errors can lead to systematic biases in survey results and can be minimized by careful design and proper data validation.
(iii) Level of Significance
The level of significance (denoted as α) is the probability threshold used to decide whether to reject the null hypothesis in statistical hypothesis testing. It represents the maximum allowable probability of making a Type I error (false positive), which occurs when the null hypothesis is incorrectly rejected when it is actually true. Commonly used levels of significance are 0.05 (5%), 0.01 (1%), and 0.10 (10%).
Example:
If the level of significance is set at α = 0.05, this means that there is a 5% chance of rejecting the null hypothesis when it is actually true. In other words, you are willing to accept a 5% probability of making a Type I error.
The level of significance is determined before conducting the test and is crucial in interpreting the p-value. If the p-value of a test is less than the level of significance, the null hypothesis is rejected. If the p-value is greater than the level of significance, the null hypothesis is not rejected.
-000000-