Exploring Two–Variable Data
How is the direction of a scatterplot determined?
By identifying outliers
By calculating the correlation coefficient
By analyzing the trend from left to right
By comparing the clusters
If a linear regression analysis produces residuals that display a clear pattern when plotted against predicted values, what conclusion could be made regarding model fit?
A visible pattern among residuals indicates an exceptionally high correlation between variables modeled linearly.
The absence of randomness in residuals implies that this model fits perfectly for prediction purposes.
Patterns within residuals mean that individual predictions will all be equally accurate or inaccurate across datasets modeled.
The presence of patterns suggests that there is non-linearity not captured by the linear model applied to these data.
What value indicates no linear association between two quantitative variables in a correlation analysis?
A researcher is testing if the time spent on homework (X) predicts scores on a math test (Y) but suspects a quadratic relationship; which analysis would most appropriately test this prediction?
Performing a Pearson correlation coefficient analysis between X and Y.
Conducting a regression with both and as predictor variables.
Applying a t-test comparing high scorers' and low scorers' average time on homework.
Utilizing a one-way ANOVA with different time categories of homework.
What is an influential point in a scatterplot?
A data point that is significantly different from the rest of the data
A data point that has a significant impact on the regression line or fitted model
A data point that breaks the overall trend
A data point with high leverage
If the variability in a data set decreases while maintaining a constant sample size and mean, how will this affect the width of a 95% confidence interval for estimating the population mean?
The width of the confidence interval will decrease.
The change in variability does not affect confidence intervals.
The width of the confidence interval will remain unchanged.
The width of the confidence interval will increase.
What could be a reason why a student finds a very high p-value during their analysis?
Misinterpreting the code used to calculate, giving incorrect results.
Not collecting data properly, leading to skewed distributions and higher results.
Lack strong evidence supporting the alternative hypothesis thought prior to conducting the study.
Confusing terms like standard deviation and variance, which caused errors in computation.

How are we doing?
Give us your feedback and let us know how we can improve
What inference can one make if a simple random sampling method results in nearly identical means but vastly different standard deviations for two sets comparing heights at different schools?
Sampling error is likely the cause of the discrepancies observed in standard deviation, rather than true population variability.
Standard deviations are unaffected by sampling methods, thus reflecting the exact variability within the populations being compared.
Means are unreliable indicators of variation, so the difference in standard deviations is irrelevant.
While both schools might have similar average height, students vary more in their heights at one school than the other.
If a linear regression model has a high R-squared value for one dataset, which of the following is most likely to be true when applying the same model to a new dataset with a different variance in the explanatory variable?
Variance in the explanatory variable does not affect the model's predictive accuracy on new data.
The new dataset will automatically conform to the original model's predictions.
The model may not fit well if the relationship between variables differs.
The predictive accuracy will remain consistently high due to the high R-squared value.
How is the form of a scatterplot described?
The general shape of the plotted points
The presence of outliers
The overall strength of the data
The direction of the scatter