Help with correlation formula in Numbers for data analysis
Hi,
I’m working on a project in Apple Numbers that involves analyzing weather forecast data and comparing it with observed values. I use different weather prediction models to calculate minimum and maximum temperatures, and I would like to calculate the correlation coefficient between the predicted and observed values. However, I am encountering issues with a formula that uses the INDIRECT function to dynamically reference cells from different sheets.
Here’s an overview of my project:
- Data: I have weather prediction models (e.g., "SwissHD 4x4") and corresponding observed data (minimum and maximum temperatures) on separate sheets in my Numbers file.
- Objective: I want to calculate the correlation coefficient between the predicted and observed values for various models. This needs to be done for a fixed cell range (from cell 3 to 258, i.e., from April 20th to December 31st), without the cell ranges changing based on the selected model.
- Formula: I am trying to use the INDIRECT function to reference forecast and observed data dynamically based on the model name found in a specific cell (B1).
The formula I’ve tried is this:
CORREL(
INDIRECT("'" & 'Previsioni e dati reali'::Tabella 1::B$1 & "'!B3:B258"),
INDIRECT("'" & 'Previsioni e dati reali'::Tabella 1::B$1 & "'!R3:R258")
)
Problem:
This formula is giving me a syntax error. It seems like the references are not being evaluated correctly and the INDIRECT function is not returning the expected ranges. The B$1 cell (which contains the model name) works for other purposes, but not in this case for building the dynamic range. I have tried several variations, but nothing seems to work.
What I’ve tried:
- I’ve used INDIRECT to dynamically reference the cells based on the model name in B$1.
- I’ve ensured that the sheet names are correct and properly enclosed in quotes (e.g., 'Previsioni e dati reali').
- I’ve checked that the cell ranges (e.g., B3:B258 and R3:R258) are correctly specified.
Previous formula:
I initially used a simpler formula to calculate the correlation between the forecast and actual data:
CORREL('Previsioni e dati reali'::Tabella 1::B3:B258, 'Previsioni e dati reali'::Tabella 1::R3:R258)
Unfortunately, this formula did not return any value because 99% of the cells in the selected range were empty or contained errors due to formulas, so the correlation couldn’t be properly calculated. The formula I am seeking should ignore empty cells or cells containing errors, and focus only on valid values.
It would be very helpful to receive advice on how to correctly format this INDIRECT function to calculate the correlation for a fixed cell range, independent of the selected model, and to ignore empty or error-containing cells.
Thank you so much in advance for your help!
(I don't know if the formulas I wrote to you were translated well from Italian, but I assure you that in my language I wrote the name of the formulas well, in case you were thinking)
MacBook Air 13″, macOS 14.6