(e.g. New York City has higher confidence levels than New York State)?
- When calculating confidence, we primarily consider sample size and the distribution. There could be circumstances where data for New York City has a lower sample size but a less variable distribution than New York state. In this case, the tighter distribution for New York City increases the confidence relative to New York state.
- For instance, if a compensation data point was labeled as being in New York, it could be in either New York City (where cost of labor is generally higher) or outside of New York City (where cost of labor is generally lower). Compensation for the state of New York is more broadly has less normal distribution associated with it, so despite having a larger sample, we may be labeled as lower confidence because there are multiple regions with different pay contributing to the distribution.