A fish labeled 'Accurate Data' swimming in a sea labeled 'Context'.

Levels of Aggregation and Disaggregation

Tip: Use crosstabs to break down the demographic characteristics of participants to help determine where levels of disaggregation can be done. If almost all of the participants are from one demographic subcategory, for example, the middle class, then it is not necessary nor perhaps appropriate to disaggregate data for analysis by socioeconomic status. However the results could not then be generalized to other socioeconomic groups.

Rationale: Cell size is an important factor in determining areas of disaggregation. If you are disaggregating by gender and by race/ethnicity, then the number of White women would be one cell, the number of Black men would be another cell, and so on. If some cell sizes are very small at a specific level of disaggregation, then statistical analysis at that level may not be appropriate. For example, a nonparametric statistical test such as a chi square test requires a minimum expected value of 5 for each cell in order to be valid. For parametric statistical tests such as t-tests, a cell size of 30 is considered the minimum. In addition in parametric statistics, the smaller the sample or cell size, the more difficult it is to get statistical significance.1 The probability that a significant result will be obtained if a real difference exists (which is called the power of the test) depends largely on the total sample size. Based on the size of the sample, you can compute the power of the test. Alternatively, if you know what power you want the test to have, you can compute the needed sample size to get that power.2

Tip: If there are known or expected differences by subgroup that could skew the overall findings, then disaggregate by those subgroups.

Tip: Be aware that there can be heterogeneity within subgroups. For example, while people who are visually impaired, hearing impaired, and learning disabled are all classified as having disabilities, the differences among them are very large and it might be appropriate to disaggregate by different categories of disability.

Tip: Do preliminary analysis of subgroup differences in areas of importance to the study. This can help inform disaggregation and aggregation decisions.

Rationale: There may be some cases where disaggregation is not needed. As Jolly explained, “when you mix ammonia and Clorox in a room, everyone gets sick. At that level it is not necessary to disaggregate.”3 There are other cases where project/program impact may vary for different subgroups and disaggregation is needed. For example, since women students tend to exhibit lower skills in some spatial areas, such as 3-D rotation, that are important to success in engineering,4 projects/programs tying improved spatial skills to increased retention in engineering should disaggregate their data by sex to determine if there are statistically significant interactions between sex and impact of program participation in a spatial skills training program. Other possible areas that need to be considered in decisions about disaggregation include:

Tip: Provide a rationale for the decisions made regarding which demographic categories are aggregated and which are disaggregated.

Rationale: For statistical reasons and often for confidentiality, some aggregation of data needs to be done even though information will be lost in each aggregation. For example, when aggregation is done across disability groups, the ability to determine if a project/program has different impacts on people with learning disabilities and people with mobility impairment is lost. Another example shows the aggregation of Native Americans with other groups because of their small numbers means that little is known about Native Americans and STEM.
While evaluators must assume responsibilities for capturing and correctly interpreting within-group variability for the groups under study,9 types of disaggregation must be both meaningful and viable. If, for example, there is an interest in trend data, aggregation across years is not appropriate. If there is reason to think there might be different trends for different subgroups it is important to disaggregate by those subgroups. Since there are gender differences in some spatial skills, if you are interested in the impact of a project/program to improve spatial skills, then it is important to disaggregate by gender. Based on the questions to be answered, it might be more appropriate to aggregate across subdisciplines, across institutions, across years, or across some racial/ethnic or disability categories.

1 Mehta, C. R., & Patel, N. R. (2011). IBM SPSS Exact tests.
2 Lachin, J. M. (1981). Introduction to sample size determination and power analysis for clinical trials. Controlled Clinical Trials 2, 93-113.
3 Jolly, E. J. (9/07/12). Personal communication.
4 Metz, S. S., Donohue, S. & Moore, C. (2012). Spatial skills: A focus on gender and engineering. In B. Bogue & E. Cady (Eds.), Apply Research to Practice (ARP) Resources.
5 Center for Disease Control. (2011). Percentage of Children Aged 5-17 years ever receiving a diagnosis of learning disability, by race/ethnicity and family income group - National Health Interview Survey, United States, 2007-2009.
6 Science and Engineering Indicators 2012. (2012). Arlington, VA: National Science Foundation.
National Academy of Sciences (NAS), Expanding Underrepresented Minority Participation: America's Science and Technology Talent at the Crossroads 2011, Washington, DC: National Academies Press.
7 National Student Clearinghouse Research Center (Spring, 2012) Snapshot report: Mobility.
National Student Clearinghouse Research Center (Spring, 2012) Snapshot report: Degree attainment.
8 Bell, N. (2010, March). Research report on data sources: Time-to-degree for doctorate recipients. Communicator, 1-3. Washington, D.C.: Council of Graduate Schools.
Huang, G., Taddese, N., & Walter, E. (2000). Entry and persistence of women and minorities in college science and engineering education (No. NCES 2000601). Washington, DC: National Center for Education Statistics.
9 Nelson-Barber, S., LaFrance, J., Trumbull, E., & Aburto, S. (2005). Culturally-responsive program evaluation. In S. Hood, R. Hopson, & H. Frierson (Eds.), The role of culture and cultural context: A mandate for inclusion, the discovery of truth and understanding in evaluative theory and practice, (pp. 61-85). Greenwich, CT: Information Age Publishing.