Statistics for categorical data - a list of ideas
Below is a quick list of things that could be added to or changed in the statistics for categorical data:
STATISTICS TABLE
- Add the count of missing values
- Frequencies: The number of observations for a particular category
- Proportions: The percent that each category accounts for out of the whole
- Confidence intervals for frequencies and proportions
- Cumulative frequency
- Cumulative proportion
- The total by column
- Option: Sort by frequency
STATISTICAL PLOTS
- Add bar plots for proportions
- Option: Add confidence intervals to bars (error bars)
- Add a tick at (0,0) on the x-axis (because such a tick is also added at the other side of the x-axis)
- Show cumulative percentage at each data point on the cumulative percentage line
- A stacked bar chart with bars sorted by proportions with vertical text alongside containing category labels
THREE TYPES OF CATEGORICAL DATA
- Sorting order of bars for ordinal and interval data could in principle be different than for nominal data (currently all bars are sorted by size), but it's not clear how user could define the order of data in a different way than by defining those three types in the properties explorer
An example of an information-dense table:
An example of a stacked bar chart:
Examples of inappropriate and appropriate uses of lines in a graph (Stephen Few)
Links:
- https://openintro-ims.netlify.app/explore-categorical.html
- https://planspace.org/20210907-try_a_table_instead_of_a_pareto_chart/
- https://www.qimacros.com/pareto-chart-excel/pareto-charts-common-problems/
- https://www.perceptualedge.com/articles/dmreview/quant_vs_cat_data.pdf
- https://www.perceptualedge.com/articles/.../quant_vs_cat_data.pdf
- http://peltiertech.com/excel-category-axis-types/
- https://en.wikipedia.org/wiki/List_of_analyses_of_categorical_data
Edited by Alexander Semke