A single categorical variable is mostly analyzed by Frequency Distribution.

A table or a graph displaying the occurrence frequency of various outcomes is called Frequency Distribution. The commonly used tabular and graphical methods for frequency distribution analysis are:

**Frequency Table**is used to show the frequency distribution in tabular form.**Bar Plot**is used to show the frequency distribution in a visual format.**Pie Chart**to show the distribution in proportions.**Proportions****help us compare parts of a whole**.

Let’s take the example of “**MBA Students**” data to learn more about frequency distribution analysis. (read the previous blog link)

**Data Import**

**Frequency Distribution Analysis of the MBA Specialization field **

We would like to know the **frequency distribution of the MBA Students** by their choice of **Specialization**. This information is captured in the variable – “mba_specialization” and it is of type – “categorical”. To be more specific, mba_specialization is of Nominal Variable type.

We can summarize the “mba_specialization” by using **Frequency Distribution Table, Proportions, Bar Plot, Pie Chart.**

**Tabular Methods: ****Frequency Distribution Table**

**Numerical Methods: ****Mode**

The Mode is the only measure of central tendency which can be used for Nominal Variables. From the above table, the Mode is 80 and the corresponding category for mode is **Finance.
**

**Graphical Methods: ****Bar plot**

**Graphical Methods: ****Pie Chart**

**Interpretation / Take away:**

**40%**of the students are specializing in Finance.**35%**of the students have chosen Marketing as their field of specialization- HR and Business Analytics have
**25**students each.

**Graduation Degree**

Let’s say, we wish to know the Graduation background of the students pursuing the MBA course. This information is captured in the variable – “grad_degree” and it is of type – “categorical”.

**Tabular Methods: ****Frequency Distribution Table**

The analysis of grad_degree as shown below shows that there are 45 distinct values. As the number of graduation degree categories are many, we should recategorize them. **Recategorization** is process of categorizing again, i.e., the act of assigning something to another category. E.g. **B.E – Mechanical, B.E – Computers **categories in our data can be recategorized as** B.E.**

*Note: We could have also recategorized the graduation degree as Science, Commerce, and Arts.*

**Interpretation / Take away:**

**41%**of the students are from B.Com or Accounting & Finance background**31.5%**of the students have B.E. / B.Tech background. Another 6% are from B.Sc.**B.M.S.**is the third major category with 15.5% of students.

**Practise Exercise **

- Create the Bar Plot and Pie Chart for Graduation using the “
**grad_deg_recat**” column. - Draw inferences for the “
**gender**” variable - Recategorize “
**ten_plus_2_stream**” variable and analyze.

**Next Blog**

In the next blog, we will learn to analyze a single continuous variable using Histogram and Density Plots.

<<< previous | next blog >>>

<<< statistics blog series home >>>

## Recent Comments