The data munging process in Python: An overview-0.1

The data munging process in Python: An overview-0.1
Photo by Claudio Schwarz / Unsplash

In my previous abstract, we have learnt the following about data munging process: what is a DataFrame and how to create one. Refer to my previous article. This abstract will give you a brief idea about munging process.

Step 1(Continued..): Inspect data –

  • Checking attributes- DataFrame:

Let's look at some ways to access the data in a DataFrame. We already did marginally of this back in an anterior edification when we did the broad overview of Pandas Data Structures, so let’s expeditiously go over this. Now checking DataFrame Attributes, which are auxiliary when we optate to fetch information cognate to a particular DataFrame. The important data frame attributes are index, columns, dtypes, info, shape, size, count, astype, transpose(T). Let's try to understand these attributes by considering the following example.




This will fetch the index’s names of the DataFrame.


Give the column labels of the DataFrame.


It gives a tuple that represents the dimensionality of a DataFrame.


Return the dtypes in the DataFrame.  This will return a Series object with the column names as the index labels and the corresponding data types as the values.

This method returns information about the DataFrame, such as index dtype and column, non-null value, and memory usage.


For Series, returns the number of rows. Otherwise, in the case of DataFrame, it returns the number of rows multiplied by the number of columns.


It counts non-NA cells for each column or row.


It will change the data types of the DataFrame that we’re working with.


This attribute used to transpose the DataFrame

Read more attributes at Pandas DataFrame

  • Check for – value, missing values:

While checking if an item exists or not: use ‘in’ keyword.

To Detect if there is any missing data: missing data in Pandas appears as NaN (Not a number), and to detect them, we use- isnull() and notnull() functions.

All right, so in the next lesson, we’ll talk about Pandas Descriptive Statistics on numerical and categorical data and also will move forward to step two i.e. Clean & Data Manipulation.