Skip to main content

CT week 5

Data Types and Sanity of Data


Sanity of Data

Sanity of data refers to the process of ensuring that the data being used is accurate, consistent, and reliable. This involves several checks and validations:

  • Range Checks: Ensuring numerical values fall within a specified range. For example, ages should be between 0 and 120.
  • Format Checks: Verifying that data follows a specific format, such as dates being in the format YYYY-MM-DD.
  • Consistency Checks: Ensuring related data is consistent. For example, the end date of an event should not be before the start date.
  • Uniqueness Checks: Ensuring that unique fields, like email addresses, are not duplicated in the dataset.
  • Completeness Checks: Ensuring that all required fields are filled. For instance, a form submission should not have any mandatory fields left blank.

Classroom Dataset

Data collected from a classroom setting, including student names, grades, attendance, etc.

Shopping Bill Dataset

Data from shopping bills, including items purchased, quantities, prices, and total amounts.

Words Dataset

A collection of words, often used for natural language processing tasks.

Trains Dataset

Data related to trains, including schedules, routes, and passenger information.

Expressions Dataset

A dataset containing various expressions, which could be mathematical, logical, or linguistic.

Rectangles Dataset

Data involving rectangles, including their dimensions, areas, and other properties.

Basic Datatypes


A Boolean data type represents two values: true or false. It is often used in conditional statements to control the flow of a program. For example, in a login system, a Boolean can indicate whether the user credentials are valid (true) or not (false).


A Character data type represents a single character, such as a letter, digit, or symbol. For example, 'A', '1', and '$' are characters. In programming, characters are often used to build strings or to represent individual elements in text processing.


An Integer data type represents whole numbers without any fractional part. Examples include -3, 0, and 42. Integers are used in various calculations, such as counting iterations in loops or performing arithmetic operations.

Compound Datatypes


A String is a sequence of characters used to represent text. For example, "Hello, World!" is a string. Strings are used in many applications, such as storing user input, displaying messages, and manipulating text.


A List is an ordered collection of elements, which can be of different types. For example, a list can contain integers, strings, and other lists. Lists are used to store and manipulate collections of data, such as a list of student names or a series of numbers.


A Record is a collection of fields, each with a name and a value, used to represent structured data. For example, a record can represent a student with fields for name, age, and grade. Records are used to group related data together, making it easier to manage and access.


Subtypes of Character

Different categories of characters, such as letters, digits, and symbols. For example, 'A' is a letter, '1' is a digit, and '$' is a symbol.

Subtypes of Integer

Different categories of integers, such as positive, negative, and zero. For example, 5 is a positive integer, -3 is a negative integer, and 0 is zero.

Subtypes of String

Different categories of strings, such as alphanumeric, numeric, and special characters. For example, "abc123" is an alphanumeric string, "12345" is a numeric string, and "!@#$%" contains special characters.




Popular post

IITM Notes

Course Overview “These handwritten notes encompass topics in data science and civil services. The beauty of knowledge is that you don’t need to belong to any specific group; simply maintain your curiosity, and knowledge will find its way to you. I hope these notes are helpful. If they are, please consider leaving a comment below and follow my blog for updates.” Mathematics 1 👉 Select Week Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 Week 8 Week 9 Week 10 Week 11 Revision Statistics 1 👉 Select Week Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 Week 8 Week 9 Week 10 Week 11

Maths 1 week 1 Summary

Number System and Set Theory 📚 Number System and Set Theory This week, our teacher covered the basics of the number system. We were instructed to consider 0 as part of the natural numbers, as it will be treated as such in future subjects like Python. However, in exams, it will be explicitly stated whether 0 should be considered a natural number. The key topics from this week include set theory and the relationship between two sets. In set theory, we focused on three Venn diagram problems. In the context of relations, we discussed the concepts of reflexive, symmetric, transitive, and equivalence relations. Detailed Explanation 1.Union of Two Sets The union of two sets A and B is the set of elements that are in either A , B , or both. It is denoted as A ∪ B . 2.Intersection of Two Sets The intersection of two sets A and B is the set of elements that are in both A and B . It is denoted as A ∩ B . 3.Subt

Community page

Welcome To our IITM BS Students Community This community is a student commune where IIT Madras Bachelor of Science students are studying. Our community is managed by 15 community admins who oversee our WhatsApp community, Discord, and Telegram profiles. With more than 1000+ active members, we study together, share memes, watch movies, play games, and have fun. Our goal is to bring all online IITM students together to excel in exams while having fun. Community Admins Agampreet LinkedIn Ansh Ashwin Ambatwar Arti Dattu Dolly Elango Koushik Shrijanani Saksham Shivamani Shivam Instagram LinkedIn Join Our Community Subscribe to our YouTube page Join our meme team on