π
Sanity of Data
Sanity of data refers to the process of ensuring that the data being used is accurate, consistent, and reliable. This involves several checks and validations:
- Range Checks: Ensuring numerical values fall within a specified range. For example, ages should be between 0 and 120.
- Format Checks: Verifying that data follows a specific format, such as dates being in the format YYYY-MM-DD.
- Consistency Checks: Ensuring related data is consistent. For example, the end date of an event should not be before the start date.
- Uniqueness Checks: Ensuring that unique fields, like email addresses, are not duplicated in the dataset.
- Completeness Checks: Ensuring that all required fields are filled. For instance, a form submission should not have any mandatory fields left blank.
Classroom Dataset
Data collected from a classroom setting, including student names, grades, attendance, etc.
Shopping Bill Dataset
Data from shopping bills, including items purchased, quantities, prices, and total amounts.
Words Dataset
A collection of words, often used for natural language processing tasks.
Trains Dataset
Data related to trains, including schedules, routes, and passenger information.
Expressions Dataset
A dataset containing various expressions, which could be mathematical, logical, or linguistic.
Rectangles Dataset
Data involving rectangles, including their dimensions, areas, and other properties.
Basic Datatypes
Boolean
A Boolean data type represents two values: true or false. It is often used in conditional statements to control the flow of a program. For example, in a login system, a Boolean can indicate whether the user credentials are valid (true) or not (false).
Character
A Character data type represents a single character, such as a letter, digit, or symbol. For example, 'A', '1', and '$' are characters. In programming, characters are often used to build strings or to represent individual elements in text processing.
Integer
An Integer data type represents whole numbers without any fractional part. Examples include -3, 0, and 42. Integers are used in various calculations, such as counting iterations in loops or performing arithmetic operations.
Compound Datatypes
Strings
A String is a sequence of characters used to represent text. For example, "Hello, World!" is a string. Strings are used in many applications, such as storing user input, displaying messages, and manipulating text.
Lists
A List is an ordered collection of elements, which can be of different types. For example, a list can contain integers, strings, and other lists. Lists are used to store and manipulate collections of data, such as a list of student names or a series of numbers.
Records
A Record is a collection of fields, each with a name and a value, used to represent structured data. For example, a record can represent a student with fields for name, age, and grade. Records are used to group related data together, making it easier to manage and access.
Subtypes
Subtypes of Character
Different categories of characters, such as letters, digits, and symbols. For example, 'A' is a letter, '1' is a digit, and '$' is a symbol.
Subtypes of Integer
Different categories of integers, such as positive, negative, and zero. For example, 5 is a positive integer, -3 is a negative integer, and 0 is zero.
Subtypes of String
Different categories of strings, such as alphanumeric, numeric, and special characters. For example, "abc123" is an alphanumeric string, "12345" is a numeric string, and "!@#$%" contains special characters.
PA:
GA:
Comments
Post a Comment