Foundations of Data Science : TYBCS : SPPU : Syllabus - BCS Guruji

Ad

Sunday, August 27, 2023

Foundations of Data Science : TYBCS : SPPU : Syllabus

 Course Contents


Chapter 1 Introduction to Data Science 


Introduction to data science, The 3 V’s: Volume, Velocity, Variety

Why learn Data Science?

Applications of Data Science

The Data Science Lifecycle

Data Scientist’s Toolbox

Types of Data

Structured, semi-structured, Unstructured Data, Problems with unstructured

data

Data sources

Open Data, Social Media Data, Multimodal Data, standard datasets

Data Formats

Integers, Floats, Text Data, Text Files, Dense Numerical Arrays, Compressed or

Archived Data, CSV Files, JSON Files, XML Files, HTML Files , Tar Files,

GZip Files, Zip Files, Image Files: Rasterized, Vectorized, and/or Compressed


Chapter 2 Statistical Data Analysis 


2.1.Role ofstatistics in data science

2.2.Descriptive statistics

Measuring the Frequency

Measuring the Central Tendency: Mean, Median, and Mode

Measuring the Dispersion: Range, Standard deviation, Variance, Interquartile

Range

2.3.Inferentialstatistics

Hypothesis testing, Multiple hypothesis testing, Parameter Estimation methods,

2.4.Measuring Data Similarity and Dissimilarity

Data Matrix versus Dissimilarity Matrix, Proximity Measures for Nominal

Attributes, Proximity Measures for Binary Attributes, Dissimilarity of Numeric

Data: Euclidean, Manhattan, and Minkowski distances, Proximity Measures for

Ordinal Attributes

2.5.Concept of Outlier, types of outliers, outlier detection methods


Chapter 3 Data Preprocessing 


Data Objects and Attribute Types: What Is an Attribute?, Nominal , Binary, Ordinal

Attributes, Numeric Attributes, Discrete versus Continuous Attributes

Data Quality: Why Preprocess the Data?

3.3.Data munging/wrangling operations

Cleaning Data - Missing Values, Noisy Data (Duplicate Entries, Multiple

Entries for a Single Entity, Missing Entries, NULLs, Huge Outliers, Out‐of‐

Date Data, Artificial Entries, Irregular Spacings, Formatting Issues - Irregular

between Different Tables/Columns, Extra Whitespace, Irregular Capitalization,

Inconsistent Delimiters, Irregular NULL Format, Invalid Characters,

Incompatible Datetimes)

Data Transformation – Rescaling, Normalizing, Binarizing, Standardizing,Label and One

Hot Encoding

Data reduction

Data discretization


Chapter 4 Data Visualization 


Introduction to Exploratory Data Analysis

Data visualization and visual encoding

Data visualization libraries

Basic data visualization tools

Histograms, Bar charts/graphs, Scatter plots, Line charts, Area plots, Pie charts,

Donut charts

Specialized data visualization tools

Boxplots, Bubble plots, Heat map, Dendrogram, Venn diagram, Treemap, 3D

scatter plots

Advanced data visualization tools- Wordclouds

Visualization of geospatial data

Data Visualization types


No comments:

Post a Comment