course details

Data Science Foundation
Avatar

Duration : 5 weeks (2 Hrs every day)

New Batch : 07-Nov-2018

About Course

These are state of the art programs designed by Prime Classes - Promoted by a team of Data science practitioners with 15+ years of experience in IT Industry. These programs are experiential application driven programs for highly motivated working professionals to become data science practitioners.

Curriculum

The Concept of a data set • Understanding the properties of an attribute: Central tendencies (Mean, Median, Mode); • Measures of spread (Range, Variance, Standard Deviation) • Basics of Probability Distributions; Expectation and Variance of a variable Probability distribution and differences between discrete and continuous distributions • Discrete probability distributions: Binomial, Poisson • Continuous probability distributions: Normal distribution; t-distribution Procedure for gaining inference about populations from samples. Understand the data attributes, distributions, sample vs population Procedure for statistical testing • Extend the understanding to analyze relationships between variables • How to conduct statistical hypothesis testing and introduction to various methods such as chi-square test, t-test, z-test, F-test and ANOVA • Covariance and Correlation and a Precursor to Regression Hands-on Implementation in R


Data preprocessing techniques • R basics • Understanding of data structures, functions, control structures, data manipulations, date and string manipulations • Pre-processing techniques: Binning, Filling missing values, Standardization and Normalization, Type conversions, train-test Data split, ROCR1


• Need for Visualizations • How to tell a Data Story • A case highlighting the transition from a simple chart to a powerful visualization, complete with storytelling


• Frameworks to analyze a data science problem • How to choose an error metrics • What are the efficient ways to present results of data Science and data Analytics • What are different forms in which data is available


Fundamentals of Linear regression. • Linear regression • Relationship between multiple variables: regression (Linear, Multi variate Linear Regression) in prediction. • Understanding the summary output of Linear Regression Introduction and deep dive into logistic regression and the important concept of ROC curves • Logistic Regression • ROC curves • Hands-on logistic Regression Time Series Analysis • Decomposition of Time Series • Trend and Seasonality detection and forecasting • Understanding ACF & PCF plots • ARIMA Modeling Principles and ideas in the field of data mining • Rule patterns, construction of rule-based classifier from data, turning trees into rules, rule growing strategy, rule evaluation and stopping criteria, several business metrics such as action ability, explicability and later turns towards association rules and cover them in detail. • Indirect from decision trees • Direct: Sequential covering • Market Basket Analysis, Apriori, Recommendation engines, Association Rules • How to combine clustering and classification How to measure the quality of clustering – outlier analysis • Association Analysis • Hands-on with R Introduction and deep dive into logistic regression and the important concept of ROC curves • Top Induction of decision trees (TDIDT) • Attribution selection based on information theory approach • Id3, C4.5, C5.0 for pattern recognition problems, avoiding over fitting, converting trees to rules • Hands-on with R Distance-based classifiers Neural Networks • Perceptron and Single Layer Neural Network. • Back Propagation algorithm and a typical Feed Forward Neural Net. • Hands-on with R with a Case Unsupervised learning algorithm-Clustering • Different clustering methods; review of several distance measures • Iterative distance-based clustering • Dealing with continuous, categorical values in K-Means • Constructing a hierarchical cluster, K-medoids, k-mode and density based clustering to handle different types in practice. • Test for stability check of clusters • Hands-on implementation of each of these methods will be conducted in R Naïve Bayes in R Popular techniques to handle Overfitting and Under fitting • Hands-on Naïve Bayes in R • How to avoid Overfitting and Under fitting Relevance ranking • Relevance Ranking • Need for relevance Ranking • TF and IDF • Thinking about the math behind the text; Properties of words; Vector Space Model • Evaluation metrics for Ranking


Learn With us

upcoming events

25-Nov-2018

Example

  • 7AM to 10AM
  • Dummy Address
  • About :- Example
17-Nov-2018

a1

  • 9 AM to 12 PM
  • any1
  • About :- any2
15-Nov-2018

any

  • 9 AM to 12 PM
  • any1
  • About :- any2