Duration : 16 weeks (2 hours every day) New Batch : 05-Nov-2018 |
These are state of art programs designed by Prime Classes - Promoted by a team of Data science practitioners with 15+ years of experience in IT Industry. These programs are experiential application driven programs for highly motivated working professionals to become data science practitioners.
This module aims at preparing you for the essential skill of thinking like a statistician. This module will enable you to change your analytical thinking process, and you will begin to start looking at data and numbers from a different perspective. This is a fundamental module and strong concepts in this area will enable you to differentiate yourself as a Data Scientist. This module covers â€¢ Probability theory and related algorithms â€¢ Descriptive statistical methods â€¢ Inferential statistical methods From a tools perspective, you will gain confidence with tools like R and Excel Fundamentals of Probability â€¢ Introduction to random variables â€¢ Probability theory â€¢ Conditional probability â€¢ Bayes Theorem The Concept of a data set â€¢ Understanding the properties of an attribute: Central tendencies (Mean, Median, Mode); â€¢ Measures of spread (Range, Variance, Standard Deviation) â€¢ Basics of Probability Distributions; Expectation and Variance of a variable Probability distribution and differences between discrete and continuous distributions â€¢ Discrete probability distributions: Binomial, Poisson â€¢ Continuous probability distributions: Normal distribution; t-distribution. Procedure for gaining inference about populations from samples. Understand the data attributes, distributions, sample vs population Procedure for statistical testing â€¢ Extend the understanding to analyze relationships between variables â€¢ How to conduct statistical hypothesis testing and introduction to various methods such as chi-square test, t-test, z-test, F-test and ANOVA â€¢ Covariance and Correlation and a Precursor to Regression â€¢ Hands-on Implementation in R
â€¢ Vectors, Matrices, Eigen values, Eigen vectors, Orthogonality, etc. â€¢ Kernel tricks, kernel functions, PCA, SVD, LSA â€¢ Hands-on implementation in R
Data preprocessing techniques â€¢ Python and R basics â€¢ Database Concepts â€¢ String and list objects â€¢ Exception handling â€¢ Understanding of data structures, functions, control structures, data manipulations, date and string manipulations â€¢ Pre-processing techniques: Binning, Filling missing values, Standardization and Normalization, Type conversions, train-test Data split, ROCR1
â€¢ Need for Visualizations â€¢ How to tell a Data Story â€¢ Communicating with data: Issues and guiding principles; Primary ingredients of data visualizaon; How to pick visual encodings such as color, shape, size; Which chart to use when; How to accommodate more than 2 dimensions â€¢ A case highlighting the transition from a simple chart to a powerful visualization, complete with storytelling â€¢ Using R-ggplots and Qliksens for visualizations
Introduction to Planning and Architecting Data Science Solutions â€¢ Frameworks to analyze a data science problem â€¢ How to choose an error metrics â€¢ What are the efficient ways to present results of data Science and data Analytics â€¢ What are different forms in which data is available
Fundamentals of Linear regression. â€¢ Linear regression Relationship between multiple variables: regression (Linear, Multi variate Linear Regression) in prediction. â€¢ Understanding the summary output of Linear Regression â€¢ Residual Analysis â€¢ Identifying significant features, feature reduction using AIC, multicollinearity check, observing influential points. â€¢ Non-normality and Heteroscedasticity â€¢ Hypothesis testing of regression Model â€¢ Confidence intervals of Slope â€¢ R-square and goodness of fit â€¢ Influential observations- leverage of Multiple linear Regression â€¢ Polynomial Regression â€¢ Categorical Variables in Regression â€¢ Hands-on Linear Regression Introduction and deep dive into logistic regression and the important concept of ROC curves â€¢ Logistic Regression â€¢ ROC curves â€¢ Logistic regression in classification; output interpretations â€¢ Hands-on logistic Regression Time Series Analysis â€¢ Decomposition of Time Series â€¢ Trend and Seasonality detection and forecasting â€¢ Smothering Techniques â€¢ Understanding ACF & PCF plots â€¢ ARIMA Modeling â€¢ Holt-Winter Method Principles and ideas in the field of Data Mining â€¢ Rule patterns, construction of rule-based classifier from data, turning trees into rules, rule growing strategy, rule evaluation and stopping criteria, several business metrics such as action ability, explicability and later turns towards association rules and cover them in detail. â€¢ Indirect from decision trees â€¢ Direct: Sequential covering â€¢ Market Basket Analysis, Apriori, Recommendation engines, Association Rules â€¢ How to combine clustering and classification â€¢ How to measure the quality of clustering â€“ outlier analysis â€¢ Association Analysis â€¢ FP Trees â€¢ Hands-on with R Introduction and deep dive into logistic regression and the important concept of ROC curves â€¢ Top Induction of decision trees (TDIDT) â€¢ Attribution selection based on information theory approach â€¢ Recursive partitioning (binary search) â€¢ Id3, C4.5, C5.0 for pattern recognition problems, avoiding over fitting, converting trees to rules â€¢ Hands-on with R Distance-based classifiers â€¢ K-Nearest Neighbor algorithm â€¢ Aspects to consider while designing K-Nearest Neighbor â€¢ Hands-on example of K-Nearest Neighbor using R â€¢ Collaborative filtering Neural networks â€¢ Perceptron and Single Layer Neural Network. â€¢ Back Propagation algorithm and a typical Feed Forward Neural Net. â€¢ Hands-on with R with a Case. Support vector machines (SVM). â€¢ Linear learning machines and kernel space, making kernels and working in feature space. â€¢ SVM algorithm and comparison with Neural Nets â€¢ Demonstrate the working of SVM classification problems using a business case in R. Ensemble methods â€¢ Bagging and boosting and its impact on bias and variance â€¢ C 5.0 boosting â€¢ Random Forest â€¢ AdaBoost â€¢ Gradient boosting machines Unsupervised learning algorithm-Clustering â€¢ Different clustering methods; review of several distance measures â€¢ Iterative distance-based clustering â€¢ Dealing with continuous, categorical values in K-Means â€¢ Constructing a hierarchical cluster, K-medoids, k-mode and density-based clustering to handle different types in practice â€¢ Test for stability check of clusters â€¢ Hands on implementation of each of these methods will be conducted in R. Bayesian belief nets, NaÃ¯ve Bayes, popular techniques to handle Overfitting and Under fitting â€¢ Introduction to generative techniques â€¢ Bayesian belief nets (BBN) â€¢ NaÃ¯ve Bayes- a special case of BBN â€¢ Hands-on NaÃ¯ve Bayes in R â€¢ How to avoid Overfitting and Under fitting â€¢ Refresher on all the machine learning algorithms
Text processing algorithms Basics of search engines â€¢ Introduction to the Fundamentals to the information retrieval; Language modeling â€¢ N-gram models of language Smoothing and probabilistic language models â€¢ Query likelihood model â€¢ 2-stage smoothing â€¢ Text Indexing and Crawling â€¢ Inverted Indexes â€¢ Boolean query processing â€¢ Handling phrase queries â€¢ Proximity queries â€¢ Crawling Relevance Ranking â€¢ Need for Relevance Ranking â€¢ TF and IDF â€¢ Thinking about the math behind the text; â€¢ Properties of words; Vector Space Model â€¢ Evaluation metrics for Ranking Link Analysis Algorithms â€¢ PageRank â€¢ HITS â€¢ Topic-sensitive PageRank â€¢ Spam Detection Algorithms Natural Language Processing â€¢ Stemming, phrase identification, word sense disambiguation â€¢ POS tagging Parsing and semantic structures Conference resolution Named Entity Recognition â€¢ What is NER? â€¢ Possible applications of NER â€¢ Evaluation and testing â€¢ NER methods
â€¢ Basics of neural network â€¢ Linear algebra â€¢ Implementation of neural network in Vanilla â€¢ Basics of TensorFlow â€¢ Convolutional neural networks (CNNs) â€¢ Recurrent neural networks (RNNs) â€¢ Generative models â€¢ Semi-supervised learning using GAN â€¢ Seq-to-seq model â€¢ Encoder and decoder