Machine Learning Fundamentals

Description

This course will introduce developers to the Hadoop ecosystem, focus on multiple programming models including MapReduce, Pig, Hive, and Apache Spark.

Objectives

• Developing MapReduce programs in Java
• Analyzing Data using Pig and Hive
• Distributed programming with Spark
• Using Machine learning algorithms

Garis Besar Pelatihan

Lesson 1: Introduction to scikit-learn

• scikit-learn
• Data Representation
• Data Preprocessing
• scikit-learn API
• Supervised and Unsupervised Learning

Lesson 2: Unsupervised Learning: Real-life Applications

• Clustering
• Exploring a Dataset: Wholesale Customers Dataset
• Data Visualization
• k-means Algorithm
• Mean-Shift Algorithm
• DBSCAN Algorithm
• Evaluating the Performance of Clusters

Lesson 3: Supervised Learning: Key Steps

• Model Validation and Testing
• Evaluation Metrics
• Error Analysis

Lesson 4: Supervised Learning Algorithms: Predict Annual Income

• Exploring the Dataset
• Naïve Bayes Algorithm
• Decision Tree Algorithm
• Support Vector Machine Algorithm
• Error Analysis

Lesson 5: Artificial Neural Networks: Predict Annual Income

• Artificial Neural Networks
• Applying an Artificial Neural Network
• Performance Analysis

Lesson 6: Building your own Program

• Program Definition
• Saving and Loading a Trained Model
• Interacting with a Trained Mode

Target Audience

This course is perfect for beginners in the field of machine learning. No prior knowledge of the use of scikit-learn or machine learning algorithms is required. The students must have prior knowledge and experience of Python programming.

Categories Data Science