This course will introduce developers to the Hadoop ecosystem, focus on multiple programming models including MapReduce, Pig, Hive, and Apache Spark.
• Developing MapReduce programs in Java
• Analyzing Data using Pig and Hive
• Distributed programming with Spark
• Using Machine learning algorithms
• scikit-learn
• Data Representation
• Data Preprocessing
• scikit-learn API
• Supervised and Unsupervised Learning
• Clustering
• Exploring a Dataset: Wholesale Customers Dataset
• Data Visualization
• k-means Algorithm
• Mean-Shift Algorithm
• DBSCAN Algorithm
• Evaluating the Performance of Clusters
• Model Validation and Testing
• Evaluation Metrics
• Error Analysis
• Exploring the Dataset
• Naïve Bayes Algorithm
• Decision Tree Algorithm
• Support Vector Machine Algorithm
• Error Analysis
• Artificial Neural Networks
• Applying an Artificial Neural Network
• Performance Analysis
• Program Definition
• Saving and Loading a Trained Model
• Interacting with a Trained Mode
This course is perfect for beginners in the field of machine learning. No prior knowledge of the use of scikit-learn or machine learning algorithms is required. The students must have prior knowledge and experience of Python programming.
Hubungi Kami