banner-img

Data Science

Data scientists are building machine learning models, driving business strategy, and commanding some of the highest salaries in tech. This is where you build the expertise to join them.

Duration

22 weeks at 20 hours/week

Level

Intermediate

Start Date

Jul 20, 2026

Format

Learn at your own pace.

Register Interest

Where our Data Science graduates work

People are landing roles at leading companies after completing this programme. You could be next.

Catherine O.
Lead Editor
FW Africa

Dela K.
Facilitator
Angels Specialist School International

Hope A.
Investigation Officer II
Public Complaints Commission

bg-img

“Other than technical lessons, one thing I learned from ALX is that success comes from stepping out of your comfort zone.”

Dorine M

Data Scientist

Mastercard Foundation

Why people choose ALX for their career in AI & Data

“I started as a high school student with an idea. Today, I’m a tech professional while building my own startup, Cheemba — a journey shaped through ALX programmes.

Gisele

What you’ll learn

Data and AI Literacy Foundations

Equip learners with the foundational mindset and technical literacy needed to solve complex problems using structured reasoning, programmatic logic, and the EGAD framework.

Outcomes

Develop the analytical mindset and technical foundation to solve complex problems using structured reasoning, programmatic logic, and the EGAD framework.

Enables learners to transform raw data into reliable business insights by mastering data cleaning, governance, and statistical reasoning within a spreadsheet environment. Learners will develop the technical proficiency to source and prepare datasets, apply descriptive analytics, and use AI-powered tools to visualize patterns and validate assumptions for data-driven decision-making.

Outcomes

Transform raw data into reliable business insights by mastering data cleaning, statistical analysis, and AI-powered visualization within a spreadsheet environment..

A comprehensive foundation in relational database management, focusing on the ability to design, query, and optimize complex data structures using SQL. Learners will master everything from basic data retrieval to advanced analytical functions, window functions, and database normalization, all while applying best practices within production-grade notebook environments.

Outcomes

Design, query, and optimize relational databases using SQL—from basic data retrieval to advanced analytical functions and database normalization.

This course focuses on transforming complex datasets into impactful visual stories by mastering data modeling, Power Query transformations, and DAX expressions within Power BI. Learners will develop the skills to design interactive dashboards and accessible reports that effectively communicate insights to both technical and non-technical stakeholders.

Outcomes

Build interactive Power BI dashboards and reports that turn complex data into compelling visual stories using data modeling, Power Query, and DAX.

Master Python fundamentals from data structures and control flow to modular functions while integrating Git/GitHub and AI-assisted coding (Copilot/Claude) into a professional development workflow.

Outcomes

Master Python fundamentals data structures, control flow, and modular functions while building a professional workflow with Git, GitHub, and AI coding tools.

Explore software architecting by mastering Object-Oriented Programming (OOP), algorithmic complexity (Big O), and advanced functional techniques. Learn to build professional-grade, scalable codebases and leverage NumPy and pandas for high-performance data manipulation, all while using AI to optimize class hierarchies and complex data transformations.

Outcomes

Write scalable, professional-grade Python code using Object-Oriented Programming, algorythmic complexity analysis, AI optimization, and advanced functional techniques with NumPy and pandas for high-performance data manipulation.

Transition from data analysis to predictive modeling by mastering the supervised machine learning workflow. Build, diagnose, and optimize Linear Regression models (Simple and Multiple) using scikit-learn, handle high-dimensional data with Ridge and LASSO regularization, and deploy your models via object serialization (pickling). Use AI to scaffold ML pipelines and translate complex error metrics (RMSE/MAE) into actionable business insights.

Outcomes

Develop predictive models. Build, evaluate, and deploy supervised machine learning models using scikit-learn, applying regularization techniques and translating model performance into actionable business insights.

Master the art of predicting categories and managing complex model trade-offs. Build non-linear models like Decision Trees and Random Forests, solve classification problems with Logistic Regression, and learn to handle “needle-in-a-haystack” scenarios using SMOTE for imbalanced data. Use AI to trace complex decision logic and translate technical metrics like Precision and Recall into high-stakes business strategies.

Outcomes

Predict categories and navigate model trade-offs using Decision Trees, Random Forests, and Logistic Regression—including strategies for handling imbalanced datasets in high-stakes business scenarios.

Use the most powerful tools in the classifier’s arsenal, moving beyond basics to Support Vector Machines (SVMs), K-Nearest Neighbors (KNN), and Naive Bayes. Bridge the gap to deep learning by architecting Neural Networks with TensorFlow/Keras and learn the rigorous science of Model Selection—using cross-validation and automated Grid Search to systematically prove which algorithm is truly the “best fit” for your data.

Outcomes

Apply advanced classifiers SVMs, KNN, Naive Bayes, and Neural Networks and use cross-validation and Grid Search to rigorously identify the best model for your data.

Uncover the hidden architecture of unlabeled data by mastering Clustering and Dimensionality Reduction. Simplify massive datasets using Principal Component Analysis (PCA), visualize high-dimensional structures with t-SNE and UMAP, and segment populations using K-Means and Hierarchical Clustering. Use AI to detect anomalies and transform abstract cluster centroids into vivid, actionable business personas.

Outcomes

Uncover hidden patterns in unlabeled data using clustering and dimensionality reduction techniques PCA, t-SNE, K-Means, and more and translate results into actionable business personas.

Explore the science of personalization and spatial analysis by implementing Gaussian Mixture Models (GMMs) for probabilistic “soft clustering” and building advanced Recommendation Engines. Process geographic data using GeoPandas, implement both Content-Based and Collaborative Filtering, and leverage AI zero-shot classification to solve the “Cold-Start” problem for new items.

Outcomes

Build recommendation engines and analyze geographic data using probabilistic clustering, content-based and collaborative filtering, and AI-powered zero-shot classification to solve cold-start challenges.

Transform messy, unstructured text into actionable intelligence by mastering Natural Language Processing (NLP). Build robust text-cleaning pipelines covering Regex, tokenization, and lemmatization and convert language into math using TF-IDF and N-Grams. Step into modern AI by deploying Hugging Face Transformers for high-accuracy text classification without the need for manual training.

Outcomes

Transform unstructured text into actionable intelligence using NLP pipelines, TF-IDF, and modern Hugging Face Transformers for high-accuracy text classification.

Have any questions?

Hi, I’m LEA, your ALX AI Assistant. I’m here to help, ask me anything.

What is data science?

Data science is the discipline of using data, statistical methods, and machine learning to build models that predict outcomes, classify information, and uncover patterns that are not visible through standard analysis. Data scientists work at the intersection of mathematics, programming, and business strategy, turning data into decisions at scale.

Data analytics focuses on examining existing data to understand what has happened and why. Data science goes further. It builds predictive models and machine learning systems to forecast what will happen, automate decisions, and find structure in large, complex datasets. Data science requires deeper programming and mathematical skills, and the programme reflects that.

The programme takes you from data literacy and spreadsheet analysis through Python programming, exploratory data analysis, statistical reasoning, supervised and unsupervised machine learning, natural language processing, and recommendation systems. Every stage is built around real-world projects, including work on agricultural productivity, public health risk prediction, humanitarian aid allocation, and disaster relief.

Data scientists work in roles including Data Scientist, Machine Learning Engineer, AI Engineer, Research Scientist, and Senior Data Analyst. They are among the most in-demand professionals globally, with applications across healthcare, finance, agriculture, logistics, and technology.

You do not need prior programming experience. The programme builds Python from the foundations up, teaching data structures, control flow, and modular functions before moving into data manipulation and machine learning. What you need is persistence. This is one of the more demanding programmes in the portfolio, and the projects reflect that.

The first intake launches on 20 July 2026. The programme covers significantly more ground than most programmes in the portfolio. Following the recommended pace gives you a clear timeline. You can also move faster if your background allows

Deeply. The programme’s projects are deliberately built around African contexts, including water access in rural communities, food security for agricultural organisations, humanitarian aid allocation across African countries, and environmental monitoring. The skills are global. The application is grounded.

When you complete your first Data Science short course, you are automatically enrolled in Professional Foundations. From that point you complete both in parallel. Professional Foundations is required for your Data Science Programme Certificate, but you do not need to finish it before you can continue your Data Science short courses. If you have already completed Professional Foundations through a previous programme, the system will recognise that and you will not be asked to repeat it.