National statistics offices increasingly need to process larger, more complex datasets while producing results that are transparent, reproducible, and easy to review. Under the SADC Regional Statistics Project, rowsquared designed and delivered a hands-on Python training programme for statisticians and analysts across the region, helping national teams move from fragmented, manual workflows towards scalable, code-based production of official statistics.

Python is well suited to this transition. As an open-source language, it removes licensing barriers and avoids dependence on a single vendor. Its large global community provides durable skills, extensive learning resources, and well-maintained tools. One language can cover the full workflow, from data preparation and analysis to reporting, while integrating smoothly with spreadsheets, databases, GIS tools, dashboards, and web services that national statistics offices already use. It also creates a practical foundation for modern methods such as machine learning and natural language processing.

rowsquared developed a two-course curriculum delivered through intensive in-country workshop sessions. The beginner course introduces Python fundamentals, data handling, data transformation, exploratory analysis, visualisation, and descriptive statistics for participants without prior programming experience. The intermediate course builds on this foundation with reproducible pipelines, code structuring and testing, API-based data access, statistical modelling, applied machine learning, and geospatial analysis. Both levels run as intensive five-day workshops with sustained hands-on practice.

The programme is built around modular content that rowsquared tailors to each national statistics office through pre-course engagement, participant assessments, and on-course adjustments. Lessons draw on analogies with the tools participants already use, such as Excel, SPSS, or Stata helping them see not only how to write code, but how Python can replace or improve tasks they already perform. Exercises use real datasets from the host office, grounding the training in participants’ professional context, increasing engagement, and supporting immediate transfer of skills back to the workplace.

To remove technical setup as a barrier, rowsquared delivers the courses through a cloud-hosted JupyterHub environment, supported by Moodle for learning materials, quizzes, and evaluation. Participants log in to individual workspaces from day one, with no local installation required. This creates a consistent training environment across different devices and network conditions, allowing the workshop to focus on learning rather than troubleshooting. All course materials remain available online after the training, giving participants a resource they can return to as they apply the skills in their own work.