More and more organizations are realizing the importance of effectively processing and analyzing data. Alongside investing in infrastructure, platforms, and software; organisations also a need to train and upskill staff in the art of data analysis. An extremely popular and versatile toolset that has gained immense popularity in this regard is the Python programming language.
This blog post is to offer some advice to organizations that are new to the world of data science and have staff members who are eager to harness the power of Python. To be crystal clear, I am not talking about software developers here. Instead, I am referring to non-software professionals, who want to use Python as a handy complement to their existing tools like Excel.
First, I’ll share a bit about my own experience and how I’ve gained insights into this subject. Then, I’ll walk you through some key decisions you’ll need to make when it comes to providing your staff with appropriate Python environments.
If you’re a senior manager, decision maker, or involved in any way with rolling out Python training in your organization, then this post will give you some food for thought.
I am a senior technical instructor at Neueda and have been delivering training in software development for quite a few years now. The companies I deliver to are in general financial services, the attendees I teach range from interns, just beginning their careers up to global heads of product and senior IT managers.
Over the last 4 or 5 years, the focus of my training has shifted from traditional software engineering courses and switched focus to more data science, analytics, and machine learning courses.
There has also been a significant shift in the day jobs of people who attend my courses. 5+ years ago trainees were almost exclusively junior software developers, today around half my time is spent teaching python to non-software developers, i.e., people whose roles vary widely, from trading and risk management to HR, from sales to controls and compliance, from confirmation and settlement to security.
Irrespective of job function, seniority of individual, size of organization or geographical location of organization, there are a few common pieces of infrastructure that companies should have in place to allow their staff to put their training to good use once they have completed a python for data analysis training course.
3 Things for a Successful Python Roll-out
Again, bearing in mind that I am talking about giving non-developers access to basic programming tools and environments, the three extensions I would recommend are:
Jupyter Lab Debugger
Sometimes people will need to debug a cell of code, inspect a variable, step over a for loop, either for bug fixing, or just to satisfy their curiosity or clarify to themselves what exactly is going on. This debugger for Jupyter notebooks is a must.
Jupyter Lab Variable Inspector
Closely related to the debugger, this feature allows users to inspect and change variables in their notebook.
Jupyter Lab Spreadsheet
Your target audience will no-doubt know spreadsheets. Jupyter does not come with built-in support for viewing spreadsheets. You will need something to prevent your users from chopping and changing between different tools such as MS Office, Google sheets etc. This add-in satisfies this need and keeps users in a single place, under the Jupyter roof.
That’s it for now, just a quick tour of what companies can do to help support their staff immediately before or after an often-stressful upskilling program. In summary:
- Choose a well-tested and popular python distribution – I would recommend anaconda.
- Use a centrally located server as opposed to local installs – here I recommend Jupyter Hub
- Use a small number of simple but extremely useful extensions – here I am suggesting, a debugger, a variable inspector, and a spreadsheet viewer.
Keep an eye out for my next posts on the topic of training for analysts. I’m going to give a few hints and tips on how to plan and manage a training course for your staff. I’ll be giving advice and food for thought on topics such as:
- Delivery format – inhouse, virtual, hybrid,
- Frequency of training and follow up
- Types of data to use, time series, geospatial, csv, RDBMS
- Lab exercises and project work
About the author
Pat is our most experienced fintech and data professional with a winning mix of financial markets, application development and data analytics experience. He is passionate about education and about sharing both his capital markets and his technology expertise.View Pat McKillen’s profile