28
@cambridgespark Cambridge Spark Data Science Training for Professionals www.cambridgespark.com https://blog.cambridgespark.com @cambridgespark

Data Science Training for Professionals Cambridge Spark · - Selecting students that fit the candidate profile to attend the workshop - Delivering an in-house workshop, which immerses

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

@cambridgespark

Cambridge SparkData Science Training for Professionals

www.cambridgespark.comhttps://blog.cambridgespark.com@cambridgespark

What are your objectives?

2

K.A.T.E. pg.26-27

Hire Data Science talent

Close Data Science skills gaps

Build Data Science skills at scale

Data Science Apprenticeship Programmes pg.12-15

Data Science Graduate Training pg.22-24

Data Science Graduate Recruitment and Assessment pg.16-21

Team Training Coursespg.6-9

Data Science Seminars for Executives pg.10-11

About Cambridge Spark

- Help professionals become more employable in the tech industry

- Specialise in Applied Data Science and Python for industry needs

- Track record of training thousands of professionals through our bootcamps and corporate trainings

- Provide bespoke Data Science & Machine Learning consulting engagements

- Facilitate networking and community development with conferences across UK, meetups and webinars

- Developed proprietary AI-powered technology training and assessment platform: K.A.T.E.®

- Cambridge Spark’s team are specialists in their fields and come from a range of backgrounds, including PhDs and Researchers in Machine Learning from Cambridge University, Oxford University and UCL as well as developers with industry experience from leading companies including Google, Microsoft Research, Amazon, Morgan Stanley and Oracle.

3

Selection of Clients and EngagementsUniversity Tech Recruitment Talent PipelineJPMC Data Science Academy Internal Skills Assessment

Applied Data Science Part-time Bootcamp

Two week Internal Data Science BootcampApplied Data Science Bootcamp Project Partner

Data Science Training for Executives

Introduction to Data Science in Python Private Training Course

University Tech Recruitment Talent PipelineApplied Data Science Bootcamp Project Partner

4Data Science Graduate Recruitment

Our Process

TrainingWe offer in-person and online Data Science training courses or professionals. Our modularised curriculum includes comprehensive lectures led by industry experts, interactive Python notebooks with exercises and solutions, and industry projects to complete using K.A.T.E.®.

AssessmentK.A.T.E.® is a proprietary adaptive assessment learning system. When training with Cambridge Spark, participants use K.A.T.E.® to conduct continuous learning projects and receive immediate feedback on their code submissions. This ensures your employees get the personalised support they need to maintain their momentum.

AnalyticsK.A.T.E.® features an analytics and monitoring dashboard for management and stakeholders to track employee progression, performance and scores on project assignments to ensure ROI along the learning journey.

5

Augment your team’s Data Science skills

6

Short, Intensive Onsite Training Courses

Quickly upskill employees and target specific skill gaps in your team

To support efficient Data Science skills acquisition we offer short, intensive training programmes delivered at your office. This is an approach used by clients,

- to stay ahead and employ new Data Science methods on client projects

- to improve team productivity and data-driven decision-making

Programme Summary:- In-person training delivered at your office by an industry expert

- Highly practical course materials focused on the workflow of a Data Scientist including:- Interactive Python notebooks with exercises and solutions- Continuous learning projects to complete using K.A.T.E.®

- Cover the most industry-relevant skills and cutting-edge techniques in Data Science that can be applied to:- Make better decisions faster- Lower costs with increased operational efficiency- Increase revenue by developing new product offerings 7

PythonCourses

Data ScienceCourses

Pragmatic Python Programming (2 day course)Python fundamentals, interacting with APIs, data processing and code architecture

Python for Analysts (2 day course)Intensive conversion training courses for developers writing and maintaining Python projects in Production.

Python for Engineers (2 day course) Intensive Python conversion training course for developers familiar with C

Core Data Science (3 day course)Exploratory Data Analysis and visualisation, Feature Engineering, Unsupervised Learning and Supervised Learning

Machine Learning and Deep Learning (2 day course)Model evaluation, decision trees, random forests, logistic regression, SVMs, neural networks and deep learning

Apache Spark in Practice (2 day course)Learn how to work with Spark (RDDs, Dataframes and SQL) and Parquet

Pragmatic Model Interpretation (1 day course)LIME, SHAP, ELI5 and best practices for interpreting and evaluating machine learning models

Regression and Time Series Analysis (2 day course)Linear regression, nonlinear regression, auto-regressive models, time series analysis and regularisation

Bespoke courses are designed according to your needs and specific training requirements

8

Cloud Computing and Databases (2 day course)Th course covers working SQL to query structured data and NoSQL. In particular, the second focuses on document-oriented storage and graph databases.

Neural Networks and Deep Learning (2 day course)This course is designed to ensure your team gains a strong understanding of how neural networks are constructed and how to apply them using Keras. Participants will learn about neural network architectures, ranging from simple single-layer architectures to deep learning architectures including CNNs and RNNs.

Natural Language Processing (2 day course)This course is designed to provide your team with hands-on training in text processing, semantic analysis and sophisticated Machine Learning approaches.

Recommender Systems, Interpretability and Evaluation (2 day course)This course is split into three core sections, building Recommender Systems, followed by model evaluation and interpretability. We cover different types of Recommender Systems such as Collaborative Recommender systems, Content-based recommender systems and Hybrid recommender systems.

Bespoke courses are designed according to your needs and specific training requirements

9

Specialised Data ScienceCourses

Data Science Executive Seminars

Quickly gain the skills you need to effectively manage your Data Science team

With companies becoming increasingly data-driven, Data Science Executive Seminars prepare executives and business leaders to make informed decisions about where and how to leverage Data Science. This is approach was used by Cambridge Spark client, McKinsey & Company.

Programme Summary:- Intensive, on-site seminars

- Designed to support executives and business leaders in adapting and updating their skills to meet the demands of the business

- Topics include:- The practical application of these techniques translated into concrete value- How businesses must adapt and become data-driven to survive- Key actions you can take to implement effective data-driven transformation- Crucial questions to tackle as a leader in this technological landscape; covering culture, strategy, infrastructure and talent

- Center around use cases, industry case studies and roundtable discussions to explore ideas and challenges

- Bespoke seminars are designed according to your needs and specific training requirements 10

Data Science for Executives

11

This intensive, half-day seminar focuses on demystifying what Data Science is and its application and benefits in industry through a mixture of case studies, practical demonstration and interactive exercises.

Big Data- Context- Defining Big Data: Volume, Velocity, Variety, Veracity, Value- Analytical Capabilities: Descriptive, Diagnostic, Predictive, Prescriptive- Cloud Computing

Data Science and Machine Learning- Why you should care and what is it?- How to be smarter and more productive than Excel using Python- Data Science Life Cycle- Data Science and Machine Learning Technologies

Artificial Intelligence- Deep Learning in a nutshell: how does it work?- Case Studies and demo

Natural Language Processing- Data & user analytics- Language modelling- Topic modelling- Summarisation- Question generation

Becoming a Data-driven Organisation - Data-driven Products- Strategy- Infrastructure- Culture- Talent- Privacy

Build Data Science skills at scale

12

About Apprenticeship Levy Funding

Background- “Free” training for companies, offset from tax (“Apprenticeship Levy”)- The apprenticeship levy is paid by organisations with a staff bill of over £3m/annum- 0.5% of staff bill is paid monthly into the company's levy account- Money can be used for approved apprenticeship training programmes- Funding expires 24 months after being paid to HMRC if not used for training- Only valid within the UK, with separate schemes in each devolved nation

Curriculum standards- Highly regulated, restrictions and many rules to follow both on curriculum and delivery (20% time)- There’s a curriculum standard to follow “Data Analyst Apprenticeship standard”

- https://www.instituteforapprenticeships.org/apprenticeship-standards/data-analyst

We are using this standard to match the Apprenticeship Levy and provide additional optional advanced training and assessment to deliver the most relevant curriculum possible.

13

Data Science Apprenticeship Curriculum

Achieve wider organisational transformation by building an internal Data Science Academy

If you’re an organisation paying the apprenticeship levy, and you need to bring new data skills into the organisation,  then the levy funding represents an opportunity to create an internal data training academy to upskill cohorts of relevant staff. This approach is used by Cambridge Spark client, JPMorgan, to level up their current analysts.

Programme Summary:- 13-month flexible programme with continuous support, designed to build Data Science skills and capabilities at scale

- 6 months Core Data Science (Level 1); 6 months optional advanced topics (Level 2)- Online platform with live-coded video lectures, Python notebooks with exercises and solutions, practical projects and reading materials- Each online content module requires approximately 10 hours per week to complete- Personalised, immediate feedback on assignments via K.A.T.E.- Bi-weekly diagnostic group sessions- Slack channels for ongoing communication, collaboration and peer-to-peer learning

- Follows a modularised structure of levels allows individuals with different experience obtain the best training suited for their background

- Participants gain the Data Science Associate certification from EMC14

Level 1Compulsory Modules

Level 2OptionalSpecialisedModules

Module 1 - Building Foundations: Tools, Mathematics, and Best Practices for Data Scientists (4 wks)Module 2 - Exploratory Data Analysis, Feature Engineering and Linear Regression (4 wks)Module 3 - Classification, Hyperparameters Tuning and Model Evaluation (4 wks)Module 4 - Time Series Analysis and Clustering (4 wks)Module 5 - Cloud Computing and Databases (3 wks)Module 6 - Processing Big Data (3 wks)Capstone Project (4 wks)

Module 7 - Ensemble Methods and Machine Learning for Production (6 ws)Module 8 - Neural Networks and Deep Learning (6 wks)Module 9 - Natural Language Processing (6 wks)Module 10 - Recommender Systems, Pragmatic Evaluation and Model Interpretability (6 wks)End Assessment and Certification

Apprenticeship Curriculum

Prerequisites:- Prior Python programming experience - Mathematical aptitude including fundamentals of linear algebra and statistics- Individuals plan to apply these skills in their job

An initial skill level assessment will be carried out by using K.A.T.E.® to baseline knowledge.15

Hire Data Science Talent

16

Graduate Recruitment

Find and hire the best candidates amid increasing competition for top technical talent

Through bespoke workshops, courses and competitions, we help you attract, assess and select high quality candidates who have a specific interest in graduate positions or internship opportunities within your company. This is an talent acquisition process used by Cambridge Spark clients, IMC Financial Trading and HSBC. Our academic network includes leading Universities such as:

Programme Summary:- Cambridge Spark will implement an innovative talent pipeline strategy to attract talent from top UK universities. This comprises of:

- Designing a graduate workshop experience aligned with your hiring requirements- Promoting and marketing the workshop to potential candidates at top tier UK universities of your choice- Selecting students that fit the candidate profile to attend the workshop- Delivering an in-house workshop, which immerses students in your company culture- Concluding the day with a competition to identify the top candidates

- K.A.T.E.® will be used to assessing and evaluating skills and aptitude in Python programming and mathematics fundamentals- K.A.T.E.® offers a real-time monitoring dashboard to track progress and top performing candidates

17

Problem:

IMC Trading was looking for a solution to efficiently assess and evaluate technical candidates as part of an innovative programming competition. The competition required potential candidates to implement quantitative finance strategies. However, manually reviewing code to ensure correctness, assessing code performance and also providing suitable technical infrastructure for students is a huge time-waster and complex.

Solution:

K.A.T.E.® provided the solution to this problem by making the whole process of submitting and testing solutions smooth and robust. Using K.A.T.E.® IMC launched a Quantitative Finance challenge, allowing candidates to experience training their own option pricing models on real-world data. The challenge drew 350+ applicants, 50 of which were selected to take part in the competition.

Results:

“The K.A.T.E. dashboard was useful for us. We were able to keep an eye on the general progress of all the teams. Using the dashboard, we could also look into the code submissions directly to spot talented students,” said Heiko Schäfer, Quantitative Developer,, IMC Trading. “We have been running workshops for students for 3 years now. Culmination is a coding competition to implement a simple numerical option pricing model in Python. This year we used K.A.T.E. for the first time and the result was very positive: we saw more teams finish with a working model than the years before.”

“The instant feedback from K.A.T.E. enabled the students to iterate more frequently in their development. It was also noticeable that K.A.T.E. helped the less computer savvy of the students a lot,” said Heiko Schäfer. “Technical problems with Python versions and installations did not impeded the competition as development with K.A.T.E. does not depend on locally running Python. Overall K.A.T.E. helped significantly to make our competition a success.”

Case Study:

18

Problem:

HSBC was looking to attract students from top UK universities to fill new Data Science graduate roles and establish a Data Science talent pipeline. However, when it came to acquiring top technical talent, traditional recruitment methods such as sponsorship and careers fairs proved to be inefficient.

Solution:

To attract the most qualified graduates, Cambridge Spark worked with HSBC Global Banking and Markets to create an innovative graduate recruitment experience showcasing HSBC’s Data Science initiatives and career opportunities. This approach comprised of three parts:

1. Marketing and engagement2. Applicant screening and technical assessment3. Target candidate identification.

Results:

Cambridge Spark designed a bespoke one-day seminar on "Machine Learning in Finance: Time Series Forecasting in Python". The seminar generated 300+ applications from HSBC’s chosen universities.

To manage the influx of applications, K.A.T.E.® was utilised to handle candidate prospecting and selection. Candidates were required to complete Python challenges on K.A.T.E.® for the purpose of assessing and evaluating skills and aptitude in Python programming and mathematics fundamentals.

80 students were selected to take part in the onsite Machine Learning in Finance seminar and hackathon-style competition. K.A.T.E.® was used to conduct the competition and evaluate candidate code submissions. As the event took place, HSBC stakeholders could identify top-performing students in real-time via the analytics dashboard.

Case Study:

19

Graduate Assessment

Automate assessment to improve productivity and find the best candidates possible

Help your learning and development team become more efficient by eliminating time spent on technical assessment and marking. K.A.T.E.® automatically evaluates each individual’s capabilities aligned with industry software development practices and Machine Learning methodologies. This software solution is used by Cambridge Spark client, JPMorgan, to manage assessment at scale.

Using K.A.T.E.® for Assessment and Evaluation, key features:- Assesses the accuracy and quality of each code submission, focusing on production ready Data Science code standards- Provides a real-time Monitoring and Analytics Dashboard to measure both employee engagement and performance- Produces immediate feedback reports on PEP8, Pylint, Cyclomatic Complexity, Unit Tests and Machine Learning metrics- Supports Python (scikit-learn, keras, pandas, numpy), SQL, Spark- Supports pluggable evaluation metrics to provide you with custom diagnostics

“The K.A.T.E.® environment provides progressive exercises with immediate feedback and scoring. Our experience is this really helps the participants to stay motivated and focussed.”Dr. Stephen Simmons, Athena Research, JPMorgan

20

Graduate Engagement: Run an Innovative Hackathon

Stand out in a competitive Data Science talent landscape with cutting-edge events

Looking to run programming-based hackathons to inspire university graduates? K.A.T.E.®’s assessment and evaluation functionalities will help you create the best possible student experience.

To support your technical hackathons K.A.T.E.® provides an innovative solution to: - enable students to learn how to design algorithms, run tests and see their results- remove technical challenges such as version and installation problems with Python- identify top talents, in real-time, based on their coding skills and accuracy of their models

21

“Every student team used K.A.T.E. to develop their pricing model. I think K.A.T.E. made the whole process of submitting and testing solutions very smooth. Previously, we had struggled to help all the teams but now they could figure a lot out themselves, immediately. The vast majority of groups had a working implementation in the end and this was to a large part due to K.A.T.E..”

“K.A.T.E. helped significantly to make our option pricing hackathon competition using Python a big success”Heiko Schäfer, Quantitative Developer, IMC Trading

Graduate Training Programmes

Maximise time-to-productivity for new graduates with practical training and project-based assessments

Cambridge Spark onboarding provides tailored training experiences that help graduates quickly get up to speed and build confidence in the relevant Data Science tools, techniques and practices for their role.

Programme Summary:- Two onboarding pathways tailored to different backgrounds and learning objectives: Data Analyst and Data Scientist

- Modularised and practical learning with continuous support- In-person training workshop led by industry experts- Online platform with continuous learning video lectures, tailored to each individual's learning objectives- Interactive Python notebooks with exercises and solutions- Personalised, immediate feedback on projects via K.A.T.E.®- Mentorship and coaching from the Cambridge Spark team- Integrated support via our ticketing system to allow students to ask for specific feedback

- Graduates gain industry experience on projects throughout. e.g.:- Kickstarter campaign success predictor- Bitcoin time series forecasting- Customer segmentation- Exploratory Data Analysis using PySpark

22

Data Analyst Pathway

Prerequisites:- Prior programming experience - Mathematical aptitude including fundamentals of linear algebra and statistics

An initial skill level assessment will be carried out by using K.A.T.E.® to baseline knowledge.

Duration: Customisable or a recommended duration of 8 weeks 23

Python Module 1 - Python Fundamentals K.A.T.E. Project-based AssessmentPython Module 2 - Structuring Python Code: OOP, Testing, DebuggingK.A.T.E. Project-based AssessmentPython Module 3 - Advanced Python: Code Quality, Functional Programming, Packages, ModulesK.A.T.E. Project-based AssessmentPython Module 4 - Python for Engineers: Web APIs, Type Checking, SDLCK.A.T.E. Project-based Assessment

Module 1 - Building Foundations: Tools, Mathematics, and Best Practices for Data ScientistsK.A.T.E. Project-based AssessmentModule 2 - Exploratory Data Analysis, Feature Engineering and Linear RegressionK.A.T.E. Project-based AssessmentModule 3 - Classification, Hyperparameters Tuning, and Model EvaluationK.A.T.E. Project-based AssessmentModule 4 - Cloud Computing and DatabasesK.A.T.E. Project-based Assessment Module 5 - Processing Big DataK.A.T.E. Project-based AssessmentEnd Assessment and Certification

Core

Level 1

24Duration: Customisable or a recommended duration of 12 weeks

Data Science Pathway

Python Module 1 - Python Fundamentals K.A.T.E. Project-based AssessmentPython Module 2 - Structuring Python Code: OOP, Testing, DebuggingK.A.T.E. Project-based AssessmentPython Module 3 - Advanced Python: Code Quality, Functional Programming, Packages, ModulesK.A.T.E. Project-based AssessmentPython Module 4 - Python for Engineers: Web APIs, Type Checking, SDLCK.A.T.E. Project-based AssessmentModule 1 - Building Foundations: Tools, Mathematics, and Best Practices for Data ScientistsK.A.T.E. Project-based AssessmentModule 2 - Exploratory Data Analysis, Feature Engineering and Linear RegressionK.A.T.E. Project-based AssessmentModule 3 - Classification, Hyperparameters Tuning, and Model EvaluationK.A.T.E. Project-based AssessmentModule 4 - Cloud Computing and DatabasesK.A.T.E. Project-based Assessment Module 5 - Processing Big DataK.A.T.E. Project-based Assessment

Core

Level 1

Module 6 - Time Series Analysis and ClusteringK.A.T.E. Project-based AssessmentModule 7 - Ensemble Methods and Machine Learning for ProductionK.A.T.E. Project-based AssessmentModule 8 - Neural Networks and Deep LearningK.A.T.E. Project-based AssessmentModule 9 - Natural Language ProcessingK.A.T.E. Project-based AssessmentModule 10 - Recommender Systems, Pragmatic Evaluation and Model InterpretabilityK.A.T.E. Project-based AssessmentEnd Assessment and Certification

Level 2

Ensuring ROI

25

Get the most out of your training using K.A.T.E.®

We invest in ensuring your training investments are effective, placing importance on:

Inspiring and helping your employees want to learn - K.A.T.E.® (Knowledge Assessment Teaching Engine) provides immediate and personalised feedback to students to accelerate progress- Offers personalised sets of exercises to help bridge skill gaps quicker and more effectively- Generates personalised further reading recommendations to support continuous learning

Alignment with the core business needs and objectives- K.A.T.E.® supports custom content to match projects to your training requirements- Provides a fully integrated solution supporting git for version control- Performs source code analysis, correctness, quality, convention, machine learning metrics- Reports relevant learning analytics and diagnostics to benchmark code submissions

The transfer of knowledge into a Production environment- K.A.T.E.® unique is focused on production-ready Data Science code rather than "toy" learning examples and exercises- Emmerses participants in a simulated work environment

Further information about K.A.T.E.® can be viewed at: https://edukate.ai 26

K.A.T.E.® for Learning and Development

Upskill, reskill and onboard team members quickly and effectively

K.A.T.E.® provides structured, expert-led data science and coding projects, enabling you to build internal Data Science capability and close technical skill gaps at scale.

Using K.A.T.E.® for Learning and Development, key features:- Customisable content supports your organisation’s needs and objectives

- Live-coded online videos, interactive exercises and project-based assignments offer an interactive, blended learning experience

- Instant personalised feedback and content recommendations to address each individual’s skill gaps

- Industry-simulated learning environment ensures the use of version control tools and coding best practices

- Monitoring and Analytics Dashboard enabling you to track progress and quantify improvement in technical proficiency

28

Training ConsultingAssessmentTalent Research

Dr. Raoul-Gabriel [email protected]