Data Science: Data Analysis Boot Camp What is Data Science?ccartled/Teaching/2020...Data Science:...


Citation preview

Data Science: Data Analysis Boot CampWhat is Data Science?

Chuck Cartledge, PhDChuck Cartledge, PhDChuck Cartledge, PhDChuck Cartledge, PhDChuck Cartledge, PhDChuck Cartledge, PhDChuck Cartledge, PhDChuck Cartledge, PhDChuck Cartledge, PhDChuck Cartledge, PhDChuck Cartledge, PhDChuck Cartledge, PhDChuck Cartledge, PhDChuck Cartledge, PhDChuck Cartledge, PhDChuck Cartledge, PhDChuck Cartledge, PhDChuck Cartledge, PhDChuck Cartledge, PhDChuck Cartledge, PhDChuck Cartledge, PhD

7 February 20207 February 20207 February 20207 February 20207 February 20207 February 20207 February 20207 February 20207 February 20207 February 20207 February 20207 February 20207 February 20207 February 20207 February 20207 February 20207 February 20207 February 20207 February 20207 February 20207 February 2020



Intro. What is DS? How about math? How about programming? How about domain expertise? Q & A Conclusion References

Table of contents (1 of 1)

1 Intro.2 What is DS?3 How about math?4 How about programming?5 How about domain expertise?6 Q & A

7 Conclusion

8 References


Intro. What is DS? How about math? How about programming? How about domain expertise? Q & A Conclusion References

What are we going to cover?

We’re going to talk about:

What is Data Science (DS)?

How much math does a DS personneed?

How much programming expertisedoes a DS person need?

How much domain expertise does aDS person need?


Intro. What is DS? How about math? How about programming? How about domain expertise? Q & A Conclusion References

Data science in a picture

Data science lives at theintersection of:


Domain expertise, and

Computer Science.

You need to be comfortable in allareas.

Image from [4].


Intro. What is DS? How about math? How about programming? How about domain expertise? Q & A Conclusion References

Sexiest Job in the 21st Century?[1]

“. . . Its a high-ranking professional with the trainingand curiosity to make discoveries in the world of big data.. . .Much of the current enthusiasm for big data focuseson technologies that make taming it possible, includingHadoop . . . and related open-source tools, cloud comput-ing, and data visualization. . . . ”

T. H. Davenport, and D. J. Patil [1]


Intro. What is DS? How about math? How about programming? How about domain expertise? Q & A Conclusion References

Hammer and nails . . .

“. . . it is tempting, ifthe only tool you have isa hammer, to treat ev-erything as if it were anail.”

Abraham H. Maslow [3]

Image from [6].


Intro. What is DS? How about math? How about programming? How about domain expertise? Q & A Conclusion References

Claiming to be a data scientist implies mastery ofdifferent tools and technologies

The areas are wide, and deep.Mastery is not necessary in allsubareas.

Image from [5].


Intro. What is DS? How about math? How about programming? How about domain expertise? Q & A Conclusion References

Same image.

Image from [5].


Intro. What is DS? How about math? How about programming? How about domain expertise? Q & A Conclusion References

Difference between theory and application

Applicable techniques:

Machine learning

Statistical modeling

Experiment design

Bayesian inference

Supervised learning

Unsupervised learning


Need to know how to apply theory to real life.


Intro. What is DS? How about math? How about programming? How about domain expertise? Q & A Conclusion References

Any language will work, some work better than others.

Computer science fundamentalsScripting language (Python)Statistical language (R)Databases (SQL and NonSQL)Relational algebraParallel databases and parallelquery processingMapReduce conceptsHadoop and Hive/PigCustom reducersExperience with XaaS(anything as a service) likeAWS Image from [2].

It all comes down to “hammers and nails.”


Intro. What is DS? How about math? How about programming? How about domain expertise? Q & A Conclusion References

Subject matter expert (SMEs) may be hard to find.

Common characteristics:

Passionate about thebusiness

Curious about data

Influence without authority

Hacker mindset

Problem solver

Strategic, proactive,creative, innovative, andcollaborative

Often more interested in bettering the system then self.


Intro. What is DS? How about math? How about programming? How about domain expertise? Q & A Conclusion References

Q & A time.

Q: Name two families whose kidswon’t join the Marines.A: The Halls of Montezuma andthe Shores of Tripoli.


Intro. What is DS? How about math? How about programming? How about domain expertise? Q & A Conclusion References

What have we covered?

Data Science is a wide and growingfieldDS is multidisciplinary (math,programming, domain expertise)A data scientist is interested in“teasing” understanding from thedata

Next: What is R?


Intro. What is DS? How about math? How about programming? How about domain expertise? Q & A Conclusion References

References (1 of 2)

[1] Thomas H. Davenport and D.J. Patil, Data scientist: Thesexiest job of the 21st century,

sexiest-job-of-the-21st-century, 2012.

[2] Pedro Marcelino, Transfer learning from pre-trained models,

from-pre-trained-models-f2393f124751, 2018.

[3] Abraham H. Maslow, The Psychology of Science, HenryRegency, 1966.

[4] Syed Sadat Nazrul, Data science interview guide,

interview-guide-4ee9f5dc778, 2018.


Intro. What is DS? How about math? How about programming? How about domain expertise? Q & A Conclusion References

References (2 of 2)

[5] CISC Staff, Data science & machine learning,, 2019.

[6] Happiness Staff, Abraham Maslow,

happiness/abraham-maslow/, 2016.
