Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
LECTURE 1INTRODUCTION
(DATA-DRIVEN MODELING)
SCIENTIFIC DATA COMPUTING MTAT.08.042
Prepared by:
Amnir Hadachi
Institute of Computer Science, University of Tartu
LECTURE 1: INTRODUCTION
OUTLINE
▸ Course Syllabus
▸ General overview
▸ Modelling
▸ Types of modelling
▸ data-driven modelling
▸ statistical models
▸ soft computing model
▸ Spatio-temporal complexity
▸ Type of data
▸ Probability distributions
COURSE SYLLABUS
1.
LECTURE 1: INTRODUCTION
COURSE SYLLABUS
▸ Rule and regulation
▸ The attendance Attendance to the Lectures and Labs are not mandatory.
▸ However it is suggested to attend the labs and respect the deadlines for the lab tasks. (one week for doing the tasks)
▸ Lectures’ slides and video records will be available during the week when the lectures scheduled time is.
▸ course website: “https://courses.cs.ut.ee/2016/SDC/spring/Main/HomePage“
LECTURE 1: INTRODUCTION
COURSE SYLLABUS
▸ Topics covered:
▸ Statistical methods and their applications
▸ Linear algebra and singular value decomposition
▸ Basic optimization
▸ Image processing and analysis
▸ Compressed sensing
▸ Text processing
▸ Time series analysis and wavelets
LECTURE 1: INTRODUCTION
COURSE SYLLABUS
▸ Course instructor:
▸ Lecturer Amnir Hadachi
▸ Office Ulikooli 17, Room 327.
▸ Office hours:
▸ Friday from 10h till 14h.
▸ Tuesday from 9h till 12h.
GENERAL OVERVIEW
2.
LECTURE 1: INTRODUCTION
GENERAL OVERVIEW
▸ General approach
When we use the term of “modeling” most of the time refers to the process of representing the real world object phenomena as a set of mathematical equations.
Modeling os a system is used to two purpose either for estimation or prediction of system behavior and response to the changing factors.
DEFINITION:
LECTURE 1: INTRODUCTION
GENERAL OVERVIEW
▸ General approach
PROCESS TO BE MODELED
MODELPREDICTED OUTPUT VARIABLE
OBSERVED OUTPUT VARIABLEINPUT DATA X
MODELLING APPROACH
MODELLING3.
LECTURE 1: INTRODUCTION
MODELING
▸ Types of modelingMODELS
PHYSICAL MATHEMATICAL
ANALYTICAL CONCEPTUAL DATA-DRIVEN
LECTURE 1: INTRODUCTION
MODELING
▸ Types of modeling
▸ Example “Electro-Dynamic Vibration Exciter”
physical system
LECTURE 1: INTRODUCTION
MODELING
▸ Types of modeling
▸ Example “Electro-Dynamic Vibration Exciter”
physical model
flexure springsspring
LECTURE 1: INTRODUCTION
MODELING
▸ Types of modeling
▸ Example “Electro-Dynamic Vibration Exciter”
physical model
flexure springsspring
Observed effects (two electromechanical effects):
Motor effect: Passage of the current via the coil causes it to experience a magnetic force promotional to the current
Generator effect: motion of the coil inside the magnetic field causes a village promotional to the velocity to be induced into the coil.
LECTURE 1: INTRODUCTION
MODELING
▸ Types of modeling
▸ Example “Electro-Dynamic Vibration Exciter”
Mathematical model (System equation)
LECTURE 1: INTRODUCTION
MODELING
▸ data-driven modeling
▸ role:
▸ enable us to map causal factors and their consequence outcomes by observing the patterns from the experimental data without understanding the complex physical process.
DEFINITION:A model which can simulate a system using experimental data of that system is known as data-driven modeling
LECTURE 1: INTRODUCTION
MODELING
▸ Data-driven modeling
▸ Purpose behind using data-drive modelling:
▸ Data clustering and classification
▸ Estimating the outcome
▸ Forecasting or Predicting the outcome
▸ Optimisation
LECTURE 1: INTRODUCTION
MODELING
▸ Data-driven modeling
▸ Characteristics:
▸ inexpensive
▸ accurate
▸ precise
▸ flexible (compared to physical models or analytical models)
LECTURE 1: INTRODUCTION
MODELING
▸ Data-driven modeling
▸ There is two groups of data-driven models:
▸ Statistical modeling
▸ Soft computing (known as Artificial intelligence)
LECTURE 1: INTRODUCTION
MODELING
▸ Statistical models is comprised of:
▸ Deterministic variables
▸ defined by: mathematical model and synthesized data
▸ Random variable
▸ defined by probabilistic model
▸ parametric (e.g. standard deviation, mean)
▸ non-parametric (based on assumptions)
LECTURE 1: INTRODUCTION
MODELING
▸ Soft computing model
▸ the principal of soft computing is modeled using neuro-computing, fuzzy logic and genetic algorithm.
▸ capable of tolerance regarding imprecision, partial truth, approximation and uncertainty.EXAMPLE:
source: http://www.newyorker.com/tech/elements/is-your-thermostat-sexist
ILLUSTR
ATION
BY TOM
I UM
Can help to answer this question what is the suitable temperer for the room to make people feel comfortable.
LECTURE 1: INTRODUCTION
MODELING
▸ Spatio-temporal complexity
▸ The model complexity can be defined in space and time.
▸ the model is defined by two characteristics:
▸ space
▸ time
▸ Very important for studying natural phenomena or any event with a dynamic change in space and time.
LECTURE 1: INTRODUCTION
MODELING
▸ Spatio-temporal complexity
▸ Example: Travel Time EstimationProblemstatementEs.matethetravel.mepersec.onKnowing:TwoGPSposi.onplusspeedandtheheading
AlltheCrossroadsposi.ondetected
Stepstofollow:MomentofpassageConcludethetravel.mees.ma.onpersec.on
A
B
LECTURE 1: INTRODUCTION
MODELING
▸ Spatio-temporal complexity
▸ Example: Travel Time Estimation
Thestateequa.on:
Firstobjec.veEs.matethemomentofpassage
LECTURE 1: INTRODUCTION
MODELING
▸ Type of data
▸ discret data
▸ continues data
▸ spacial data
▸ temporal data
0255075
100
0
50
100
050
100150200
APRIL MAY JUNE JULY
LECTURE 1: INTRODUCTION
MODELING
▸ How to develop data-driven models
▸ General approach DEFINE THE PROBLEMATIC
MODEL FORMULATION
SOLVING THE MODEL
CHECK AND VALIDATION OF THE MODEL SOLUTION
GIVE FEEDBACK ON THE MODEL
DECISION ON THE MODEL
CORRECTIONS AND ADJUSTMENT
OK
NOT OK
✤ Collect data ✤ Build assumptions ✤ Define variables ✤ Establish relationships ✤ Define functions and formulas
PROBABILITY DISTRIBUTIONS
4.
LECTURE 1: INTRODUCTION
PROBABILITY DISTRIBUTION
DEFINITION:Let X be a continue random variable. Then, a probability distribution or probability density function (pdf) of X is a function f(X) such that for any two numbers a and b with a≤b:
source:http://philschatz.com/statistics-book/contents/m46965.html
LECTURE 1: INTRODUCTION
PROBABILITY DISTRIBUTION
REMARK:In order that f(X) to be a legitimate probability density function f(X) must satisfy the following conditions:
LECTURE 1: INTRODUCTION
PROBABILITY DISTRIBUTION
QUESTION:
if you toss a die, what is the probability that you roll a 3 or less?
1. 1/6
2. 1/3
3. 1/2
4. 5/6
5. 1
LECTURE 1: INTRODUCTION
PROBABILITY DISTRIBUTION
QUESTION:
if you toss a die, what is the probability that you roll a 3 or less?
1. 1/6
2. 1/3
3. 1/2
4. 5/6
5. 1
LECTURE 1: INTRODUCTION
PROBABILITY DISTRIBUTION
QUESTION:
Two dice are rolled and the sum of the face values is equal to 6. what is the probability that at least one of the dice came up with a 3 ?
LECTURE 1: INTRODUCTION
PROBABILITY DISTRIBUTION
QUESTION:
Two dice are rolled and the sum of the face values is equal to 6. what is the probability that at least one of the dice came up with a 3 ?
1. 2/3
2. 1/3
3. 5/6
4. 1/5
5. 1
LECTURE 1: INTRODUCTION
PROBABILITY DISTRIBUTION
QUESTION:
Two dice are rolled and the sum of the face values is equal to 6. what is the probability that at least one of the dice came up with a 3 ?
1. 2/3
2. 1/3
3. 5/6
4. 1/5
5. 1
1-5 , 5-1 , 2-4 , 4-2 , 3-3
LECTURE 1: INTRODUCTION
PROBABILITY DISTRIBUTION
EXAMPLE:
Let suppose we have a clock that indicates the time. However, the clock has a malfunctioning which makes the clock stops at random at any time during the day.
if we suppose that X is the time at which the clock stops. Can you define the pdf function?
LECTURE 1: INTRODUCTION
PROBABILITY DISTRIBUTION
EXAMPLE:Let suppose we have a computer running computation and showing results at the same time. However, the computer has a malfunctioning which makes it stops at random at any time during the day.
if we suppose that X is the time at which the computer stops. Can you define the pdf function?
SOLUTION:
LECTURE 1: INTRODUCTION
PROBABILITY DISTRIBUTION
EXAMPLE:Let suppose we have a computer running computation and showing results at the same time. However, the computer has a malfunctioning which makes it stops at random at any time during the day.
if we suppose that X is the time at which the computer stops. The pdf for X is :
In case we want to know the probability that the computer will stop between 9:00am and 9:45am ?
LECTURE 1: INTRODUCTION
PROBABILITY DISTRIBUTION
EXAMPLE:Let suppose we have a computer running computation and showing results at the same time. However, the computer has a malfunctioning which makes it stops at random at any time during the day.
if we suppose that X is the time at which the computer stops. The pdf for X is :
In case we want to know the probability that the computer will stop between 9:00am and 9:45am ?
SOLUTION:
LECTURE 1: INTRODUCTION
PROBABILITY DISTRIBUTION
DEFINITION:
if A is a continues random variable, X is said to have a uniform distribution on the interval [A,B], if the pdf of X is:
LECTURE 1: INTRODUCTION
PROBABILITY DISTRIBUTION
▸ Random variable
▸ Discrete random variables have a countable number of outcomes
▸ Continuous random variables have an infinite continuum of possible values.
LECTURE 1: INTRODUCTION
PROBABILITY DISTRIBUTION
▸ Discreet variable
▸ The probability distribution for a discrete rv. X consists of:
▸ Where,
▸ X is the variable
▸ are the values
▸ Each and
Possible values:
Corresponding probabilities:
with the interpretation that:
012,5
2537,5
50
LECTURE 1: INTRODUCTION
PROBABILITY DISTRIBUTION
▸ Discreet variable
▸ Summary statistics for rv.:
▸ Mean value or expected value
▸ Variance
▸ standard deviation
LECTURE 1: INTRODUCTION
PROBABILITY DISTRIBUTION
▸ Continues variable
▸ A continues rv. can take any value in some interval.
▸ Thus, the probability that X takes any exact or single value is equal to zero.
▸ Probability of continue rv. X is computed in a range of values or interval: