38
Fabio Ricciato, EUROSTAT Project Meeting on Measuring Human Mobility 27 - 29 March 2019, Tbilisi, Georgia Overview of current activities in MNO data at EUROSTAT

Overview of current activities in MNO data at EUROSTAT · • JRC study on density estimation from mobile network data • ... tower locations detailed radio configuration parameters

  • Upload
    others

  • View
    7

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Overview of current activities in MNO data at EUROSTAT · • JRC study on density estimation from mobile network data • ... tower locations detailed radio configuration parameters

Fabio Ricciato, EUROSTAT Project Meeting on Measuring Human Mobility 27 - 29 March 2019, Tbilisi, Georgia

Overview of current activities in MNO data at EUROSTAT

Page 2: Overview of current activities in MNO data at EUROSTAT · • JRC study on density estimation from mobile network data • ... tower locations detailed radio configuration parameters

About EUROSTAT and ESS

•  Eurostat is the statistical office of the European Union, and is part of the European Commission

•  The European Statistical System (ESS) is the partnership between Eurostat, the national statistical institutes (NSIs) and other national authorities responsible in each Member State for the development, production and dissemination of European statistics.

Page 3: Overview of current activities in MNO data at EUROSTAT · • JRC study on density estimation from mobile network data • ... tower locations detailed radio configuration parameters

About me •  MS’99 Electrical Engineering, PhD’03 in Telecommunications,

from Univ. La Sapienza, Rome •  2004-2008 Senior researcher and project manager in FTW, Vienna

•  METAWIN & DARWIN projects on 3G data traffic monitoring •  2007-2013 Assistant Professor, Univ. of Salento, Lecce, Italy

•  Teaching Telecommunication Systems (2G/3G/4G networks) •  2013-2014 Head of Business Unit at AIT, Vienna

•  JRC study on density estimation from mobile network data •  2015-2017 Professor at Univ. of Ljubljana, Slovenia •  Since January 2018 – Statistical offices in EUROSTAT

Unit B1 “Innovation; Methodologies in Official Statistics”

Page 4: Overview of current activities in MNO data at EUROSTAT · • JRC study on density estimation from mobile network data • ... tower locations detailed radio configuration parameters

Current activities on MNO data in Eurostat

•  Developing Reference Methodological Framework (RFM) for using MNO data for Official Statistics •  focus on presence & mobility patterns •  CDR and signalling data

•  Developing and comparing different methodological

variants for the density estimation problem

•  Exploiting Secure Multi-party Computation (SMC) for multi-MNO data fusion •  capacity building, technical and legal aspects

Page 5: Overview of current activities in MNO data at EUROSTAT · • JRC study on density estimation from mobile network data • ... tower locations detailed radio configuration parameters

Ongoing Collaborations

•  Cooperation with MNO •  proximus (belgium)

•  Cooperation with NSI in the framework of ESSnet on Big Data •  Italy, Netherlands, France, Spain,…

Page 6: Overview of current activities in MNO data at EUROSTAT · • JRC study on density estimation from mobile network data • ... tower locations detailed radio configuration parameters

Current activities on MNO data in Eurostat

•  Developing Reference Methodological Framework (RFM) for using MNO data for Official Statistics •  focus on presence & mobility patterns •  CDR and signalling data

•  Developing and comparing different methodological

variants for the density estimation problem

•  Exploiting Secure Multi-party Computation (SMC) for multi-MNO data fusion •  capacity building, technical and legal aspects

Page 7: Overview of current activities in MNO data at EUROSTAT · • JRC study on density estimation from mobile network data • ... tower locations detailed radio configuration parameters

controller

Core Network (CN)

Billing System

Radio Access Network (RAN)

RAN-SIG

CN-SIG

CDR

LBS

LBS

Page 8: Overview of current activities in MNO data at EUROSTAT · • JRC study on density estimation from mobile network data • ... tower locations detailed radio configuration parameters

MNO data

More informative higher spatial resolution, higher temporal frequency, better coverage

More costly to extract for MNO to interpret for statisticans

CDR

CN signalling

RAN signalling

tower locations

detailed radio configuration parameters

BSA maps

cell type & basic parameters

Transactional data

Auxiliary data

CN: Core Network RAN: Radio Access Network BSA: Best Service Area

Page 9: Overview of current activities in MNO data at EUROSTAT · • JRC study on density estimation from mobile network data • ... tower locations detailed radio configuration parameters

Statistics S-Layer

MNO Data D-Layer

Convergence C-layer

Embeds raw MNO data §  Data Heterogeneity §  Diversity of data collection methods §  Complexity of data semantics §  Multiple MNOs

Definitions of statistical indicators §  Heterogeneity of applications & use-cases §  Diversity of statistical definitions §  Complexity of statistical objects §  Multiple NSIs

Few common definitions

Domain of expertise: statisticians, NSI

Domain of expertise: telecom engineers, MNO 6

RMF - hourglass model

Page 10: Overview of current activities in MNO data at EUROSTAT · • JRC study on density estimation from mobile network data • ... tower locations detailed radio configuration parameters

Statistics S-Layer

MNO Data D-Layer

Convergence C-layer

Few common definitions

Multiple data sources: MNO#1, MNO#2… Different data types: CDR, signalling data, RAN data, LBS, …

Multiple data users: EUROSTAT, NSI#1, NSI#2… Different subject matter experts & use-cases: tourism, population, transport, …

raw microdata

standardised/uniformed microdata

aggregates, macrodata

RMF - hourglass model

Page 11: Overview of current activities in MNO data at EUROSTAT · • JRC study on density estimation from mobile network data • ... tower locations detailed radio configuration parameters

Benefits of layering •  Decouples the complexity & heterogeneity of the 2 domains

ð  hides complexity & heterogeneity of MNO data to statisticians ð  hides complexity & heterogeneity of statistical concepts to

MNO engineers

•  Decoupling allows for independent development, adoption and evolution at each domain

•  C-layer = abstract “knowledge interface” between domains

= "common language" for the different actors across the two domains during the algorithm design phase

≠ physical interface for data export!

Decouple the DESIGN from the EXECUTION of algorithm: roles and interfaces in design phase ≠ roles and interfaces in execution phase

Page 12: Overview of current activities in MNO data at EUROSTAT · • JRC study on density estimation from mobile network data • ... tower locations detailed radio configuration parameters

Algorithm design versus execution

C-path

Interpolated C-location at time t*

Density Map

Estimated max velocity in [t*,t**]

Living Location Mode-of-transport Road Congestion

… …

… Statistics S-Layer

Convergence C-Layer

MNO Data D-Layer

External data (e.g., land use maps, road network maps, admin registry etc.)

algorithms designed by telecom engineers

algorithms executed by MNO

algorithms designed

by statisticians

algorithms executed

by NSI

designed by statisticians & executed by MNO

Page 13: Overview of current activities in MNO data at EUROSTAT · • JRC study on density estimation from mobile network data • ... tower locations detailed radio configuration parameters

CN signalling

CDR

RAN signalling

LBS data … …

Tower locations

Statistics S-Layer

MNO Data D-Layer

C-location C-path Convergence

C-Layer

D2C Mapping functions how to produce C-trajectories &

C-locations from MNO data

C2S Processing functions how to extract statistics from the

C-trajectories

Cell type & configuration

Population density Usual place of living Tourism trip

Page 14: Overview of current activities in MNO data at EUROSTAT · • JRC study on density estimation from mobile network data • ... tower locations detailed radio configuration parameters

C-layer as a common substratum for MNO data users

MNO 1

NSI A

MNO 2 MNO 3

NSI B

MNO 4

NSI C

Convergence Layer

commercial analytic product

research institution

scrutiny validation

improvement innovation

private company

Users & Providers of Methodologies (NSO, researchers, private companies,…)

open-source code

non-personal data

processing components

Providers of Data (MNO)

Page 15: Overview of current activities in MNO data at EUROSTAT · • JRC study on density estimation from mobile network data • ... tower locations detailed radio configuration parameters

C-location

x

y

t

x,y spatial dimensions (latitude, longitude) t* reference time a*(j) best-guessed bounding area at time t* for

mobile user j

t*

a*(j)

Extended region (low spatial resolution) typically defined over a grid (rasterization)

Spatial (low) resolution: you do not have a point, but an extended area that is likely to contain the actual point

Page 16: Overview of current activities in MNO data at EUROSTAT · • JRC study on density estimation from mobile network data • ... tower locations detailed radio configuration parameters

C-path

x

y

t

x,y spatial dimensions (latitude, longitude) t1(j), t2(j) observation instants for mobile user j a1(j), a2(j) best-guessed bounding area at time t1,t2

for mobile user j

t1(j)

a1(j)

t2(j) t*

Observed or at sparse sampling times (low temporal resolution)

a2(j)

t1(j) and t2(j) are associated to individual mobile user j. They are dictated by the MNO network configuration, types of MNO data etc. logically they refer to the lower D-layer

t* is the reference time set by the S-layer expert (statistician), common for all mobile users (not depending on j)

Temporal (low) resolution: the desired reference time t* might not be included in the set of available observation times

Page 17: Overview of current activities in MNO data at EUROSTAT · • JRC study on density estimation from mobile network data • ... tower locations detailed radio configuration parameters

C-path

x

y

t

B1

B2

t1(j)

a1(j)

t2(j) t*

a2(j)

x,y spatial dimensions (latitude, longitude) t1(j), t2(j) observation instants for mobile user j a1(j), a2(j) best-guessed bounding area at time t1,t2

for mobile user j

B1(j) outer bounding area during the interval [t1,t2) (àLocation Area in GSM)

B2(j) outer bounding area during the interval [t2,t3)

However, if MNO signalling data are available at the D-layer, we can identify an outer region bounding the /unknown) trajectory of the mobile users between two consecutive observation times (this is given by the Location Area, Routing Area, Paging Area in 2G/3G/4G…).

Page 18: Overview of current activities in MNO data at EUROSTAT · • JRC study on density estimation from mobile network data • ... tower locations detailed radio configuration parameters

C-path

x

y

t

x,y spatial dimensions (latitude, longitude) t time a1,a2 best-guessed bounding area at time t1,t2

B1 outer bounding area during the

interval [t1,t2) (àLocation Area in GSM)

Examples of possible trajectory

B1

B2

t1(j)

a1(j)

t2(j) t*

a2(j)

Page 19: Overview of current activities in MNO data at EUROSTAT · • JRC study on density estimation from mobile network data • ... tower locations detailed radio configuration parameters

Which spatial mapping approach ?

•  Options •  Voronoi tessellation LL

as in most published literature

•  Better variant of Voronoi tessellation K proximus “Technology-Agnostic Cell (TAC) area”

•  Best Service Area (BSA) tessellation K ISTAT-WIND research

•  Nominal coverage area (overlapping areas) J •  uniform •  non-uniform (mobloc) device ID u1

timestamp t1 cell/sector ID s1 [optional data] h1

micro-records C-locations

spatial mapping

tessellations (non overlapping)

Page 20: Overview of current activities in MNO data at EUROSTAT · • JRC study on density estimation from mobile network data • ... tower locations detailed radio configuration parameters

Tower position qk

Nominal cell radius

C-location radius rk

C-location ak

Nominal cell coverage area

circular cell 120° sector cell

120° sector cell + range data (Time Advance)

Triangulation from LBS system

Page 21: Overview of current activities in MNO data at EUROSTAT · • JRC study on density estimation from mobile network data • ... tower locations detailed radio configuration parameters

Current activities on MNO data in Eurostat

•  Developing Reference Methodological Framework (RFM) for using MNO data for Official Statistics •  focus on presence & mobility patterns •  CDR and signalling data

•  Developing and comparing different methodological

variants for the density estimation problem

•  Exploiting Secure Multi-party Computation (SMC) for multi-MNO data fusion •  capacity building, technical and legal aspects

Page 22: Overview of current activities in MNO data at EUROSTAT · • JRC study on density estimation from mobile network data • ... tower locations detailed radio configuration parameters

Design choices •  1) Which spatial mapping approach ?

•  Voronoi tessellation LL •  Variant of Voronoi tessellation K •  Best Service Area (BSA) tessellation K •  Overlapping coverage area JJ

uniform or non-uniform ?

•  2) Which space/time interpolation method?

•  Zero-order interpolation •  …

•  3) Which inference method? •  Area Proportional •  Maximum Likelihood •  Hierarchical Bayesian •  …

spatial mapping

space/time interpolation

statistical inference

raw data from real world

output estimate

Page 23: Overview of current activities in MNO data at EUROSTAT · • JRC study on density estimation from mobile network data • ... tower locations detailed radio configuration parameters

x

y

t t1

a1

t2

a2

t*

B1

y

C-path

Interpolated C-location at time t*

Spatio/Temporal Interpolation

Page 24: Overview of current activities in MNO data at EUROSTAT · • JRC study on density estimation from mobile network data • ... tower locations detailed radio configuration parameters

device ID u1 timestamp t1 cell/sector ID s1 [optional data] h1

micro-records C-locations

device ID u2 timestamp t2 cell/sector ID s2 [optional data] h2

Density map

spatial mapping

+ interpolation

inference method

Page 25: Overview of current activities in MNO data at EUROSTAT · • JRC study on density estimation from mobile network data • ... tower locations detailed radio configuration parameters

Current activities on MNO data in Eurostat

•  Developing Reference Methodological Framework (RFM) for using MNO data for Official Statistics •  focus on presence & mobility patterns •  CDR and signalling data

•  Developing and comparing different methodological

variants for the density estimation problem

•  Exploiting Secure Multi-party Computation (SMC) for multi-MNO data fusion •  capacity building, technical and legal aspects

Page 26: Overview of current activities in MNO data at EUROSTAT · • JRC study on density estimation from mobile network data • ... tower locations detailed radio configuration parameters

Stage 1 scenario

MNO #1

NSI

Raw micro-data (D-layer)

Standardised micro-data (C-layer)

aggregate data

(S-layer)

final statistics (S-layer)

Page 27: Overview of current activities in MNO data at EUROSTAT · • JRC study on density estimation from mobile network data • ... tower locations detailed radio configuration parameters

NSI

Stage 2 scenario

MNO #2

MNO #1

MNO #3

Raw micro-data (D-layer)

Standardised micro-data (C-layer)

aggregate data

(S-layer)

final statistics (S-layer)

Page 28: Overview of current activities in MNO data at EUROSTAT · • JRC study on density estimation from mobile network data • ... tower locations detailed radio configuration parameters

NSI

Stage 3 scenario

MNO #2

MNO #1

MNO #3

Raw micro-data (D-layer)

Standardised micro-data (C-layer)

Secure Multiparty Computation (SMC) platform for privacy-preseving cross-domain data linkage

Page 29: Overview of current activities in MNO data at EUROSTAT · • JRC study on density estimation from mobile network data • ... tower locations detailed radio configuration parameters

Problem statement

•  Two or more input parties held confidential data x1, x2, … •  They agree that another output party (e.g., the Stat. Office)

computes the result of a statistical function y=f(x1,x2,…) on their input data (e.g., regression coefficients)

•  Each input party does not want to disclose its input data with the other input/output parties

x2

x1

y = f x1,x2 ,...( )

Input parties

Output party

Page 30: Overview of current activities in MNO data at EUROSTAT · • JRC study on density estimation from mobile network data • ... tower locations detailed radio configuration parameters

Trusted Third Party (TTP)

•  data sharing still occurs (with TTP) •  risk concentration at TTP •  a single entity trusted by all input/output parties

might not exist

x2

x1

Input parties

Output party y = f x1,x2 ,...( )Trusted Third Party

this is sharing! all data are here!

NOT A SOLUTION!

Page 31: Overview of current activities in MNO data at EUROSTAT · • JRC study on density estimation from mobile network data • ... tower locations detailed radio configuration parameters

Fully Homomorphic Encryption (FHE) •  Encrypt the input data xn into pn (non-reversible transformation) •  The computation on encrypted data eturns the same output value that

would be obtained on the input data (homomorfism) •  à along the process the original inputs are never reconstructed

(no “decryption” takes place) - only the output is revealed •  theoretically possible, but practically unfeasible in most cases

x2

x1

Input parties

Output party

y = f p p1, p2( ) = f x1,x2( )NOT

SCALABLE !

Page 32: Overview of current activities in MNO data at EUROSTAT · • JRC study on density estimation from mobile network data • ... tower locations detailed radio configuration parameters

•  Each element of secret input xn is transformed into K “shares” pn,1, pn,2 … pn,k that are distributed to different computing parties

•  The computation is distributed (shared) among the computing parties •  The computation on secret shares returns the same output value that would be

obtained from the input data (homomorfism)

pn,1

pn,2

pn,3

Input parties

Computing Parties

x2

x1

y

Output party

y = fs p1,1, p1,2 , p1,3 , p2,1, p2,2 , p2,3( ) = f x1,x2( )

Secure Multi-party Computation via Secret Sharing

Page 33: Overview of current activities in MNO data at EUROSTAT · • JRC study on density estimation from mobile network data • ... tower locations detailed radio configuration parameters

•  Individual shares reveal nothing about the secret input •  à no single party holds “data”

à “passing shares” ≠ “sharing data”

•  Computing parties need to be trusted collectively, not individually

pn,1

y pn,2

pn,3

Input parties

Output party

Computing Parties

x2

x1

Secure Multi-party Computation via Secret Sharing

Page 34: Overview of current activities in MNO data at EUROSTAT · • JRC study on density estimation from mobile network data • ... tower locations detailed radio configuration parameters

Main points of interest for this project

•  Develop novel methodologies, explore use-cases, with special focus on multi-MNO aspects

•  [optional] benchmark algorithm implementation in centralized vs. SMC settings •  verify correctness of output •  assess computation load and delay of SMC implementation

Page 35: Overview of current activities in MNO data at EUROSTAT · • JRC study on density estimation from mobile network data • ... tower locations detailed radio configuration parameters

MNO #1

MNO #2

MNO #3

Input parties

Centralized Computation

reference output

GNCC environment

Page 36: Overview of current activities in MNO data at EUROSTAT · • JRC study on density estimation from mobile network data • ... tower locations detailed radio configuration parameters

MNO #1

MNO #2

MNO #3

Computing parties Input parties

Output party

Secret Sharing

output from SMC

emulated instances within GNCC

GNCC environment

Page 37: Overview of current activities in MNO data at EUROSTAT · • JRC study on density estimation from mobile network data • ... tower locations detailed radio configuration parameters

MNO #1

MNO #2

MNO #3

Computing parties Input parties

Output party

Centralized Computation

Secret Sharing

output from SMC

reference output

compare for correctness

computation load computation delay

GNCC environment

Page 38: Overview of current activities in MNO data at EUROSTAT · • JRC study on density estimation from mobile network data • ... tower locations detailed radio configuration parameters

Thanks for your attention

For follow-up: [email protected]