23
01110010 00100000 01100010 01101001 01100111 00100000 01100100 01100001 01110100 01100001 00100000 01100010 01110010 01101001 01101110 01100111 00100000 01100010 01101001 01100111 00100000 01100100 01100001 01110100 01100001 00100000 01110100 01101111 00100000 01100001 01100111 01110010 01101001 01100011 01110101 01101100 01110100 01110101 01110010 01100101 00101100 00100000 01100001 01101110 01100100 00100000 01100001 01100111 01110010 01101001 01100011 01110101 01101100 01110100 01110101 01110010 01100101 00100000 01110100 01101111 00100000 01100010 01101001 01100111 00100000 01100100 01100001 01110100 01100001

Big Data in Agriculture - Setting the scene for the CGIAR

Embed Size (px)

Citation preview

Page 1: Big Data in Agriculture - Setting the scene for the CGIAR

01101100 01100101 01110110 01100101 01110010 01100001 01100111 01101001 01101110 01100111 00100000 01100011 01100111 01101001 01100001 01110010 00100000 01100010 01101001 01100111 00100000 01100100 01100001 01110100 01100001 00100000 01100010 01110010 01101001 01101110 01100111 00100000 01100010 01101001 01100111 00100000 01100100 01100001 01110100 01100001 00100000 01110100 01101111 00100000 01100001 01100111 01110010 01101001 01100011 01110101 01101100 01110100 01110101 01110010 01100101 00101100 00100000 01100001 01101110 01100100 00100000 01100001 01100111 01110010 01101001 01100011 01110101 01101100 01110100 01110101 01110010 01100101 00100000 01110100 01101111 00100000 01100010 01101001 01100111 00100000

01100100 01100001 01110100 01100001

Page 2: Big Data in Agriculture - Setting the scene for the CGIAR
Page 3: Big Data in Agriculture - Setting the scene for the CGIAR

Moore’s Law in Space

Page 4: Big Data in Agriculture - Setting the scene for the CGIAR
Page 5: Big Data in Agriculture - Setting the scene for the CGIAR

Internet usage:40% of global population – 2.26 billion

Developing countries: from 0-30% in 16 yearsOn linear trend, 100% in just 22 years. Goal of UN to have 50% by 2015. Achieved 34%

Philippines ranked above US in 2015

A game changer?

Page 6: Big Data in Agriculture - Setting the scene for the CGIAR
Page 7: Big Data in Agriculture - Setting the scene for the CGIAR
Page 8: Big Data in Agriculture - Setting the scene for the CGIAR

But is the Big Data revolution democratic?

Page 9: Big Data in Agriculture - Setting the scene for the CGIAR
Page 10: Big Data in Agriculture - Setting the scene for the CGIAR

Democratizing Big Data…..About CGIAR mission: propose ANOTHER BUSINESS MODEL for the use of these techniques.

Google, Monsanto, John Deere all entered the business of big data in Ag, but with the same business model: subscribed service for commercial farmers. Smallholders also have much to benefit from BD, but can’t always pay for the service.

How do we close equity gaps instead of widening them?

Page 11: Big Data in Agriculture - Setting the scene for the CGIAR

The Vision: starting point for discussionThe data revolution is changing the role, reach and modus operandi of research and development organizations such as CGIAR. It represents an unprecedented opportunity to find new ways of reducing hunger and poverty, but also has its risks: unequal access to and use of information could widen social inequity, and exacerbate yield gaps in agriculture. CGIAR is uniquely positioned to be a thought leader on the use of big data and information technology to drive equitable rural development, ensuring that the data revolution is democratic, and reaches the poor and marginalized.

Page 12: Big Data in Agriculture - Setting the scene for the CGIAR

OverviewGoal: to harness the capabilities of Big Data to accelerate and enhance the impact of international agricultural research, and solve development problems faster, better and at greater scale

Organise: Make CGIAR data truly open and available, revolutionise how agricultural data is collected and managedConvene: Bring big data to agriculture and agriculture to big data by partnering the CGIAR with 42 Big Data powerhouse partnersInspire: Solve development problems with big data; generate new international public goods around big data in agricultural development

Page 13: Big Data in Agriculture - Setting the scene for the CGIAR

Theory of change for Big Data in Agriculture• Unless our data is organized, we cannot use it effectively -> OA/OD critical factor for

success• Big data is as much a threat to global development and equality as it is an

opportunity: CGIAR can and should play a role on the boundary between “silicon valley” and poor rural regions• New partnerships are needed, CGIAR needs to build a foundation (human,

infrastructure, social) to be at the lead of the dialogue of big data in agriculture in developing countries• Harness capacity to do CGIAR research and development smarter and faster• We need to inspire – show how it can be done, and attract private sector investment

and sustainable business providing big data based services to rural communities

Page 14: Big Data in Agriculture - Setting the scene for the CGIAR

Big Data: A behavior change• YES big data requires large amounts of data and therefore big

servers, BUT it is much more than that:• REUSING the data: Extracting embedded knowledge from existing

datasets to answer questions that don’t have to do with the initial purpose for which the data was captured.• COMBINING datasets that were originally not supposed to meet,

enable to relate more variables and uncover useful correlations.• ANALYZING with CREATIVITY: the data scientist needs to be

innovative in the uses he is giving the data. Who would have guessed that Google requests could help fighting flu?

Page 15: Big Data in Agriculture - Setting the scene for the CGIAR

Many partners: central to achieving breakthrough big data science

Page 16: Big Data in Agriculture - Setting the scene for the CGIAR

An incredible opportunity to change the course of agricultural development! CGIAR

Internal Data Housekeeping

CGIAR as a Boundary partner for development

Development impacts

Big Data Science and Discovery

Research

IBM support CGIAR on data management and archiving

IRRI and IBM work together to develop new business opportunities that co-generate agricultural development impacts

IRRI provide IBM with new datasets and data problems that contribute to science discovery

IRRI provide decision makers with novel, actionable information to aid decision making

Page 17: Big Data in Agriculture - Setting the scene for the CGIAR

Hey Cigi, when should I plant my maize?

Real-time decision support system for farmers

Easy natural language as an interface

Smart artificial intelligence trained by CGIAR and partners

Leveraging open, harmonized and interoperable multiple databases

Page 18: Big Data in Agriculture - Setting the scene for the CGIAR

A complementary bottom-up approach: Information from commercial fields - Taking advantage of modern information technologies !!!

Climate Soil Crop management Productivity

/Quality

Site-specific information

Yield and quality limiting factors

favorable/unfavorable Climatic patterns

Optimal site-specific management practices

Massively exciting, transformational science

“The most magical aspect of big data is Smart Data: the application of statistical analytics and machine learning to data sets to find interesting connections and signals in all the noise.” ”. Philip Brittan. http://tmsnrt.rs/1EmFXTT

Page 19: Big Data in Agriculture - Setting the scene for the CGIAR

238 production events, 2013 to 2016www.open-aeps.org

From zero to heros: New insights in 4 slides

Page 20: Big Data in Agriculture - Setting the scene for the CGIAR

VARIABLES SIGNIFICADO TIPO UNIDADTIPO_SIEMBRA Siembra mecanizada o manual Categórica NASEM_TRATADAS Tratamiento de la semilla Booleana NADIST_SURCOS Distancia entre surcos Cuantitativa mDIST_PLANTAS Distancia entre plantas Cuantitativa mCOLOR_ENDOSPERMO Color del maíz Categórica NACULT_ANT Cultivo anterior Categórica NADRENAJE Se hace drenaje en la parcela Booleana NAPOBLACION_20DIAS Numero de plantas por hectárea vivas a los 20 días después de germinación Cuantitativa plantas.ha-1

METODO_COSECHA Cosecha mecanizada o manual Categórica NAALMACENAMIENTO_FINCA Se almacena la cosecha? Booleana NACONTENFQUI Conteo de tratamientos químicos contra enfermedades Cuantitativa NACONTMALQUI Conteo de tratamientos químicos contra malezas Cuantitativa NACONTPLAQUI Conteo de tratamientos químicos contra plagas Cuantitativa NACANFERQUI Conteo de fertilizaciones químicas Cuantitativa NAPENDIENTE Pendiente promedio del lote Cuantitativa gradosPH pH del suelo Cuantitativa NAESTRUCTURA_RASTA Estructura del suelo Categórica NAMAT_ORGANICA Contenido de materia orgánica Categórica NADRE_INTERN Capacidad de drenaje interno del suelo Categórica NADREN_EXTERN Capacidad de drenaje externo del suelo Categórica NAPROF_EFEC Profundidad efectiva del suelo Cuantitativa cmMATERIAL_GENETICO1 Cultivar Categórica NATEMP_MAX_AVG_VEG Promedio de temperatura máxima en fase vegetativa Cuantitativa °CTEMP_MIN_AVG_VEG Promedio de temperatura mínima en fase vegetativa Cuantitativa °CTEMP_AVG_VEG Promedio de temperatura en fase vegetativa Cuantitativa °CDIURNAL_RANGE_AVG_VEG Amplitud térmica promedio en fase vegetativa Cuantitativa °CSOL_ENER_ACCU_VEG Acumulación de energía solar en fase vegetativa Cuantitativa cal.cm-2

RAIN_ACCU_VEG Acumulación de precipitación en fase vegetativa Cuantitativa mmRAIN_10_FREQ_VEG Frecuencia de días con lluvias de más de 10mm en fase vegetativa Cuantitativa NATEMP_MIN_15_FREQ_VEG Frecuencia de días con temperaturas mínimas menores a 15°C en fase vegetativa Cuantitativa NARHUM_AVG_VEG Promedio de humedad relativa en fase vegetativa Cuantitativa %RHUM_SD_VEG Deviación estándar de la humedad relativa en fase vegetativa Cuantitativa NATEMP_MAX_AVG_FOR Promedio de temperatura máxima en fase de formación Cuantitativa °CTEMP_MIN_AVG_FOR Promedio de temperatura mínima en fase de formación Cuantitativa °CTEMP_AVG_FOR Promedio de temperatura en fase de formación Cuantitativa °CDIURNAL_RANGE_AVG_FOR Amplitud térmica promedio en fase de formación Cuantitativa °CSOL_ENER_ACCU_FOR Acumulación de energía solar en fase de formación Cuantitativa cal.cm-2

RAIN_ACCU_FOR Acumulación de precipitación en fase de formación Cuantitativa mmRAIN_10_FREQ_FOR Frecuencia de días con lluvias de más de 10mm en fase de formación Cuantitativa NATEMP_MIN_15_FREQ_FOR Frecuencia de días con temperaturas mínimas menores a 15°C en fase de formación Cuantitativa NARHUM_AVG_FOR Promedio de humedad relativa en fase de formación Cuantitativa %RHUM_SD_FOR Deviación estándar de la humedad relativa en fase de formación Cuantitativa NATEMP_MAX_AVG_MAD Promedio de temperatura máxima en fase de maduración Cuantitativa °CTEMP_MIN_AVG_MAD Promedio de temperatura mínima en fase de maduración Cuantitativa °CTEMP_AVG_MAD Promedio de temperatura en fase de maduración Cuantitativa °CDIURNAL_RANGE_AVG_MAD Amplitud térmica promedio en fase de maduración Cuantitativa °CSOL_ENER_ACCU_MAD Acumulación de energía solar en fase de maduración Cuantitativa cal.cm-2

RAIN_ACCU_MAD Acumulación de precipitación en fase de maduración Cuantitativa mmRAIN_10_FREQ_MAD Frecuencia de días con lluvias de más de 10mm en fase de maduración Cuantitativa NATEMP_MIN_15_FREQ_MAD Frecuencia de días con temperaturas mínimas menores a 15°C en fase de maduración Cuantitativa NARHUM_AVG_MAD Promedio de humedad relativa en fase de maduración Cuantitativa %RHUM_SD_MAD Deviación estándar de la humedad relativa en fase de maduración Cuantitativa NATOTN Cantidad total de nitrógeno aportada Cuantitativa kgTOTP Cantidad total de fosforo aportada Cuantitativa kgTOTK Cantidad total de potasio aportada Cuantitativa kgTEXTURA Textura del suelo Categórica NARDT Rendimiento Cuantitativa kg.ha-1

Varia

bles

Data and AnalysisFarmers record production data and

send through app

Data geeks mine it to death:• Conditional Inference Forest (CIF)1,2

• Partial dependence plots3

• ……..

1 Hothorn, Torsten, Kurt Hornik, and Achim Zeileis. 2006. “Unbiased Recursive Partitioning: A Conditional Inference Framework.” Journal of Computational and Graphical Statistics 15(3): 651–74.2 Strobl, Carolin, Anne-laure Boulesteix, Thomas Kneib, Thomas Augustin, and Achim Zeileis. 2008. “Conditional Variable Importance for Random Forests.” BMC Bioinformatics 11: 1–11.3 Hastie, Trevor, Robert Tibshirani, and Jerome Friedman. 2009. “The Elements of Statistical Learning.” Elements 1: 337–87. http://www.springerlink.com/index/10.1007/b94608.

Page 21: Big Data in Agriculture - Setting the scene for the CGIAR

Results

(c)

(d)

(e)

(b)

(a)

R2 = 45.79

Slope (>3°) and de external drain ( at least slow ) = Associated with high yield.

25 kg/ha is the minimum phosphorus to exploit the plan potential. From the 238 events, only 23 (10%) apply more than 25 kg/ha of phosphorus and 198 not fertilized.

Change the harvest method from manual to mechanized can gain 100 kg/haHowever only 59 events (25%) are harvest with the combined method.

The plant population at 20 day after germination should be above the 65000 plants/haCurrently, 158 (66%) plots, have less than 70 000 plants Actualmente, En 158 (66%) lotes, hay menos de 70 000 plantas.ha-1 a los 20 días

Page 22: Big Data in Agriculture - Setting the scene for the CGIAR

Impact Farmer gets personalied “Fenalcheck” report

Five basic farming principles identified (CropCheck): Privileging plots with slope > 2° Farmers with plots without external

drainage should adapt them. Apply a minimum amount of

phosphorus around 25kg . Harvest using a combined method Assure the plant population will be

at least of 65000 plants/ha, 20 days after germination.

Yield distributions for the three agronomic management groups observed in Córdoba.

Vertical lines correspond with the yield average from each group, the red and blue arrows represent the

yield gap for the members of groups B and N.

Page 23: Big Data in Agriculture - Setting the scene for the CGIAR

#1 Open Access Open Data

Credit: Tim Berners-Lee

Source: cgiar.org/open