View
819
Download
1
Embed Size (px)
Citation preview
01101100 01100101 01110110 01100101 01110010 01100001 01100111 01101001 01101110 01100111 00100000 01100011 01100111 01101001 01100001 01110010 00100000 01100010 01101001 01100111 00100000 01100100 01100001 01110100 01100001 00100000 01100010 01110010 01101001 01101110 01100111 00100000 01100010 01101001 01100111 00100000 01100100 01100001 01110100 01100001 00100000 01110100 01101111 00100000 01100001 01100111 01110010 01101001 01100011 01110101 01101100 01110100 01110101 01110010 01100101 00101100 00100000 01100001 01101110 01100100 00100000 01100001 01100111 01110010 01101001 01100011 01110101 01101100 01110100 01110101 01110010 01100101 00100000 01110100 01101111 00100000 01100010 01101001 01100111 00100000
01100100 01100001 01110100 01100001
Moore’s Law in Space
Internet usage:40% of global population – 2.26 billion
Developing countries: from 0-30% in 16 yearsOn linear trend, 100% in just 22 years. Goal of UN to have 50% by 2015. Achieved 34%
Philippines ranked above US in 2015
A game changer?
But is the Big Data revolution democratic?
Democratizing Big Data…..About CGIAR mission: propose ANOTHER BUSINESS MODEL for the use of these techniques.
Google, Monsanto, John Deere all entered the business of big data in Ag, but with the same business model: subscribed service for commercial farmers. Smallholders also have much to benefit from BD, but can’t always pay for the service.
How do we close equity gaps instead of widening them?
The Vision: starting point for discussionThe data revolution is changing the role, reach and modus operandi of research and development organizations such as CGIAR. It represents an unprecedented opportunity to find new ways of reducing hunger and poverty, but also has its risks: unequal access to and use of information could widen social inequity, and exacerbate yield gaps in agriculture. CGIAR is uniquely positioned to be a thought leader on the use of big data and information technology to drive equitable rural development, ensuring that the data revolution is democratic, and reaches the poor and marginalized.
OverviewGoal: to harness the capabilities of Big Data to accelerate and enhance the impact of international agricultural research, and solve development problems faster, better and at greater scale
Organise: Make CGIAR data truly open and available, revolutionise how agricultural data is collected and managedConvene: Bring big data to agriculture and agriculture to big data by partnering the CGIAR with 42 Big Data powerhouse partnersInspire: Solve development problems with big data; generate new international public goods around big data in agricultural development
Theory of change for Big Data in Agriculture• Unless our data is organized, we cannot use it effectively -> OA/OD critical factor for
success• Big data is as much a threat to global development and equality as it is an
opportunity: CGIAR can and should play a role on the boundary between “silicon valley” and poor rural regions• New partnerships are needed, CGIAR needs to build a foundation (human,
infrastructure, social) to be at the lead of the dialogue of big data in agriculture in developing countries• Harness capacity to do CGIAR research and development smarter and faster• We need to inspire – show how it can be done, and attract private sector investment
and sustainable business providing big data based services to rural communities
Big Data: A behavior change• YES big data requires large amounts of data and therefore big
servers, BUT it is much more than that:• REUSING the data: Extracting embedded knowledge from existing
datasets to answer questions that don’t have to do with the initial purpose for which the data was captured.• COMBINING datasets that were originally not supposed to meet,
enable to relate more variables and uncover useful correlations.• ANALYZING with CREATIVITY: the data scientist needs to be
innovative in the uses he is giving the data. Who would have guessed that Google requests could help fighting flu?
Many partners: central to achieving breakthrough big data science
An incredible opportunity to change the course of agricultural development! CGIAR
Internal Data Housekeeping
CGIAR as a Boundary partner for development
Development impacts
Big Data Science and Discovery
Research
IBM support CGIAR on data management and archiving
IRRI and IBM work together to develop new business opportunities that co-generate agricultural development impacts
IRRI provide IBM with new datasets and data problems that contribute to science discovery
IRRI provide decision makers with novel, actionable information to aid decision making
Hey Cigi, when should I plant my maize?
Real-time decision support system for farmers
Easy natural language as an interface
Smart artificial intelligence trained by CGIAR and partners
Leveraging open, harmonized and interoperable multiple databases
A complementary bottom-up approach: Information from commercial fields - Taking advantage of modern information technologies !!!
Climate Soil Crop management Productivity
/Quality
Site-specific information
Yield and quality limiting factors
favorable/unfavorable Climatic patterns
Optimal site-specific management practices
Massively exciting, transformational science
“The most magical aspect of big data is Smart Data: the application of statistical analytics and machine learning to data sets to find interesting connections and signals in all the noise.” ”. Philip Brittan. http://tmsnrt.rs/1EmFXTT
238 production events, 2013 to 2016www.open-aeps.org
From zero to heros: New insights in 4 slides
VARIABLES SIGNIFICADO TIPO UNIDADTIPO_SIEMBRA Siembra mecanizada o manual Categórica NASEM_TRATADAS Tratamiento de la semilla Booleana NADIST_SURCOS Distancia entre surcos Cuantitativa mDIST_PLANTAS Distancia entre plantas Cuantitativa mCOLOR_ENDOSPERMO Color del maíz Categórica NACULT_ANT Cultivo anterior Categórica NADRENAJE Se hace drenaje en la parcela Booleana NAPOBLACION_20DIAS Numero de plantas por hectárea vivas a los 20 días después de germinación Cuantitativa plantas.ha-1
METODO_COSECHA Cosecha mecanizada o manual Categórica NAALMACENAMIENTO_FINCA Se almacena la cosecha? Booleana NACONTENFQUI Conteo de tratamientos químicos contra enfermedades Cuantitativa NACONTMALQUI Conteo de tratamientos químicos contra malezas Cuantitativa NACONTPLAQUI Conteo de tratamientos químicos contra plagas Cuantitativa NACANFERQUI Conteo de fertilizaciones químicas Cuantitativa NAPENDIENTE Pendiente promedio del lote Cuantitativa gradosPH pH del suelo Cuantitativa NAESTRUCTURA_RASTA Estructura del suelo Categórica NAMAT_ORGANICA Contenido de materia orgánica Categórica NADRE_INTERN Capacidad de drenaje interno del suelo Categórica NADREN_EXTERN Capacidad de drenaje externo del suelo Categórica NAPROF_EFEC Profundidad efectiva del suelo Cuantitativa cmMATERIAL_GENETICO1 Cultivar Categórica NATEMP_MAX_AVG_VEG Promedio de temperatura máxima en fase vegetativa Cuantitativa °CTEMP_MIN_AVG_VEG Promedio de temperatura mínima en fase vegetativa Cuantitativa °CTEMP_AVG_VEG Promedio de temperatura en fase vegetativa Cuantitativa °CDIURNAL_RANGE_AVG_VEG Amplitud térmica promedio en fase vegetativa Cuantitativa °CSOL_ENER_ACCU_VEG Acumulación de energía solar en fase vegetativa Cuantitativa cal.cm-2
RAIN_ACCU_VEG Acumulación de precipitación en fase vegetativa Cuantitativa mmRAIN_10_FREQ_VEG Frecuencia de días con lluvias de más de 10mm en fase vegetativa Cuantitativa NATEMP_MIN_15_FREQ_VEG Frecuencia de días con temperaturas mínimas menores a 15°C en fase vegetativa Cuantitativa NARHUM_AVG_VEG Promedio de humedad relativa en fase vegetativa Cuantitativa %RHUM_SD_VEG Deviación estándar de la humedad relativa en fase vegetativa Cuantitativa NATEMP_MAX_AVG_FOR Promedio de temperatura máxima en fase de formación Cuantitativa °CTEMP_MIN_AVG_FOR Promedio de temperatura mínima en fase de formación Cuantitativa °CTEMP_AVG_FOR Promedio de temperatura en fase de formación Cuantitativa °CDIURNAL_RANGE_AVG_FOR Amplitud térmica promedio en fase de formación Cuantitativa °CSOL_ENER_ACCU_FOR Acumulación de energía solar en fase de formación Cuantitativa cal.cm-2
RAIN_ACCU_FOR Acumulación de precipitación en fase de formación Cuantitativa mmRAIN_10_FREQ_FOR Frecuencia de días con lluvias de más de 10mm en fase de formación Cuantitativa NATEMP_MIN_15_FREQ_FOR Frecuencia de días con temperaturas mínimas menores a 15°C en fase de formación Cuantitativa NARHUM_AVG_FOR Promedio de humedad relativa en fase de formación Cuantitativa %RHUM_SD_FOR Deviación estándar de la humedad relativa en fase de formación Cuantitativa NATEMP_MAX_AVG_MAD Promedio de temperatura máxima en fase de maduración Cuantitativa °CTEMP_MIN_AVG_MAD Promedio de temperatura mínima en fase de maduración Cuantitativa °CTEMP_AVG_MAD Promedio de temperatura en fase de maduración Cuantitativa °CDIURNAL_RANGE_AVG_MAD Amplitud térmica promedio en fase de maduración Cuantitativa °CSOL_ENER_ACCU_MAD Acumulación de energía solar en fase de maduración Cuantitativa cal.cm-2
RAIN_ACCU_MAD Acumulación de precipitación en fase de maduración Cuantitativa mmRAIN_10_FREQ_MAD Frecuencia de días con lluvias de más de 10mm en fase de maduración Cuantitativa NATEMP_MIN_15_FREQ_MAD Frecuencia de días con temperaturas mínimas menores a 15°C en fase de maduración Cuantitativa NARHUM_AVG_MAD Promedio de humedad relativa en fase de maduración Cuantitativa %RHUM_SD_MAD Deviación estándar de la humedad relativa en fase de maduración Cuantitativa NATOTN Cantidad total de nitrógeno aportada Cuantitativa kgTOTP Cantidad total de fosforo aportada Cuantitativa kgTOTK Cantidad total de potasio aportada Cuantitativa kgTEXTURA Textura del suelo Categórica NARDT Rendimiento Cuantitativa kg.ha-1
Varia
bles
Data and AnalysisFarmers record production data and
send through app
Data geeks mine it to death:• Conditional Inference Forest (CIF)1,2
• Partial dependence plots3
• ……..
1 Hothorn, Torsten, Kurt Hornik, and Achim Zeileis. 2006. “Unbiased Recursive Partitioning: A Conditional Inference Framework.” Journal of Computational and Graphical Statistics 15(3): 651–74.2 Strobl, Carolin, Anne-laure Boulesteix, Thomas Kneib, Thomas Augustin, and Achim Zeileis. 2008. “Conditional Variable Importance for Random Forests.” BMC Bioinformatics 11: 1–11.3 Hastie, Trevor, Robert Tibshirani, and Jerome Friedman. 2009. “The Elements of Statistical Learning.” Elements 1: 337–87. http://www.springerlink.com/index/10.1007/b94608.
Results
(c)
(d)
(e)
(b)
(a)
R2 = 45.79
Slope (>3°) and de external drain ( at least slow ) = Associated with high yield.
25 kg/ha is the minimum phosphorus to exploit the plan potential. From the 238 events, only 23 (10%) apply more than 25 kg/ha of phosphorus and 198 not fertilized.
Change the harvest method from manual to mechanized can gain 100 kg/haHowever only 59 events (25%) are harvest with the combined method.
The plant population at 20 day after germination should be above the 65000 plants/haCurrently, 158 (66%) plots, have less than 70 000 plants Actualmente, En 158 (66%) lotes, hay menos de 70 000 plantas.ha-1 a los 20 días
Impact Farmer gets personalied “Fenalcheck” report
Five basic farming principles identified (CropCheck): Privileging plots with slope > 2° Farmers with plots without external
drainage should adapt them. Apply a minimum amount of
phosphorus around 25kg . Harvest using a combined method Assure the plant population will be
at least of 65000 plants/ha, 20 days after germination.
Yield distributions for the three agronomic management groups observed in Córdoba.
Vertical lines correspond with the yield average from each group, the red and blue arrows represent the
yield gap for the members of groups B and N.
#1 Open Access Open Data
Credit: Tim Berners-Lee
Source: cgiar.org/open