68
http://www.flickr.com/photos/sminor/ Unveiling the web, making the implicit explicit; how new technologies will do your networking for you, and what you can do to take advantage of that. Ian Mulvany VP New Product Development Mendeley.com ..

Unveiling the web, making the implicit explicit

Embed Size (px)

DESCRIPTION

This talk was given on the 9th of August 2010 at the American Phytopathological Society's annual conference in Charolette North Carolina. I talk about how the commodotisation of emerging tools on the web, such as the semantic web and scalable architectures, may have an effect on the communication and practice of science.

Citation preview

Page 1: Unveiling the web, making the implicit explicit

http://www.flickr.com/photos/sminor/

Unveiling the web, making the implicit explicit; how new technologies will do your

networking for you, and what you can do to take advantage

of that.

Ian MulvanyVP New Product Development Mendeley.com

..

Page 2: Unveiling the web, making the implicit explicit

Hi, I’m Ian. I started out in science studying astrophysics.

Then I worked as an editor for Springer.

While doing that job I got really interested in how the web could help with scientific communication.

That led me to Nature where I spent three years building web applications for scientists.

For the last 5 weeks I’ve been working with a great little startup called Mendeley.

Page 3: Unveiling the web, making the implicit explicit

Humans

Machines

AcademicPublic

Science Blogging/Tweeting/Social Communities

You are all familiar with social media tools like blogs, twitter and social networks.

They are great for connecting professionals, and for reaching out to the general public.

But in a way these tools are really just at the surface of the internet.

There are a lot of interesting emerging technologies that lie beneath that are starting to have an impact on science, technologies like semantic markup, the commodotisation of scalable web architectures, and easier to implement machine learning tools.

Today I’m going to talk about some of these tools.

Page 4: Unveiling the web, making the implicit explicit

the future is already here. It's just not very evenly distributed

william gibson - 1999image flickr: fredarmitage

I love this quote, and what I want to do today is mainly show you what some other fields of science have been doing with some of these technologies.

Page 5: Unveiling the web, making the implicit explicit

it required no brilliance for people to foresee the fabulous growth

that awaited such industries as ... aircraft (in 1930) and television sets (in 1950).

But the future then also included competitive dynamics that would

decimate almost all of the companies entering those industries

Warren Buffet

But I do have a very important warning for you all.

This is Warren Buffet, he is one of the most successful investors of all time, and he doesn’t invest in internet companies.

This quote is taken from his annual letter to investors, and there he explains that the internet is a disruptive technology, and that makes it hard to predict what is going to succeed.

Just like it was hard to predict who would do really well at the dawn of the aeroplane, or the TV.

Page 6: Unveiling the web, making the implicit explicit

Google Waveimage: flickr prgibbs

For the last year I was convinced that Google Wave was going to be the next big thing, and I told a lot of people that.

Last week Google stopped development work on Google Wave.

Oops.

So instead of telling what tools are going to be the ones that you should use, I just want you to concentrate on how the tools I will talk about are making changes happen.

I don’t know if they will be the ones that will be around in five years, but I know things will be different.

Page 7: Unveiling the web, making the implicit explicit

images: wikimedia commonsThis is what things that you could fly in looked like about 100 years ago.

Page 8: Unveiling the web, making the implicit explicit

images: wikimedia commonsI’m very glad to say that I flew in on something that looked like this.

Technology does mature, and when it matures all the complicated bits get abstracted away from you.

They get hidden.

You sit in your seat, and a few hours later you are somewhere else.

It’s like magic.

Page 9: Unveiling the web, making the implicit explicit

• server room http://www.flickr.com/photos/tuxstorm/

Ethernet

TCP/IP

HTTPThe internet is very complicated.

It’s a big distributed mess of cables and protocols.

Page 10: Unveiling the web, making the implicit explicit

But you don’t see that any more.

You just see some very easy to use interfaces.

All the complexity has been abstracted away and hidden from you.

In the last five years I bet everyone in this room has become a content creator on the internet.

All because it’s just so easy nowadays.

Page 11: Unveiling the web, making the implicit explicit

image flickr: robbie73

Peer Review at Nature, 43 years old

Mendeley, ~1.5 years old

Google, 12 years old

Nature, 141 years old

The Royal Society, 350 years old

Facebook, 7 years old

Twitter, 4 years old

So even though some of the tools that I’m going to tell you about today may not be very mature yet.

That will happen,

Time goes really fast on the internet!

Page 12: Unveiling the web, making the implicit explicit

no idea

person I know

person I know

person I know

person I know

person I know

person I know

person I know

no idea

no idea

no idea

no idea

no idea

no ideano idea

no idea

no idea

no idea

no idea

no idea

no idea

no idea

no idea

no idea

no idea

no idea

no idea

Some of the companies on the internet have started taking the content you have created.

The digital trails that leave behind you every day.

And they have used that to recommend new friends for you.

And stuff for you to buy all of these new friends of yours.

Page 13: Unveiling the web, making the implicit explicit

Why no recommendation engine

for science, especially multi-disciplinary

science?

I want a jet pack, but I also want a really good recommendation engine for science.

Page 14: Unveiling the web, making the implicit explicit

doi/10.1371/journal.pone.0004803.g007

Bollen, J. et al., 2009. A principal component analysis of 39 scientific impact measures. Methods, 1-19.

I want a jet pack, but I also want a really good recommendation engine for science.

This shows how journals are related by the reading patterns of scientists.

Science is so richly interconnected, it’s a shame that we don’t have great recommendation engines yet.

(by the way, if you don’t like the impact factor, go and read this paper, it’s awesome, and Johan is a really great guy!)

Page 15: Unveiling the web, making the implicit explicit

time

Citations

Of course much of the rich interlinking comes from citations.

Citations link papers together, but there is a problem with these links

You can never tell whether the link is a good link or ...

Page 16: Unveiling the web, making the implicit explicit

time

Citations

a bad link.

Page 17: Unveiling the web, making the implicit explicit

RDFThere is a way to turn links into relationships on the web.

It adds meaning to links.

It adds semantics to the web.

RDF is a popular way of doing this.

RDF means Resource Description Framework, but at it’s heart, it’s just a way of adding information about what a connection means.

Page 18: Unveiling the web, making the implicit explicit

image: flickr fturmog

Semantic Web Applications in Neuromedicine

Researchers at Harvard Medical School and the Massachusetts's Hospital are using RDF in Alzheimer’s research.

Their systems is called SWAN.

Page 19: Unveiling the web, making the implicit explicit

Research Narrative

Research Statement

Research Statement

Research Statement

Research Statement

inconsistentconsistent discusses

alternativeTo

Every scientific paper is really a story.

It tells us about the nature of the world, and it draws on the works of other people to convince us that new claims about the world are true.

Using SWAN the author of a paper adds the context to each citation and statement in a paper.

They let us know whether the claims in a paper are consistent, inconsistent or an alternative to another claim elsewhere.

It takes a lot of effort to mark up a paper like this. It’s expensive.

Page 20: Unveiling the web, making the implicit explicit

http://hypothesis.alzforum.org/swan/

But when you do it, you get an amazing overview of the literature.

You can use a machine to find the most controversial claims very quickly.

You can use that information to decide what experiment will shed the most light into our ignorance.

Page 21: Unveiling the web, making the implicit explicit

There are a growing number of sites and data silos that support rdf. This is the semantic web.

Page 22: Unveiling the web, making the implicit explicit

2 300, 000, 000Assertions in BioRDF

There are a huge number of statements about biological systems.

But what happens if you have plain vanilla html, or a naked CSV data set?

Page 23: Unveiling the web, making the implicit explicit

Let’s take an example from plant science.

On http://sbr.ipmpipe.org/cgi-bin/sbr/public.cgi you can get a map of the spread of soybean rust.

When you click on the link you get the information as a html table.

Page 24: Unveiling the web, making the implicit explicit

This is like much of the information on the web, let’s have a look at the html.

Page 25: Unveiling the web, making the implicit explicit

This html is plain, without much explanatory mark up.

Page 26: Unveiling the web, making the implicit explicit

But we could fix that pretty simply.

Page 27: Unveiling the web, making the implicit explicit

And then we could use a tool like Yahoo Query Language (http://developer.yahoo.com/yql/) to filter the information on the table.

Page 28: Unveiling the web, making the implicit explicit

And we can create an RSS feed.

With a little effort in creating nice html, we can go from a plain piece of content into a filtered alerting service.

The web is soooo cool.

Page 29: Unveiling the web, making the implicit explicit

HTML

YQL

JSONRSS RDF

YQL takes input from html sources, and allows you to manipulate that input in interesting ways.

Page 30: Unveiling the web, making the implicit explicit

The entire conversion can be called at a single url

HTML

YQL

JSONRSS RDF

CSV HTML

YQL can also take data from csv files or xml files on the web.

It can merge data.

The entire pipeline can be mapped onto one url, making it transferable, open and very sharable.

YQL is a tool that has come out of the hacker community.

It has great potential for science.

Just remember, put your data on the web.

<div id=”important”>Be nice about how you put it there ;)</div>

Page 31: Unveiling the web, making the implicit explicit

Citizen Science

Ok, we have looked at how emerging tools can help us join data together.

How they can help us add meaning and insight to the literature.

And how they can be used to make it easier to put our data onto the web in interesting ways.

Another emerging trend is the way in which we can connect people to that data.

And by people, I mean EVERYONE !!

Page 32: Unveiling the web, making the implicit explicit

BOINC based science

http://www.allprojectstats.com

> 2, 000, 000 people

> 5, 500, 000 CPUs

Systems that analyse data on a users computer while the computer is in screen saver mode have been around for a long time.

SETI at home is the most famous.

They have been adopted by millions of people.

Millions of computers have been used for doing science at home.

But this is a somewhat passive way to engage people.

Page 33: Unveiling the web, making the implicit explicit

10 000 sheep, Aaron Koblin, 2006

Tools like the Mechanical Turk (https://www.mturk.com/mturk/welcom) allow you to get people to do real world tasks for you.

Like drawing sheep (baaaa!).

Page 34: Unveiling the web, making the implicit explicit

image: Sloan Digital Sky SurveyOr classifying galaxies.

Page 35: Unveiling the web, making the implicit explicit

The Galaxy Zoo project created an intuitive web interface that allowed members of the general public to classify galaxies from the Sloan Digital Sky Survey.

They had a lot of galaxies that were too fuzzy for a computer to classify.

And they had too many for even a grad student to classify.

Page 36: Unveiling the web, making the implicit explicit

17 papers

1, 000, 000 galaxies

50, 000, 000 classified

150, 000 people

In one year 150,000 people classified the one million fuzzy galaxies in the survey.

They did a lot of classification.

And Galaxy Zoo published a lot of papers as a result.

Page 37: Unveiling the web, making the implicit explicit

Cooper, S. et al., 2010. Predicting protein structures with a multiplayer online game. Nature, 466(7307), 756-760.

The foldit project turned molecule folding into a game.

You get more points if you get your molecule into a lower energy state.

For many molecules this is too hard for computers to figure out.

After two years of people playing the game, they found the solution to a bunch of molecules that were not known before.

Page 38: Unveiling the web, making the implicit explicit

The last two examples were examples of data analysis.

You can also get people to collect data for you.

The great backyard bird count gets bird watchers to count birds.

Page 39: Unveiling the web, making the implicit explicit

They can make the best survey of bird populations, all across the US.

Page 40: Unveiling the web, making the implicit explicit

Noise tube turns a mobile phone into a sensing device for measuring noise pollution.

Page 41: Unveiling the web, making the implicit explicit

The noise profile of a bunch of of cities have been mapped out by people using this software in ambient mode.

As more people get more powerful phones what they will be able to measure will only be limited by the ingenuity of those looking for data.

We can already use phone to record sound, time, location, images, motion.

(Some phones can even be used to make phone calls)

Page 42: Unveiling the web, making the implicit explicit

image: flickr sybrenstuvelBut all of the things I’ve been talking about are not easy to do yet.

You need to really invest in building a platform, annotating your documents, or engaging with a community of people.

I believe that the tools that make these platforms possible will become easier to use.

The complexity will get abstracted away.

Tools will make it easy for us to engage people with our data, with each other, all helping science.

Page 43: Unveiling the web, making the implicit explicit

At Mendeley we want to build just such a tool.

Page 44: Unveiling the web, making the implicit explicit

Mendeley Desktop

We have built a tool that works on your computer to help you manage your research library.

Page 45: Unveiling the web, making the implicit explicit

Manageyour research papers

It’s really good (you should check it out at Mendeley.com). We want it to be the best tool that is possible for helping you.

(actually that’s my job, I’m in charge of making the product better, so let me know what you think at [email protected] :P)

Page 46: Unveiling the web, making the implicit explicit

Mendeley aggregates research data in the cloud

But what is really cool is that we mirror your activity in the cloud.

We have a tool that is useful to you as an individual.

But when lots of you use it we can find out in real time what science is interesting!

Page 47: Unveiling the web, making the implicit explicit

By doing this, Mendeley makes science more collaborative and transparent

We want to make it easy for everyone to find out what the experts think are the important papers.

Page 48: Unveiling the web, making the implicit explicit

Real-time data on 28m research papers:

Thomson Reuters’ Web of Knowledge

Mendeley after 16 months:

And we already have information on lots of papers.

Page 49: Unveiling the web, making the implicit explicit

We can tell you what kind of people are reading a paper, and where they are from.

Page 50: Unveiling the web, making the implicit explicit

And just like amazon can recommend books to you based on your behaviour, and the behaviour of everyone

We have started making recommendations about research.

We are trying to make crowed sourced recommendations for science easy, and we have an API, so we are trying to make it easy for you too.

We have BIG ideas, and we are really excited.

Come and help us make science easier to do at mendeley.com, I’d love to see you there.

Page 51: Unveiling the web, making the implicit explicit

image: flickr daviddmuirIn the future, I don’t think you will be asking yourself “how” can you use tools and platforms like the ones I’ve been describing.

They will become easy to use, and easy to utilise.

You will be asking yourself “why” should you use these things.

So let’s look at the befits.

Page 52: Unveiling the web, making the implicit explicit

Costs of research Source: Research Information Network

This Research Information Network report from 2008 shows that a lot of time is spent looking for what to read.

And time is money.

If we can build a way for you to find what you need faster, we all save money :)

Page 53: Unveiling the web, making the implicit explicit

Lazer, D. et al., 2009. Social science. Computational social science. Science (New York, N.Y.), 323(5915),

Huang, Y., Contractor, N. & Yao, Y., 2008. CI-KNOW: recommendation based on social networks. In Proceedings of the 2008 international conference on Digital government research`. Digital Government Society of North America, pp. 27-33.

If we can recommend people to each other as well as papers we can save on redundancy in research.

That’s what the tool that Huang and Contractor can help you do.

It’s helped people in cancer research get their work done faster.

Page 54: Unveiling the web, making the implicit explicit

crystal eye:http://wwmm.ch.cam.ac.uk/crystaleye/ The crystal eye is a tool that extracts the crystallographic bond lengths reported in the literature.

You can compare you results with every other result.

If it’s very different have you found something really interesting?

Or have you found an error?

By quickly being able to see the context of the information you have, you can more quickly understand it.

(http://wwmm.ch.cam.ac.uk/crystaleye/summary/acs/inocaj/2009/10/index.html)

Page 55: Unveiling the web, making the implicit explicit

image: flickr matthewfieldBut for me the most exciting thing are these people.

Page 56: Unveiling the web, making the implicit explicit

image: flickr matthewfieldWe can make them into scientists.

Look at the last author on the foldit paper.

I wish I had a paper in Nature.

I wish I’d played that foldit game, don’t you?

Page 57: Unveiling the web, making the implicit explicit

Humans

Machines

ProfessionalAmateur

Science Blogging/Tweeting/Social Communities

DATA CollectionAnalysis

Academic Papers/Annotation

Reading Academic Papers

Data Mining/LinkingData Processing

So you see, there are lots of ways to connect people.

Page 58: Unveiling the web, making the implicit explicit

The FutureI wanted to end with a few thoughts more about future trends.

The first one I want to talk about is that we are going to need to be more open about science.

Page 59: Unveiling the web, making the implicit explicit

GISTEMPGlobal Temperature Anomaly

(and we match this)

slide from: clear climate codeWhen the Intergovernmental Panel on Climate Change reported their results.

Page 60: Unveiling the web, making the implicit explicit

Motivation

xkcd.com

slide from: clear climate codeLot’s or people said that it was a fix-up, that the data could not be reproduced, and that the old Fortran code that produced that graph could never be run.

Page 61: Unveiling the web, making the implicit explicit

Code Metrics

GISS ccc-gistemp

slide from: clear climate codeIndeed, the code was a mess, that’s the composition of the code on the left.

Some interested computer programmers (NOT SCIENTISTS, JUST NORMAL PEOPLE WHO WERE INTERESTED) rewrote the code in python.

Sorry for shouting just there, but that’s so important. Not scientists, not the custodians of reproducibility.

And the reason is that you don’t get credit in science for rewriting code.

But these computer programmers thought it was an important enough issue, the potential destruction of mankind, and they were not looking for scientific accreditation.

So they proved you could run the original code.

And they vastly improved it (that’s their code in the middle).

You can go and tell them how awesome you think they are over at http://clearclimatecode.org/

Page 62: Unveiling the web, making the implicit explicit

Independent Analyses

Graphic courtesy Zeke Hausfather

slide from: clear climate codeAnd here is the proof.

So if you make your data open, you also really have to make the methods and the code and all the nitty gritty open too.

Otherwise you steal away the context.

And we will forget.

And the knowledge that you know is so important.

Will be lost.

Page 63: Unveiling the web, making the implicit explicit

image: flickr doug88888I think another interesting trend will be that the world will start talking to us.

London Bridge talks to us.

(Hi Tom, ~waves~).

Page 64: Unveiling the web, making the implicit explicit

image: flickr flyingsingerAsteroids are talking to us.

Page 65: Unveiling the web, making the implicit explicit

image: flickr scottkinmartinFrom botanicalls.com you can even get something to put into your plant pot that will make you plant talk to you.

Page 66: Unveiling the web, making the implicit explicit

King, R. D., Rowland, J., Oliver, S. G., Young, M., Aubrey, W., Byrne, E., Liakata, M., et al. (2009). The automation of science. Science, 324(5923), 85-89. AAAS. Retrieved from http://www.ncbi.nlm.nih.gov/pubmed/19342587With all of this data available machines like the one King et al. created will get more powerful.

You feed it data, and it doesn’t just analyze the data.

It creates hypotheses.

And they are correct.

Computers are going to start doing science.

I hope we can be friends.

Page 67: Unveiling the web, making the implicit explicit

image: flickr simon

Bradley W. SchenckBradley W. Schenck

The last idea I have for you is a 3-d printer that can print itself.

It’s slow, but the internet used to be slow too.

In 1982 it would take 400 hours to transmit 1 song.

In 1990 it still took 1 hour.

Right now it takes a week to print all the bits you need to make another 3-d printer.

But imagine a future where you could email your lab to someone.

And they could print it.

Page 68: Unveiling the web, making the implicit explicit

http://www.flickr.com/people/marcelgermain/The End

Thank you very much for taking the time to read through my ideas.

I’m Ian Mulvany and you can follow me @ianmulvany.