18
Department of Computer Science, Graduate School of Information Science & Technology, Osaka University An Empirical Study of Out-dated Third-party Code in Open Source Software Pei Xia Inoue Lab 2013/02/12 1

An Empirical Study of Out-dated Third-party Code in Open S ource Software

  • Upload
    kemp

  • View
    54

  • Download
    1

Embed Size (px)

DESCRIPTION

An Empirical Study of Out-dated Third-party Code in Open S ource Software. Pei Xia Inoue Lab 2013/02/12. Third-party Code in OSS. Developers reuse 3rd-party code from existing open source projects [1]. libxml2. libpng. zlib. ……. libjpeg. openssl. reuse. User project. User project. - PowerPoint PPT Presentation

Citation preview

Page 1: An Empirical Study of  Out-dated  Third-party Code in Open  S ource Software

Department of Computer Science, Graduate School of Information Science & Technology,Osaka University

1

An Empirical Study of Out-dated Third-party Code in Open Source Software

Pei XiaInoue Lab

2013/02/12

Page 2: An Empirical Study of  Out-dated  Third-party Code in Open  S ource Software

Department of Computer Science, Graduate School of Information Science & Technology, Osaka University 2

Third-party Code in OSS

Developers reuse 3rd-party code from existing open source projects[1]

[1] S.Haefliger, G.Krogh, S.Spaeth, 2008. “Code Reuse in Open Source Software”, Management Science, Vol.54 No.1 Jan.2008

reuse

zlib libpnglibjpeg

libxml2……

openssl

User project User project User project User project

Page 3: An Empirical Study of  Out-dated  Third-party Code in Open  S ource Software

Department of Computer Science, Graduate School of Information Science & Technology, Osaka University 3

Out-dated Third-Party Code

Third-party code of older versions containing known defects such as software vulnerabilities that should be fixed by upgrading them to a newer version

reuse

v1.0 v1.1 v1.2 v2.0 v2.1 Timeline3rd-party project

bug bug bug

User project User project User project User project

No Existing Research

Page 4: An Empirical Study of  Out-dated  Third-party Code in Open  S ource Software

Department of Computer Science, Graduate School of Information Science & Technology, Osaka University 4

Research Questions

What is the proportion of out-dated 3rd-party code reused in the open source software?

What are the potential defects caused by such reuse?

How do user projects manage those out-dated 3rd-party code?

Be helpful in understanding OSS reuse activities, evaluating the quality of OSS and predicting some of the potential defects

Page 5: An Empirical Study of  Out-dated  Third-party Code in Open  S ource Software

Department of Computer Science, Graduate School of Information Science & Technology, Osaka University 5

Study Approach Overview

rep

v1.0 v1.1 v1.2 v2.0 v2.1 Timeline3rd-party project

bug bugbug1.Defects Information Collection

2.Projects Searching 3.Version Identifying

4.Management Information Collection

Page 6: An Empirical Study of  Out-dated  Third-party Code in Open  S ource Software

Department of Computer Science, Graduate School of Information Science & Technology, Osaka University 6

Step 1 : Defects Information Collection

Home page announcement National Vulnerability Database[2]

The U.S. Government repository of standards based vulnerability management data 

[2] National Vulnerability Database,http://nvd.nist.gov/

Page 7: An Empirical Study of  Out-dated  Third-party Code in Open  S ource Software

Department of Computer Science, Graduate School of Information Science & Technology, Osaka University 7

Step 2 : Projects Searching

Using OpenCCFinder[3] to Search

[3] P. Xia, Y. Manabe, N. Yoshida, and K. Inoue. Development of a code clone search tool for open source repositories. Technical report, IPSJ SIG Technical Reports, Vol.2011-SE-174, No2 ,pp.1-8, 2011.

Page 8: An Empirical Study of  Out-dated  Third-party Code in Open  S ource Software

Department of Computer Science, Graduate School of Information Science & Technology, Osaka University 8

v2.1

Step 3 : Version Identifying

rep

v1.0 v1.1 v1.2 v2.0

V2.1

TimelineThird-party project

Tokenized file hash197770261178625914

5917292968849110879

197770261178625914

5917292968706253673

197770261178625914

5527652421706253673

598032372178625914

5527652421706253673

5980323721191396480527652421706253673

// some commentpublic static void main(){ int a=0; a=a+1;}

publicstaticvoid$(){int$=$;$=$+$;}

197770261

rep

rep

User project 1

User project 2 Latest ver.

Latest ver.

197770261178625914

5917292968706253673

598032372178625914

5527652421706253673

match

Tokenization

Hashing

v1.1

v2.0

Page 9: An Empirical Study of  Out-dated  Third-party Code in Open  S ource Software

Department of Computer Science, Graduate School of Information Science & Technology, Osaka University 9

Step 4 : Management Information Collection

Questions on reused 3rd-party code Modified or Copy&Paste? Keep updating? Well managed?

Manual investigation Directory structure and file name Repository commit history readme.txt changelog.txt

Page 10: An Empirical Study of  Out-dated  Third-party Code in Open  S ource Software

Department of Computer Science, Graduate School of Information Science & Technology, Osaka University 10

Case study

Subject

Project Name Domain Project Historyzlib Data compression 1995-current

libcurl File transfer 1999-current

libpng Graphics 1995-current

Page 11: An Empirical Study of  Out-dated  Third-party Code in Open  S ource Software

Department of Computer Science, Graduate School of Information Science & Technology, Osaka University 11

Case Study Result (1/5)

What is the proportion of out-dated 3rd-party code reused in the open source software?

11

V1.1.3 V1.1.4 V1.2.1.1 V1.2.3 V1.2.3.2 V1.2.4 V1.2.5 V1.2.6 V1.2.70

10

20

3 5 4

15

3 2 3 4 6

zlib (45)

01234

21 1

21 1

21 1

3

1

3

1 1 1 1 1 1 1 1

libcurl (28)

v1.0.

11

v1.2.

7v1

.2.5

v1.2.

16

v1.2.

22

v1.2.

24

v1.2.

29

v1.2.

33

v1.2.

35

v1.2.

39

v1.2.

42

v1.2.

43

v1.4.

4

v1.4.

6beta

06v1

.5.4

v1.2.

46

v1.2.

49

v1.5.

10

v1.5.

13024 libpng (50)

Reused Versions of 3rd-party code

# Pr

ojec

ts u

sing

3rd

-par

ty c

ode

Vulnerabilities reported Warning from hompage No defects reported

Page 12: An Empirical Study of  Out-dated  Third-party Code in Open  S ource Software

Department of Computer Science, Graduate School of Information Science & Technology, Osaka University 12

Case Study Result (2/5)

What is the proportion of out-dated 3rd-party code reused in the open source software?

# investigated projects

# projects contain out-dated 3rd-party code

Out-date code Percentage

zlib 45 14 31.11%

libcurl 28 24 85.71%

libpng 50 46 92.00%

total 123 84 68.30%

Page 13: An Empirical Study of  Out-dated  Third-party Code in Open  S ource Software

Department of Computer Science, Graduate School of Information Science & Technology, Osaka University 13

Case Study Result (3/5)

What are the potential defects caused by such reuse?

zlib version Reported defectsv1.1.3 CVE-2002-0059 VU#368819 CA-2002-07v1.1.4 CVE-2003-0107 VU#142121v1.2.1 v1.2.2 CVE-2004-0797 VU#238687v1.2.1 v1.2.2 CVE-2005-2096 VU#680620v1.2.2 CVE-2005-1849v1.2.4 Bug Fixed. Update suggestion from project homepage

• Example CVE-2005-1849: inftrees.h in zlib 1.2.2 allows remote attackers to cause a denial of service (application crash) via an invalid file that causes a large dynamic tree to be produced.

Page 14: An Empirical Study of  Out-dated  Third-party Code in Open  S ource Software

Department of Computer Science, Graduate School of Information Science & Technology, Osaka University 14

Case Study Result (4/5)

How do user projects manage those out-dated 3rd-party code?

keep updating15%

reverted2%

other16%

no version info28%

haveVersion info

72%

Whether well managed

Page 15: An Empirical Study of  Out-dated  Third-party Code in Open  S ource Software

Department of Computer Science, Graduate School of Information Science & Technology, Osaka University 15

Case Study Result (5/5)

How do user projects manage those out-dated 3rd-party code? 96 (78.0%) of user projects reused the third-

party code with copy and paste 6 (4.9%) of user projects changed directory

names or mix the third-party code with other code

Page 16: An Empirical Study of  Out-dated  Third-party Code in Open  S ource Software

Department of Computer Science, Graduate School of Information Science & Technology, Osaka University 16

Conclusion

In this study, 68.3% of open source software are reusing out-dated third-party code which contain critical defects.

More than half of the open source projects did not manage the third-party code very well.

Page 17: An Empirical Study of  Out-dated  Third-party Code in Open  S ource Software

Department of Computer Science, Graduate School of Information Science & Technology, Osaka University 17

Future work

Develop a 3rd-party code manage systemVersion identifyingDefects predictionAutomatically Updating

Page 18: An Empirical Study of  Out-dated  Third-party Code in Open  S ource Software

Department of Computer Science, Graduate School of Information Science & Technology, Osaka University 18

Q&A