27
Appendix-I: Sample of Spam words 91 Appendix-I Sample List of Spam words ___________________________________________________________ The sample of shortlisted word is, ‘NOKIA MOBILE LOTTREY DRAW’, ‘Promo Enlargement’, ‘BBC NATIONAL LOTTERY’,’UNITED KINGDOM LOTTERY’, ‘COCA COLA DRAW’, ’Free Trial Men’s Supplement’. ‘WON POUNDS’, ‘job offers’, ‘UK-LOTTERY’, ‘huge stick’, ‘increase your length’, ‘desired proportion and size’, ‘Customer Survey’, ‘WON 500,000GBP’, ‘LOAN OFFER!!’,’WINNING NOTIFICATION..!!’, ‘making money’, ‘income going down’, ‘LOTTERY DRAW’, ‘Weight Loss’ ,‘Diet’, ‘WON £750,000.GBP’, ‘SEX PILL’, ‘Buy Viagra at Half Price’, ’Winner’, ’MyDailyFlog!’, ‘HasDonated (£,,500,000.GBP)’ . ‘NOKIA MOBILE LOTTREY DRAW’, ‘Promo Enlargement’, ‘BBC NATIONAL LOTTERY’,’UNITED KINGDOM LOTTERY’, ‘COCA COLA DRAW’, ’Free Trial Men’s Supplement’. UK-LOTTERY ORGANIZATION Microsoft HasDonated (£ 1,500,000 GBP) To You. C0NGRATULATION++ YOUR INTERNET EMAIL ID HAS WON SOME PRIZE FROM REEBOK AWARD PROMO 2011URGENT PAYMENT INFORMATION FROM WESTERN UNION NOKIA MOBILE PRODUCTION COMPANY INTERNATIONAL ONLINE HCG Diet program burns FAT Fast. Weight loss is simple and easy with the HCG Diet Drops. Look for specials and save big $ on your next HCG Diet purchase

Appendix-I Sample List of Spam wordsshodhganga.inflibnet.ac.in/bitstream/10603/94131/14/14_appendix.pdfevaluation of semantic similarity measures for concepts stemming from the same

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Appendix-I Sample List of Spam wordsshodhganga.inflibnet.ac.in/bitstream/10603/94131/14/14_appendix.pdfevaluation of semantic similarity measures for concepts stemming from the same

Appendix-I: Sample of Spam words

91

Appendix-I

Sample List of Spam words

___________________________________________________________

The sample of shortlisted word is,

‘NOKIA MOBILE LOTTREY DRAW’, ‘Promo Enlargement’, ‘BBC NATIONAL

LOTTERY’,’UNITED KINGDOM LOTTERY’, ‘COCA COLA DRAW’, ’Free Trial

Men’s Supplement’.

‘WON POUNDS’, ‘job offers’, ‘UK-LOTTERY’, ‘huge stick’, ‘increase your

length’, ‘desired proportion and size’, ‘Customer Survey’, ‘WON 500,000GBP’,

‘LOAN OFFER!!’,’WINNING NOTIFICATION..!!’, ‘making money’, ‘income

going down’, ‘LOTTERY DRAW’, ‘Weight Loss’ ,‘Diet’, ‘WON £750,000.GBP’,

‘SEX PILL’, ‘Buy Viagra at Half Price’, ’Winner’, ’MyDailyFlog!’, ‘HasDonated

(£,,500,000.GBP)’ .

‘NOKIA MOBILE LOTTREY DRAW’, ‘Promo Enlargement’, ‘BBC NATIONAL

LOTTERY’,’UNITED KINGDOM LOTTERY’, ‘COCA COLA DRAW’, ’Free Trial

Men’s Supplement’. UK-LOTTERY ORGANIZATION

Microsoft HasDonated (£ 1,500,000 GBP) To You.

C0NGRATULATION++ YOUR INTERNET EMAIL ID HAS WON SOME PRIZE

FROM REEBOK AWARD PROMO 2011’

URGENT PAYMENT INFORMATION FROM WESTERN UNION

NOKIA MOBILE PRODUCTION COMPANY INTERNATIONAL ONLINE

HCG Diet program burns FAT Fast. Weight loss is simple and easy with the HCG

Diet Drops. Look for specials and save big $ on your next HCG Diet purchase

Page 2: Appendix-I Sample List of Spam wordsshodhganga.inflibnet.ac.in/bitstream/10603/94131/14/14_appendix.pdfevaluation of semantic similarity measures for concepts stemming from the same

References

99

References

[Amari and Wu, 1999] S. Amari, S. Wu, Improving support vector machine classifiers by

modifying kernel functions. Neural Networks, 12, 783– 789, 1999.

[Androutsopoulos et.al, 2000] I. Androutsopoulos, J. Koutsias, K.V. Chandrinos, George

Paliouras, and C.D. Spyropoulos, An Evaluation of Naive Bayesian Anti-Spam

Filtering. Proceedings of the Workshop on Machine Learning in the New Information

Age, 11th European Conference on Machine Learning (ECML 2000), pages 9–17,

Barcelona, Spain, 2000.

[Allman, 2003] Allman E, Spam, Spam, Spam, Spam, Spam, the FTC, and Spam, Queue,

2003, Vol.1, no. 6, pp. 62- 69.

[Androutsopoulos et.al, 2004] Androutsopoulos, I., Georgios, P., Michelakis, E.: Learning to

Filter Unsolicited Commercial E-mail. Technical Report 2004/2, NCSR Demokritos

2004.

[Amend_IT ACT, 2008] The Gazette of India, The Information Technology (Amendment)

Act, 2008, Ministry of Law, Justice And Company Affairs (Legislative Department),

5th February, 2009.

[Awad and ELseuofi, 2011] W.A. Awad and S.M. ELseuofi, Machine Learning Methods for

Spam E-mail Classification. International Journal of Computer Science & Information

Technology (IJCSIT), Vol 3, No 1, Feb 2011.

[Amlan Mohanty, 2011] Amlan Mohanty, New Crimes under the Information Technology

(Amendment) Act, The Indian Journal Of Law And Technology, Volume 7, 2011,

PP:103-119.

[Aski, 2013] Ali Shafigh Aski, A Proposed Algorithm for Spam Filtering Emails by Hash

Table Approach, International Research Journal of Applied and Basic Sciences. ISSN

2251-838X / Vol, 4 (9): 2436-2441, Science Explorer Publications, 2013.

[Barracuda, 2005] Barracuda Networks, An Overview of Spam Blocking Techniques,

Barracuda Networks, viewed Aug 22, 2005.

[Coello, 2005] Juan Francisco Méndez Coello, The Roles of Legislative and Technical

Approaches in Controlling Spam, CO620: Technical Report, The University of Kent,

Kent, CT2 7NZ, 2005.

[Caruana and Li, 2012] Godwin Caruana and Maozhen Li, A Survey of Emerging

Approaches to Spam Filtering ACM Computing Surveys, Vol. 44, No. 2, Article 9,

February 2012.

Page 3: Appendix-I Sample List of Spam wordsshodhganga.inflibnet.ac.in/bitstream/10603/94131/14/14_appendix.pdfevaluation of semantic similarity measures for concepts stemming from the same

References

100

[Chunping, 2013] Wang Chunping, The study on the spam filtering technology based on

Bayesian algorithm, International Journal of Computer Science Issues (IJCSI), Vol. 10,

Issue 1, No 3, ISSN (Print): 1694-0784 | ISSN (Online): 1694-0814, 2013.

[Eui-Hong Han et.al, 1999] Eui-Hong (Sam) Han, George Karypis, Vipin Kumar, Text

Categorization Using Weighted Adjusted k-Nearest Neighbour Classification,

Department of Computer Science and Engineering. Army HPC Research Centre,

University of Minnesota, Minneapolis, USA, 1999.

[Euripides et.al, 2006] Euripides G.M. Petrakis Giannis Varelas, A.H.P.R., Design and

evaluation of semantic similarity measures for concepts stemming from the same or

different Ontologies. In the proceedings of 4th Workshop on Multimedia Semantics

(WMS’06), 2006, pp. 44–52.

[Erickson et.al, 2008] D.Erickson, M.Kasado and Nick McKeown, The Effectiveness of

Whitelisting: a User-Study, In Proceedings of Conference on E-mail and Anti Spam-

2008 (CEAS-2008).

[El-Halees, 2009] Alaa El-Halees, 2009, Filtering Spam E-Mail from Mixed Arabic and

English Messages: A Comparison of Machine Learning Techniques, The International

Arab Journal of Information Technology, Vol. 6, No. 1, January-2009, PP 52-59.

[Enron,2012]The Enron-Spam dataset is available from

www.aueb.gr/users/ion/publications.html in both raw and pre-processed form. Last

accessed May, 2012.

[Elavarasi et.al, 2014] S. Anitha Elavarasi, Dr. J. Akilandeswari and K. Menaga. A Survey on

Semantic Similarity Measure, International Journal of Research in Advent Technology,

Vol.2, No.3, E-ISSN: 2321-9637, March 2014

[Fix and Hodges, 1999] E. Fix and J. L. Hodges, 1989, Discriminatory analysis.

Nonparametric discrimination: Consistency properties International Statistical

Review/Revue, 57(3), 238-247, 1999.

[FTC, 2005] The US SAFE WEB Act: Protecting Consumers from Spam, Spyware, and

Fraud. A Legislative Recommendation to Congress, Federal Trade Commission, June

2005.

[Fumera et.al, 2006] G.Fumera, I.Pillai and F.Roli, Spam Filtering based on the analysis of

text information embedded into images, Journal of Machine Learning Research, 7

(2006) ISSN:2699-2720.

[Faure et.al, 2007] F. Faure, M. Lopusniac, G. Richard and M. Farmer, A Complexity-based

Method for Anti-Spamming, In Proceedings of International Conference of Digital

Information Management (ICDIM) DOI:1-4244-1476-8/07/$25.00, 2007.

Page 4: Appendix-I Sample List of Spam wordsshodhganga.inflibnet.ac.in/bitstream/10603/94131/14/14_appendix.pdfevaluation of semantic similarity measures for concepts stemming from the same

References

101

[Fatiha Barigou et.al, 2014] Fatiha Barigou, Bouziane Beldjilali, and Baghdad Atmani, Using

Cellular Automata for Improving KNN based Spam Filtering, The International Arab

Journal of Information Technology, Vol. 11, No. 4, July 2014, PP: 345-353

[Graham, 2003] Graham P, A Plan for Spam. Published on the web

www.paulgraham.com/spam.html in 2003. Last accessed May, 2011.

[Garcia et.al, 2004] Flavio D. Garcia, Jaap-Henk Hoepman, Jeroen van Nieuwenhuizen, Spam

Filter Analysis, Security and protection in Information Processing Systems, ISSN-

1571-5736 (Print) 1861-2288 (Online), Volume 147/2004, ISBN-978-1-4020-8142-2,

PP:395-410.

[Gomaa and Fahmy, 2013] Wael H. Gomaa and Aly A. Fahmy, A Survey of Text Similarity

Approaches, International Journal of Computer Applications Volume 68– No.13,

ISSN:0975 – 8887, April 2013.

[Govt. of India, 2013] E-mail Policy E-mail Policy of Government of India, October 2013.

[Haskins and Nielsen, 2004] Robert Haskins, Dale Nielsen, Slamming Spam: A Guide for

Spam Administrations, Addison Wesley Professional, ISBN 0-13-146716-6, 2004.

[Hovold, 2005] Johan Hovold, Naive Bayes Spam Filtering Using Word-Position-Based

Attributes, In the proceedings of Second Conference on Email and Anti-Spam (CEAS-

2005) Mountain View, California USA, 21-22 July, 2005.

[Hayat and Basiri, 2010] Morteza Zi Hayat, Javad Basiri, 2010, Leila Seyedhossein and

Azadeh Shakery, Content-Based Concept Drift Detection for Email Spam Filtering. In

the Proceedings of 5th International Symposium on Telecommunications-2010

(IST'2010) 978-1-4244-8185-9/10/$26.00, 2010 IEEE.

[Hohlfeld et.al, 2012] Oliver Hohlfeld, Thomas Graf and Florin Ciucu, Long time Behaviour

of Harvesting Spam Bots, In the proceedings of ACM conference on Internet

measurement (IMC’12) , November 14–16, 2012, ISBN: 978-1-4503-1705-4

DOI:ACM 978-1-4503-1705-4/12/11,PP:453-460.

[Haddadi and Zincir-Heywood, 2014] Haddadi, F and Zincir-Heywood, A.N. Benchmarking

the Effect of Flow Exporters and Protocol Filters on Botnet Traffic Classification

Systems Journal, IEEE Volume:99, PP 1-12 ISSN 1932-8184.

[IT Act, 2000] The Gazette of India, The Information Technology Act, 2000, Ministry of

Law, Justice And Company Affairs (Legislative Department) 2000.

[Jiang and Conrath, 1997] Jay J. Jiang and David W. Conrath, Semantic Similarity Based on

Corpus Statistics and Lexical Taxonomy, In the Proceedings of International

Conference Research on Computational Linguistics (ROCLING X), 1997, Taiwan.

[Jung and Sit, 2004] J. Jung and Emil Sit. An Empirical Study of Spam Traffic and the Use of

DBS Black Lists, In the Proceedings of the 4th ACM SIGCOMM Conference on

Page 5: Appendix-I Sample List of Spam wordsshodhganga.inflibnet.ac.in/bitstream/10603/94131/14/14_appendix.pdfevaluation of semantic similarity measures for concepts stemming from the same

References

102

Internet Measurement (IMC’04), October, 25-27, 2004.ISBN:1-58113-821-0, PP-370-

375, DOI: 10.1145/1028788.1028838.

[Komorowski et.al, 1999] J. Komorowski, Z. Pawlak, L. Polkowski, and A. Skowron, Rough

sets: A tutorial, In S.K. Pal, A. Skowron (Eds.), Rough Fuzzy Hubridization: A New

Trend in Decision Making, Springer, Singapore, pp. 3–98, 1999.

[Klimt and Yang, 2004] Bryan Klimt and Yiming Yang, Introducing the Enron Corpus, In the

proceedings of first Conference on E-mail and Anti-Spam, 2004 (CEAS-2004).

[Karen Ng, 2005] Karen Ng, Spam Legislation in Canada: Federalism, Freedom of

Expression and the Regulation of the Internet. The University of ottawa law &

technology Journal, 2005, PP:447-492.

[Kim et.al, 2007] Jongwan Kim, Dejing Dou, Haishan Liu, and Donghwi Kwak, Constructing

a User Preference Ontology for Anti-spam Mail Systems, Z. Kobti and D. Wu (Eds.):

Canadian AI 2007, LNAI 4509, pp. 272 – 283, 2007. Springer-Verlag Berlin

Heidelberg 2007.

[Khan et.al, 2010] Aurangzeb Khan, Baharum Baharudin, Lam Hong Lee and Khairullah

Khan, A Review of Machine Learning Algorithms for Text-Documents Classification,

Journal of Advances In Information Technology, Vol. 1, No. 1, 2010.

doi:10.4304/jait.1.1.4-20

[Kiamarzpour et.al, 2013] Foruzan Kiamarzpour, Rouhollah Dianat, Mohammad bahrani,

Mehdi Sadeghzadeh Improving the methods of email classification based on words

ontology, arXiv:1310.5963, 2013.

[Kigerl, 2014] Alex C. Kigerl, Evaluation of the Can Spam Act: Testing Deterrence and

other Influences of Email Spammer Behaviour over Time, Washington State

University, Ph.D. Thesis, 2014,

[Leacock et.al, 1998] Leacock, C., Miller, G.A., Chodorow, M.: Using corpus statistics and

wordnet relations for sense identification. Comput. Linguist. 24(1), 147–165 (1998)

[Lin, 1998] D. Lin, An information-theoretic definition of similarity. In the Proceedings of

15th International Conference on Machine Learning, Morgan Kaufmann, San Francisco,

CA, 1998, pp. 296–304.

[Lai and Tsai, 2004] Chih-Chin Lai and Ming-Chi Tsai, An Empirical Performance

Comparison of Machine Learning Methods for Spam E-mail Categorization. In the

Proceedings of the Fourth International Conference on Hybrid Intelligent Systems

(HIS’04) DOI: 0-7695-2291-2004, 2004.

[Luis Von Ahnm et.al, 2004] A. B. Luis Von Ahnm, M. Blum and J.Langford, Telling

humans and Computers apart Automatically, Communication of the ACM, February

2004/Vol. 47, No. 2.

Page 6: Appendix-I Sample List of Spam wordsshodhganga.inflibnet.ac.in/bitstream/10603/94131/14/14_appendix.pdfevaluation of semantic similarity measures for concepts stemming from the same

References

103

[Li et.al, 2006] Yuhua Li, David McLean, Zuhair A. Bandar, James D. O’Shea, and Keeley

Crockett, Sentence Similarity Based on Semantic Nets and Corpus Statistics, IEEE

Transactions on Knowledge and Data Engineering, Vol. 18, No. 8, August 2006,

pp:1138-1150.

[Lai, 2007] Chih-Chin Lai, 2007, An Empirical Study of Three Machine Learning Methods

for Spam Filtering, Knowledge-Based Systems, pp.249–254. Available online at www.

Sciencedirect. Com, 2007.

[Lin and Sandkuhl, 2008] Lin, F. and Sandkuhl, K., 2008, in IFIP International Federation for

Information Processing, Volume 276; Artificial Intelligence and Practice II; Max

Bramer; (Boston: Springer), pp. 341–350, 2008.

[LingSpam,2012] The LingSpam dataset is available from

www.aueb.gr/users/ion/publications.html. Last accessed May, 2012.

[Liu and Ting, 2012] Liu, Wuying and Ting Wang, Online active multi-field learning for

efficient email spam filtering, Knowledge and Information Systems, Vol.33, No.1,

October-2012 , PP.117(20), ISSN: 0219-1377.

[Lingling Meng et.al, 2013] Lingling Meng, Runqing Huang and Junzhong Gu, A Review of

Semantic Similarity Measures in WordNet, International Journal of Hybrid Information

Technology Vol. 6, No. 1, January, 2013. PP:1-12.

[Lomax and Vadera. 2013] S. Lomax and S. Vadera, A survey of cost-sensitive decision tree

induction algorithms. ACM Computing Surveys (CSUR), 45(2):16, pp. 1-35, 2013.

[Martin et.al, 2005] Steve Martin, Anil Sewani, Blaine Nelson, Karl Chen and Anthony D.

Joseph, Analyzing Behavioural Features for Email Classification, In the Proceedings of

Second Conference on Email and Anti-Spam (CEAS-2005), 2005.

[Moustakas et.al, 2005] Evangelos Moustakas, C. Ranganathan, and Penny Duquenoy,

Combating spam through legislation: A Comparative Analysis of US and European

Approaches. In Proceedings of Second Conference on Email and Anti-

Spam,,2005(CEAS’2005).

[Metsis et.al, 2006] Vangelis Metsis, Ion Androutsopoulos and Georgios Paliouras, Spam

Filtering with Naive Bayes-Which Naive Bayes? In proceedings of CEAS 2006, Third

Conference on E-mail and Anti-Spam, July 27-28, 2006.

[Md. Islam et.al, 2009] Md. Saiful Islam, Shah Mostafa Khaled, Khalid Farhan, Md. Abdur

Rahman and Joy Rahman, Modeling Spammer Behavior: Naïve Bayes vs. Artificial

Neural Networks. In the proceedings of IEEE International Conference on Information

and Multimedia Technology, 2009, 978-0-7695-3922-5/09. IEEE DOI

10.1109/ICIMT.2009.48

Page 7: Appendix-I Sample List of Spam wordsshodhganga.inflibnet.ac.in/bitstream/10603/94131/14/14_appendix.pdfevaluation of semantic similarity measures for concepts stemming from the same

References

104

[Meng et.al, 2013] Lingling Meng, Runqing Huang and Junzhong Gu, An Effective

Algorithm for Semantic Similarity Metric of Word Pairs, International Journal of

Multimedia and Ubiquitous Engineering Vol. 8, No. 2, March, 2013, PP:1-12.

[Pawlak, 1982] Z. Pawlak, 1982, Rough Sets, International Journal of Computer &

Information Sciences, 11(5), 341-356.

[Philip Resnik, 1995] Philip Resnik, Using Information Content to Evaluate Semantic

Similarity in Taxonomy. In the proceedings of 14th International Joint Conference of

Artificial Intelligence (IJCAI-95), Vol-I, ISBN:1-55860-363-8 978-1-558-60363-9,

PP:448-453, 1995.

[Pantel and Lin, 1998] P. Pantel and D. Lin, SpamCop: A Spam classification and

organization program. In Proceedings of Workshop for Text Categorization, AAAI-98,

1998, PP- 95–98

[Polcicova and Navrat, 2002] Gabriela Polcicova and Pavol Navrat, Semantic Similarity in

Content-Based Filtering, Y. Manolopoulos and P. N´avrat (Eds.): ADBIS 2002, LNCS

2435, pp. 80–85, 2002, Springer-Verlag Berlin Heidelberg 2002

[Paganini, 2003] Marco Paganini, ASK: Active Spam Killer, In Proceedings of the

FREENIX Track USENIX Annual Technical Conference San Antonio, Texas, USA,

June 9-14,2003, PP- 51-62.

[Pfleeger and S L, Bloom, 2005] Pfleeger, S L, Bloom G, Canning Spam: Proposed Solutions

to Unwanted E-mail, Security & Privacy Magazine, IEEE, 2005, Vol. 3, no. 2, pp. 40-

47.

[PU123A, 2012] The PU123A dataset is available from

http://www.aueb.gr/users/ion/publications.html and http://www.iit.demokritos.gr/skel/i-

config/ Last accessed May, 2012.

[Peace et.al, 2015] Igiri Chinwe Peace, Anyama Oscar Uzoma, Silas Abbasiama Ita and Sam

Iibi, A Comparative Analysis of K-NN and ANN Techniques in Machine Learning,

International Journal of Engineering Research & Technology (IJERT), ISSN: 2278-

0181 Vol. 4, Issue 03, March-2015, PP: 420-425.

[Quinlan, 1986] J. R. Quinlan, 1986, Introduction of Decision Trees. Machine Learning,

Kluwer Academic Publishers, Boston, 81-106.

[reCAPTCHA] reCAPTCHA. http://recaptcha.net/.

[Resnik, 1995] Resnik, P. Using information content to evaluate semantic similarity in a

taxonomy. In: IJCAI, PP: 448–453, 1995.

[Resnik, 1999] P. Resnik, Semantic similarity in a taxonomy: An information-based measure

and its application to problems of ambiguity in natural language. Journal of Artificial

Intelligence Research 11, PP:95–130, 1999.

Page 8: Appendix-I Sample List of Spam wordsshodhganga.inflibnet.ac.in/bitstream/10603/94131/14/14_appendix.pdfevaluation of semantic similarity measures for concepts stemming from the same

References

105

[Rodriguez and Egenhofer, 2003] Rodriguez, M., Egenhofer, M., Determining Semantic

Similarity among Entity Classes from different Ontologies, IEEE Transactions on

Knowledge and Data Engineering 15(2), 442–456 (2003) ISSN:1041-4347, 2003.

[Rissino and Lambert-Torres, 2005] S. Rissino, and G. Lambert-Torres, 2005, Rough Set

Theory–Fundamental Concepts, Principals, Data Extraction, and Applications, In Data

Mining and Knowledge Discovery in Real Life Applications, J. Ponce and A. Karahoca

(Eds.), PP: 438.

[Ramachandran, 2006] Anirudh Ramachandran, David Dagon, and Nick Feamster, Can DNS

Based Blacklists Keep Up with Bots? In the proceedings of Third Conference on Email

and Anti­Spam,2006. (CEAS 2006), PP:1-2.

[Ramachandran et.al, 2007] A. Ramachandran, N.Feamster and S. Vempala, Filtering Spam

with Behavioral Blacklisting, In the proceedings of (CCS'07), 14th ACM Conference

on Computer and Communications Security, 2007 October,29- Nov, 2, 2007.

[Romero et.al, 2010] C. Romero, M. Garcia Valdez, A. Alanis, A Comparative Study of

Machine Learning Techniques in Blog Comments Spam Filtering, Studies in

Computational Intelligence, Volume 312, PP:57-72 978-1-4244-8126-2/10/$26.0,

2010.

[Robinson et.al, 2011] Neil Robinson, Hans Graux, Davide Maria Parrilli, Lisa Klautzer And

Lorenzo Valeri, Comparative Study on Legislative and Non Legislative Measures to

Combat Identity Theft and Identity Related Crime: Final Report TR-982-EC, June

2011.

[Rahmi, 2011] Isredza Rahmi A. Hamid and Jemal Abawajy, Hybrid Feature Selection for

Phishing Email Detection, In the Proceedings of ICA3PP-2011, Workshops, Part II,

LNCS 7017, pp. 266–275, Springer-Verlag Berlin Heidelberg, 2011.

[Ramasubramanian and Prakash, 2013] Ramasubramanian and Prakash P., Spam and Internet

abuse in India: A brief history. In the proceedings of IEEE World Cyberspace Summit-

IV, 2013. (WCC4-2013), PP:1-7, 2013.

[Sahami et.al, 1998] Sahami, M., Dumais, S., Heckerman, D., and Horvitz, E, A Bayesian

Approach to Filtering Junk Email. In Learning for Text Categorization, Papers from the

AAAI Workshop, PP: 55–62, Madison Wisconsin. AAAI Technical Report WS-98-05,

1998.

[Sebastiani, 2002] Sebastiani, F.,2002, Machine learning in automated text categorization,

ACM Computing Surveys (CSUR) 34, pp.1 – 47, 2002.

[Schneider, 2003] Karl-Michael Schneider, Comparison of event model for Naïve Bayesian

Anti-Spam E-mail filtering. In Proceedings of the 11th Conference of the European

Chapter of the Association for Computational Linguistics (EACL’03), 2003.

Page 9: Appendix-I Sample List of Spam wordsshodhganga.inflibnet.ac.in/bitstream/10603/94131/14/14_appendix.pdfevaluation of semantic similarity measures for concepts stemming from the same

References

106

[Su and Gulla, 2004] Xiaomeng Su and Jon Atle Gulla, Semantic Enrichment for Ontology

Mapping. NLDB 2004, LNCS:3136, PP:217-228, 2004, Springer Verlag Berlin

Heidelberg.

[Subramaniam et.al, 2010] Thamarai Subramaniam, Hamid A. Jalab and Alaa Y. Taqa,

Overview of Textual Anti-Spam Filtering Techniques, International Journal of the

Physical Sciences Vol. 5(12), pp. 1869-1882, ISSN 1992 -1950, 2010.

[Spambase, 2012] http://www.ics.uci.edu/~mlearn/MLRepository.html. Last accessed May,

2012

[Spamhaus, 2013] http://www.spamhaus.org/

[Sánchez et.al, 2013] David Sánchez, Montserrat Batet, David Isern, Aida Valls, Ontology

based semantic similarity: a new feature-based approach, International Journal of

Expert System with Application, Volume 39, Issue 9, July, 2012, PP:7718-7728.

[Saab et.al, 2014] Saab S.A., Mitri and Awad M, Ham or spam? A comparative study for

some Content-based Classification algorithms for E-mail Filtering, In the proceedings

of 17th IEEE Conference on Mediterranean Electro-technical Conference

(MELECON), 2014.

[Tuttle et.al, 2004] Andrew Tuttle, Evangelos Milios, Nauzer Kalyaniwalla, Doctoral Thesis

on, An Evaluation of Machine Learning Techniques for Enterprise Spam Filters.

Dalhousie University Halifax Canada, 2004.

[Ted Pedersen, 2010] Ted Pedersen, Information Content Measures of Semantic Similarity

Perform Better Without Sense-Tagged Text, Human Language Technologies: The 2010

Annual Conference of the North American Chapter of the ACL, pages 329–332, Los

Angeles, California, June 2010.

[Uysal, 2013] A. K. Uysal, S. Gunal, S. Ergin, E. Sora Gunal, The Impact of Feature

Extraction and Selection on SMS Spam Filtering, ELEKTROTECHNIKA, ISSN 1392-

1215, VOL. 19, No. 5, 2013,

ViMailFilter-Taking control of your E-mails, Technical White paper, ViSolve Open Source

Solution.

[Wu and Palmer, 1994] Wu, Z., Palmer, M, Verb semantics and lexical selection. In 32nd

Annual Meeting of the Association for Computational Linguistics, PP:133 –138. New

Mexico State University, Las Cruces, New Mexico, 1994.

[WangI et.al, 2006] Ren WangI, Amr M. Youssef and Ahmed K. Elhakeem, On Some

Feature Selection Strategies for Spam Filter Design, In the proceedings of Canadian

Conference on Electrical and Computer Engineering (CCECE-2006) , Ottawa, DOI-1-

4244-0038-4, 2006.

Page 10: Appendix-I Sample List of Spam wordsshodhganga.inflibnet.ac.in/bitstream/10603/94131/14/14_appendix.pdfevaluation of semantic similarity measures for concepts stemming from the same

References

107

[Wang et.al, 2006] Bin Wang, Gareth J. F. Jones and Wenfeng Pan, Using Online Linear

Classifiers to Filter Spam E-mails, Pattern Analysis and Applications Vol. 9, No.4, PP:

339-351 DOI 10.1007/s10044-006-0045-7

[Wenbin Li et.al, 2007] Wenbin Li, Ning Zhong, Y.Y. Yao, Jiming Liu, and Chunnian Liu,

Spam Filtering and Email-Mediated Applications, LNAI 4845, pp. 382–405, 2007.

[Yanhui Guo et.al, 2003] Yanhui Guo, Jianyi Liu, Cong Wang, Yixin Zhong, Online

classifiers for Chinese text classification and filtering, In the proceedings of IEEE

International Conference on Natural Language Processing and Knowledge

Engineering, 2003 PP:656 - 662 ISBN:0-7803-7902-0

10.1109/NLPKE.2003.1275988.

[Youn and McLeod, 2007] Seongwook Youn, Dennis McLeod, Spam Email Classification

using an Adaptive Ontology, Journal of Software, Vol. 2, No. 3, September 2007,

PP:43-55.

[Youn, 2014] Seongwook Youn, SPONGY (SPam ONtoloGY): Email Classification Using

Two-Level Dynamic Ontology, The Scientific World Journal Volume 2014, Article ID

414583, 2014.

[Zi-Qiang Wang, et.al,2006] Zi-Qiang Wang, Xia Sun, Xin Li and De-Xian Zhang, An

Efficient SVM based Spam Filtering Algorithm, Proceedings of the Fifth International

Conference on Machine Learning and Cybernetics, Dalian, 13-16 August 2006,

PP:3682-3686.

Page 11: Appendix-I Sample List of Spam wordsshodhganga.inflibnet.ac.in/bitstream/10603/94131/14/14_appendix.pdfevaluation of semantic similarity measures for concepts stemming from the same

Appendix-III: Blacklisted IP/Domain Names

94

Appendix-III

List Black-list IP or Domain Names ___________________________________________________________

The Black-listed IP or Domain names are:

['[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'sgtjasoncampbell.campbell597@gmail

.com', '[email protected]',

'[email protected]',

'cocacola2011_awardpromo@rediffmai

l.com', '[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]', 'mandy-

[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]

om', '[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]', 'lzh-

[email protected]', '[email protected]',

'[email protected]', 'lnx-

@hotmail.com',

Page 12: Appendix-I Sample List of Spam wordsshodhganga.inflibnet.ac.in/bitstream/10603/94131/14/14_appendix.pdfevaluation of semantic similarity measures for concepts stemming from the same

Appendix-III: Blacklisted IP/Domain Names

95

'mrleungcheung@hangsengonlinebank.

tk', '[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]

om', '[email protected]',

'[email protected]

o.in',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]

m', '[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]

m', '[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]

m', '[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]', 'coca-

[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'pyrometerlaurentian@correctionscorp.

com', '[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

Page 13: Appendix-I Sample List of Spam wordsshodhganga.inflibnet.ac.in/bitstream/10603/94131/14/14_appendix.pdfevaluation of semantic similarity measures for concepts stemming from the same

Appendix-III: Blacklisted IP/Domain Names

96

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]', 'dr-

[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]', 'micro-

[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]', 'man-

[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]'

, '[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

Page 14: Appendix-I Sample List of Spam wordsshodhganga.inflibnet.ac.in/bitstream/10603/94131/14/14_appendix.pdfevaluation of semantic similarity measures for concepts stemming from the same

Appendix-III: Blacklisted IP/Domain Names

97

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]

m', '[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]'

, '[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

Page 15: Appendix-I Sample List of Spam wordsshodhganga.inflibnet.ac.in/bitstream/10603/94131/14/14_appendix.pdfevaluation of semantic similarity measures for concepts stemming from the same

Appendix-III: Blacklisted IP/Domain Names

98

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]', 'micro-

[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]', 'maikel-

[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]'

, '[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]', 'salescnx-

[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

Page 16: Appendix-I Sample List of Spam wordsshodhganga.inflibnet.ac.in/bitstream/10603/94131/14/14_appendix.pdfevaluation of semantic similarity measures for concepts stemming from the same

Appendix-III: Blacklisted IP/Domain Names

99

'[email protected]',

'[email protected]', 'mike-

[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'backgammonhalcyon@business-

humanrights.org',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]'

, '[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

Page 17: Appendix-I Sample List of Spam wordsshodhganga.inflibnet.ac.in/bitstream/10603/94131/14/14_appendix.pdfevaluation of semantic similarity measures for concepts stemming from the same

Appendix-III: Blacklisted IP/Domain Names

100

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]', '[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]', 'lover-

[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]', 'info@online-job-

offer.co.cc', '[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]', 'j-

[email protected]',

'[email protected]',

'unsubscribe@tho',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

Page 18: Appendix-I Sample List of Spam wordsshodhganga.inflibnet.ac.in/bitstream/10603/94131/14/14_appendix.pdfevaluation of semantic similarity measures for concepts stemming from the same

Appendix-III: Blacklisted IP/Domain Names

101

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]', 'tpd-

[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]', '[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]', 'm--o--

[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

Page 19: Appendix-I Sample List of Spam wordsshodhganga.inflibnet.ac.in/bitstream/10603/94131/14/14_appendix.pdfevaluation of semantic similarity measures for concepts stemming from the same

Appendix-III: Blacklisted IP/Domain Names

102

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]

m', '[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]', 'lolooo-

[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]', 'michael-

[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]', '[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]', 'coca-

[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'mumfordmedford@creativecommons.

org', '[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]'

, '[email protected]',

'[email protected]',

'[email protected]',

Page 20: Appendix-I Sample List of Spam wordsshodhganga.inflibnet.ac.in/bitstream/10603/94131/14/14_appendix.pdfevaluation of semantic similarity measures for concepts stemming from the same

Appendix-III: Blacklisted IP/Domain Names

103

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]'

, '[email protected]',

'[email protected]',

'[email protected]

m', '[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]', 'm24-2-

[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'agroseafoodinc@hotmail',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

Page 21: Appendix-I Sample List of Spam wordsshodhganga.inflibnet.ac.in/bitstream/10603/94131/14/14_appendix.pdfevaluation of semantic similarity measures for concepts stemming from the same

Appendix-III: Blacklisted IP/Domain Names

104

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]', 'love-

[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]

m', '[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]

m', '[email protected]',

'[email protected]',

'[email protected]',

'[email protected]

m', '[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]', 'lyric-

[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]', 'info@deala-

day.co.cc', '[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

Page 22: Appendix-I Sample List of Spam wordsshodhganga.inflibnet.ac.in/bitstream/10603/94131/14/14_appendix.pdfevaluation of semantic similarity measures for concepts stemming from the same

Appendix-III: Blacklisted IP/Domain Names

105

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]

m', '[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'revjamessmithclaimagent2011@gmail.

com',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

Page 23: Appendix-I Sample List of Spam wordsshodhganga.inflibnet.ac.in/bitstream/10603/94131/14/14_appendix.pdfevaluation of semantic similarity measures for concepts stemming from the same

Appendix-III: Blacklisted IP/Domain Names

106

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'europes_jackpotlotto_award@europe.

com', '[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]', '[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]', '[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

Page 24: Appendix-I Sample List of Spam wordsshodhganga.inflibnet.ac.in/bitstream/10603/94131/14/14_appendix.pdfevaluation of semantic similarity measures for concepts stemming from the same

Appendix-III: Blacklisted IP/Domain Names

107

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]', 'man-

[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]', 'info@deal-

extreme.co.cc',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'geraldjohnson.coca-

[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

Page 25: Appendix-I Sample List of Spam wordsshodhganga.inflibnet.ac.in/bitstream/10603/94131/14/14_appendix.pdfevaluation of semantic similarity measures for concepts stemming from the same

Appendix-II: List of White-list IP/Domain Names

92

Appendix-II

List White-list IP or Domain Names ___________________________________________________________

The Whit-listed IP or Domains are:

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]

m', '[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

Page 26: Appendix-I Sample List of Spam wordsshodhganga.inflibnet.ac.in/bitstream/10603/94131/14/14_appendix.pdfevaluation of semantic similarity measures for concepts stemming from the same

Appendix-II: List of White-list IP/Domain Names

93

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]'

, '[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]', 'micro-

[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]', 'maikel-

[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]'

, '[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',

'[email protected]',