Automated Rogue Behavior Detection for Android Applications · 2020. 5. 10. · to rectify. The experimental results show that the precision of rogue application detection reached

Automated Rogue Behavior Detection for AndroidApplications

Shuangmin Zhang, Ruixuan Li�, Junwei Tang, Xiwu GuSchool of Computer Science and Technology

Huazhong University of Science and TechnologyWuhan, China

E-mail: {shmzhang, rxli, jwtang, guxiwu}@hust.edu.cn

Abstract—There are a large number of third-party applicationmarkets that provide application download services for Androidusers. In order to improve users’ satisfaction, major applicationmarkets urgently need an automated solution to avoid somerogue behaviors that affect users’ experience, such as rogueadvertisements that induce users to click, rogue pop-up boxesthat cannot be closed normally, and rogue floating windowsthat affect users’ experience. To address such problems, wepropose a rogue behavior detection framework. We use theobject detection approach to identify advertisements in screenviews, the random forest method to identify the pop-up views,and then combine image analysis, natural language processingand heuristic methods to detect rogue behaviors. The proposedframework can also record evidences of rogue behaviors inapplications, so that application markets can ask developersto rectify. The experimental results show that the precision ofrogue application detection reached 96.7% and the recall reached90.1%.

I. INTRODUCTION

In the last decade, it has entered the era of mobile devices.According to the mobile operating system market share inMarch 2020, Android system has the highest share at 72.26%[1]. The huge user base of the Android system has alsospawned a large number of feature-rich Android application-s. Users with high security awareness generally choose todownload applications from large application markets, becausemost of them perform virus detection before allowing theapplications to be online. However, there are still a largenumber of problematic applications that can cause trouble tousers in application markets.

Research on Android applications mainly focuses on safetyissues that pose a significant risk, such as malware detection[2], privacy leak detection [3] and ad fraud detection [4].Little attention has been paid to user experience of usingAndroid applications. For the detection of rogue ads (a kind ofrogue behavior), the detection of ads in the screen views is akey point. Ad detection methods mainly adopt traffic analysismethod [4] [5]. These methods can only identify the screenviews containing ads, but cannot give the specific location ofthe ads. Hence, an efficient method is needed to detect thelocation of the ads in screen views. In order to improve users’experience and satisfaction, an automated solution is urgently

DOI reference number: 10.18293/SEKE2020-089

needed to identify rogue behaviors that affect users’ experiencein Android applications, including some rogue behaviors thatonly affect users’ experience without causing major securityissues.

In this paper, we propose a framework to find roguebehaviors that affect users’ experience. The contributions ofour work are as follows. We propose an object detectionapproach based on deep learning approach to detect ads inscreen views. It can be used to serve rogue ad detection. Wepresent a classifier that can divide the screen views into thosecontaining and not containing pop-up boxes based on randomforest. It can be used to serve rogue pop-up box and rogue addetection. We implement the rogue ad detection module basedon natural language processing and heuristic rules, the roguepop-up box detection module based on heuristic rules, andthe rogue floating window detection module based on imageanalysis and heuristic rules.

II. DEFINITION AND CATEGORIES OF ROGUEBEHAVIORS

A. Definition of Rogue BehaviorRogue Behavior: A kind of application behavior that

indirectly affects the user’s mobile device, making the userunable to use the mobile device conveniently, and bringingpotential threat to the user’s mobile device. It has no directdamage to the system after the execution, nor causing theinfringement of user’s personal information and fees [6].

B. Categories of Rogue Behavior1) Rogue Ad: Ads with contents that induce users to

click. Some ads often display some inductive information toinduce users to click. As shown in Fig. 1(a), the bottom of theleft screen view is an ad. The ”Clear Memory” and ”Close” inthe ad view are fake buttons. Such ads can easily mislead usersto think that their devices need to clear memory. When a userclicks on the green fake button, the installation package startsto be downloaded. Ads that overlay clickable components.To improve the click rate of ads, applications may pop upan ad pop-up box immediately after a normal pop-up box.The original intention of the user is to click the button ofthe normal pop-up box, but at this time, the ad pop-up boxpops out suddenly, and the user is highly likely to click onthe ad by mistake, as shown in Fig. 1(b). Ads that appear

before application exit. After a user’s click on the exit key,the application will usually pop up a prompt box to confirmif the user really wants to exit. When the user clicks the”Exit” button, the user’s intention is to leave the application.However, some applications show an ad after clicking the”Exit” button, as shown in Fig. 1(c).

2) Rogue Pop-up Box: Some developers may use pop-upbox inappropriately, causing trouble to users. As shown in Fig.1(d), the prompt pop-up box of the application only providesone button means ”update”, without the button to close thepop-up box. Through manual testing, even clicking the backkey of Android device still cannot exit, which will cause greattrouble to users.

3) Rogue Floating Window: Some applications misuse thefloating window to place ads for profit. Because a floatingwindow is floating on the normal application screen view,it will cover part of the contents in the application screenview, which will affect users’ experience. Android applicationdevelopers generally like to design the floating window usedfor advertising purposes as red-envelope-style to attract usersto click on the ad, as shown in Fig. 1(e).

Fig. 1. Rogue Behaviors.

III. METHODThe overall scheme of rogue behavior detection for Android

applications is shown in Fig. 2. The Android applicationtraversal module automatically runs applications and recordsinformation based on Appium1. The information of the screenviews and the events that caused the views’ transition arerecorded during the traversal process, as shown in Fig. 3.The main purpose of the Android application screen viewclassification module is to classify the application screenviews. This module will train a deep learning-based objectdetection model and a random forest-based classifier. Theobject detection model is used to identify ads in the screenviews. The random forest classifier is used to divide thescreen views into views containing pop-up box and those notcontaining pop-up box. The rogue ad detection module, therogue pop-up box detection module and the rogue floatingwindow detection module are used to identify rogue behaviorsin Android applications.

1Appium. http://appium.io/

Fig. 2. Overall Scheme of Rogue Behavior Detection.

Fig. 3. Recorded Information.

A. Ad DetectionWe design a lightweight network based on RetinaNet [7],

as shown in Fig. 4. The left is backbone, the medium isFeature Pyramid Networks (FPN) structure [8], the right areclassification and bound regression subnetworks. Each of L1and L2 is a convolution. L3, L4 and L5 are made up of 4,8 and 4 cells respectively, and each cell is a residual modulecomposed of three deep separable convolutions [9]. By FPNstructure, features of different scales are fused to combine thehigh-level information with the low-level information. Eachof the classification and bound regression branches containstwo residual modules consisting of two convolutions. Finally, aconvolution is added as the output layer. We design 24 anchorsof different scales and aspect ratios at each location. For theclassification branch, each anchor is classified as an ad or non-ad, so the number of the output channels is 48. For the boundregression branch, every four numbers make up the upper-leftand lower-right coordinates of the border, so the number ofthe output channels is 96.

Fig. 4. Ad Detection Model.

B. Pop-up View and Non-Popup View ClassificationWe divide the Android application screen views into views

containing pop-up box and those not containing pop-up box.We analyze the information of 1,000 Android applications,and summarize a series of features. These features includesome 0-1 features as well as continuous value features. Thesefeatures can be roughly divided into the following categories:component size features, component quantity features, compo-nent location features, keyword features and the proportions ofcomponents with specific attributes. The extracted features aresuitable for decision tree classifier, so we use a random forestclassifier based on decision tree. The random forest classifierhas higher accuracy by combining the results of multiple baseclassifiers.

C. Heuristics-based Detection of Rogue Behaviors1) Rogue Ad Detection: Detection of ads with contents

that induce users to click. We first get the ad images croppedfrom the screenshots according to the ad detection result, andthen we use the Optical Character Recognition (OCR) APIprovided by Baidu2 to extract the texts in ad images. Then,we analyze the semantics of the advertising texts. In orderto remove the interference of the texts outside the ad, weadopt difference set of OCR texts and view texts. Due toChinese has multiple expressions of the same meaning, weneed to understand the degree of similarity between differentwords. We adopt neural network word vectors for wordrepresentation, so we can identify other inductive sentencesof similar meaning. The detection can be considered as anadvertising text classification task. We adopt FastText [10] asthe classification model, because it is based on word vectorsand maintains high accuracy while training and testing timeare greatly reduced. Detection of ads that overlay clickablecomponents. In order to overlay the clickable components, thead view should have an ad that occupies more than 30% ofthe screen space. The previous screen view of the ad view isa pop-up view that contains at least one clickable component.Detection of ads that appear before the application exits.Generally, before exiting the application, users will be asked ifthey really want to quit. There are two situations at this time.One is that the user clicks the ”Exit” button, then will lead toexit. In normal circumstances, this prompt view should be the

2Baidu OCR. https://ai.baidu.com/tech/ocr/general

last view in the traversal process. If an ad view appears afterthat, this ad is a rogue ad. Another situation is that the userclicks the ”Cancel” button, then will go back to the previousview. Because the previous view may contain ads, the ads thatappear at this time should not be considered as rogue ads.

2) Rogue Pop-up Box Detection: The detection of pop-up boxes that force applications to upgrade requires thecooperation of the automated traversal module. In automatedtraversal process, we remove events of update buttons. If apop-up view contains ”update”, ”upgrade”, ”download” orother texts with similar meanings, and the event executed onthis view is ”Back” type event, and the ID of the next viewis equal to this view, the view is considered to have a pop-upbox that forces users to update the application.

3) Rogue Floating Window Detection: The most intuitivefeature of rogue floating windows is that they have red-envelope-style small icons that are approximately square. Weconsider components with an aspect ratio between 0.8 and 1.2to be regarded as approximate square, and extract the pixelvalue of each pixel for such components. The screen viewsthat contain such components with visual red pixels accountingfor more than 20% and visual yellow pixels accounting formore than 3% are considered as those likely to have red-envelope-style floating windows. It is found that rogue floatingwindows often exist in multiple screen views. Therefore, wefurther consider the context information of the current screenview. If one of the surrounding screen views also containsred-envelope-style icon, we believe it contains rogue floatingwindow.

IV. EXPERIMENTAL EVALUATION

A. DatasetThe dataset used in our work is from the application market

named ”Mango Download Station”3. We collected a totalof 4,000 applications. For Android application ad detection,500 applications were randomly selected as the dataset. Forthe classification of views containing pop-up box and viewsnot containing pop-up box, 1,000 applications were randomlyselected as the dataset. For the detection of rogue behaviors,the remaining 3,000 applications were selected as the test set,which did not include the training set used for ad detectionand the dataset used for the classification of screen views.

B. Result of Ad DetectionWe use 300 applications as the training set, and the re-

maining 200 applications as the validation set. The validationresults are shown in Fig. 5. Fig. 5(a) shows the precisioncurve. Fig. 5(b) shows the recall curve. Fig. 5(c) shows theF1 curve. Fig. 5(d) shows the precision-recall curve. Theabscissa represents the recall under different score thresholds,the ordinate represents the corresponding accuracy, and thearea under the curve represents the Average Precision (AP).Here, the score threshold means that during evaluation, apredicted box with a score lower than the threshold will be

3Mango Download Station. http://www.90370.com

thought as a negative case, otherwise as a positive case. Theevaluation was conducted with an Intersection Over Union(IOU) of 0.75, which means that the predicted box and theground truth box are considered to be matched if their IOU isgreater than 0.75. Based on the results and combined withthe need to identify ads in our work, the score thresholdwith higher recall rate is selected under the condition that theprecision and F1 value are not too low. The score threshold weuse for ad detection is 0.42. The AP is 0.808 on the validationset. The effect of ad detection is good enough to be used inour work to serve rogue ad detection.

Fig. 5. Precision, Recall, F1 and P-R Curve.

C. Result of Pop-up View and Non-Popup View Classifica-tion

In this experiment, 1,000 applications were randomly se-lected as the dataset. For the evaluation of the classifier, the10-fold cross-validation method is used. We randomly dividedthe data into 10 parts, one of which was taken as the validationset and the other 9 parts as the training set. The experimentwas conducted for 10 times, and the arithmetic mean value ofthe results was taken as the final result. The results show thatthe average accuracy of the classification is 98%. The effect ofthe classifier is good enough to be used in our work to serverogue pop-up box detection and rogue ad detection.

D. Result of Rogue Behavior Detection

By manually marking the data set, 131 of the 3000 appli-cations in the test set were marked as having rogue behaviors.Using our method to test the 3,000 applications, the resultsof rogue behavior detection are shown in Table I. The resultsshow that 122 applications have rogue behaviors, of which118 applications do have rogue behaviors, and 4 do not.13 applications with rogue behaviors were not identified. Insummary, the precision of rogue behavior detection in ourwork is 96.7%, and the recall is 90.1%.

TABLE IRESULT OF ROGUE BEHAVIOR DETECTION.

Predict LabelsRogue Apps Normal Apps

ActualLabels

Rogue Apps 118 13Normal Apps 4 2865

V. CONCLUSION AND FUTURE WORK

In this paper, we propose an effective rogue behaviordetection framework for Android applications. We implementad detection, screen view classifier, rogue ad detection, roguepop-up box detection, and rogue floating window detection.Using the methods we proposed, 118 Android applicationscontaining rogue behaviors were found in 3,000 applications.The results show that the precision of rogue applicationdetection reached 96.7% and the recall reached 90.1%.

We only identify five types of rogue behaviors, and otherrogue behaviors may exist in reality. Meanwhile, new roguebehavior is still emerging. The methods proposed in this paperis scalable, and this study can be easily expanded to new roguebehaviors.

VI. ACKNOWLEDGEMENT

This work is supported by the National Key Researchand Development Program of China under grants 2016YF-B0800402 and 2016QY01W0202, National Natural ScienceFoundation of China under grants U1836204 and U1936108.

REFERENCES

[1] statcounter, “Mobile operating system market share worldwide,” https://gs.statcounter.com/os-market-share/mobile/worldwide, 2020.

[2] Y. S. Sun, C.-C. Chen, S.-W. Hsiao, and M. C. Chen, “Antsdroid:automatic malware family behaviour generation and analysis for androidapps,” in Australasian Conference on Information Security and Privacy.Springer, 2018, pp. 796–804.

[3] W. Enck, P. Gilbert, S. Han, V. Tendulkar, B.-G. Chun, L. P. Cox,J. Jung, P. McDaniel, and A. N. Sheth, “Taintdroid: an information-flow tracking system for realtime privacy monitoring on smartphones,”ACM Transactions on Computer Systems (TOCS), vol. 32, no. 2, pp.1–29, 2014.

[4] F. Dong, H. Wang, L. Li, Y. Guo, T. F. Bissyande, T. Liu, G. Xu, andJ. Klein, “Frauddroid: Automated ad fraud detection for android apps,” inProceedings of the 2018 26th ACM Joint Meeting on European SoftwareEngineering Conference and Symposium on the Foundations of SoftwareEngineering, 2018, pp. 257–268.

[5] J. Crussell, R. Stevens, and H. Chen, “Madfraud: Investigating ad fraudin android applications,” in Proceedings of the 12th annual internationalconference on Mobile systems, applications, and services, 2014, pp.123–134.

[6] C. C. S. Association, “Description format for mobile internet maliciouscode,” http://www.ptsn.net.cn/standard/std query/show-yd-3983-1.htm,2013.

[7] T.-Y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollar, “Focal lossfor dense object detection,” in Proceedings of the IEEE internationalconference on computer vision, 2017, pp. 2980–2988.

[8] Q. Zhao, T. Sheng, Y. Wang, Z. Tang, Y. Chen, L. Cai, and H. Ling,“M2det: A single-shot object detector based on multi-level featurepyramid network,” in Proceedings of the AAAI Conference on ArtificialIntelligence, vol. 33, 2019, pp. 9259–9266.

[9] F. Chollet, “Xception: Deep learning with depthwise separable convolu-tions,” in Proceedings of the IEEE conference on computer vision andpattern recognition, 2017, pp. 1251–1258.

[10] A. Joulin, E. Grave, P. Bojanowski, M. Douze, H. Jegou, and T. Mikolov,“Fasttext. zip: Compressing text classification models,” arXiv preprintarXiv:1612.03651, 2016.

Documents

Automated Rogue Behavior Detection for Android Applications · 2020. 5. 10. · to rectify. The experimental results show that the precision of rogue application detection reached