DSpace Repository

Feature Extraction And Detection of Malicious URLs Using Deep Learning Approach

Show simple item record

dc.contributor.author Kushwaha, Rajni
dc.date.accessioned 2022-02-03T08:00:09Z
dc.date.available 2022-02-03T08:00:09Z
dc.date.issued 2019-07
dc.identifier.citation 28p. en_US
dc.identifier.uri http://hdl.handle.net/10263/7268
dc.description Dissertation under the supervision of Dr. K.S Ray en_US
dc.description.abstract Phishing Attack is one of the cyber bullying activity over the internet. Most of the phishing websites try to look similar to legitimate websites, their web content and URL features memic the legitimate URL. Due to emerging new techniques, detecting and analyzing these malicious URL is very costly due to their complexities. Traditionally, black and white listing is used for detection, but these technique was not good for real time.To address this, recent years have witnessed several e orts to perform Malicious URL Detection using Machine Learning. The most popular and scalable approaches use lexical properties of the URL string by extracting Bag-of-words like features, followed by applying machine learning models such as SVMs, Randon Forest etc. Various machine learning and deep learning techniques are used to improve generalization of malicious URLs.These approaches su er from several limitations: (i) Inability to e ectively capture semantic meaning and sequential patterns in URL strings; (ii) Requiring substantial manual feature engineering; and (iii) Inability to handle unseen features and generalize to test data. To address these Limitation, In this dissertation work, we are focused to built the real time and language independent phishing detection model by analyzing the anatomy of the URLs using deep learning techniques. To achieve this, we rstly try to nd static and dynamic features manually using some previous work. After getting the featured valued data set, we tried to nd the lexical features of Url using CNN which has both characters and words of the URL String to learn the URL embedding. After that we merge features which we manually selected and features learned from CNN and applied on Bi-LSTM Model to keeps the sequence information of URL. A hybrid model of CNN (convolution neural network model) and Bi-directional LSTM(Long Short Term Memory) are to achieve the goal. Our model analyze the URL without accessing the web content of websites. It eliminates the time latency. en_US
dc.language.iso en en_US
dc.publisher Indian Statistical Institute,Kolkata en_US
dc.relation.ispartofseries Dissertation;;2019-20
dc.subject Malicious URL en_US
dc.subject Feature Extraction en_US
dc.title Feature Extraction And Detection of Malicious URLs Using Deep Learning Approach en_US
dc.type Other en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search DSpace


Advanced Search

Browse

My Account