list based approach:
(2007) developed a system, which was based on the white list based approach
that prevented the access to the phishing sites using URL check. When a user
enters a website, the Website’s URL and IP address was paired. This pair was
then sent to Access Enforcement Facility (AEF) to check the validity of URL
with matching against the trusted list, if URL matches the IP address is also
checked. If the address also matches then it allows the user to proceed,
otherwise the user is warned.
14, JungMin Kang and DoHoon Lee described approach which detected phishing
based on users online activities. This method maintained a white list as a part
of users’ profile. This profile was dynamically updated whenever a user visited
any website. An engine used here identified a website by evaluating a score and
then comparing it with a threshold score. The score was calculated from the
entries available in the user profile and details of the current website.
2012 an Automated Individual White-List (AIWL) to protect users online
credentials. This method maintained a list of records with all the legitimate
websites, which a user visits rather than maintaining a list of all the
websites available of the internet. The list also maintained features of the website
where a user often enters his credentials. This information allowed protecting
the user from different online frauds.
The visual similarity based approach are broadly
classified into following:
document object model (DOM) tree:
is a language independent and multiplatform convention for demonstrating
objects in XML and HTML document. It represents the logical structure of the
document. In addition, it represents the way a document I represented in the
form of a tree. Using DOM based phishing detection system, the DOM tree of the
suspicious webpage is compared with the legitimate web page. The attacker
usually tries to mimic the original legitimate web page and the page layout is
expected to be similar.
Style Sheet (CSS) similarity:
Cascading style sheets is a language used for
depicting the formatting of a document and setting the visual appearance of a
web page written in HTML and XML. CSS is used to design the contents of web
pages like fonts, page layout and colours.
In 2, Mishra and Gupta presented a hybrid solution
based on URL and CSS matching. In this approach it can detect embedded noise
contents like an image in a web page which is used to sustain the visual
similarity in the webpage.
They used the technique used in 3 by Jian Mao, Pei Li, Kun Li,Tao Wei, and
Zhenkai Liang to compare the CSS similarity and used it in their technique.
different types of visual features are – text content and text features. Text
features are like font colour, font size, background colour, font family and so
forth. This approach matches the visual features of different websites because the
attacker copies the page content from the actual website. A few of the visual
features of web page