Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ ZENODOarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Dataset . 2024
License: CC BY
Data sources: ZENODO
ZENODO
Dataset . 2024
License: CC BY
Data sources: Datacite
ZENODO
Dataset . 2024
License: CC BY
Data sources: Datacite
versions View all 2 versions
addClaim

PhishDecloaker Datasets

Authors: Teoh, Xiwen; Lin, Yun; Liu, Ruofan; Huang, Zhiyong; Dong, Jin Song;

PhishDecloaker Datasets

Abstract

This record contains datasets part of the paper: "PhishDecloaker: Detecting CAPTCHA-cloaked Phishing Websites via Hybrid Vision-based Interactive Models", published at USENIX Security'24. Phishing Kit Dataset Section: 2 Description: For empirical study. Contents: 100 defanged PHP phishing kits representing the following list of brands 1. Microsoft 2. Banco de Oro 3. Microsoft OneDrive 4. Deutsche Kreditbank 5. Adobe Acrobat 6. N26 7. Absa Group 8. DHL 9. Microsoft 10. Correos 11. Kempinski Summerland Hotel & Resort Beirut 12. Vantage West Credit Union 13. NetFlix 14. Agencia Tributaria 15. Square 16. Chronopost 17. PayPal 18. American Express 19. Allegro 20. LinkedIn 21. virtru 22. Citibank 23. AOL 24. Credit Agricole 25. Mercado Pago 26. Université de Pau et des Pays de l'Adour (UPPA) 27. Fifth Third Banki 28. Columbia Bank 29. Alibaba Mail 30. Microsoft OneDrive 31. Intesa Sanpaolo 32. Santander 33. America First Credit Union 34. Barclays 35. Interac 36. USPS 37. Wells Fargo 38. Yahoo 39. XFINITY 40. Berliner Sparkasse 41. OneDrive 42. Standard Bank 43. Wells Fargo 44. aruba.it 45. Bancolombia 46. Caisse d’Epargne 47. DubaiPay 48. Chase Bank 49. M&T Bank 50. Postmaster 51. Volksbanken Raiffeisenbanken 52. Facebook 53. Huntington Bank 54. Commonwealth Bank of Australia 55. Orange 56. shopify 57. Google Drive 58. WalletConnect 59. Meritrust Credit Union 60. Credit Agricole 61. Desjardins 62. Postbank 63. Dropbox 64. DocuSign 65. dpdgroup 66. L'Assurance Maladie 67. Adobe Acrobat 68. Global Sources 69. Microsoft Excel 70. SFR 71. FedEx 72. Citibank 73. Royal Credit Union 74. GoDaddy 75. ADP 76. International Card Services 77. Israeli Post 78. UNI Financial Cooperation 79. TD Bank 80. ATB Mobile 81. HSBC 82. Bank of Montreal 83. RBC Royal Bank 84. IONOS 85. AlaskaUSA Federal Credit Union 86. French Government 87. UOL SAC 88. Banco Itaú Paraguay 89. Amazon 90. Apple 91. AT&T 92. Australian Government 93. Bank of America 94. BNP Paribas 95. eBay 96. ING Group 97. Instagram 98. MetaMask 99. SingTel 100. Société Générale Landscape Dataset Section: 4.3 Description: For training the rotation CAPTCHA solver model. Contents: 7,268 natural and man-made landscape images (320×180). Format: JPEG images. CAPTCHA Detection Dataset Section: 5.2.1 Description: For training the CAPTCHA detection model. Contents: 19,680 webpage screenshots (1920×1080), 10,680 with annotated CAPTCHA bounding boxes, 9,000 without. Format: PNG images with annotations in PASCAL VOC and COCO format.All bounding boxes are labeled as the "CAPTCHA" class (no CAPTCHA type categorization). CAPTCHA Recognition Dataset Section: 5.2.2 Description: For training the CAPTCHA recognition model Contents: 6,612 CAPTCHA images distributed across 38 classes. Format: PNG images with their corresponding class labels in CSV CAPTCHA classes: 1. baidu_slide_rotate 2. dingxiang_audio 3. dingxiang_click_area 4. dingxiang_click_difference 5. dingxiang_click_font 6. dingxiang_click_icon 7. dingxiang_click_vr 8. dingxiang_click_word 9. dingxiang_drag 10. dingxiang_slide_puzzle 11. dingxiang_slide_puzzle2 12. dingxiang_slide_rotate 13. geetest_checkbox 14. geetest_click_icon 15. geetest_click_phrase 16. geetest_click_word 17. geetest_game_playing 18. geetest_game_playing2 19. geetest_select 20. geetest_slide_puzzle 21. hcaptcha 22. hcaptcha_checkbox 23. netease_click_icon 24. netease_click_phrase 25. netease_click_vr 26. netease_click_word 27. netease_drag 28. netease_slide 29. press_and_hold 30. recaptchav2 31. recaptchav2_checkbox 32. tencent_slide 33. text_1 34. text_2 35. text_3 36. text_4 37. text_5 38. text_6 CAPTCHA Open-set Dataset Section: 5.2.2 Description: For testing the CAPTCHA detection and recognition pipeline. Contents: 1,100 webpage screenshots (1920×1080), all of which have annotated CAPTCHA classes spanning 11 different categories. Format: PNG CAPTCHA and screenshot images with their corresponding class labels in CSV CAPTCHA classes: 1. arkose_select_2 2. capycaptcha_drag 3. dicecaptcha_qa 4. funcaptcha_select 5. funcaptcha_select_2 6. funcaptcha_select_3 7. funcaptcha_select_4 8. funcaptcha_select_5 9. funcaptcha_select_6 10. keycaptcha_drag 11. mtcaptcha_text Ablation Dataset Section: 5.4 Description: For training the CAPTCHA recognition model Contents: 722 webpage screenshots (1920×1080), 622 with CAPTCHAs spanning 38 classes, 100 without. Format: PNG images with their corresponding bounding box and class labels in CSV. Class IDs 0-37 can be directly mapped to class names in CAPTCHA recognition dataset. Class ID 38 are samples without CAPTCHAs.

Related Organizations
Keywords

Artificial intelligence, Computer vision

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
0
Average
Average
Average
Related to Research communities