Call for Volunteers for Analyzing Browser History for Anti-Phishing

We are conducting a research study which involves studying web browsing histories to discover and understand its commonalities. You are invited to participate. You must be aged 18 or older, have access to a computer, running a web browser that is either Mozilla Firefox or Chromium-based (Google Chrome, Chromium, Brave).

Specifically, we are researching the distribution of website visits and commonalities of website visits across users to understand if sharing information about phishing targets will be effective.

For our experiment, would like our volunteers to install a browser extension that will record their website visit history for a week, starting on March 27, 2020, and then share with us their anonymized history.

(You can install the extension at anytime until the experiment is over.)

Your website visit history will consist of many entries in the form of {timestamp, anonymized website URL}. We describe how we anonymize a website URL below.

Recording your web browsing history involves private information. Our goal is to understand the commonalities between your web browsing history and that of other users in this study.

To protect the privacy of what you will share, we will anonymize your email address and visited websites by cryptographically hashing it. (This means we will replace each email address and website pathname component with an alphanumeric string that we cannot reverse, so we will not know either).

The specific things we anonymize via cryptographic hashing is your email address and website URLs. Every website is specified by a URL, of which we will only record the domain. For each domain, we only keep the “generic” part of the domain (the right-most component), and cryptographically hash the other domain name components: www.example.com becomes xxxx.yyyy.com (xxxx and yyyy is a unique, fixed-length, non-reversible string as generated by a cryptographic hash function). Sometimes, we will keep the two right-most components (usc.edu, isi.edu, and generic second-level domains found here https://publicsuffix.org/list/).

Finally, only members of the research team will have access to the data associated with this study.

If you agree to participate, you will be asked to do the following:

1) Read our Informed Consent Form, which includes information on our study, to understand what you are agreeing to participate in:

PDF: https://cardi.github.io/browser-history-research/informed_consent_form.pdf

Online (via Google Forms): https://docs.google.com/forms/d/e/1FAIpQLSd_xC9gvEkuYOkNBoOdrcbwqtnMSMNA5Z8BCqOrNupVmCqO8g/viewform

2) Install and run the browser extension during the week of March 27 – April 2, 2020:

Chrome: https://chrome.google.com/webstore/detail/uscisi-browser-history-re/ggnkccpdmkoophjblaokdncbdlmjbdng

Firefox: https://github.com/cardi/browser-history-research/releases/download/v1.0.2/browser-history-research-1.0.2-fx.xpi

All of the above information can be found on our research study website:

https://cardi.github.io/browser-history-research/

Below are some details.

Browser extension: An add-on for Mozilla Firefox or Chromium-based browsers (Google Chrome, Chromium, Brave). It can be installed by following the instructions here:

https://cardi.github.io/browser-history-research/.

The browser extension will read and encode your web browser history and create an output for manual submission.

Web browser history: Your history will consist of many entries, with each entry formatted as (timestamp, coded webpage domain). A coded webpage domain is generated by taking the domain component of the original URL, and then cryptographically hashing all but one or two right-most domain components (www.example.com becomes xxxx.yyyy.com, and a.b.usc.edu becomes xxxx.yyyy.usc.edu).

Information we request:

At the end of the experiment on April 3, 2020, you are asked to send researchers (via Google Forms or by email) the information listed above.

Please contact me if you have questions or concerns. We will distribute the experimental results to you on request when they are completed.

This study has been approved by University of Southern California’s Institutional Review Board (USC IRB #UP-19-00826).

Thank you for your time!

Calvin Ardi

calvin@isi.edu