Call for Volunteers for Analyzing Browser History for Anti-Phishing
We are conducting a research study which involves studying web browsing histories to discover and understand its commonalities. You are invited to participate. You must be aged 18 or older, have access to a computer, running a web browser that is either Mozilla Firefox or Chromium-based (Google Chrome, Chromium, Brave).
Specifically, we are researching the distribution of website visits and commonalities of website visits across users to understand if sharing information about phishing targets will be effective.
For our experiment, would like our volunteers to install a browser extension that will record their website visit history for a week, starting on March 27, 2020, and then share with us their anonymized history.
(You can install the extension at anytime until the experiment is over.)
Your website visit history will consist of many entries in the form of {timestamp, anonymized website URL}. We describe how we anonymize a website URL below.
Recording your web browsing history involves private information. Our goal is to understand the commonalities between your web browsing history and that of other users in this study.
To protect the privacy of what you will share, we will anonymize your email address and visited websites by cryptographically hashing it. (This means we will replace each email address and website pathname component with an alphanumeric string that we cannot reverse, so we will not know either).
The specific things we anonymize via cryptographic hashing is your email
address and website URLs. Every website is specified by a URL, of which
we will only record the domain. For each domain, we only keep the
“generic” part of the domain (the right-most component), and
cryptographically hash the other domain name components:
www.example.com
becomes xxxx.yyyy.com
(xxxx and yyyy is a unique,
fixed-length, non-reversible string as generated by a cryptographic hash
function). Sometimes, we will keep the two right-most components
(usc.edu, isi.edu, and generic second-level domains found here
https://publicsuffix.org/list/).
Finally, only members of the research team will have access to the data associated with this study.
If you agree to participate, you will be asked to do the following:
1) Read our Informed Consent Form, which includes information on our study, to understand what you are agreeing to participate in:
PDF: https://cardi.github.io/browser-history-research/informed_consent_form.pdf
Online (via Google Forms): https://docs.google.com/forms/d/e/1FAIpQLSd_xC9gvEkuYOkNBoOdrcbwqtnMSMNA5Z8BCqOrNupVmCqO8g/viewform
2) Install and run the browser extension during the week of March 27 – April 2, 2020:
Chrome: https://chrome.google.com/webstore/detail/uscisi-browser-history-re/ggnkccpdmkoophjblaokdncbdlmjbdng
All of the above information can be found on our research study website:
https://cardi.github.io/browser-history-research/
Below are some details.
Browser extension: An add-on for Mozilla Firefox or Chromium-based browsers (Google Chrome, Chromium, Brave). It can be installed by following the instructions here:
https://cardi.github.io/browser-history-research/.
The browser extension will read and encode your web browser history and create an output for manual submission.
Web browser history: Your history will consist of many entries, with
each entry formatted as (timestamp, coded webpage domain). A coded
webpage domain is generated by taking the domain component of the
original URL, and then cryptographically hashing all but one or two
right-most domain components (www.example.com
becomes xxxx.yyyy.com
,
and a.b.usc.edu
becomes xxxx.yyyy.usc.edu
).
Information we request:
- the output (coded web browser history) from the browser extension
At the end of the experiment on April 3, 2020, you are asked to send researchers (via Google Forms or by email) the information listed above.
Please contact me if you have questions or concerns. We will distribute the experimental results to you on request when they are completed.
This study has been approved by University of Southern California’s Institutional Review Board (USC IRB #UP-19-00826).
Thank you for your time!
Calvin Ardi