Cloudsweeper is a web application that enables users to view files in their Google Drive accounts, and evaluate their sensitivity and usefulness using machine learning. This app is the product of an academic research study, conducted by researchers in the Computer Science Department at the University of Illinois at Chicago.
We have designed the application with a focus on respecting users’ right to privacy. No file information is transferred out of your browser without using the “Analyze file” feature - the file information shown on the screen is only kept in the browser and never sent to our servers.
This application uses OAuth2 to give our tool limited access to your Google Drive account that you can revoke at any time. It does not have access to your password in any way. After users go through the OAuth process, we use a client-based API to scan a user’s Google Drive account and collect metadata within the user’s browser. We never store any data of users files on our servers.
We will never have access to your password, and we do not store your username. You can read more about how Google allows third parties to access your account, and how you can manage this access here.
Our application uses machine learning to classify the sensitivity and usefulness of files. For this process, we use metadata about your files which is provided by the Google Drive API. These include values such as the file size, last modified date, access type and the number of shared users per file. For images, we use software provided by the Google Vision API on these files. It is the policy of the Google Vision API not to access or use data provided to them except for the purposes of performing offered services https://cloud.google.com/terms/. For documents in English, we gather the linguistic contexts of the words and their topic and vocabulary associations.
None of your files' data is retained in any way: as soon as a file has been analyzed, all information (including the ability to access it) is discarded by the analysis server.
We do not share any information collected through this app with any external entity. The files analyzed by the classifier are in no way tied to the user’s identity or any personally identifiable information. Individual sensitivity and usefulness values for the files are only shared with the individual, and stored only within the user’s browser.
For additional questions about this app and our research, you may contact: Taha Khan ([email protected]) or Chris Kanich ([email protected])