FAQ for Using Seven Bridges
1. How do we claim files on Seven Bridges?
When the files are transferred to Seven Bridges, the Sequence Request project identifier numbers (1234R) and Analysis Project identifier numbers (A4567) for each lab will be maintained and organized into a Seven Bridges project for the lab. The principal investigator or designated member of the lab will need to sign up for a Seven Bridges account for the lab and provide a payment method (PO#) to Seven Bridges. Seven Bridges has been added to UShop as an authorized vendor (use the non-catalog request form to generate a standing PO). Once an account is established, the lab’s GNomEx files will be claimed, owned, and maintained by the lab.
2. Where are my legacy GNomEx files?
Legacy files from GNomEx were bulk uploaded into a single project and named after the PI (First Last name). Click the Projects menu at the top; if you do not see it listed, click View All. Once in the project, click the Files tab. There are two folders, Analysis and Request. Each folder contains the legacy GNomEx data, with each project as a folder.
If you are unable to see the legacy GNomEx project, you may not have permission. Ask the lab division administrator (by default, the PI) or send an email to the HCI Bioinformatics team to have you added to the project. By default, all lab division administrators should be able to see the project. Members can be added by clicking the "Manage members" button in the Dashboard tab of the Project.
3. How is Seven Bridges organized?
Each lab will be organized as a “division” within the Seven Bridges platform under an enterprise HCI account. The division is named after the PI (First Last name). Each lab division will have an administrator (initially the lab PI, although additional administrators may be assigned) who will be responsible for the billing, adding or removing users, and creating or removing projects. Each lab member who wants to use or access Seven Bridges will need an account.
Projects may be created within the lab division, and users may work on one or more projects. Unlike GNomEx, projects are not differentiated into raw data (Requests, or 1234R) and processed data (Analysis, or A4567). Data files and workflows may be shared between projects within the lab division, but not between lab divisions. However, users may be invited to other labs, with read and/or write privileges to individual projects within those.
4. May an individual be part of more than one lab division?
Yes, individuals can be invited as a member to another lab division and granted read and/or write access to individual projects by the lab division administrator. Any files written to the project or workflows executed on that data will be charged to the lab that owns the project.
5. How will Bioinformatics Shared Resource Core members access our data in Seven Bridges?
The Bioinformatics Shared Resource will automatically have a user account in every lab division and will be able to see and work on the data, just as with our current GNomEx setup. If we are requested by the lab to perform analysis on their data using the Seven Bridges platform, we will execute workflows as a member of that lab division, and compute costs will be incurred by the lab. If we perform analysis locally, we can upload the results in the project on Seven Bridges.
6. How do I estimate the cost of storage?
Seven Bridges passes through to its users the direct storage rates that it negotiates with Amazon Web Services. Currently, this is around $0.004 per gigabyte per month in AWS Glacier. You can use the AWS Simple Monthly Calculator to estimate storage costs.
7. How do I check my current charges?
You must be logged in as an administrator. Click your account name in the upper right hand corner, and select the lab Division name (the first and last name of the PI) from the pop-up menu. This will open a page with three tabs in the lower menu bar at the upper right hand corner under your account name: Members, Billing, Notifications. Click on Billing. Charges are split into Analysis (compute charges for running analysis jobs) and Storage. Charges are for the current quarter.
8. When will charges begin?
For labs that have set up a PO number, storage costs will begin in 2019. HCI will absorb the storage cost through 2018. Seven Bridges will charge on a quarterly basis.
9. What if I cannot use a PO number? Will they accept a credit card?
Yes, Seven Bridges recently (December 2018) implemented the ability to charge credit cards. However, they currently prefer PO numbers if at all possible. You may enter the information in the billing tab for the lab Division settings when logged in as an administrator. Click your account name in the upper right hand corner, and select the lab Division name (the first and last name of the PI) from the pop-up menu. This will open a page with three tabs in the lower menu bar at the upper right hand corner under your account name: Members, Billing, Notifications. Click on the Billing tab. Click Change payment method to enter your credit card information.
10. How do I add new division members?
You must be logged in as an administrator. Click your account name in the upper right hand corner, and select the lab Division name (the first and last name of the PI) from the pop-up menu. This will open a page with three tabs in the lower menu bar at the upper right hand corner under your account name: Members, Billing, Notifications. In the Members tab, click Add Members and enter an email address.
11. How can I download data files on Seven Bridges to use in other compute resources?
You can generate secure anonymous, signed URLs for any file in a project, which can be used to either download or access through http at another compute resource (for example, a genome browser or a local compute node at Utah, depending on context or size). For more information about downloading, see the Seven Bridges documentation on downloading results.
To download files to an analysis Linux server, use a tool such as aria2 or curl. Since the URLs encode the authentication information after the file path, you will need to explicitly specify the downloaded file name and protect the URL by quoting it. For example,
curl -o file1.txt.gz 'https://sb-hci-archive-us-east-1.s3.blah.blah.blah'
The aria2 program is recommended for large files, supports multiple simultaneous downloads, and conveniently accept a text file of download links. The list may be edited to specify the download filename. Download links are valid for two days.
12. How can I share data?
Data, unfortunately, cannot be shared between lab divisions. However, you may invite other members to your lab division to see or work with your data. Permissions may be set on projects so that members may only see and work with data that the administrator explicitly allows. Of course, data can also be simply downloaded and uploaded using a URL (see above).
13. When will my new GNomEx data be uploaded? Do I need to do that manually?
We are currently developing a mechanism for uploading current (most of 2018) and new data to Seven Bridges. Only data for labs that have a signed up with Seven Bridges will have their data uploaded. Ideally, this will be a "button" in GNomEx for automatic transfer. Until this is implemented, users who need data uploaded expeditiously should contact us.
Files uploaded from GNomEx will not be immediately archived. It is the users
14. Should I turn on spot instances?
If you are planning on executing analysis pipelines on the Seven Bridges platform, you should strongly consider turning on spot instances. This means that your jobs will be placed in queue and executed when resources are available at a reduced cost. Otherwise, jobs will be executed immediately on demand at a much higher cost. In most cases, spot instance wait times are very short, unless it's Black Friday or Amazon Prime Day, in which case forget it.
15. Who are the extra people in my lab division?
If you review the list of members in your lab division (by going to the Members tab of the lab Division settings), you may see some unrecognized individuals. These are either individuals from Seven Bridges (with a sbgenomics.com address) or members of the HCI Bioinformatics team. The Seven Bridges team members were initially invited for purposes of demonstration, training, and/or uploading of legacy data, and could be removed if so desired. The HCI team members are included to help administer the enterprise-level account.
16. Can more than one person use an invitation link or share an account?
NO. Accounts should NOT be shared; this is generally a bad security practice across all Internet platforms, not just Seven Bridges. The invitation links and accounts are linked to a specific email address, and notifications are sent to these email addresses.
17. I need to access my files, but they're archived. How do I restore my files?
Navigate to your folder and/or files and select them by checkmarking the box. NOTE: you can navigate up and down into folders and select multiple files from different locations; a badge with a number indicates how many files are selected. Once the files are selected, click the "..." (More Actions) button and select "Restore". Note that members must have "Write" privileges in the project. If you do not, make a new project where you do have write permissions, copy the selected files to the new project, and restore there.
IMPORTANT: There are 3 things to remember: 1) Restoring incurs a per-file restoration fee. 2) Restoring files is NOT immediate and may take hours to restore. 3) Restored files are temporary and will be automatically re-archived after 7 days.
For more information, see the Seven Bridges documentation on restoring files.
18. How do I see GNomEx metadata for the files in Seven Bridges?
Some of the GNomEx metadata was associated with the files during the bulk upload of legacy data, including values such as the GNomEx project name and ID, lab name, and user name. To see these, click the "Edit column names" button on the far right of window, scroll down the list of columns, and select the ones you want to see. Unfortunately, individual sample information was not able to be incorporated into the Seven Bridges metadata scheme.
19. Will the sample information in the GNomEx database go away?
NO, the sample information and metadata will not be deleted from GNomEx. You will always be able to go back to GNomEx and view that information.
20. How do I protect my project from being deleted?
The AWS S3 and AWS Glacier file systems do not have an undelete or backup system. File deletions are permanent with no recourse. You can follow a few simple steps to avoid disaster. First, lab division administrators can limit access to projects with write (and delete) permissions. Members may be invited to a new project with "Copy" only permissions. Second, files may be copied to a new project. Copying files do NOT incur additional storage fees; they are are simply a link. This makes deleting files harder, as all links must be deleted before a file is truly gone.
21. I cannot see a project, but others in my lab can.
Each project in a lab division has a members list, and members must be explicitly added to a project before they can see the project. Members can have different permissions to work in a project, including "Copy" (copy files to another project or download), "Write" (upload, add, or delete files), "Execute" (run computation tasks), and "Admin" (add new members). Lab division administrators by default can view, edit, and administer all projects in a division. To add new members, go to the project dashboard, and click the "Manage Members" button; enter the email address or user name of the lab member.
22. How do I archive files?
Archiving files means moving the files to AWS Glacier at substantially reduced storage cost. This should be done if you are no longer actively working with or plan to use the files. For example, you may want to consider archiving Fastq files after aligning if you are satisfied with the alignment. To do so, simply checkmark the box next to the file name (or checkmark all using the box in the header). Hint: To select files embedded in folders, use the "Type" button to select file types (e.g. "FASTQ.GZ"), which selects files regardless of how deep in a folder structure. Next, click the "..." (More Actions) button and select "Archive". Note that members must have "Write" priveleges.
23. Will I be charged for making copies of files?
No. Making a copy of the file, unlike your desktop computer, does not actually copy the file on the AWS file system; it's a misnomer. Rather, it creates a link to the file (think "Alias" on a Mac, "Shortcut" on Windows, or symbolic link on Linux). You may have as many links to a file as you want.
This is useful for several reasons: One, you may not have "Write" permissions in a project. Two, you may want to create a working project where you collect lots of files from different places. Three, you can insure against accidental deletion, since you have to delete all links to a file before it is truly deleted.
If the file is archived, then all links to that file will also be archived. Consequently, if you restore an archived file, all links will then point to the restored file.
24. I have a Cancer Genomics Cloud account. How do I link them?
This is answered in detail here. In short, generate a Developer authorization token under your CGC account by going to your account settings and selecting Developer. In your University of Utah Seven Bridges account, go to your account settings (click your user name in the upper right corner). Paste your token in the Cancer Genomics Cloud box under the Dataset Access tab.