Cohort Builder
The Cohort Builder lets you define groups of participants using clinical and genomic criteria, then request access to their data for analysis in the Trusted Research Environment.
https://biobank.sca.iu.edu/cohorts
Cohort Types
Phenotype Cohorts are built from clinical data: diagnoses, medications, lab results, demographics, and hospitalizations.
Genotype Cohorts are built from genetic variant data: specific variants, gene regions, or variant annotations.
Combination Cohorts are built by combining two or more existing cohorts using set operations — useful when you want to intersect clinical and genomic criteria, or compare populations.
Building a Phenotype Cohort
Open the Cohort Builder and choose New Phenotype Cohort
Add filters — for example, an age range, a diagnosis code, or a lab value
Set the matching logic for your filters:
- All (AND) — participant must match every filter
- Any (OR) — participant must match at least one filter
- Excluding All (NOT) — participant must match none of the filters

Review the participant count as you refine your criteria
When satisfied, save and name your cohort
Building a Genotype Cohort
- Open the Cohort Builder and choose New Genotype Cohort
- Use the embedded Variant Xplorer to search for variants by gene, genomic coordinates, or rsID
- Apply filters (allele frequency, predicted impact, variant type, etc.) to narrow results
- Save and name your cohort
Combining Cohorts
Use a Combination Cohort to apply set operations across two or more existing cohorts:
- Intersection (AND) — participants in both cohort A and cohort B
- Union (OR) — participants in cohort A or cohort B (or both)
- Exclusion (NOT) — participants in cohort A but not cohort B


Cohort Visibility and Sharing
When you save a cohort, you can control who can see and use it:
- Private (default) — only you can access it
- Public — visible to all users on the platform; others can submit an access request to use it
You can also share a cohort directly with specific users, giving them access without requiring a formal request.
Writing a Good Description
When saving a cohort, a description is required. This is used as part of the governance review process, so it should be written in plain language that researchers from different disciplines can understand — not just in terms of the filters you selected.
The system may generate a description automatically, but auto-generated descriptions can be inaccurate or incomplete. Review it carefully and edit as needed to accurately describe the population your cohort represents and why this combination of criteria defines your target group.
Requesting Data Access
Once you've finalized a cohort, you can request access to participant-level clinical and genomic data for analysis in the Trusted Research Environment.
Lock the cohort — locking preserves its exact composition and prevents further changes. If you need to make changes later, duplicate the cohort first.
Submit a data access request from the locked cohort


Your request will be reviewed by the data governance committee
Upon approval, you will receive credentials and instructions for accessing the Trusted Research Environment
Tracking Your Request
After submitting, you can monitor your request's status from within the portal:
![]()
Status values you may see: Submitted, Under Review, Approved, Denied. You'll also receive email notifications when the status changes.
Trusted Research Environment
Once your access request is approved, you'll work with your cohort's data inside a Trusted Research Environment. See Trusted Research Environment Setup for how to connect.