added all new content authored by Paul's avatar Paul
## Dataset
* bucket_name: dataset_id prefixed with S3_BUCKET_PREFIX
`S3_BUCKET_PREFIX-dataset_id`
Example: For dataset https://rdms.cottagelabs.com/concern/datasets/2v23vt540, in S3 the id will be
`cl2-2v23vt540`
* all of the files will be inside `bucket_name/`
* The metadata will be saved in `metadata.json` within the bucket_name `bucket_name/metadata.json`
## CrcDataset
* bucket_name: experiment id prefixed with S3_BUCKET_PREFIX
`S3_BUCKET_PREFIX-crc_dataset_id`
Example: For experiment https://rdms.cottagelabs.com/concern/crc_datasets/xs55mc178?locale=en, in S3 the id will be
`cl2-xs55mc178`
* Experiment
* all of the experiment files will be inside `bucket_name/`
* The metadata will be saved in `metadata.json` within the bucket_name `bucket_name/metadata.json`
* Subject
* sanitised_subject_name:
* The subject title will be sanitised to follow [S3 bucket naming rules](#s3-bucket-naming-rules).
* After sanitising, a subject title must be unique within the experiment
For example, Sub-001, sub-001, sub--001 will all be considered the same.
* All of the subject files will be inside
`bucket_name/sanitised_subject_name/`
* The metadata will be saved in `metadata.json` within the subject folder `bucket_name/sanitised_subject_name/metadata.json`
* Session
* sanitised_session_name:
* The session title will be sanitised to follow [S3 bucket naming rules](#s3-bucket-naming-rules).
* After sanitising, a session title must be unique within the subject
For example, Ses-001, ses-001, ses--001 will all be considered the same.
* All of the session files will be inside
`bucket_name/sanitised_subject_name/sanitised_session_name/`
* The metadata will be saved in `metadata.json` within the session folder `bucket_name/sanitised_subject_name/sanitised_session_name/metadata.json`
* Modality
* sanitised_modality_name:
* The first modality value will be copied to be the modality title.
* The modality title will be sanitised to follow [S3 bucket naming rules](#s3-bucket-naming-rules).
* After sanitising, a modality title must be unique within the session.
For example, Mod-001, mod-001, mod--001 will all be considered the same.
* All of the modality files will be inside
`bucket_name/sanitised_subject_name/sanitised_session_name/sanitised_modality_name/`
* The metadata will be saved in `metadata.json` within the session folder `bucket_name/sanitised_subject_name/sanitised_session_name/sanitised_modality_name/metadata.json`
## S3 bucket naming rules
There are rules that apply to for naming general purpose buckets and directory buckets in Amazon S3 as stated [here](https://docs.aws.amazon.com/AmazonS3/latest/userguide/bucketnamingrules.html).
We have implemented the following rules
* Check for length (between 3 and 63 characters long)
* Bucket names can consist only of lowercase letters, numbers, dots (.), and hyphens (-).
* Convert title to lowercase
* Remove any characters other than `a-z`, `0-9`, `.` and `-`
* Bucket names must begin and end with a letter or number.
- Remove `.` or `-` from beginning and end
* Bucket names must not contain two adjacent periods.
- Replace `..` with `.`
- Also, replace `--` with `-` for consistency
* bucket names must not start with the prefix `xn--`
- We would have changed this to `xn-`
* Bucket names must not start with the `sthree-`
* Bucket names must not end with the `-s3alias`
* Bucket names must not end with the --ol-s3
- We would have changed this to `-ol-s3`
\ No newline at end of file