This process uses the Bulkrax *CSV from S3 parser* to do imports. Metadata is prepared in **CSV files**, with commas used to separate columns, and semi-colons used (in some cases) to separate values within a single column. Additionally, all text containing commas or semi-colons should be wrapped in quotation marks, if these commas and semi-colons are not meant to be interpreted as separators (e.g. in description or when listing contributors by "LAST_NAME, FORENAME(S)"). Encoding: UTF-8 without BOM is advised.
This process uses the Bulkrax *CSV from S3 parser* to do imports. Metadata is prepared in CSV files, data for each dataset is provided in distinct folders (see below).
## Prepare the data
...
...
@@ -14,6 +14,10 @@ The data to be imported needs to have the following file structure
* There should be a file called `metadata.csv`
* The format of the columns in the `metadata.csv` file is explained in *The metadata CSV format* section (below)
* General csv remarks
* Commas are used to separate columns, and semi-colons used (in some cases) to separate values within a single column.
* All text containing commas or semi-colon not meant to be interpreted as separators (e.g. in description or when listing contributors by "LAST_NAME, FORENAME(S)") needs to be wrapped in quotation marks.
* Encoding: UTF-8 without BOM is advised.
* The CSV file should contain one row for each dataset to be imported
* The row should mention the path to the dataset relative to the directory containing the `metadata.csv` in the column `dataset_path`.