Skip to content
Snippets Groups Projects
Commit ce79b701 authored by Francesco Garassino's avatar Francesco Garassino
Browse files

Revert "first edits - created "easy" part of tutorial"

This reverts commit 1f6ae9cf.
parent 1f6ae9cf
No related branches found
No related tags found
No related merge requests found
Pipeline #327620 passed
......@@ -9,163 +9,17 @@ author:
- name: "G.Fraga Gonzalez"
orcid: '0000-0002-1857-8607'
affiliation: "Center for Reproducible Science, UZH"
- name: "F. Garassino"
orcid: '0000-0002-3568-9077'
affiliation: "Center for Reproducible Science, UZH"
date: last-modified
format:
html:
code-fold: show
editor: visual
editor_options:
chunk_output_type: console
---
# Creating JSON files from a metadata table
In this tutorial, we will show you how to create simple human and machine-readable metadata files in JavaScript Object Notation [(JSON)](https://www.json.org/json-en.html). JSON files consist of fields of key-value pairs. These are *sidecar* metadata files, that is, they accompany a separate source data file and provide essential information about the data.
JSON metadata files differ from structured metadata files (i.e., tables) because of their machine-readability. While structured metadata files may contain human-only readable columns (e.g., "comment" columns with free-text notes), *JSON files should not. However, they can have more details than the metadata tables.*
::: {.callout-tip collapse="false"}
## Making JSON files
Good editors with a graphical interface are available online to read and write JSON files. We recommend the following: <https://jsoneditoronline.org/>
However, we recommend creating JSON files with a script and not manually to save time and prevent data entry errors. We will demonstrate how in this tutorial.
:::
## Setting the stage for this tutorial
Here, we will work with an example situation derived from an imaging experiment. We will automate the creation of a JSON file from a metadata table containing image file names and locations, as well as information about the images (e.g., subject ID, subject sex, condition in which the subject was observed, treatment the subject received).
### Requirements for creating the JSON file
- Metadata table, containing the name of the reference images and metadata. *If relying entirely on the metadata on these tables (provided they have sufficient information) we do not require access to the actual files.*
- Potentially, additional table(s) with JSON fields. The JSON files will therefore have additional information not found in the metadata table.
- Description of filenaming convention, *codebook (i.e., ?)*, and glossary of abbreviations used in the metadata table
- R code
## Let's get to work!
Let's assume a relatively common structure for the dummy dataset we'll be using for this tutorial:
experiment_results \# The base folder of our dataset
├── ... \# Folder(s) with other kinds of data
└── imaging \# The folder containing the imaging data
├── subject_n \# Each measured subject has a folder
\| └── imgfile_subject_n.tiff \# The image file
└── ...
The code we provide will parse a dummy metadata table to create one JSON file for each row of the table, which describes one data file (in this case, an image). Our script thus will generate companion files for all image files, as necessary for....
::: callout-warning
## Machine readability
As we are automating a task, it's essential that our metadata table is formatted to be machine readable. This means that when preparing the table one should have paid attention to (among others) avoiding blank rows, if possible avoiding empty cells, using only the first row for header information (i.e. variable names).
Futhermore, the metadata table should be part of a spreadsheet also containing a *codebook* explaining what each variable is. The number of variables (= the number of columns) and their names in the metadata tables should be the same in the codebook.
For further information on readability of spreadsheets, see the [Six tips for better spreadsheets](https://doi.org/10.1038/d41586-022-02076-1) by J.M. Perkel .
:::
```{r setup, echo=T, results='hide'}
# these packages are required for correct functioning
easypackages::packages("dplyr", # data operations
"kableExtra", # for rendering of tables in HTML or PDF
"knitr") # rendering of the report
```
First of all, let's create a simple dummy (or toy) metadata table:
```{r dummy-metadata, echo=T, results='hide'}
n_rows = 30 # defining how many rows (in this case, how many study "subjects") we want in the table
metadata <- tibble(id = paste("subject", 1:n_rows, sep = "_"),
# this simply creates "subject_n" entries with n from 1 to n_rows
img_location = paste("experiment_results/imaging/subject_", 1:n_rows, "/imgfile_subject_", 1:n_rows, ".tiff", sep = ""),
sex = replicate(n_rows, sample(c("male", "female"), size=1), simplify = T),
# this and following lines will randomly fill a column with attributes chosen between a set of options (in this case, "male" or "female")
condition = replicate(n_rows, sample(c("A", "B"), size=1), simplify = T),
treatment = replicate(n_rows, sample(c("control", "treat_1", "treat_2", "treat_3"), size=1), simplify = T)
)
```
Let's take a look:
```{r view-table}
knitr::kable(metadata) %>%
scroll_box(height= "300px")
```
---how to force some white space here?---
Now let's make the corresponding JSON files. For readability purposes, here we use a for loop to iterate over the lines of the metadata table. This solution can be very slow when dealing with large metadata tables, so below we will illustrate an alternative, faster solution.
```{r make-json}
library(jsonlite)
library(stringr)
saveoutput <- F # set to TRUE or T to automate JSON saving
for (i in 1:nrow(metadata)) {
# Create JSON for current row
row_metadata <- metadata %>%
slice(i)
json_metadata <- toJSON(row_metadata, pretty = TRUE, auto_unbox = TRUE )
if (saveoutput) {
# create the output path for the JSON
json_path <- row_metadata %>%
pull(img_location) %>% # this will give us the full path to the image
str_replace(., ".tiff", ".json") # and this will remove the filename
# write JSON file to appropriate location if triggered
write(json_metadata, file = json_path)
print(paste0("Wrote ", json_path))
}
}
```
::: callout-tip
The code snippet you just saw includes the possibility to save the generated JSON files into the folders containing image files mentioned in the `img_location` column of the metadata table. If you want to use this functionality during your execution, simply change the `saveoutput` variable to `TRUE` or `T`.
:::
The `toJSON` function of `jsonlite` will convert anything in a table (in our case, a single row of the metadata table) into an R character vector of length 1, i.e. containing a single string. This one string is formatted according to the JSON format specifications. Here's how one of our JSON looks like:
```{r print_json}
print(json_metadata) # substring allows us to extract a small part of the string for printing
```
::: callout-note
Notice that the file starts with a `[` and ends with a `]`. The content of a row of the metadata table is delimited by `{}`. This delimited field contains `column_name:value` pairs in separate lines (separated by a newline, `\n`). Therefore, we could say that `toJSON` "expands" the row of the metadata table into a list describing each of its cells.
:::
And just like that, you've created your first JSON files. Congratulations!
# **Creating JSON files with a metadata table and information from additional files**
--- still to be edited —
**Create a JSON file with metadata from tables and data files**
In this example we create simple human and machine-readable metadata files in JavaScript Object Notation [(JSON)](https://www.json.org/json-en.html). JSON files consist of fields of key-value pairs. These are *sidecar* metadata files, that is, they accompany a separate source data file (for this example we use dummy images as data files). In this use case, researchers can edit a table specifying the fields in the JSON file. This script creates a JSON with these fields. Some of the values in the JSON fields are filled for each of the data files based on the filename and an additional table with metadata (subject information).
In this example we create simple human and machine-readable metadata files in JavaScript Object Notation [(JSON)](https://www.json.org/json-en.html). They consists of fields of key-value pairs. These are *sidecar* metadata files, that is, they accompany a separate source data file (for this example we use dummy images as data files). In this use case, researchers can edit a table specifying the fields in the JSON file. This script creates a JSON with these fields. Some of the values in the JSON fields are filled for each of the data files based on the filename and an additional table with metadata (subject information).
::: {.callout-important collapse="false"}
There are good editors with a graphical interface available online to read and write JSON. We recommend the following website: <https://jsoneditoronline.org/>
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment