Metadata tables

The following sections describes the variables used in the different metadata tables. All tables should be machine-readable. They should have one or more key variables which are shared across all tables so that we can linked them (or create a full joined table containing all information)

Metadata table
Metadata table Content
Data Storage Information Server(s) address and disks. One row per bundle of data packages / datasets stored in different disks (containing data from multiple mice)
Mice Information Mice features. One row per mouse.
Scan Information List of folders/files available for each mice

Data Storage Table

Data provenance

Describe origin of the data, where, when and by whom they were collected and whether the data package is complete.

University University affiliation of the group who collected the data (e.g., UZH; see glossary)
Research_group Abbreviation describing the group who collected the data (e.g., TIG; see glossary)
Year Year of data curation
Data_type Abbreviation describing the type of data collected (e.g., SRµCT; see glossary)
Dataproject_ID (optional) Identifier of the project from which the data package derives, e.g., In Vivo CSF
Facility_proposal_ID (optional) Number of the proposal for facility usage e.g., the beamline numbers
Facility_1 Name or acronym of the main facility (e.g., CLS, SPring-8,etc)
Facility_2 (optional) Additional facility information (e.g., beamline)
Facility_Country Country where the facility is located
Start_date_acquisition DD/MM/YYYY
Contact_researcher Main researcher responsible for data collection, ideally can be contacted if there are issues with the data package or dataset
Status Indicate if ‘complete’ or ‘in progress’
Status_comment (optional) additional comments on the status of these data

File descriptors

Describe more details of the files, like the type of images they contain and how much volume they take on disk

Files_type (If applicable) more detailed description of the files depending on the data type. E.g., if data_type is synchrotron: projections, reconstructions or both.
Data_size The total volume of the data stored
File_subjectIDs List which samples (mice) were in this location be precise e.g., CA001-CA030 should be used only if there are really 30 subjects with IDs 001 to 030

Storage locations

Show the paths to the data. The columns should provide enough information to find the data and access it (whenever the right permissions are in place). There can be several locations accessible offline or online.

Online locations

If available, indicate the address in the servers maintained by the research group who collected the data, e.g., if synchrotron data collected by TIG the server is expected to be maintained at the TIG facilities

Make sure paths are machine-readable: avoid entries like “123.54.666.10\data\folder (additional info)”, where the information in brackets has to be removed. Also, beware of any white spaces at the end of the path.

Source_server_path Full path to the data (provide the IP-internet protocol server address, e.g.,‘\\123.45.679.01\data\synchrotron\brains’ and not the arbitrary letter used to map the network drive, like ‘O:\data\synchrotron\brains’)
Source_server_type Specify if the path refers to a: disk, tape, server, online repository
Source_server_access Public or private, specify any special access rules


Physical devices and copies

Disk_<research_group>: indicates a copy of the content of the disk in the different research group facilities. Groups cn be TIG, BMC, TKI (see Glossary)

Disk_<research_group>_status Copied if there is a copy at the facilities of this group
Disk_<research_group>_ID Unique identifier of the harddisk, e.g., serial number and model
Disk_<research_group>_speed Copy speed of this disk
Disk_<research_group>_time Track how long did it take to copy the data
Code
library(DT)

input_table =  'Data_locations.csv'
miss_values_spec =  c("","NA")

tbl <- read.csv(input_table, na.strings = miss_values_spec)


datatable(
    tbl,
    filter = "top",
    escape = FALSE,
    rownames = FALSE,       
    width = "100%",     
    class = 'compact cell-border  hover',
    extensions = c('Buttons', 'Select','ColReorder', 'Scroller',  'KeyTable'),
    selection = 'none',
    options = list(
      pageLength = 10,       
      dom = 'Bfrtip',
      buttons = list(list(extend = "colvis", text = "select Columns"),'selectAll', 'selectNone', 'copy' ), 
      select = list(style = 'os', items = 'row'),
      scrollX = TRUE,
      scrollY = '600px',
      paging = FALSE, 
      scrollCollapse = FALSE,  
      autoWidth = TRUE,
      colReorder = TRUE,
      columnDefs = list(
        list(
          keys = TRUE,
          search = list(regex = TRUE),
          targets=0
        )
      ),
    initComplete = JS(
      "function(settings, json) {",
      "$('td').css('font-size', 'smaller');",
      "}"
    )
  )
)

Mice Information Table

::: panel-tabset

Variable codebook

Common variables to all Sinergia datasets

Data type-specific variables

These are not documented here as they will vary with each data type (e.g., synchrotron, microscopy, etc.)

:::

Back to top