A cybersecurity dataset sharing Platform.
SCReeD is a platform purposefully designed for sharing cybersecurity research data, including, but not limited to cybersecurity datasets generated by CSCRC research projects. For sharing cyber-datasets, general-purpose dataset sharing platforms are not ideal, and often not suitable, due to many reasons, such as the data sensitivity, volume, complexity, and wide range of file formats and serialisations typical to cybersecurity research datasets; also they can be authentic, synthetic, or hybrid, with different potential applications. In addition, a variety of audiences should be considered when sharing cybersecurity datasets, and all these require different levels of access, whether for CSCRC partners, Australian researchers, Australian industry practitioners, 5 Eyes researchers, the EU, or the world, which makes access control crucial in this application—compared to having only the option to share datasets with the general public, which is what many general-purpose/domain-independent dataset sharing platforms focus on.
The data captured in cybersecurity research datasets can be beneficial not only to the academic community, but also the industry in terms of adding new functionalities to, and test existing capabilities of, software tools; evaluating analytic performance of tools; assessing IT infrastructures’ vulnerability; identifying new types of cyberthreats, etc.
SCReeD defines standards for sharing cybersecurity datasets, provides guidelines for users so that the metadata to capture for cybersecurity datasets is streamlined, including technical and legal considerations, without the need for the uploader to be an expert in data sharing.
The guidelines are provided for SCReeD users regarding dataset uploading, ensuring a systematic approach and inclusion of all relevant metadata.
Before uploading, users need to do following steps.
(Note: Metadata for the uploaded datasets on SCReeD platform is available here)
Once you are done its time to upload the dataset. To upload the datasets, there are two methods, user can follow either of them.
To upload data with method 1, users need to do following steps:
Click on the + on the top right as shown below.
Create a folder to hold your data file, in this example, we will create a folder called dataset. Click on the Create new file button, and type in dataset/.githold. This will create a folder called dataset and place a file called** .githold** inside it.
.
├── README.md
├── .dataherb
├── dataset
└── your_data_file
*.docx binary
*.pdf binary
name: [Name of your dataset]
description: [Describe your dataset here]
contributors:
name: [Name of the the first contributor]
data:
name: [name of your data file, optional]
description: [description of your data file, optional]
path: [path_to_your_data_file]
format: [format of your data file]
size: [size of your data file]
fields:
name: [name of the first colomn]
description: [description of the first column]
name: [name of the second colomn]
description: [description of the second column]
name: [name of your second data file, optional]
description: [description of your second data file, optional]
path: [path_to_your_data_file]
format: [format of your data file]
size: [size of your data file]
fields:
name: [name of the first colomn]
description: [description of the first column]
name: [name of the second colomn]
description: [description of the second column]
license:
name: [Name of the license of the dataset]
link: [Link to the license page]
references:
name: [Name of the first reference]
link: [https://link_to_your_first_reference]
This is for example, one could use similar or less content for metadata as required.
To upload data with method 2, users need to do following steps:
.
├── README.md
├── .dataherb
├── dataset