Data Structures
This chapter aims to provide a brief overview of the formats used to write CaRMeN's input files.
The first part of this chapter deals with the basic structure common to all programming languages. If you are already familiar with the topic you might skip this part. The second part of this chapter deals with the JSON-format (JSON: JavaScript Object Notation) used for packages. Here we progressively explain the structure and the syntax used by files with the file extension .json
. For more information about packages please refer to the chapter Concepts of this documentation. The third part of this chapter deals with the alternative YAML-format (YAML) used to write files with the file extension .yml
or .yaml
.
JSON is already extensively described in the literature and the content shown in this chapter is therefore not new. An introductory description of JSON can be found under http://guide.couchdb.org/draft/json.html, as well as on the JSON website http://www.json.org, and on Wikipedia. Our intention here is to first explain the basics of JSON's data types and syntax together with examples relevant to the work we expect you to be performing with CaRMeN. We will then present an alternative format called YAML, a superset of JSON with the same basic structure but simpler notation, which is intended to be easier to read. Extensive information about YAML can also be found on the official site http://yaml.org and on Wikipedia.
NOTE: Examples of code used in this chapter are solely aimed to explain the JSON and the YAML-format in a relevant context and may not necessarily comply with the latest version of CaRMeN for defining configuration parameters. Examples of code used in every other chapter of this documentation are however up-to-date with CaRMeN's latest version.
Universal Data Structures
All modern programming languages are built upon two basic structures:
A collection of name/value pairs. In JSON this is realized as an object. In other languages it is called a record, struct, dictionary, hash table, keyed list, or associative array.
An ordered list of values. In JSON this is realized as an array. In other languages it is called a vector, list, or sequence.
Let's first have a look at an object. An object can be visualized as a table
name | value |
---|---|
temperature | 298 |
pressure | 100000 |
with names on the left column and corresponding values of the right one. Both names and values can be determined at will. The order of the name/value pairs in an object is irrelevant. An array can also be visualized as a table
index | value |
---|---|
0 | O2 |
1 | H2O |
containing an index (or numbers) on the left column and values on the right column. Here only the values can be determined at will. The numbers are given alone by the position of the value in the array and are not shown in the code. Values in both objects and arrays can be text, numbers or even objects and arrays. One can picture an object containing an array as a table inside of a table.
name | value |
---|---|
species | ![]() |
JSON-format
JSON is an open-source format commonly used by configuration files for JavaScript based browser-server communication and applications, like CaRMeN. It is therefore the only format allowed by CaRMeN for writing package.json
files, which mark the enclosing directory as a so called package. Packages are the entry points for all configuration parameters used by the program, including the definition of cases, mixins and resources. For further information about packages see the chapter Concepts of this documentation. The filemane extension for JSON files is .json
. package.json
files can either contain all configuration parameters in themselves or they can retrieve them from external files. This external files can either be written in JSON or in other formats like YAML, which will be introduced later in this chapter. Now going back to JSON, the following five data types need to be discussed: objects, values, arrays, strings, and numbers.
Object
An object begins in JSON with a left brace {
and ends with a right brace }
. Each name is followed by a colon :
and the name/value pairs are separated by a comma ,
. Names are always given as strings, which will be described later. The basic syntax of an object is therefore {name:value,name:value}
. In this example
{"temperature":293,"pressure":50000}
"temperature"
and "pressure"
are the names, and 293
and 50000
are the values of the pairs in the object. The braces are the delimiters of the object. Objects can be either written on a single line (called in-line) like in the example above or split across multiple (called indented). For writing indented objects a so called off-side indentation is introduced by convention after the delimiter by writing two consecutive blanc spaces before each name/value pair. Indentation in JSON is not obligatory but contributes to readability. Tab characters are not allowed as indentation.
{
"name":"surface_reactions",
"format":"cki",
"path":"surface_reactions.txt"
}
Here the three name/value pairs of the object above receive the same indentation while the braces receive none. The comas separating the name/value pairs are written at the end of each pair. Professional text editors like Atom (for more see chapter Creating your First Own Simulation will automatically introduce the correct indentation level when pressing enter at the end of a line. With simpler text editors indentation might have to be set manually.
Value
A value can be either a string in double quotes "
, an object in braces {}
, an array in brackets []
, a number, or a boolean (i.e. true
or false
or null
). In this example
{
"name":"species_profile",
"format":"csv",
"schema":{
"primaryKey":"z"
},
"path":"capillary.csv"
}
the value of the first, second, and fourth pairs are each a string, while the one of the third is an object. Note that the object of the third pair is split across three lines. The opening delimiter {
is written after "schema":
on the same line, the name/value pair "primaryKey": "z"
is written on the next line and receives an additional indentation, and the closing delimiter }
is written on yet another line and preserves the indentation level of the the name "schema"
. Through delimiters and indentation a visible hierarchy of structures inside of other structures is achieved.
Array
An array begins with a left bracket [
and ends with a right bracket ]
. Values are separated by a comma ,
. The basic syntax of an in-line array is then [value,value]
. In this example
{
"surface":{
"species":[
"Rh",
"H2O-Rh",
"H-Rh",
]
}
}
the species for the "surface"
are given as an indented array. Notice that indented arrays follow the same delimiter-indentation rules as objects.
String
A string is a sequence of zero or more Unicode characters, delimited by double quotes "
. Strings support escape characters by using backslashes \
, which are however not relevant when working with CaRMeN. Relevant though is the use of macros (e.g. $resource
and $merge
) introduced by a dollar character $
. In this example
{
"driver":"detchem_channel",
"configuration":{
"chemical_model":{
"$merge":{
"source":{"$resource": "surface_reactions"},
"with":{"$resource": "thermo"}
}
}
}
}
the $merge
macro is used to merge two objects (or fragments) to form one. Here the thermodynamic data is merged with the surface reactions to form the chemical model. The $resource
macro takes the name of a resource as an argument. It can be seen as a "reference" to an external file. Further details about the use of macros are given in chapter Concepts of this documentaion.
Number
Numbers can be given either as positive integers 293
, negative integers -19
, floating-point numbers 0.800
, or scientific notation 1.1e2
(1.1e-2
or 1.1E2
or 1.1E-2
). The following example shows the syntax of a configuration descriptor for CaRMeN written in JSON.
{
"driver":"detchem_channel",
"data":{"$resource":"species_profile"},
"configuration":{
"inlet":{
"temperature":798,
"gas_velocity":1.1,
"mole_fractions":{
"CH4":0.133,
"O2":0.067,
"N2":0.800
}
},
"channel":{
"length":1.1e-2,
"radius":0.45e-3
},
"pressure":1.013e5,
"wall_temperature":{"$resource":"temperature_profile"}
}
}
YAML-format
YAML is a data-oriented format which uses a similar syntax as JSON. The most significant advantage of YAML over JSON is it's simplified syntax which makes it easier to read and allows features not available in JSON, like comments. YAML cannot be used however to write package.json
files directly, but rather only .yml
files. These files can then be used to externally define configuration parameters in YAML-format rather than having to define them directly in package.json
files using the more cumbersome JSON-format. The filemane extension for YAML files can either be .yml
or .yaml
. We will now explain YAML's basic semantic structure by using the same examples presented for JSON.
Object
Objects in YAML are not denoted by delimiters. Names are separated from values by a colon :
+ space and name/value pairs are separated from each other by a comma ,
+ space. The basic syntax of an in-line object is then name: value, name: value
as shown in the following example.
temperature: 293, pressure: 50000
Indented objects follow the same indentation rules as described for JSON, with the exemption that no comas are required to separate the name/value pairs. An indented object looks then as follows.
name: surface_reactions
loader: cki
input: surface_reactions.txt
Value
Values in YAML follow the same rules as in JSON. In the following example notice again the omission of delimiters and commas for indented objects.
name: species_profile
loader: csv
schema:
primaryKey: z
input: capillary.csv
Array
Arrays in YAML are not denoted by delimiters. Values are separated from each other by a comma ,
+ space. The basic syntax of an in-line array is then value, value
. Values in an indented array are denoted by a leading hyphen -
+ space.
surface:
species:
- Rh
- H2O-Rh
- H-Rh
As for JSON, arrays in YAML can also contain objects as entries. In such cases convention suggests the object in the array to be denoted by a leading hyphen -
+ enter (instead of space). In this example
example_array:
- value
-
example_object_inline: value
-
example_object_indented:
option: value
the first entry of the array is a one line value and it is denoted again by a leading hyphen -
+ space. The second and the third entry of the array are objects and are denoted again by a leading hyphen -
+ enter. Notice however, that the objects do not start at the same indentation level as the leading hyphen on the line above, but rather two blank spaces after its beginning (or one black space after its end). This way it looks, as if the objects would beginning at the same place as they would have before getting pushed one line downwards. Professional text editors like Atom (for more see chapter Creating your First Own Simulation will automatically introduce the correct indentation level when pressing enter after the hyphen -
. With simpler text editors indentation might have to be set manually. This practice improves readability as it helps to differentiate between one line values and objects in arrays. However, an object in an array in YAML can also be written without getting pushed one line downwards.
String
Strings in YAML are not denoted by delimiters.
Number
Numbers follow the same rules as described for JSON.
Additional features
YAML offers a number of features which are not allowed in JSON. One of this features is the ability to insert comments in a document.
Comments in YAML are denoted by a number symbol #
+ space. Comments must always be separated from other elements by a blank space. The following example shows the syntax of the same configuration descriptor shown earlier, this time written in YAML.
driver: detchem_channel # This is an example of a comment
data:
$resource: species_profile
configuration:
inlet:
temperature: 798
gas_velocity: 1.1
mole_fractions:
CH4: 0.133
O2: 0.067
N2: 0.800
channel:
length: 1.1e-2
radius: 0.45e-3
pressure: 1.013e5
wall_temperature:
$resource: temperature_profile
Notice that values given as an objects inside of other objects, i.e. in this example $resource: species_profile
and $resource: temperature_profile
, cannot be written in-line but rather indented since YAML makes no use of the delimiters {}
(or []
).