Getting started
In this tutorial we will create a simple pipeline which fetches the current time from a Web API at
timeapi.io
, interprets it as RDF treating the response as JSON-LD, and finally serializes as n-triples.
Setting up the project
If you are not fluent with Node.js & NPM read more about how to create a package.json file.
First, install the main barnard59 package which serves a CLI to run pipelines. We will use it later in this tutorial.
- NPM
- Yarn
npm init -y
npm i --save barnard59
yarn init -y
yarn add barnard59
Add type: module
to the package.json to use ESM Modules
{
"name": "barnard59-time-zone",
"version": "1.0.0",
+ "type": "module",
"description": "",
"main": "index.js",
"scripts": {
"test": "echo \"Error: no test specified\" && exit 1"
},
"keywords": [],
"author": "",
"license": "ISC"
}
Then, add necessary dependencies which provide the operations we will use in the pipeline:
- NPM
- Yarn
npm i --save barnard59-formats barnard59-http barnard59-base
yarn add barnard59-formats barnard59-http barnard59-base
Pipeline definition
Create a file pipeline/main.ttl
. It will contain the turtle definition of the pipeline and its steps. The barnard59
CLI then parses this file and executes the processing.
First, add the necessary prefixes and a base URI.
@base <http://example.org/pipeline/> .
@prefix code: <https://code.described.at/> .
@prefix p: <https://pipeline.described.at/> .
@prefix op: <https://barnard59.zazuko.com/operations/> .
Then add the pipeline resource itself.
<tz> a p:Pipeline, p:Readable ;
p:variables
[
p:variable
[
a p:Variable ;
p:name "TZ" ;
p:value "UTC" ;
]
] ;
p:steps
[
p:stepList (
<fetch>
[ op:base\/json\/parse () ]
<jsonldStructure>
[ op:formats\/jsonld\/parse\/object () ]
[ op:formats\/ntriples\/serialize () ]
)
] .
Here, you started a Readable
pipeline with 5 steps and a variable which
will be needed by those steps and is the means to parametrise a pipeline.
Fetch time from API
Next, add implementation of <fetch>
step to retrieve the current time using a simple HTTP request.
<fetch> op:http\/get
[
code:name "url" ;
code:value "https://timeapi.io/api/Time/current/zone?timeZone=${TZ}"^^code:EcmaScriptTemplateLiteral ;
] .
Parse time as JSON-LD
The timeapi.io
returns a plain JSON response which represents a point in time. That response will be similar to:
{
"year": 2023,
"month": 7,
"day": 27,
"hour": 10,
"minute": 44,
"seconds": 23,
"milliSeconds": 88,
"dateTime": "2023-07-27T10:44:23.0884467",
"date": "07/27/2023",
"time": "10:44",
"timeZone": "UTC",
"dayOfWeek": "Thursday",
"dstActive": false
}
This JSON will be pushed as a single string chunk.
To convert it to RDF, first parse the JSON itself by using an operation from barnard59-base
:
[ op:base\/json\/parse () ]
barnard59-base
contains many steps which
will be useful in stream processing, such as filtering chunks, merging multiple stream, or parsing JSON seen above.
Next, you'll want to apply a JSON-LD context to extract the dateTime
from the JSON and have it parsed as an
xsd:dateTime
literal. Hence, add the implementation for step <jsonldStructure>
:
<jsonldStructure> op:base\/map (
[
a code:EcmaScriptModule ;
code:link <file:../lib/jsonldStructure.js#addContext>
]
) .
When processing a stream, a chunk can be modified by mapping to a new value. Here, the argument to the op:base\/map
operation
is a function which should return JSON-LD. As seen in the previous snippet, the implementation is addContext
function
exported from a module jsonldStructure.js
export function addContext(json) {
const TZ = this.variables.get('TZ')
return {
'@context': {
dateTime: 'http://purl.org/dc/elements/1.1/date'
},
'@id': `https://timeapi.io/api/Time/current/zone?timeZone=${TZ}`,
...json
}
}
Finally, the parsing itself is a dedicated operation provided by the package barnard59-formats
:
[ op:formats\/jsonld\/parse\/object () ]
Notice the final parse/object
segments used to parse JSON-LD which is used with object chunks. When processing a raw, string
JSON-LD stream parse
would be used instead.
Serialize
Finally, the last step is to serialize an RDF stream.
[ op:formats\/ntriples\/serialize () ]
Running the pipeline
You are now ready to run the pipeline:
npx barnard59 run pipeline/main.ttl --pipeline http://example.org/pipeline/tz
The CLI run
action is called with the path of the pipeline's source and its identifier.
In the output, you should see a single triple
<https://timeapi.io/api/Time/current/zone?timeZone=UTC>
<http://purl.org/dc/elements/1.1/date>
"2023-07-31T08:02:55.8602811" .
Substituting variables
To use a different timezone, you can provide an override for the variable declared in the pipeline definition from the command line.
npx barnard59 run pipeline/main.ttl \
--pipeline http://example.org/pipeline/tz \
--variable TZ=America/New_York
The output would be adjusted accordingly
<https://timeapi.io/api/Time/current/zone?timeZone=America/New_York>
<http://purl.org/dc/elements/1.1/date>
"2023-07-31T04:02:55.8602811" .