.
-
Getting Started
-
Archipelago
-
Standard component library
Quick Start Guide
Introduction
Welcome
We would like to thank you for your interest in Archipelago.
Purpose
This guide is intended to help get you started developing your own pipelines and services using Archipelago.
We will cover the process of installing Archipelago for the first time and other requirements for being able to follow this guide and create some simple applications.
At the end of this document there are links to further information and support.
System Requirements
Archipelago requires Scala 2.13 and a Java JDK (11 and above), the remaining dependencies are packaged with Archipelago. We will not cover how to download and install Scala or the JVM in this guide as this is covered comprehensively in other documents.
This guide is for Archipelago version 0.3.*
Installation Guide
- Installation Steps: [Detailed Installation Instructions]
- Troubleshooting: [Common Installation Issues & Fixes]
- Screenshots: [Visual Guide]
Download Instructions
- Account Creation: [How to Create an Account]
- Product Activation: [How to Activate the Product]
Coded examples
A selection of example Archipelago processes are presented in this section.
Create a new project using any tooling and techniques familiar to you that includes the Archipelago and Mhor jar files that were downloaded and installed in the previous section of the guide. Register the jar files <ARC.JAR> and <MHOR.JAR> as dependencies of your project.
By including the jar files as library dependencies we will have access to all the source code required for the example projects, however it preferable to place a copy of the files referenced in the source directory of the project and use this source code as a starting point to build and eventually extend from the examples to explore what is possible.
Using IntelliJ IDEA, for example, the project structure could look something like this:
This example uses sbt, however choosing any common build system such as maven or gradle should work straightforwardly as well.
A. Archipelago Hello World
Objective
This example provides the classic and essential “proof of life” from a simple and minimal Archipelago processes
Steps
Assuming that Archipelago has been downloaded and installed, let’s follow the
Steps necessary to create a “Hello World” process.
Create a component to perform the task of writing “Hello World!” to the console.
Within the mhor_examples library, there are a series of examples including a number dedicated to this guide that an be found at the following location:
- com/eileantech/archipelago/mhor/getting_started
There is a file HelloWorldComp.scala that contains the component definition (the content of the file is presented below).
package com.eileantech.archipelago.mhor.getting_started
import com.eileantech.archipelago.api.contract._
import com.eileantech.archipelago.core.base.Cell
import com.eileantech.archipelago.core.types.CellCfg
import com.eileantech.archipelago.macros.CellContract
object HelloWorldComp {
object Go
@CellContract(Receives[Go.type])
case class HelloWorldCell() extends Cell[CellCfg] {
message {
case Go =>
println("Hello World!")
}
}
}
The code consists of an outer object HelloWorldComp which contains the definition of the component and any other related definitions it requires, this is good practice to impose clear namespaces and, if required, access controls.
Typically the contents within the wrapper would include one of more ‘Cell’ definitions (Cell being the Archipelago name for a component), config types for the Cells, types used by the Cell definitions and common code (usually in the form of traits which are mixed into the Cell definitions).
Notice that the Cell is a case class HelloWorldCell that extends Cell[CellCfg]. All Archipelago components follow this pattern but can in addition support class parameters and custom config types that are subtypes of CellCfg, these features will be covered in detail later.
The annotation @CellContract is used to define the input and output message contracts for the Cell. The contract restricts the message types that can be received and sent from the Cell which helps to eliminate mistakes in the usage of Cells when they are composed into services, if the contracts don’t align, the Cells cannot be composed together to form services, equally services have contracts and they have to be compatible.
Archipelago uses message passing rather than method invocation to deliver information to and from Cells, this enables full control over how elements are composed and recomposed.
Lastly, in the code above you can see the message handler defined for the Cell, this handler detects messages received by the Cell by type (and potentially value) and allows a corresponding action to be defined and performed when that particular message type arrives. In this example the code:
Matches the Go type when there is a message of that type sent to the Cell and performs the action of printing the “Hello World!” message to the console.
Next we need a System Definition (system spec), within the getting_started package this is presented in the file HelloWorldSpec.scala.
package com.eileantech.archipelago.mhor.getting_started
import com.eileantech.archipelago.core.systemdef.SystemDef
import com.eileantech.archipelago.mhor.getting_started.HelloWorldComp.{Go, HelloWorldCell}
object HelloWorldSpec extends SystemDef("HelloWorldSpec"){
reactor("HelloWorldReactor") {
triggering {
trigger[Go.type](Go)
}
contract {
receives[Go.type] via "HelloWorldCell"
}
cell[HelloWorldCell]("HelloWorldCell") {}
}
}
The system definition defines how Cells and supervising elements are composed into services, detailing the Cell types and their interconnections.
In the example HelloWorldSpec extends the trait SystemDef which provides a DSL to describe the system. The example spec has a single Reactor with a triggering block to ensure a message containing an instance of the triggering type is delivered to the Cell specified in the contract block, namely the HelloWorldCell.
Finally, the top level entity needs to be defined, this can behave like a script, a standalone process or an agent process within a cluster, in this simple example we will stick with a script.
package com.eileantech.archipelago.mhor.getting_started
import com.eileantech.archipelago.core.process.{ArcMainAutoStart, ArcScript}
object HelloWorldService extends ArcScript(HelloWorldSpec) with ArcMainAutoStart
Notice that the object HelloWorldService is runnable as indicated by the green arrow symbol in the picture below and therefore it can be invoked directly in the IDE (this example is from IntelliJ IDEA, other IDE might be different).
On invocation the output from the process will be printed to the console. Before examining the output let’s take a look at some of the more interesting log output. In the logback.xml config file we have set up a rolling file appender to the following file location logs/arch.log relative to the resource root.
Opening the log file and beginning with the “System Webservices” section below:
-----------------------------System Webservices Online-----------------------------
System services:
HTML
Dashboard http://127.0.0.1:8009/Dashboard
JSON
Info Report http://127.0.0.1:8009/InfoReport
Status Report http://127.0.0.1:8009/StatusReport
Status Snapshot Report http://127.0.0.1:8009/StatusSnapshotReport
Status Deltas Report http://127.0.0.1:8009/StatusDeltaReport
Structure Report http://127.0.0.1:8009/StructureReport
Services Report http://127.0.0.1:8009/ServicesReport
Ops Report http://127.0.0.1:8009/OpsReport
Agent Report http://127.0.0.1:8009/AgentReport
21:42:44.158 INFO - SystemServicesRoot < ActId-ee64b5aa >: Received ServicesActivated from Actor[akka://UnnamedApp/user/SysSvcRoot/WebServer#1360268232]
21:42:44.158 INFO - Arc[Id-692c6200, ActId-ee64b5aa]: SystemSpec loaded by Builder, notifying host ...
21:42:44.158 INFO - Arc[UnnamedApp] has notified its startup
21:42:44.159 INFO - ArcProcess: Startup called
21:42:44.159 INFO - Arc[Id-692c6200, ActId-ee64b5aa]: Starting services ...
21:42:44.160 INFO - RootSupervisor < ActId-ee64b5aa >: Received ActivateServices(ActId-ee64b5aa)
21:42:44.161 INFO - Builder < ActId-ee64b5aa >: Sending the primary (default) op request ...
21:42:44.161 INFO - Builder < ActId-ee64b5aa >: Starting Application UnnamedApp in ScriptMode ...
21:42:44.161 INFO -
----------------------------------Build Complete----------------------------------
System Structure:
RootSupervisor: "SvcRoot"
Supervisor: "HelloWorldSpec"
Reactor: "HelloWorldReactor"
Cell[HelloWorldCell] "HelloWorldCell"
21:42:44.163 INFO - Controller < id:ActId-ee64b5aa >: Received Root Op Request
21:42:44.166 INFO - RootSupervisor[SvcRoot] Received: StartOp[RootOp, owner: /SysSvcRoot/Controller]
21:42:44.173 INFO - Cell[HelloWorldCell] Received StartOp signal for: RootOp, from: /SvcRoot/HelloWorldSpec/HelloWorldReactor/HelloWorldCell
21:42:44.175 INFO - RootSupervisor[SvcRoot] Received last response to directive StartOp[RootOp, owner: /SysSvcRoot/Controller]
21:42:44.175 INFO - RootSupervisor[SvcRoot] Root for service [RootService] sending status OpIsStarted(RootOp,LocRef[/SysSvcRoot/Controller, with 1 ref, Location Status = LOCAL],Map()) to /SysSvcRoot/Controller
21:42:44.176 INFO - Controller < id:ActId-ee64b5aa >: Root Op has started, sending triggers ...
21:42:44.177 INFO - Controller < id:ActId-ee64b5aa >: Received StartOpConfirm notification at system controller
We can see a table of web service endpoints that are presented by all Archipelago processes, they provide a variety of information about the process in HTML and JSON formats and more importantly via the ArcView web app, accessed through the dashboard link, by default at http://127.0.0.1:8009/Dashboard which provides a comprehensive and powerful schematic view of any Archipelago system.
The next section of interest is the “Build Complete” section, which gives a structured listing of the elements comprising the system, in this case featuring the single HelloWorldCell. More typical systems will comprise numerous Cells of various types together with supervising parent elements; Reactors and Supervisors.
We will return to the general log messages later in this guide, their purpose is to assist diagnostics in what can become fairly complex systems although the intention is to soon replace the majority of standard logged output with a much more powerful system for recording, presenting and analysing operational data from systems. Logging will always be supported in Archipelago systems but its significance and processing overhead will reduce dramatically over time.
So what about the output? Well to be honest it is pretty underwhelming, let’s take a look:
The HelloWorldCell, the active element which produces the eponymous message, is hardly an excessive 18 lines of code and whitespace. Most of these lines are for naming, scope management and more slightly more esoteric purposes such as contract and config definition and message reception and processing. These aspects are usually present in all Cell types and as Cells become more sophisticated they will usually be presented within an outer wrapper which also contains any other definitions required by the Cells. These inclusions will be covered in the next (more sophisticated) example.
Cells can also be defined with extension in mind, as abstract classes to be subclassed into concrete types while defining all common factors present in the resulting family of derived Cell types.
Cells are designed intentionally to encourage re-use, thereby allowing common functionality to be evolved and refined to the benefit of increasingly wider audiences. Cells make strong guarantees about mutual exclusion during invocation which, in certain cases, allow shared and interleaved accesses to be efficiently made.
Let’s move on to a more involved example which introduces more of the features of Archipelago, the next example hints strongly at the origins of the platform, rooted in the world of ETL processing for FinTech.
B. Trade Aggregation Service
Objective
One typical usage of Archipelago is in the creation of ETL (Extract Transform Load) feeds and services, Routing information to and from persistence and middleware and the systems processing the information is usually accompanied by inline validation, conversion, transformation, enrichment and marshalling. Archipelago is very well suited to this flow based form of processing and to service based architectures as a whole.
In this example we will create a simplified trade aggregation service, this is a common type of operation within most financial organisations to allow the firm to benefit from placing larger trade sizes on the market, reducing the number of trades executed and select optimal bid and ask prices in its favour that are still acceptable to its clients or counterparties.
The service processes a stream of trade objects applying some basic aggregation rules to merge individual trades non-destructively into equivalent singular trades (resulting in one aggregate trade per category with a traded quantity equivalent to the sum of buys and sells). The aggregation trades are generated as replacements for the original trades in their categories and the original trades are marked up with a link to their replacement aggregation trade. The updated trades and the aggregation trades are passed out of the service.
In this example, the original and resultant trades are read from and written to the file system but in practice they would typically be received from and delivered to upstream and downstream services equivalently.
This example is focused at exploring Archipelago concepts and not at providing a realistic or complete implementation of an aggregation service, that said we are sure it would be fun to evolve something more capable and realistic from this example as an exercise, especially as Archipelago’s features become more familiar.
Steps
Assuming again that Archipelago has been downloaded and installed, let’s follow the Steps necessary to create the aggregation service.
Within the hor_examples library, the code for the aggregator service can be found at the following location:
- com/eileantech/archipelago/mhor/getting_started/
Let’s start with the top level ‘Spec’ for the service (ETLTradesAggregator).
object ETLTradesAggregatorSpec extends SystemDef("ETLSpec") {
case class Go()
routing {...}
reactor("CoordinatorReactor") {...}
reactor("TradeReaderReactor") {...}
reactor("AggregatorReactor") {...}
reactor("TradeWriterReactor") {...}
}
object ETLTradesAggregator extends ArcScript(ETLTradesAggregatorSpec) with ArcMainAutoStart
The system specification contains four reactors and a routing section, both definitions are collapsed (folded) to allow us to focus on the top level structure, we will return to the details of each of these reactors in turn.
The four reactors define a coordinating function and three services, one per reactor. The Coordinator orchestrates the startup of the system and coordinates the delivery of a request to each of the 3 services to prepare for operation and the handling of conclusions of the requests.
The services respond either directly to the request placed on them and / or to data subsequently arriving during the processing of the requests. The services conclude their processing by declaring success or failure of the request as signals back to the coordinator.
Each service’s response to a request is termed an Operation (a lightweight transaction) and the sender of the request is the Operation Owner, the coordinator in this case.
The routing section contains the linkage between each reactor and its siblings. We’ll expand on this later.
First let’s take a look at the CoordinatorReactor:
reactor("CoordinatorReactor") {
triggering { trigger[Go](Go()) }
contract {
receives[Go] via "Coordinator"
sends[TradeReaderRequest] via "Coordinator"
sends[TradeAggregateRequest] via "Coordinator"
sends[TradeWriteRequest] via "Coordinator"
}
cell[Coordinator] {}
}
Following on from the notion of contracts that apply to Cells, higher level containers, such as reactors and their parent supervisors also have contracts that can be used to restrict the input and output types that they allow to be routed to their child elements. These contracts also serve to document the function of a particular reactor and constrain its composition within a system.
In the above definition, you can see sections for triggering, contract and cell. Let’s look at each in turn:
- Cell one cell definition per instantiated cell within the reactor.
- Contract the input and output types allowed for this reactor.
- Triggering messages delivered by the process to the reactor at startup.
For a coordinator, it is very typical to see request types in its contract, this indicates the use of lower level services. Request types informally have a ‘Request’ suffix in their type name and more formally are subtypes of the Archipelago ServiceRequest type. Signals are messages within Archipelago that don’t need to be declared in contracts or specifically referenced, they convey status and commands between elements and Arcipelago processes.
Next we will look at the TradeReaderReactor:
reactor("TradeReaderReactor") {
contract {
receives[TradeReaderRequest] via "TradeReader"
sends[Trade] via "TradeReader"
}
cell[TradeReader] {}
}
Again the same structure but without the need for triggering (the coordinator handles delivery of a request to the reader which initiates the read operation). You can see that the reader receives ReaderRequest and sends Trade, Trade being the main data type, which is routed to the aggregator.
You may well ask, “Where are the services?” and to answer that a service is the capability of a reactor to respond to a request of a particular type.
A reactor can accept multiple request types and thereby offer the same number of services, a request type can be qualified by a path to the Cell accepting that request type, so it is feasible to have multiple Cells that accept the same request type in a single reactor, but that it’s unlikely to be useful and is getting rather ahead of ourselves. Why support multiple potentially multiple request types in a reactor (Cell), mostly to benefit from locality and state that is interrelated between service Operations.
So in this example system I am sure you will be happy to hear that there is only a single service offered by each reactor, three in total. There is also an intrinsic ‘Root’ service which is present in every Archipelago process and allows simple behaviours to be built without considering services or hierarchies of services. The Root service runs a single Operation which spans the lifetime of the process, it is the service for initial coordinators and simple single reactor scripts and apps.
Let’s now see what the routing section contains:
routing {
route[TradeReaderRequest] from "CoordinatorReactor" to "TradeReaderReactor"
route[TradeAggregateRequest] from "CoordinatorReactor" to "AggregatorReactor"
route[TradeWriteRequest] from "CoordinatorReactor" to "TradeWriterReactor"
route[Trade] from "TradeReaderReactor" to "AggregatorReactor"
route[Trade] from "AggregatorReactor" to "TradeWriterReactor"
}
The routing section specifies the wiring of the reactors. As noticed above, each reactor specified a contract which indicates which messages it can receive or send and which internal cell those messages are delegated to/from. The contracts, however, do not specify where those messages are sent or from where they are received. This is specified in the routing section.
In our example above, we can see that the ReaderRequest is delivered from the CoordinatorReactor to the TradeReaderReactor, similarly Trade messages are sent from TradeReaderReactor to AggregatorReactor and in turn from AggregatorReactor to TradeWriterReactor.