Welcome to Claro!

Claro is a statically compiled JVM language whose goal is first and foremost to provide you with the tools for reasoning about program design and construction using a very clear and direct mental model - data flowing through a systems of steps where new data may be produced, and existing data may be updated by the system at each step. This stated mental model is intentionally simple. Where it may be vague, I hope to add clarity in the rest of these intro docs.

Where other languages tend to mostly fall into the "Object-Oriented" (Java, Python, etc) paradigm, or the "Functional Programming" (Haskell, Elm, Roc, etc) paradigm, Claro eschews both of these extremes and lands somewhere more closely resembling a mix of procedural and declarative programming. Claro aims to be easily digestible to programmers coming from other paradigms by actually striving to add as few totally novel features as possible - only adding net-new ideas where they can be easily understandable to any user with moderate effort and only after the net-new idea has proven to provide some substantial, and observable benefit in the real world. Claro is not an esoteric language to be marvelled at by experts, or language geeks (like myself). Claro is a practical language whose mission is to help make writing readable, extensible, and performant programs significantly easier than one would achieve using existing languages and tools. Rather than depend heavily on layers of frameworks to achieve things like dependency injection, safe concurrency and more, Claro gives you powerful capabilities out of the box.

Learning Claro will involve a bit of unlearning of previous language principles, but will leave you with a single, well-lit path.

Hello, World!

print("Hello, World!");

As you can already see from the most minimal program possible, Claro programs eliminate unnecessary boilerplate. Every Claro program is simply a sequence of statements that are executed from top-to-bottom as if it were a "script". You don't need to specify a "main" method" as in other languages like Java, instead, much like Python, you simply specify a starting file which will execute top-down at program start.

(Note: For now, almost all Claro statements must end with a semicolon. I am aware that many people aren't fans of this, so a TODO item exists to attempt removal of semicolons from the grammar.)

Common Programming Concepts

Variables & Primitive Types

Claro is a statically-compiled, strictly typed language. Practically speaking, this means that the type of all variables must be statically determined upon declaration of the variable, and may never change thereafter.

Claro has a few builtin "primitive" types representing generally small or low-level "value types" that are immutable to the programmer. They are referred to as "primitive" because they are foundational to the language's type system, and make up the basic building blocks of which every other type in the language is just some structured combination. Values of these primitive types are generally cheap to allocate on the stack, and are passed as copies to other functions (strings, being handled in typical JVM fashion, are actually heap allocated with references to strings passed instead of copying the value itself).

More are coming soon, but for now the supported set of primitives include: int, float, boolean, string. The example below shows how you'd define variables to represent values of each type:

# All immutable.
var i: int = 10; # Any whole number. 
var f: float = 1.15; # Any decimal number.
var b: boolean = true; # true or false.
var s: string = "very first string"; # Any sequence of chars. Heap allocated.

To break the syntax down further:

var : Keyword introducing / declaring a new variable.

b : the name we chose for this particular var.

: : a syntactic divider between a variable's name and its type.

boolean : the type of the variable, which constrains the domain of values which this variable may hold.

Separate Variable Declaration & Initialization

The previous example demonstrates the simultaneous declaration and initialization of a new variable and its initial value. It is also possible to delay initialization to happen independently of declaration.

var i: int;
i = 10;

(Note: this is particularly useful when you may want to initialize to different values in different branches of an if-else chain for example.)

Variable Reassignment

By definition, the value represented by a variable may vary, or change, over time throughout the program:

var s: string = "Hello";
s = "goodbye";

Control Flow

Claro has only a few control flow structures at the current moment. These structures allow programs to execute code both conditionally and repeatedly. The only thing to keep an eye on, coming from a dynamic language like Python, is that Claro will statically validate that do not misuse conditional execution to run code that may attempt to use a variable before initialization. The examples in the following sections will also demonstrate invalid code that Claro throw a compile-time error on.

If-Else

var x: boolean = getBoolFromUser();
var s: string; # Declared but uninitialized. 
if (x) { # Curly braces are not optional.
    s = "blue";
else if (...) {
    s = "green";
} else {
    s = "red";
}
print(s); # Prints "blue", "green", or "red".

The above example is valid, but would become a compilation error if you removed one of the branches, because s might not have a value when you go to print it:

var x: boolean = ...;
var s: string;
if (x) {
    s = "blue";
} else if (...) {
    s = "green";
}
print(s); #Error: Use of uninitialized var.

While Loops

var i: int = 0;
while (i < 10) {
    print (i++);
}

Possible use of an uninitialized variable is a compile-time error:

var s: int;
while (...) {
    s = ...;
}
print(s); #Error

(Note: At the moment Claro has no builtin mechanisms for breaking out of a loop early or skipping ahead to the next iteration. You'll have to do this manually which is really annoying for now - stay tuned.)

Pipes

Piping is a control flow mechanism that is not common to all languages, but is truly just syntactic sugar (one of the few pure-sugar features in the language). Piping gives you a mechanism to decompose deeply nested function calls into a linear chain of operations that happen one after the other much like any other imperative code you're familiar with. The main thing to know is that on each line beginning with the "pipe" operator |>, the token ^ (known as the "backreference" operator) refers to the value of the expression before the pipe operator. It is intended that the ^ operator, visually resembles an arrow pointing upwards to the value produced on the line above.

var firstPipingSource = ["Claro", "piping", "is", "so", "cool"];

firstPipingSource
  |> getFirstAndLast(^)
  |> join(^, " is damn ")
  |> "{^}! I'll say it again... {^}!!"  # <-- Can backreference prev value more than once.
  |> var firstPipingRes = ^;

print(firstPipingRes);

The above prints the following:

Claro is damn cool! I'll say it again... Claro is damn cool!!

Compare to the highly nested alternative without piping (notice how use of piping in the above example even allows elimination of an entire temporary variable):

var nonPipingSource = ["Claro", "piping", "is", "so", "cool"];

# With piping, this tmp var is unnecessary.
var joinedTmp = join(getFirstAndLast(nonPipingSource));

var nonPipingRes = "{joinedTmp}! I'll say it again... {joinedTmp}!!";

print(nonPipingRes);

Types

Claro is a statically typed, compiled programming language. This means that if you attempt to assign a value of the wrong type to a variable, Claro will emit a compiler error asking for a correction before your program will be able to run. This will prevent you from waiting until runtime to find many program errors.

var s: string = 100.55; # Compiler Error - expected string found float.

Builtin Collections

Claro also rounds out its builtin types with a small set of convenient collection types that allow you to manipulate many values using a single variable. These are provided as builtins for your convenience, but their implementations have been hand selected to cover the majority of your general purpose programming use cases.

Lists

The simplest collection type allows you to keep an arbitrary number of values in some ordering. The list is very much like a Python list in that it allows arbitrary appends, and random access to read values at a O-based index. Unlike Python, as Claro is statically typed, all values in the list must be of the same type, and this type must be pre-determined upon declaration of the variable which reference the list.

var l: [int] = [1, 3, 7, 2, -115, 0];
append(l, 99);
print(len(l)); # 7
print(l[1] == l[0]); # false
print(l[6] == 99); # true

Empty Lists

It's worth noting that Claro has no way of inferring the correct element type of an empty list when it's type is not constrained by context. For example, the below variable declaration would be a compile-error:

var l = []; # Compiler Error: ambiguous type.

Empty List Type Inference By Later Usage (Will Never Be Supported)

You might think that Claro should be able to infer the type intended for this empty list based on the later usage of the variable it's assigned to. Claro takes the opinionated stance that this would be inherently undesirable behavior. Type inference shouldn't follow some esoteric resolution rules. It would be all too easy to implement a complex type inference system that can infer types far better than any real world human reader could - the end result would simply be enabling code to be written that is intrinsically difficult for your colleagues (and your future self) to read later on. This is an anti-goal of Claro.

The following will never be supported:

# Hypothetically, Claro could infer that the type of `l` is [string] based
# solely on the usage of `l` later on.
var l = []; 

...a bunch of code...

append(l, "foo");

Sets

Claro sets are much like Python sets, with a fixed, single type for all elements. You may initialize them with many elements and then check for membership in the set later.

var mySet: {int} = {1, 6, -12};
print(10 in mySet); # false
print(6 in myset); # true

(Note: for now the usefulness of sets is very limited as the available operations are extremely limited. A serious TODO is open to support all expected se operations: add, remove, union, intersect, etc.)

Maps

A mapping of keys of a fixed value type, to values of a fixed type.

var myMap: {string: int} = {};
myMap["Jason"] = 28; 
print("Jason" in myMap); # true
myMap["Kenny"] = 29;
print(myMap); # {"Jason": 28, "Kenny": 29}

(Note: for now maps are also missing many useful operations that should be builtin. Will be built as part of stdlib later.)

Tuple

Tuples are a fixed-order, fixed-size collection of values which do not all have to be of the same type.

var myPair: tuple<int, string> = (1, "one");

# Claro will interprate literal int subscripts at compile-time for type validation.
var myInt: int = myPair[0]; 
var myStr: string = myPair[1]; 

# Claro requires a type cast for non-literal index.
var index: int = ...;
myInt = myPair[index]; # Compile Error
myInt = (int) myPair[index]; # OK, opting into runtime type validation.

You can see in the example above, tuples interact w/ type validation in an interesting way worth making note of. When you index into a tuple, you should generally prefer to use a literal int constant. When you do, Claro can statically determine the type of the value you're accessing at compile time, which allows cleaner, safer code. If your index value is hidden behind some indirection, Claro can't know the type at compile-time and will require a runtime type cast (which is slow & unsafe).

Aliases

Aliases are a powerful feature that allow the expression of arbitrary types. In their simplest form, they may be used as syntactic sugar to reduce keystrokes and cognitive overhead from typing out a full type literal.

# You can imagine typing this out is verbose/annoying.
alias IntsTo2TupleFn: function<|int, int| -> tuple<int, int>>

var swapped: IntsTo2TupleFn = lambda (a, b) -> (b, a);
print("swapped(1, 2) -> {swapped(1, 2)}"); # swapped(1, 2) -> (2, 1)

var doubled: IntsTo2TupleFn = lambda (a, b) -> (2*a, 2*b);
print("doubled(1, 2) -> {doubled(1, 2)}"); # doubled(1, 2) -> (2, 4)
type(doubled); # function<|int, int| -> tuple<int, int>>

var ddd: [IntsTo2TupleFn] = [doubled];
type(ddd); # [function<|int, int| -> tuple<int, int>>]

Aliases are Not Syntactic Sugar

To be absolutely clear, Aliases are not simply syntactic sugar as shown in the trivial example above. Without aliases there would be no way to define a recursive data type in the language. Read on to the next sections to learn about recursive alias type definitions.

Aliases are Not a New Type Declaration

It's important to know that, in general, defining an Alias does not declare a "new type", instead it is just providing a shorthand for referring to some type. For simple (non-recursive) Alias definitions, you are simply defining a new, more convenient way to refer to a type that is equivalent to typing out the long-form of the type.

The example below demonstrates how variables with types declared using equivalent aliases, will in fact type-check as having the same type:

alias IntList1 : [int]
alias IntList2 : [int]

var i1: IntList1 = [1];
var i2: IntList2 = [2];
i1 = i2;                    # IntList1 is equivalent to IntList2.

var iLiteral: [int] = [3];
i2 = iLiteral;              # IntList2 is equivalent to [int].

Aliases as "Structural Typing"

Claro's Aliases are a mechanism to define values with "structural types". Roughly speaking, this is why any variables declared with "structurally equivalent" aliases are considered to have interchangeable types, regardless of the originally used alias in each variable's declaration. Many languages use a different form of typing known as "nominal typing" which implies that the name of the type is the thing that determines equivalence, rather than the structure, but this is not the case with Aliases in Claro.

Note on "Nominal Typing"

Nominal typing can actually be very useful, however, for enforcing maintenance of inter-field invariants in structured data, so, in the future Claro will provide a mechanism to define new, "nominally typed" type definitions. This will allow making a distinction between two "structurally equivalent" types that have different names.

This will allow you to confidently ensure that data with a certain type that semantically has inter-field invariants that need to be maintained across mutations are not semantically invalidated by passing the data off to a mutating procedure that wasn't implemented with the nominal type's invariants in mind. This semantic data invalidation would be easy to run into if the only type validation scheme available was structural typing, as that validation scheme makes type equivalence decisions with zero semantic knowledge.

Stay tuned for opt-in nominal types support.

(Advanced) Recursive Alias Type Definitions

A more advanced usage of type aliases includes using recursive self-reference(s) to the alias type in order to define recursive types. This allows you to define self-similar nested structures of arbitrary (but finite) depth as in the following example:

# TODO(steving) Rewrite this example w/ oneof<int, [IntOrList]> when supported.
alias IntOrList : tuple<boolean, int, [IntOrList]>

print("All of the following values satisfy the type definition for IntOrList:");
var myIntOrList: IntOrList;
myIntOrList = (true, 9, []);
print(myIntOrList); # (true, 9, [])

myIntOrList = (false, -1, []);
print(myIntOrList); # (false, -1, [])

myIntOrList = (false, -1,
  [
    (true, 2, []),
    (false, -1,
      [
        (false, -1, []),
        (true, 99, [])
      ]
    )
  ]
);
print(myIntOrList); # (false, -1, [(true, 2, []), (false, -1, [(false, -1, []), (true, 99, [])])])

append(myIntOrList[2], (true, 999, []));
print(myIntOrList); # (false, -1, [(true, 2, []), (false, -1, [(false, -1, []), (true, 99, [])]), (true, 999, [])])

(Advanced) Impossible-to-Initialize Recursive Alias Type Definitions

Some recursive aliases like the following should be rejected at compile-time because they're impossible to instantiate. The issue with these aliases is that the type recursion has no implicit "bottom" and implies an infinitely nested value. Because it's impossible to ever initialize a value composed of infinitely many values (you'd never finish typing the code), Claro lets you know right away at compile time that the infinite type is rejected for being unusable.

The below alias definitions all trigger compile-time warnings from Claro indicating that these types aren't usable and are therefore illegal.

alias IllegalUnboundedRecursiveAlias : tuple<int, IllegalUnboundedRecursiveAlias>
alias InfiniteRecursion : InfiniteRecursion
alias PartialUnbounded : tuple<PartialUnbounded, [PartialUnbounded]>

Example error message:

Impossible Recursive Alias Type Definition: Alias `IllegalUnboundedRecursiveAlias`
represents a type that is impossible to initialize in a finite number of steps. To
define a recursive type you must ensure that there is an implicit "bottom" type to
terminate the recursion. Try wrapping the Alias self-reference in some builtin
empty-able collection:
	E.g.
		Instead of:
			alias BadType : tuple<int, BadType>
		Try something like:
			alias GoodType : tuple<int, [GoodType]>
...

Type Inference

So far, through each code snippet you've seen, each variable has always included an explicit type declaration. This may be useful for the sake of very explicit readability, however, these type annotations littering your entire codebase may begin to feel very clunky and inconvenient - particularly when the type is very obvious to the reader, or sometimes if it becomes very long to type (as the result of many layers of nested collections for example). In almost every case, however, these explicit type annotations are optional in Claro!

Claro is smart enough to be able to infer the vast majority of types in any given program. So, unless you feel that the type annotation being present makes the code more readable in a particular situation, then you can generally omit it entirely! Please keep in mind, however, that while this may indeed make your code visually resemble something like Python or JavaScript, Claro is 100% statically typed. Therefor, in this regard, Claro is much more alike Rust/Java/Haskell than it is like any dynamic language. And, importantly, Claro is not an "Optionally Statically Typed" language like Typescript - the compiler must always statically know the type of every value, you may at times simply choose to avoid explicitly including the type annotation in the source code.

Inference Examples

Instead of:

var i: int = 1;
var b: boolean = true;
var l: [tuple<int, boolean>] = [(1, true), (2, false)];

You could write:

var i = 1;
var b = true;
var l = [(1, true), (2, false)];

Each corresponding statement has exactly the same meaning. They differ only syntactically. Each variable is still declared to have the same static type you'd expect.

Required Type Annotations

Some of this will be skipping ahead to more advanced topics that haven't been brought up yet, so come back to this part later if you want to, but just takeaway the fact that there are same limited situations where Claro will require a type annotation to understand your intent. Note that these situations are not just a limitation of the compiler, even if Claro would somehow choose a type for you in these situations, your colleagues (or your future self) would struggle to comprehend what type was being inferred.

For clarity and correctness in the following situations, you will be required to write an explicit type annotation:

  1. Function / Consumer args.
  2. Lambda Expressions assigned to variables.
  3. Non-literal Tuple Subscript - in the form of a runtime type cast.
  4. Function/Provider call for a Generic return type - when the generic return type can't be inferred from arg(s) of the same generic type.
  5. Any of the above Expressions when passed to a generic function arg position.

Types Constrained by Context

Whenever a type is contextually required, the value/expression placed in that position will be type checked to have the expected type. Otherwise, the compiler tries to infer the type.

For example, when you assign a value to a variable declared to have some type, the assigned value must contextually have the same type as the variable, and Claro will statically type-check that this is true:

var i: int = 10;
i = "foo"; # Error. Expected int found string.

Alternatively, Claro may infer the type of a newly declared variable instead by checking against the known type of the value being assigned:

var i: int = 10;
var i2 = i; # Ok. Claro infers that i2 must be an int.

If the context does not provide enough information for some type to be inferred, you would be required to annotate your intended type:

var unknown; # Error. Each var's type must be set at declaration time.
var known: string; # Ok.

Procedures

All languages tend to have a way to encapsulate a block of logic in one place so that it can be reused throughout the program. Generally, however, languages tend to provide only a single tool for this job, the function. The problem I see with this is that not all functions in these languages are created equal - but yet they're all forced to share the same structure which has some unfortunate implications. The general idea is straightforward: a function takes in some data, manipulates it somehow, and possibly returns some data. However, not all functions take input, and not all of them return data ("void" is not data... looking at you, Java and friends). To me, this is very unclear using a single structure, functions, for meaningfully different purposes. Claro addresses this by getting specific. Claro provides "Procedures" broken into a few sub-categories: Functions, Consumers, and Providers.

Functions

A Procedure that takes in same data and returns some data.

function add(x: int, y: int) -> int {
    return x + y;
}

# Call the function.
var res = add(10, 5);
print(res); # 15

Consumers

A Procedure that takes in some data but doesn't return any data.

consumer dump(s: string, age: int, heightFt: int) {
    # String formatting.
    print("{s} is {age} years old and {heightFt}ish feet tall.");
}

# Calling the consumer. Syntactically, consumers are always used as statements,
# never as an expression (something that has a value).
dump("Laura", 28, 5); # Laura is 28 years old and 5ish feet tall.

Note: Consumers tend to be an inherent waste of computation time unless that consumer does some side-effecting operation observable outside the program scope. So, it may be a useful hint that if you're reading code that includes a call to a consumer, some I/O is very likely taking place (if not, you should delete the call entirely as it's a waste of work).

Providers

A Procedure that takes in no data but returns some data.

provider getInt() -> int {
    return 10;
}

# Calling a provider.
var myInt = getInt();
print(myInt); # 10

Lambdas & First Class Procedures

Claro opens you up to taking full advantage of functional programming techniques by allowing you to assign Procedures to variables and to pass them around as data, allowing you to hand them off to be called later. As such you can do the following:

var f: function<int -> int> = x -> x + 1;
var c: consumer<int> = x -> { print(k); };
var p: provider<int> = () -> 10;

You may also reference defined procedures as data:

function add(x: int, y: int) -> int {...}

var biConsumer: consumer<int, int, function<|int, int| -> int>>
    = lambda (x, y, mapFn) -> {
        print(mapFn(x, y));
    };

# Pass a reference to the `add()` function as a first class arg.
biConsumer(10, 5, add); #15.

Lambdas are "Closures" (for now)

Technically a "Closure" is a lambda that is able to capture long-lived references to the values defined outside the body of the lambda, importantly, keeping that reference even as the lambda itself leaves the scope (passed into another scope or returned). This is exactly how Python lambdas work, for example.

Unfortunately, this leads to hard-to-understand code as you end up with "spooky action at a distance" where calling a lambda can cause some faraway data to be changed without realizing it. For Claro's more advanced "Fearless Concurrency" goals, this is even worse because it represents hidden mutable state which would invalidate Claro's goals of making multithreaded code unable to run into data races. Instead, to solve this, when lambdas reference names in outer scopes, they make a local copy, and can't mutate the outer scope.

The Bad (The TODO)

The below example demonstrates the main current implementation flaw of lambdas which will be updated to ensure that lambdas are always pure functions:

var i = 0;
var f: function<int -> int> = x -> {
    i = i + x; # `i` is captured, and also locally updated. Will be impossible in the future. 
    return i;
};

print(f(0)); # 0   <-- `f` is stateful and mutates its internal state on each call.
print(f(1)); # 1
print(f(5)); # 6
print(f(5)); # 11  
print(i);    # 0   <-- at least `i` is still unchanged.

Generics

Oftentimes, you'll find that some code patterns keep coming up over and over and you'll want to find same way to factor out the major commonalities in logic from the minor specific details that you'd want to just plug in as needed. For example, you might realize that you're writing loops to filter lists based on conditions all over your code; the only difference between the implementation in any of these occurrences of filtering being the element types and the specific condition. But because you want to filter lists of all kinds of types you might not immediately think you could write a single function that could be called wherever filtering is needed. Enter Generics!

function filter<T>(l: [T], pred: function<T -> boolean>) -> [T] {
    var res: [T] = [];
    var i = 0;
    while (i < len(l)) {
        if (pred(l[i])) {
            append(res, l[i]);
        }
        ++i;
    }
    return res;
}

The function filter<T>(...) is defined to take a list of elements of some arbitrary (generic) type, T, and a " predicate" (single arg func returning a boolean) that takes in values of that generic type, T. In this example, the particular type T is "unconstrained". The only constraint is the typical type behavior that for whatever values are passed as args, the list elements must be of the same type as the input to the given predicate function.

So, that generic type can take on the "concrete" type of whatever data happens to be passed into the function at the call site:

filter([1, 90, 10, 40], x -> x > 15); # [90, 40]
filter([[0], [0,0], [0, 0]], l -> len(l) > 1); # [[0, 0], [0, 0]]

Contracts

Consider the example of the generic function:

function filter<T>(l: [T], pred: function<T -> boolean>) -> [T] {...}

If you really squint, you might notice that there's very little information available in the body of the filter<T>(...) function to tell you about the type T. As a result, you're unable to do much with values of such an unconstrained generic type beyond passing the value along to another generic function accepting an unconstrained generic arg, or putting it into some collection defined over the same generic type. This would be very limiting if this was all that could be done with generics.

Enter Contracts! It will take a bit of a buildup, but we should be able to write generic functions that will be able to put constraints on the acceptable types, for example saying something like "this procedure will accept any type, T, for which the function foo(arg1: T, arg2: T) exists."

For example, we should be able to write the following generic function:

requires(Operators<T>)    # <-- What is this `requires(...)`?
function sum<T>(l: [T]) -> T {
    var res = l[0];
    var i = 0;
    while (++i < len(l)) {
        res = Operators::add(res, l[i]); # <-- What is this `Operators::add`?
    }
    return res;
}

The function above has a new requires(...) clause in the signature which we haven't seen before. This is the mechanism by which a function constrains the set of types that may be passed into this function to only types that definitely have a certain associated procedure implementation existing. The requires(...) clause takes in a list of "Contracts" that must be implemented over the generic type. In this case that contract's definition looks like:

Contract Operators<X> {
    function add(lhs: X, rhs: X) -> X;
}

This Contract specifies a single function signature that any implementation of this Contract must implement. Other Contracts may specify more than one signature, or even more than one generic type param. There are no restrictions on where the generic Contract param(s) may be used in the procedure signatures, so it may even be included in the return type as shown in the example above.

The only requirement on signatures is that each one must make use of each generic arg type listed in the Contract's signature. This is mandatory as Claro looks up the particular implementations by inspecting the arg types provided at the Contract procedure's call-sites.

Contracts are Not Interfaces

Coming from an Object-Oriented background, you may be tempted to compare Contracts to "Interfaces", but you'll find that while they may be used to a similar effect, they are not the same thing. The intention of an "Interface" is to encode subtyping relationships between types, whereas Claro has absolutely no notion of subtyping. All defined types are strictly independent of one another. Claro asks you to simplify your mental model and simply think of Contracts as a mechanism for encoding a required bit of functionality that needs to be implemented uniquely over values of unrelated, arbitrary (generic) types.

Implementing a Contract

Simply defining a contract is not sufficient to actually be useful, however, since the definition itself doesn't provide any logic. So, to actually use a Contract, we must implement it for a certain (set of) concrete type(s):

implement Operators<int> {
    function add(lhs: int, rhs: int) -> int {
        return lhs + rhs;
    }
}

implement Operators<string> {
    function add(lhs: string, rhs: string) -> string {
        return "{lhs}{rhs}";
    }
}

Now that you have implementations, you can either call them directly:

print(Operators::add(10, 20)); # 30
print(Operators::add("Hello, ", "world")); # "Hello, world"

Or, even more valuable, you can also call the generic sum function over concrete types int or string because the requirements are met for both!

requires(Operators<T>)
function sum<T>(l: [T]) -> T {...}

print(sum([1, 2, 3])); # 6 
print(sum(["a", "bc", "d"])); # abcd

In this way, Claro's Contracts interact with Generics to create a powerful form of code reuse where custom behavior can be uniquely dictated by type information. And, unlike in an Object-Oriented language, this code reuse did not rely on creating any subtyping relationships.

A Note on Static Dispatch via "Monomorphization"

As a performance note - even beyond the conceptual simplification benefits of avoiding dependence on subtyping relationships to achieve custom behaviors, Claro also achieves performance gains through its ability at compile-time to statically know which custom Contract implementation will be called. In the Object-Oriented approach, generally speaking the procedure receiving an arg of an interface type doesn't know which particular implementation will be called at runtime. This leads to the situation where a runtime "dispatch table"/"vtable" lookup is required to determine which particular implementation to call for each particular value passed into the procedure. Claro is a "monomorphizing" compiler, meaning that during compilation each Generic Procedure has a customized implementation codegen'd for each set of concrete types the procedure is actually called with. In this way, there's no runtime dispatch overhead when types are statically known (which is always true unless you're explicitly calling a generic procedure over a oneof<...> type - but in this case you're consciously opting into dynamic dispatch overhead).

Generic Return Type Inference

One very interesting capability that you get from the combination of Claro's bidirectional type inference and generics is the ability to infer which Contract implementation to defer to based on the expected/requested return type at a procedure call-site. Let's get more specific.

contract Index<T, R> {
    function get(l: [T], ind: int) -> R;
}

implement Index<[int], int> {
    function get(l: [int], ind: int) -> int {
        return l[ind];
    }
}

alias SafeRes : tuple<boolean, int>

implement Index<[int], SafeRes> {
    function get(l: [int], ind: int) -> SafeRes {
        if (ind >= 0 and ind < len(l)) {
            return (true, l[ind]);
        }
        return (false, -1);
    }
}

For the above implementations of Index<T, R>, you'll notice that each function, Index::get, only differs in its return type but not in the arg types. So, Claro must determine which implementation to defer to by way of the contextually expected return type. This, I believe leads to some very convenient ergonomics for configurability, though the onus for "appropriate" use of this feature is a design decision given to developers.

var l = [1,2,3];
var outOfBoundsInd = 10;
var unsafeRes: int = Index::get(l, outOfBoundsInd); # out of bounds runtime err.
var safeRes: SafeRes = Index::get(l, outOfBoundsInd); # (false, -1)
var ambiguous = Index::get(l, outOfBoundsInd); # Compiler error, ambiguous call to `Index::get`.

Concurrency

There is one remaining significant factor that a programming language should provide builtin mechanisms for in order to enable programmers to develop very highly performant code that can take full advantage of the available CPU hardware: concurrency.

Sometimes you have already squeezed every last drop of performance out of your algorithmic designs, or you are constrained by waiting for slow operations to complete (DB requests, networked API calls, file I/O) before your program may even make progress through its workload. In these situations, often the only way possible to get more work done is to do more than one thing at the same time.

In order to achieve this, Claro asks you to first think about the dependencies between the various steps in your desired workflow. These dependencies come in the form of data, so you should be asking yourself, "At any given step in my workflow, what data do I need to be available in order to make the decisions I'll need to make or to take the actions needed?". When you start to reason in this way, you will likely come across opportunities where certain components of your workflow are completely independent, in the sense that they do not rely at all upon the same data in order to do their work. Examples of this are easy to see in web service request handling (each reg can typically be handled independently of any others), or if you look a bit closer it can also be seen in MapReduce style batch processing (the large input is partitioned for the workers to map independently of other partitions. There will be many more examples, but the key takeaway is that if these work items can be partitioned to be completely independent like this, then they should be run at exactly the same time rather than sequentially. In a single-machine context, you achieve this by using multiple threads to execute your program, or portions of your program, concurrently.

Unfortunately, using threads is known to have inherent dangers. Mistakes with threaded programs have been known to cause "deadlocking" or other issues where a program becomes completely stock and is unable to make forward progress. Alternatively, you may run into "data races" where multiple threads attempt to read/write the same shared data simultaneously, each not knowing that another thread may be impacting or be impacted by the state change - this leads to consistency problems where threads end up operating on stale, corrupted, or inconsistent data. These have tended to be reasons for people to fully avoid working with multithreaded code at all - but that caution is just leaving performance on the table. Thankfully, Claro addresses these issues and provides convenient, fearless concurrency!

Graph Procedures

A Graph Procedure is much like a regular Procedure, with the only difference coming in how you structure code in the body. As its mame implies, the body of a Graph Procedure will be structured as a graph of operations. Specifically it is a DAG (directed-acyclic-graph) where each node in the DAG represents some isolated unit of work which may depend on data produced by one or more other nodes and will produce its own resulting data. This structure is inherently parallelizable as Claro can analyze the provided DAG to schedule nodes to run as soon as possible once all of the data depended on by that node is ready. If any two nodes happen to have all of their dependent data ready at the same time, then Claro may schedule those nodes to run concurrently.

In fact, not only does Claro enable concurrency, it actually is able to create the optimal schedule to run your nodes. You don't need to think about scheduling at all, simply encode the data relationships between your operations, and Claro does the rest.

All of this is achieved by scheduling nodes to run cooperatively on a threadpool currently configured to have a single thread per CPU core (as of this writing, this default is the only option, but it will become configurable in the future (i.e. Google Java services default to 50 request threads)). This allows you to trivially achieve significantly better utilization of your available hardware resources than single threaded code, and much more safely and more easily than can generally be achieved with a handcrafted threaded program.

The example below shows syntax vs DAG visualization:

graph function getWatchlist(userId: UserId) -> future<Watchlist> {
    root recWatchlist <- mergeTop10(@movies, @shows);
    node movies <- getTopMovies(@profile);
    node shows <- getTopShows(@profile);
    node profile <- getUserFromDB(userId);
}

As you can see clearly in the diagram below, profile must run first but movies and shows may be computed concurrently:

Graph Procedure Composition

Great! Now Graph Procedures have given us free concurrency just by structuring our code declaratively rather than imperatively. But as we'd realistically only want to put a few nodes in a single Graph Procedure from a code maintenance and readability point of view, how do we write DAGS that are larger than just a few nodes? Composition! By this I mean simply calling another Graph Procedure from within the current one.

For Example:

graph function bar(argB: ..., argC: ...) -> future<...> {
    root barRes <- doBar(@b1);
    node b1 <- doBar1(@b2, @b3);
    node b2 <- doBar2(argB);
    node b3 <- doBar3(argC);
}
graph function foo(argA: ...) → future<...> {
    root fooRes <- doFoo(@f1, @f2);
    node f1 <- doFoo1(@f3);
    node f2 <- bar(10, @f3); # <-- Graph Composition via Call to `bar`.
    node f3 <- doFoo3(argA);
}

Because foo(...) includes a call to bar(...) as a subgraph, you can imagine that node f2 in graph foo actually composes around the entire bar graph.

This composition is extremely simple to understand in this way. The entire subgraph is started after all data dependencies of the node wrapping it are ready.

Calling Graph Procedures

As you've already seen, if you call a Graph Procedure from within another Graph (composition) then Claro will automatically handle the scheduling for you so that downstream nodes receive the value when it's ready. If you tried calling a Graph Procedure from the top-level of a file, or from a non-Graph Procedure, then you'll see you receive a value wrapped in a future<...>. This is because, as Claro follows the Async pattern for concurrent execution, some nodes in the Graph Procedure may not be done running yet meaning that the overall Graph result may not be ready either.

var graphRes: future<Foo> = fooGraph(...);

There's not much you can do with a future<...> as it's really just a handle representing work whose result you'd like to be able to access when it's ready. In this situation (outside a Graph), as a future<...> represents some computation that may not be done yet, the only way to get the actual result is to block the current thread until the other threads running the graph backing the future<...> have finished. To do so, use the "blocking unwrap" op <-|:

var unwrappedRes: Foo <-| fooGraph(...);

The number one thing to keep in mind is that between calling a Graph and blocking on its result, any operations between may be running concurrently with the graph backing the future<...> (you don't know when the graph actually finishes except that it will certainly have finished after the <-| operation).

var graphFuture: future<Foo> = fooGraph(...);

# These two instructions are likely running concurrently with respect to
# `graphFuture`, as `graphFuture` likely hasn't finished yet, but they are
# definitely serialized with respect to each other.
doSomething(...);
doAnotherThing(...);

# Blocking the current thread to "unwrap" the `future<Foo>` into a raw `Foo`
# value we can operate on.
var unwrapped: Foo <-| graphFuture; 

(Advanced) Conditional Subgraph Execution

There will be times when you actually only want to execute some portion of the graph upon satisfying some condition. In this case, you may inject the node to a procedure expecting a provider<future<...>> so that you may conditionally trigger execution yourself after checking the condition:

graph function getHomepage(userId : UserId) -> future<Homepage> {
    root homepage <- renderPage(@basePage, @maybeUpgradeBanner);
    node basePage <- getBasePageFromDB();
    node maybeUpgradeBanner
        <- getOptionalUpgradeBannerFromDB(
               @userIsPremium,
               @upgradeBanner  # <-- "Lazy Subgraph" injection requested.
           ); 
    node userIsPremium <- checkPremiumFromDB(userId);
    node upgradeBanner <- getUpgradeBannerFromDB();
}

...

function getOptionalUpgradeBannerFromDB(
    alreadyPremium: boolean,
    getUpgradeBannerPromDBProvider: provider<future<Upgrade>>
) -> Optional<future<Upgrade>> {
    if (already premium) {
        return Nothing;
    }
    return getUpgradeBannerFromDBProvider();
}

Read closely above. The function shown requests an arg of type provider<future<Upgrade>> which is injected as a lazy subgraph rooted at node upgradeBanner. In this way, the subgraph of the getHomepage(...) graph is only run sometimes, upon satisfying the condition that the user is not already a "premium" member.

Note on Usage of Optional in Above Example:

If you read the above example very closely, you may have noticed that the return type of the getOptionalUpgradeBannerFromDB(...) is Optional<future<Upgrade>> but yet the two return statements in the function are return Nothing; and return getUpgradeBannerFromDBProvider();, neither of which reference Optional. This is making use of a type system feature upcoming very soon in the language but not yet available, the oneof<...> type. I've done this to make a less distracting example, but for now, until oneof<...> is available, you would actually need to do a workaround to define your own quasi Optional perhaps looking something like:

# Impl of `Optional` before `oneof<...>` type system support.
alias Optional : tuple<boolean, future<Upgrade>, NothingType>

As a preview, for anyone very interested in what the definition of this more convenient Optional in the example above would look like making use of the in-development oneof<...> type:

# Impl of `Optional` using `oneof<...>`.
alias Optional<T> : oneof<T, !! NothingType>

Stay tuned for updates on oneof<...> support.

Guaranteed Deadlock-Free Concurrency

One of Claro's most powerful advantages is that it is able to statically analyze your concurrent code to determine that it is impossible to run into a deadlock at runtime.

A deadlock is a situation where a thread blocks its execution waiting for another thread to complete or for some other action to complete before it can continue, but the other thread or action never completes thereby leaving the waiting thread permanently blocked. Threads are not free, and effectively losing access to a deadlocked thread has costlier implications than just losing that unit of work completing. Each thread costs about 1MB of RAM and in a server application deployed with a fixed number of threads, losing even one can lead to cascading failures such as thread starvation (having no more threads in a healthy state available do meaningful work) or simply falling behind on incoming request handling, leading to a server decreasing its effective throughput, causing other servers to pick up the load ( making them more likely to fail in turn) or just straight up dropping user requests returning errors to them and degrading product experience.

To mitigate these risks at scale, high-throughput, low-latency services turn to the async concurrency pattern to handle all operations in a non-blocking way. Claro's Graph Procedures implement the async pattern for you for free, while statically validating that your concurrent code is entirely non-blocking. It does so by modeling every Graph node as an async operation that will not even be started until after all of its data dependencies are resolved. Once a node is ready for execution it will be scheduled on a threadpool with as many threads as available CPU cores (will be configurable in the future).

In this way, calling a Graph Procedure is actually an extremely lightweight operation from the perspective of the calling thread. The calling thread simply

  1. traverses the Graph (without executing any nodes)
  2. composes a future<...> representing a handle to the work to be done by the Graph
  3. submits the Graph to the Graph Executor to schedule on its threadpool when threads become available

After these steps the calling thread is completely freed to move on, knowing that the work represented by the Graph Procedure's returned future<...> will be handled by other threads. As a result, in a web server, after calling a request handling Graph the service thread is free to just immediately move on to accepting new requests. The service thread never needs to block to wait for request handling business logic to complete. Now, a server built using this approach will no longer be bound by the number of incoming requests as it will be able to continuously schedule incoming requests to be processed when Graph Executor threads become available. Of course, the server may still fail due to heavy load, though this will end up coming from OOMs (out-of-memory errors) as the result of storing all of the queued requests. Even so, as a general rule, this will happen much later than if you were to execute request handling logic using thread blocking operations, and it will almost always degrade more gracefully when it does eventually reach its limit.

The only concession that you, as a programmer, have to make is simply defining all of your concurrent logic inside a Graph Procedure. Claro will then manage all of the scheduling for you, while enforcing that you never block one of the Graph Executor threads (you may not use the <-| operator in any code transitively reachable from your Graph, or else you'll receive a compiler error). To provide a single, well-lit path for safely designing scalable services in Claro, the only available mechanism to access threads are Graph Procedures.

Blocking Procedures

Whereas other languages with some form of builtin concurrency mechanism may tend to make it harder to write async code than blocking code, Claro is very intentional about inverting that balance. Make the good things easy and the bad things hard. So, you may write blacking code in Claro, but as it's really only intended to be used in limited contexts, Claro forces your hand. Any procedure that makes use of the <-| operator either directly or indirectly, must be explicitly annotated to be blocking:

blocking function doBlocking(...) -> ... {
    ...do stuff...

    var unwrappedGraphRes: Foo <-| fooGraph(...); # Blocking unwrap.

    ...do stuff using `unwrappedGraphRes`...

    return ...;
}

To prevent deadlocking, procedures annotated blocking may not be called from a Graph. Therefore, you can be confident that the threading implementation of any logic defined within a Graph Procedure will certainly not suffer from liveliness issues in the form of deadlocks (of course, you may still write code with bugs such as infinite loops that may lead to a "livelock").

Re: "What Color is Your Function?"

(For context, the blog post "What Color is Your Function?" by Bob Nystrom is highly recommended reading.)

Unfortunately, introducing the blocking procedure type variant has the effect of "coloring" all functions that transitively reach a blocking procedure. This ends up being a problem for any code that provides some generic functionality over first-class procedure arguments that we would ideally like to be able to reuse and call from any context, whether blocking or not.

Take, for example, Functional Programming's common filter function with the following signature:

function filter<T>(l: [T], pred: function<T -> boolean>) -> [T];

As currently defined, the filter function with the above signature could only be used over non-blocking pred function args. You'd need to write a duplicate function explicitly accepting a blocking pred function in its signature if you wanted to filter lists using a pred function that makes use of blocking operations:

blocking function filterBlocking<T>(
    l: [T], pred: blocking function<T -> boolean>) -> [T];

This duplication would be pervasive throughout functional-style code, and would discourage using functional-style at all. Both of which are very undesirable outcomes. So, Claro handles this using one more form of generics inspired by Rust's Keyword Generics Initiative, "Blocking Generics".

Blocking Generics

You're able to define a procedure whose "blocking"-ness is generically determined by the type of the first-class procedure arg that the function is called with. Taking inspiration from Rust's Keyword Generics Initiative, a Claro procedure may be declared "Blocking-Generic" with the following syntax:

blocking:pred function filter<T>(
    l: [T], pred: blocking? function<T -> boolean>) -> [T] {...}

Now, with only a single implementation of your filter function, calls may be statically determined to be either a blocking or non-blocking call depending on the type of the passed pred function arg. So now, from within a Graph, you may call this "blocking-generic" function as long as you pass in a non-blocking pred function.

Note on the blocking:argName and blocking? Syntax

Claro localizes Generics only to procedure signatures. This is done with the intention of making Generics more easily understandable, such that Generics itself may be conceptualized simply as a form of "templating" (regardless of whether this is how the compiler is actually implementing the feature).

As a result, these type modifier syntaxes are restricted to being used within top-level procedure definition signatures only. In particular, you may not define a variable of a blocking-generic procedure type:

# Illegal use of `blocking:...`, and `blocking?` outside of top-level Procedure definition.
var myBlockingGenericFn:
    blocking:arg1 function<|[int], blocking? function<int -> boolean>| -> [int]>;

This has the implication that lambdas may not make use of blocking generics. But this is in line with Claro's single-use intention for lambdas, encouraging the definition of lambdas that will only be used in a single limited scope. For any cases that actually need to make use of blocking-generics, you are by definition defining a procedure that should have more than one use case, and you should define a top-level procedure instead.

You can, however, still make first-class references to top-level blocking-generic procedures in order to pass them around as data. The only restriction, is that you must statically declare which blocking variant the reference will take on:

# A blocking function var, to which you may *only* assign blocking functions.
var myBlockingFn: blocking function<
        |[int], blocking function<int -> boolean>| -> [int]>
    = filter;

# A non-blocking function var, to which you may *only* assign non-blocking functions.
var myNonBlockingFn: function<|[int], function<int -> boolean>| -> [int]>
    = filter;