For our first parser, we'll implement a string interpolator equivalent to the standard s interpolator. That is, an interpolator that builds a string consisting of the concatenation of literal parts and arguments converted to strings using toString.

The crux of parser combinators is creating a parser by combining smaller parsers together, so we'll start with a small parser and build up to a parser with the desired properties.

The leaf parsers that we'll be using for now are provided in Interpolator.idInterpolators. And we're defining a new string interpolation. So, the scaffolding will look something like:

type Result
import name.rayrobdod.stringContextParserCombinator.Interpolator.idInterpolators._

extension (sc:StringContext)
  def prefix(args:Any*):Result =
    val interpolator:Interpolator[Result] = ???
    interpolator.interpolate(sc, args)

We'll start by creating a Interpolator that will match one of any character from the processed string. charWhere is one of the leaf parsers that can be used for this; charWhere takes a predicate and creates a parser in which, if the next character passes the predicate, the parser passes and the character is captured. Since we want this parser to match any character, we will use a predicate that always returns true.

import name.rayrobdod.stringContextParserCombinator.Interpolator.idInterpolators._

extension (sc:StringContext)
  def s2(args:Any*):Char =
    val anyChar = charWhere(_ => true)
    anyChar.interpolate(sc, args)

s2"Hello" // 'H'
s2"${1 + 1}" // throws
s2"" // throws

Note that the parser matches the first character if there is one. Parsing starts at the start of the processed string.

Using this parser that can match one of any character, we can create a parser that can match a sequence of characters using the repeat operator. The repeat operator creates a parser that will invoke the operand repeatedly until the operand parser fails, and then combine the results of the repeated operand runs into a new value. Using the default givens, since the operand is a Interpolator[Char], the result is a Interpolator[String].

import _root_.name.rayrobdod.stringContextParserCombinator.Interpolator.idInterpolators._

extension (sc:StringContext)
  def s2(args:Any*):String =
    val anyChars = charWhere(_ => true)
        .repeat()
    anyChars.interpolate(sc, args)

s2"Hello" // "Hello"
val name = "Mr. Smith"
s2"Hello ${name}!" // "Hello "
s2"" // ""

Next, lets handle processed string arguments. We will set aside anyChars for now. Of the leaf parsers that handle args, ofType is the most straightforward. ofType takes a type argument and type evidence and will match and capture any argument that is a subtype of that class. So, an ofType[Int] would match any argument that is an Int or a subclass of Int. Since we want to match any argument, we will use ofType[Any]. The result of running this parser is the same as its type parameter, in this case Any.

import name.rayrobdod.stringContextParserCombinator.Interpolator.idInterpolators._

extension (sc:StringContext)
  def s2(args:Any*):Any =
    val anyArg = ofType[Any]
    anyArg.interpolate(sc, args)

s2"${2 + 2}" // 4
s2"Hello" // throws: Expected ofType[Object]

We don't want the result of the interpolator to be an Any here, though; we want the result to be a String.

We can use the map operator to convert an interpolator on one type to an interpolator on another type.

import name.rayrobdod.stringContextParserCombinator.Interpolator.idInterpolators._

extension (sc:StringContext)
  def s2(args:Any*):String =
    val anyArg = ofType[Any]
        .map(_.toString)
    anyArg.interpolate(sc, args)

s2"${2 + 2}" // "4"
s2"Hello" // throws: Expected ofType[Object]

Now that we have one parser that will match a sequence of characters and another that will match an arg, we can create a parser that will match either a sequence of characters or an arg by combing the two other parsers using the orElse operator. The <|> operator creates a parser that will attempt the left parser, passing the result of left parser if the result was a success, otherwise attempting the right parser and passing that result.

Using the default givens, since both arguments to the <|> operator are Interpolator[String], the result of the operator will also be a Interpolator[String]

import name.rayrobdod.stringContextParserCombinator.Interpolator.idInterpolators._

extension (sc:StringContext)
  def s2(args:Any*):String =
    val anyChars = charWhere(_ => true)
        .repeat()
    val anyArg = ofType[Any]
        .map(_.toString)
    val segment = anyChars <|> anyArg
    segment.interpolate(sc, args)

s2"2 + 2 = ${2 + 2}" // "2 + 2 = "
s2"${2 + 2}" // ""

Oh, the parser didn't do quite what we wanted. Here, the parser saw that the processed string started with zero characters and considered that to be a match of the anyChars branch. To fix this, we are going to modify the repeat call in anyChars. repeat has several optional arguments, the first of which is the minimum number of repeats required for the parse to be considered a success. This argument defaults to zero, but if it is explicitly set to one and the processed string starts with an arg, then anyChars will not consider a run of zero characters to be a success, and segment will try the anyArg branch after the anyChars branch fails.

import name.rayrobdod.stringContextParserCombinator.Interpolator.idInterpolators._

extension (sc:StringContext)
  def s2(args:Any*):String =
    val anyChars = charWhere(_ => true)
        .repeat(1)
    val anyArg = ofType[Any]
        .map(_.toString)
    val segment = anyChars <|> anyArg
    segment.interpolate(sc, args)

s2"2 + 2 = ${2 + 2}" // "2 + 2 = "
s2"${2 + 2} = 2 + 2" // "4"

Now that we have a parser that can match either a run of characters or an argument, we can repeat that parser to create a parser that can match a sequence of character-sequences-or-arguments. This time, since the input to the repeat parser is a Interpolator[String], the result will be a Interpolator[List[String]]. In general, unless a higher priority instance of the Repeated typeclass can be found, repeat will create a parser that produces a List[A]; the Char to String seen before was a built-in higher priority Repeated typeclass instance.

import name.rayrobdod.stringContextParserCombinator.Interpolator.idInterpolators._

extension (sc:StringContext)
  def s2(args:Any*):List[String] =
    val anyChars = charWhere(_ => true)
        .repeat(1)
    val anyArg = ofType[Any]
        .map(_.toString)
    val segment = anyChars <|> anyArg
    val segments = segment
        .repeat()
    segments.interpolate(sc, args)

s2"2 + 2 = ${2 + 2}" // List("2 + 2 = ", "4")
s2"${2 + 2} = 2 + 2" // List("4", " = 2 + 2")

We have a Interpolator[Seq[String]], and we can map a Seq[String] to an String to finish the simple string context reimplementation.

import name.rayrobdod.stringContextParserCombinator.Interpolator.idInterpolators._

extension (sc:StringContext)
  def s2(args:Any*):String =
    val anyChars = charWhere(_ => true)
        .repeat(1)
    val anyArg = ofType[Any]
        .map(_.toString)
    val segment = anyChars <|> anyArg
    val segments = segment
        .repeat()
        .map(_.mkString)
    segments.interpolate(sc, args)

s2"2 + 2 = ${2 + 2}" // "2 + 2 = 4"

This interpolator does work, however this interpolator works at run time.

The library also supports creating marco-level parsers, that will instead run at compile time. There are several advantages to using a macro-based interpolator,

  • Parsing errors will fail at compile time, instead of being a runtime exception
  • ofType can work with types instead of classes,
  • The Lifted interpolator only works in the Quoted context

The leaf interpolators used for a macro-based interpolator at provided in Interpolator.quotedInterpolators (or equivalently provided directly in the Interpolator companion object). The extension method declaration changes to that of a macro definition.

Together the scaffolding of the string context extension method becomes

type Result
import scala.quoted.{Expr, Quotes}
import name.rayrobdod.stringContextParserCombinator.Interpolator._

extension (inline sc:StringContext)
  inline def prefix(inline args:Any*):Result =
    ${prefixImpl('sc, 'args)}

def prefixImpl(sc:Expr[StringContext], args:Expr[Seq[Any]])(using Quotes):Expr[Result] =
  val interpolator:Interpolator[Expr[Result]] = ???
  interpolator.interpolate(sc, args)

The interpolate method handles extracting string context parts and arguments from the Expr arguments. Most of the changes involve changing return values to be wrapped in an Expr.

The charWhere and the other character parsers capture Char values even in macro interpolators. However, the result must be wrapped in an Expr, so the parts must must be lifted into an Expr at some point. This can be done with the map operator, such as .map(Expr(_)), or equivalently with the mapToExpr method.

 val anyChars = charWhere(_ => true)
     .repeat(1)
+    .mapToExpr

ofType changes from requiring an implicit scala.reflect.ClassTag and returning an unwrapped value, to requiring an implicit scala.quoted.Type and returning an Expr that builds a value. Since the ofType result is in an Expr, the mapping applied to this value must be changed from a Any => String to a Expr[Any] => Expr[String], essentially wrapping the mapping in a Quote.

 val anyArg = ofType[Any]
-    .map(arg => arg.toString)
+    .map(arg => '{$arg.toString})

The operands that create a segment have changed from both being a Interpolator[String] to both being an Interpolator[Expr[String]], so the <|>-combination of the two parts changes in the same way, but no source changes occur as a consequence of this.

 val segment = anyChars <|> anyArg

Lastly, the segments mapping has to be changed from a Seq[String] => String to a Seq[Expr[String]] => Expr[String].

 val segments = segment
     .repeat()
-    .map(_.mkString)
+    .map(strExprs => '{${Expr.ofList(strExprs)}.mkString})

Taken together, the macro-level reimplementation of the standard string interpolator is as follows:

import scala.quoted.{Expr, Quotes}
import name.rayrobdod.stringContextParserCombinator.Interpolator._

extension (inline sc:StringContext)
  inline def s2(inline args:Any*):String =
    ${s2Impl('sc, 'args)}

def s2Impl(sc:Expr[StringContext], args:Expr[Seq[Any]])(using Quotes):Expr[String] =
  val anyChars = charWhere(_ => true)
      .repeat(1)
      .mapToExpr

  val anyArg = ofType[Any]
      .map(arg => '{$arg.toString})

  val segment = anyChars <|> anyArg

  val segments = segment
      .repeat()
      .map(strExprs => '{${Expr.ofList(strExprs)}.mkString})

  segments.interpolate(sc, args)